It looks like out(3) is saying that the output will be of length 3. The + is the operation that the scan is using, and what that looks like for the first example (an exclusive scan) is:
With input (1,2,3)
First output: the sum of all terms before the first. This is an empty sum, so it's 0
Second output: the sum of all terms before the second. This is just 1
Third output: the sum of all terms before the third. This is 1+2=3
A couple more terms there would probably have helped to see the pattern for those unfamiliar with the concept
Given the recursive nature of the fibonacci sequence, I don't think a GPU is a good approach here.
I think this is technically not a "scan" operation. Looking at the definition of scan at the beginning, it operates on the inputs only (though can be rewritten to be a recursive form). But by taking the outputs from one step as the inputs to the next step, I think this violates the definition of the scan operation.
the definition of scan does match the code.
notice how y1=x0 * x1, and y2=x0 * x1 * x2 = y1 * x2.
it is the same as defining it using it as yn=y(n-1) * xn.
No one is trying to apply a scan to these inputs, these are just some inputs you picked...
The scan is applied to a different input, and the output is the Fibonaci sequence.
No, The "inputs" to the current step are the outputs from prior steps. The actual inputs to the process are 0, 1, 0, 0, 0... Because prior outputs are used as inputs, the fibonacci sequence is an IIR (Infinite impulse response) filter, where the equation is:
y[n] = x[n] + y[n-1] + y[n-2].
For x[n], the inputs are 0, 1, 0, 0.
The y[n] sequence then becomes 0, 1, 1, 2, 3, 5, 8, 13, ...
The Z- transform of the above yields a pole outside the unit circle, which tracks with the fibonacci sequence going to infinite.
Because outputs are fed back in, this cannot be done as a scan operation (which is better called an FIR (finite impulse response) filter).
You're not describing the algorithm used in the blog post.
The algorithm used in the blogpost has all the inputs set to the matrix (1,1,1,0)(2x2 matrix flattened).
And the function used is a matrix multiplication.
This algorithm does succesfully produce the Fibonacci sequence, with the Fibonacci number itself being stored on bottom-right cell of each output matrix.
I am describing an algorithm that a Junior level EE student can derive for the fibonacci sequence. And no it is not a scan. A scan operation operates on the inputs only (after pole-zero cancellation in the transfer function). The Fibonacci sequence operates on the prior outputs of the operation.
I never said your algorithm is a scan, I just said the algorithm in the post is.
Nowhere in the post was it even defined if the implementation recomputes all the calculations using every previous input for every output or uses the previous output, either way, the final result is the same.
No it is not a scan (either my algorithm or the one in the post). And it fundamentally cannot be a scan operation as scan only operates on inputs. The fibonacci algorithm operates on outputs.
Ignoring the fact that "scan only operates on inputs" is wrong, the n'th Fibonaci can be calculated as the bottom right element of the matrix (1,1,1,0)(flattened 2x2) raised to the n'th power.
where in this definition did I use any previous inputs?
I've been sitting on this little trick I found for a little while. While it's useful and possibly lends itself to SIMD/GPU-code, Fibonacci numbers get very large very quickly that it's almost always better to just have a look-up table when you are limited to 32 or 64 bit integers. With 32-bit integers, you only need a table of 47 elements before overflowing, with 64-bit numbers, you would need 93.
$ cat fib.py
import numpy as np
n = 99999999
Mod = 9837
R = np.identity(2, dtype=int)
M = np.matrix([[1, 1], [1, 0]], dtype=int)
while n:
if n % 2:
R = (R * M) % Mod
M = (M * M) % Mod
n //= 2
print(R[0, 1])
$ time python3 fib.py
7558
real 0m0.054s
user 0m0.046s
sys 0m0.008s
The real power of using matmul for Fibonaci numbers is that you can more efficiently compute them using Exponentiation by Squaring(log N matmuls instead of N matmuls)
Otherwise you're better off using the normal scalar algorithm.
pacificlattice@reddit
uhh this is pretty thoroughly criticized on hackernews but
searching vectorial additon chains using rl and similar methods have been tried using deep learning which uses gpu lol
devraj7@reddit
Can someone explain why applying + to (0,1,2) and (3) outputs this?
le_birb@reddit
It looks like out(3) is saying that the output will be of length 3. The + is the operation that the scan is using, and what that looks like for the first example (an exclusive scan) is:
With input (1,2,3)
First output: the sum of all terms before the first. This is an empty sum, so it's 0
Second output: the sum of all terms before the second. This is just 1
Third output: the sum of all terms before the third. This is 1+2=3
A couple more terms there would probably have helped to see the pattern for those unfamiliar with the concept
ronniethelizard@reddit
Given the recursive nature of the fibonacci sequence, I don't think a GPU is a good approach here.
I think this is technically not a "scan" operation. Looking at the definition of scan at the beginning, it operates on the inputs only (though can be rewritten to be a recursive form). But by taking the outputs from one step as the inputs to the next step, I think this violates the definition of the scan operation.
barr520@reddit
the definition of scan does match the code.
notice how y1=x0 * x1, and y2=x0 * x1 * x2 = y1 * x2.
it is the same as defining it using it as yn=y(n-1) * xn.
ronniethelizard@reddit
No.
The inputs to this sequence are 0,1,0,0,0,0,0,0,0,0,0...
Doing a scan (with addition) on that will yield 0,1,1,1,1,1,1.
barr520@reddit
where do you see 0,1,0,0,0,...? the inputs are a series of the same matrix.
ronniethelizard@reddit
Those are the inputs to the fibonacci sequence.
barr520@reddit
No one is trying to apply a scan to these inputs, these are just some inputs you picked...
The scan is applied to a different input, and the output is the Fibonaci sequence.
ronniethelizard@reddit
No, The "inputs" to the current step are the outputs from prior steps. The actual inputs to the process are 0, 1, 0, 0, 0... Because prior outputs are used as inputs, the fibonacci sequence is an IIR (Infinite impulse response) filter, where the equation is:
y[n] = x[n] + y[n-1] + y[n-2].
For x[n], the inputs are 0, 1, 0, 0.
The y[n] sequence then becomes 0, 1, 1, 2, 3, 5, 8, 13, ...
The Z- transform of the above yields a pole outside the unit circle, which tracks with the fibonacci sequence going to infinite.
Because outputs are fed back in, this cannot be done as a scan operation (which is better called an FIR (finite impulse response) filter).
barr520@reddit
You're not describing the algorithm used in the blog post.
The algorithm used in the blogpost has all the inputs set to the matrix (1,1,1,0)(2x2 matrix flattened).
And the function used is a matrix multiplication.
This algorithm does succesfully produce the Fibonacci sequence, with the Fibonacci number itself being stored on bottom-right cell of each output matrix.
And yes, this *is* a scan.
ronniethelizard@reddit
I am describing an algorithm that a Junior level EE student can derive for the fibonacci sequence. And no it is not a scan. A scan operation operates on the inputs only (after pole-zero cancellation in the transfer function). The Fibonacci sequence operates on the prior outputs of the operation.
barr520@reddit
I never said your algorithm is a scan, I just said the algorithm in the post is.
Nowhere in the post was it even defined if the implementation recomputes all the calculations using every previous input for every output or uses the previous output, either way, the final result is the same.
ronniethelizard@reddit
No it is not a scan (either my algorithm or the one in the post). And it fundamentally cannot be a scan operation as scan only operates on inputs. The fibonacci algorithm operates on outputs.
barr520@reddit
Ignoring the fact that "scan only operates on inputs" is wrong, the n'th Fibonaci can be calculated as the bottom right element of the matrix (1,1,1,0)(flattened 2x2) raised to the n'th power.
where in this definition did I use any previous inputs?
ronniethelizard@reddit
That is the definition given in the blog post. The x's are the inputs and the y's are the outputs.
If you didn't use the inputs, it is by definition, not a scan operation.
barr520@reddit
Right, but utilizing outputs you already calculated will give the same result with less work, which is why almost every scan implementation does it.
typo , i meant that I only used inputs, and no outputs
Wunkolo@reddit
I've been sitting on this little trick I found for a little while. While it's useful and possibly lends itself to SIMD/GPU-code, Fibonacci numbers get very large very quickly that it's almost always better to just have a look-up table when you are limited to 32 or 64 bit integers. With 32-bit integers, you only need a table of 47 elements before overflowing, with 64-bit numbers, you would need 93.
TheoreticalDumbass@reddit
69WaysToFuck@reddit
What is the mod based on?
TheoreticalDumbass@reddit
on the article, they used the same mod
barr520@reddit
The real power of using matmul for Fibonaci numbers is that you can more efficiently compute them using Exponentiation by Squaring(log N matmuls instead of N matmuls) Otherwise you're better off using the normal scalar algorithm.