Calculating the Fibonacci numbers on GPU

[-]

pacificlattice@reddit

uhh this is pretty thoroughly criticized on hackernews but

searching vectorial additon chains using rl and similar methods have been tried using deep learning which uses gpu lol

[-]

devraj7@reddit

out: 0;1;3;

Can someone explain why applying + to (0,1,2) and (3) outputs this?

[-]

le_birb@reddit

It looks like out(3) is saying that the output will be of length 3. The + is the operation that the scan is using, and what that looks like for the first example (an exclusive scan) is:

With input (1,2,3)

First output: the sum of all terms before the first. This is an empty sum, so it's 0

Second output: the sum of all terms before the second. This is just 1

Third output: the sum of all terms before the third. This is 1+2=3

A couple more terms there would probably have helped to see the pattern for those unfamiliar with the concept

[-]

ronniethelizard@reddit

Given the recursive nature of the fibonacci sequence, I don't think a GPU is a good approach here.
I think this is technically not a "scan" operation. Looking at the definition of scan at the beginning, it operates on the inputs only (though can be rewritten to be a recursive form). But by taking the outputs from one step as the inputs to the next step, I think this violates the definition of the scan operation.

[-]

barr520@reddit

the definition of scan does match the code.
notice how y1=x0 * x1, and y2=x0 * x1 * x2 = y1 * x2.
it is the same as defining it using it as yn=y(n-1) * xn.

[-]

ronniethelizard@reddit

No.

The inputs to this sequence are 0,1,0,0,0,0,0,0,0,0,0...

Doing a scan (with addition) on that will yield 0,1,1,1,1,1,1.

[-]

barr520@reddit

where do you see 0,1,0,0,0,...? the inputs are a series of the same matrix.

[-]

ronniethelizard@reddit

Those are the inputs to the fibonacci sequence.

[-]

barr520@reddit

No one is trying to apply a scan to these inputs, these are just some inputs you picked...
The scan is applied to a different input, and the output is the Fibonaci sequence.

[-]

No, The "inputs" to the current step are the outputs from prior steps. The actual inputs to the process are 0, 1, 0, 0, 0... Because prior outputs are used as inputs, the fibonacci sequence is an IIR (Infinite impulse response) filter, where the equation is:

y[n] = x[n] + y[n-1] + y[n-2].

For x[n], the inputs are 0, 1, 0, 0.

The y[n] sequence then becomes 0, 1, 1, 2, 3, 5, 8, 13, ...

The Z- transform of the above yields a pole outside the unit circle, which tracks with the fibonacci sequence going to infinite.

Because outputs are fed back in, this cannot be done as a scan operation (which is better called an FIR (finite impulse response) filter).

[-]

barr520@reddit

You're not describing the algorithm used in the blog post.

The algorithm used in the blogpost has all the inputs set to the matrix (1,1,1,0)(2x2 matrix flattened).

And the function used is a matrix multiplication.

This algorithm does succesfully produce the Fibonacci sequence, with the Fibonacci number itself being stored on bottom-right cell of each output matrix.

And yes, this *is* a scan.

[-]

ronniethelizard@reddit

I am describing an algorithm that a Junior level EE student can derive for the fibonacci sequence. And no it is not a scan. A scan operation operates on the inputs only (after pole-zero cancellation in the transfer function). The Fibonacci sequence operates on the prior outputs of the operation.

[-]

barr520@reddit

I never said your algorithm is a scan, I just said the algorithm in the post is.

Nowhere in the post was it even defined if the implementation recomputes all the calculations using every previous input for every output or uses the previous output, either way, the final result is the same.

[-]

ronniethelizard@reddit

No it is not a scan (either my algorithm or the one in the post). And it fundamentally cannot be a scan operation as scan only operates on inputs. The fibonacci algorithm operates on outputs.

[-]

barr520@reddit

Ignoring the fact that "scan only operates on inputs" is wrong, the n'th Fibonaci can be calculated as the bottom right element of the matrix (1,1,1,0)(flattened 2x2) raised to the n'th power.
where in this definition did I use any previous inputs?

[-]

ronniethelizard@reddit

Ignoring the fact that "scan only operates on inputs"

That is the definition given in the blog post. The x's are the inputs and the y's are the outputs.

where in this definition did I use any previous inputs?

If you didn't use the inputs, it is by definition, not a scan operation.

[-]

barr520@reddit

That is the definition given in the blog post. The x's are the inputs and the y's are the outputs.

Right, but utilizing outputs you already calculated will give the same result with less work, which is why almost every scan implementation does it.

If you didn't use the inputs, it is by definition, not a scan operation.

typo , i meant that I only used inputs, and no outputs

[-]

Wunkolo@reddit

I've been sitting on this little trick I found for a little while. While it's useful and possibly lends itself to SIMD/GPU-code, Fibonacci numbers get very large very quickly that it's almost always better to just have a look-up table when you are limited to 32 or 64 bit integers. With 32-bit integers, you only need a table of 47 elements before overflowing, with 64-bit numbers, you would need 93.

[-]

TheoreticalDumbass@reddit

$ cat fib.py
import numpy as np

n = 99999999
Mod = 9837
R = np.identity(2, dtype=int)
M = np.matrix([[1, 1], [1, 0]], dtype=int)

while n:
    if n % 2:
        R = (R * M) % Mod
    M = (M * M) % Mod
    n //= 2

print(R[0, 1])

$ time python3 fib.py
7558

real    0m0.054s
user    0m0.046s
sys     0m0.008s

[-]