Assembling Picard out of 4,096 clips of my favorite scene

Posted by Gedanke@reddit | TNG | View on Reddit | 4 comments

How I did it, in case you are curious:

ELI5: I took an image of my captain and had the computer search for the exact frames in the 4 lights scene that, when assembled, match the image. Once we know the right frames, creating the zoom animation is easy.

Technical: Let f(t) be a function that takes 4096 timestamps and returns a tensor representing an assembled mosaic of frames from the clip. The magic is that we can make this function differentiable. (The tricky part is making the sampling of a specific video frame differentiable, but this can be achieved using a trick. Instead of picking a hard frame index which breaks gradients, the model learns a probability distribution over all frames and "smoothly" grabs the right one using a matrix multiplication).

With this set up, we can define a loss function L(t) that scores how closely the assembled timestamps resemble the target Peeper image. From there, we just use standard ML tooling (PyTorch) to find the best timestamps that minimize that loss, and voila!

Assembling Picard out of 4,096 clips of my favorite scene

sgrams04@reddit

TanningOnMars@reddit

Mistervimes65@reddit

MyKidsArentOnReddit@reddit