The O(K) -> O(1) optimization can be made without creating "blocky" outputs by turning it into an Infinite Impulse Response implementation. Intel had a nice article, but took it down at some point: https://web.archive.org/web/20151008041501/https://software.intel.com/en-us/articles/iir-gaussian-blur-filter-implementation-using-intel-advanced-vector-extensions/
Interpreting the input bytes linearly pretty much means that you blur is completely broken. You can test this by taking a picture with a pixel-sized black and white checkerboard. Now blur it and watch its brightness completely shift. You need to decode before blurring and re-encode afterwards. The blurred pictures in the post are also too dark (compared to GIMP's gaussian blur), but it's less obvious.
Surprised to see no mention of the Infinite Impulse Response way to implement Gaussian blur. Intel used to have an article about it: https://web.archive.org/web/20151008041501/https://software.intel.com/en-us/articles/iir-gaussian-blur-filter-implementation-using-intel-advanced-vector-extensions/
The nice thing about this way of implementing the filter is that it doesn't matter how large a blur you want, you always have the same number of coefficients and the same number of iterations.
Yeah, but their second pass caught my eye. I don't think it's the best approach, and it could be done faster. Other than that, I first learned about this method about 25 years ago. Without the words "faster" and "Rust", there was nothing new.
Have you even read the article ? It's not about that. It's the second comment you make which is off base, the whole point is about switching computations from floats to int, and how to keep the design clean depending on image types, that's all.
First you can read it, second if you don't like it that's fine, but unrelated comments are just wasting your time
I’ve read the article. Do you mean they replaced floating-point division with a bitwise integer shift? And now it all takes about 3 cycles instead of 15-30? That’s not a new trick either. I’m more curious about why they used floats in the first place - presumably for rendering. But if it’s all for rendering, why not use the GPU? Why not use multithreading? So the key takeaway here is just approximating the Gaussian filter with a box filter. Everything else raises more questions than it answers.
trialofmiles@reddit
Summed area filters and the use of them for approximate Gaussian filtering as an O(1) option is a very old idea in computer vision.
Kaloffl@reddit
The O(K) -> O(1) optimization can be made without creating "blocky" outputs by turning it into an Infinite Impulse Response implementation. Intel had a nice article, but took it down at some point: https://web.archive.org/web/20151008041501/https://software.intel.com/en-us/articles/iir-gaussian-blur-filter-implementation-using-intel-advanced-vector-extensions/
Interpreting the input bytes linearly pretty much means that you blur is completely broken. You can test this by taking a picture with a pixel-sized black and white checkerboard. Now blur it and watch its brightness completely shift. You need to decode before blurring and re-encode afterwards. The blurred pictures in the post are also too dark (compared to GIMP's gaussian blur), but it's less obvious.
Kaloffl@reddit
Surprised to see no mention of the Infinite Impulse Response way to implement Gaussian blur. Intel used to have an article about it: https://web.archive.org/web/20151008041501/https://software.intel.com/en-us/articles/iir-gaussian-blur-filter-implementation-using-intel-advanced-vector-extensions/
The nice thing about this way of implementing the filter is that it doesn't matter how large a blur you want, you always have the same number of coefficients and the same number of iterations.
KadmonX@reddit
So you’re essentially approximating Gaussian blur with multiple box filter passes now?
bentheaeg@reddit
That was already the case prior to the change this article is about. "fast-"methods being approximations is very common
KadmonX@reddit
Yeah, but their second pass caught my eye. I don't think it's the best approach, and it could be done faster. Other than that, I first learned about this method about 25 years ago. Without the words "faster" and "Rust", there was nothing new.
bentheaeg@reddit
Have you even read the article ? It's not about that. It's the second comment you make which is off base, the whole point is about switching computations from floats to int, and how to keep the design clean depending on image types, that's all. First you can read it, second if you don't like it that's fine, but unrelated comments are just wasting your time
KadmonX@reddit
I’ve read the article. Do you mean they replaced floating-point division with a bitwise integer shift? And now it all takes about 3 cycles instead of 15-30? That’s not a new trick either. I’m more curious about why they used floats in the first place - presumably for rendering. But if it’s all for rendering, why not use the GPU? Why not use multithreading? So the key takeaway here is just approximating the Gaussian filter with a box filter. Everything else raises more questions than it answers.
gmes78@reddit
The article does explain it: HDR images use floats.
UninterestingDrivel@reddit
Perhaps you're not the target audience then. Have you considered that the world might not revolve around you?
YamGlobally@reddit
Have you considered that you're not as smart as you think?
programming-ModTeam@reddit
Your post or comment was removed for the following reason or reasons:
Your post or comment was overly uncivil.
iVerner@reddit
How to say you didn’t read the article without saying you didn’t read the article.
FirmMost2@reddit
but its faster and has rust in the name!
thuiop1@reddit
Good write-up! Thanks for your work!
bentheaeg@reddit
Nice write up, thank you ! Would be curious to see how this compiles, simd wise