This can't actually be all that surprising of a result, can it? Given the requirements for PhotoDNA (small size, resistant to most minor modifications to the image file), it kind of has to encode some of the large-scale structure of the image, right? It's interesting to use a NN for the reversing, instead of reverse-engineering the actual algorithm, though.
By following the links in the article, to other links, I eventually found this description of the actual algorithm:
> First, a full-resolution color image is converted to
> grayscale and downsized to a lower and fixed
> resolution of 400 × 400 pixels.
> ...
> Next, a high-pass filter is applied to the reduced
> resolution image to highlight the most informative
> parts of the image.
> Then, the image is partitioned into non-overlapping
> quadrants from which basic statistical
> measurements of the underlying content are
> extracted and packed into a feature vector.
> Finally, we compute the similarity of two hashes
> as the Euclidean distance between two feature
> vectors, with distances below a specified
> threshold qualifying as a match.
So, that tracks. Anything which "reverses" the algorithm will by necessity produce a small greyscale image of the original picture. I suppose there are probably ways to obfuscate the feature vectors in the published hash, but given the nature of similarity hashing, you can't actually produce a similarity hash that has the usual desirable characteristics of a cryptographic hash - they're distinctly different things.
What I took away as the importance is that Microsoft seems to have presented it as resistant to reversing, and this disproves that claim. As someone who's never looked into the details of how a perceptual hash really works, I found this very surprising.
I mean, fair enough - it's definitely not easily reversible, and several destructive steps in the pre-processing mean that it can't be deterministically reversed into the original image.
Using a GAN to produce "plausible" original images from the hash is going to be very susceptible to the initial training data. You can see that a bit in the results in the article, where a net that was trained on a particular source is better at reproducing results from that source.
Unless someone trains their net on actual CSAM (and yuck, why would you do that?), it's not likely to produce results very similar to the original image the hash was computed from.
I like that one of the samples for Reddit has a faint, overlain "This image..." message, as one of the samples used for learning couldn't be found and so the error image was used instead...
TreasuredPogrom@reddit
If I had three wishes, one would be to meet this cat.
its_a_gibibyte@reddit
That inversion isn't very compelling. They've just been able to recreate a blurry mess, not anything with genuinely identifiable information.
torsten_dev@reddit
Storing the Sha of the PhotoDNA should be non reversible, which I hope is what they're actually doing.
i_invented_the_ipod@reddit
This can't actually be all that surprising of a result, can it? Given the requirements for PhotoDNA (small size, resistant to most minor modifications to the image file), it kind of has to encode some of the large-scale structure of the image, right? It's interesting to use a NN for the reversing, instead of reverse-engineering the actual algorithm, though.
By following the links in the article, to other links, I eventually found this description of the actual algorithm:
> First, a full-resolution color image is converted to
> grayscale and downsized to a lower and fixed
> resolution of 400 × 400 pixels.
> ...
> Next, a high-pass filter is applied to the reduced
> resolution image to highlight the most informative
> parts of the image.
> Then, the image is partitioned into non-overlapping
> quadrants from which basic statistical
> measurements of the underlying content are
> extracted and packed into a feature vector.
> Finally, we compute the similarity of two hashes
> as the Euclidean distance between two feature
> vectors, with distances below a specified
> threshold qualifying as a match.
So, that tracks. Anything which "reverses" the algorithm will by necessity produce a small greyscale image of the original picture. I suppose there are probably ways to obfuscate the feature vectors in the published hash, but given the nature of similarity hashing, you can't actually produce a similarity hash that has the usual desirable characteristics of a cryptographic hash - they're distinctly different things.
yawara25@reddit (OP)
What I took away as the importance is that Microsoft seems to have presented it as resistant to reversing, and this disproves that claim. As someone who's never looked into the details of how a perceptual hash really works, I found this very surprising.
fnork@reddit
You just published a CSAM generator. The tech is neat, and Microsoft are bastards, but you could be the one who gets in trouble.
yawara25@reddit (OP)
I didn't publish anything. This is an article some random guy wrote 5 years ago.
i_invented_the_ipod@reddit
I mean, fair enough - it's definitely not easily reversible, and several destructive steps in the pre-processing mean that it can't be deterministically reversed into the original image.
Using a GAN to produce "plausible" original images from the hash is going to be very susceptible to the initial training data. You can see that a bit in the results in the article, where a net that was trained on a particular source is better at reproducing results from that source.
Unless someone trains their net on actual CSAM (and yuck, why would you do that?), it's not likely to produce results very similar to the original image the hash was computed from.
Ameisen@reddit
I like that one of the samples for Reddit has a faint, overlain "This image..." message, as one of the samples used for learning couldn't be found and so the error image was used instead...