[ art / civ / cult / cyb / diy / drg / feels / layer / lit / λ / q / r / sci / sec / tech / w / zzz ] archive provided by lainchan.jp

lainchan archive - /sec/ - 2489

File: 1479272440753.png (27.69 KB, 300x150, 7365254.jpg)


Let's say I want to share an image with someone online, but I don't want them to have the ability to plug the file into google's reverse image search, because the source hosting the image(which I have no control of in this hypothetical scenario) contains personal information I'm not interested in sharing.

Obviously, there are ways to distort or otherwise change an image far enough that the algorithm of any reverse image engine can't associate it with the original host, but exactly how severely do I need to change the image to keep it from leaving a trail of bread crumbs back to a sensitive source?

Obviously metadata would need to be wiped, but I don't know how much visual distortion would be necessary. Cropping the image? Slightly changing the colors in an image editor? stretching it by a few pixels? Or are we talking a more blunt method, like adding harsh noise, or some kind of serious image warping?


I've read that adding large amounts of random noise can confuse image recognition services, but this is a game of cat and mouse.

Microsoft has a new system called Photo DNA that ignores color by converting to grayscale and ignores resizing, flipping, and similar techniques by splitting an image into small parts and comparing those for similarities.

It's questionable that this can be done satisfactorily. If you've the potential of vulnerability, isn't that simply a hidden vulnerability? Is there a way you could layer security so that even if it's found, that won't matter?


change the encoding(jpeg->png) and flip it LTR->RTL


File: 1479279062611.png (245.21 KB, 200x126, 9dcbeaf5cb2d4206d9c85416db9ddeaa6f55cfc2.png)

just dont let google index your site?


>just dont let google index your site?
basically this. You can tell google image bot to fuarrrk off instead of jumping through hoops to fool some image search.



In this vein you can split this into further strategies.

First you'd disallow indexing in robots.txt. Then, for spiders that don't respect it, you'll probably want to check the header for a viable enough browser. For sneakier culprits, the best bet is to create a landing page that creates a session cookie that verifies that it isn't a 'dumb spider' so when the image loads within that page it can check to see that a valid 'site gateway' was hit prior to loading the image.

So http://site/page/image333 would contain html that has an img tag within that pulls image333.jpg and server side logic that checks that a session was generated from /page/image333


That would be totally useless.
Reverse image search engines don't care about the format of the image, and they use more intelligent techniques than just comparing images side-by-side.

As >>2490 said, they can use chunks of images (applying bag-of-words algorithms) and don't even need to match colors. They make use of adjacent pixels level differences to detect edges and shapes.

I don't know if there is a perfect way to circumvent those search engines, but I guess that if a human can distinguish the subject of the picture, the SE will be able to find the original image (probably among others, but still).


All of them used to be defeatable via a horizontal flip. Not sure if it's like that anymore.

You should also try a horizontal skew (ie compress horizontally to 90% but leave vertical at 100%).

You'll have to experiment. Find an image that's reverse search-able and fuarrrk with it until you can't find it anymore.


File: 1479507938199.png (361.8 KB, 200x105, break neural networks.png)

applying a trippy filter might work.


File: 1479592025478.png (1.42 MB, 200x84, intelligent-memes.gif)

>Obviously metadata would need to be wiped, but I don't know how much visual distortion would be necessary. Cropping the image? Slightly changing the colors in an image editor? stretching it by a few pixels? Or are we talking a more blunt method, like adding harsh noise, or some kind of serious image warping?

Consider that all of the patterns, noise, or algorithms used to obfuscate the meme could be used to individually tag and identify the original obfuscation source of the meme. Is this a process you want to perform?

This may be better off decentralized using banked processing to perform for as many memes as possible.


File: 1480542094753.png (17.82 KB, 200x125, Untitled.png)

>ignores color by converting to gray-scale
I could see this being a weakness in some ways...
There will be one value of red that will convert to exactly the same gray as one shade of green...
I.e. images made of selective RGB shades will always render to a gray sheet mixing many false positives.

If you like 3 col images an all that... pic related (ish - depending on gray-scale conversion used).


Maybe convert to a different color space.


Could you print it out and take a picture of the printed picture? You could bend the paper or take the picture at an angle to distort the image. Low tech.


Just tell Google to fuarrrk off.

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} AdsBot-Google [OR]
RewriteCond %{HTTP_USER_AGENT} Feedfetcher-Google [OR]
RewriteCond %{HTTP_USER_AGENT} ^Google [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot-Image [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot-News [OR]
RewriteCond %{HTTP_USER_AGENT} Mediapartners-Google
RewriteRule .* - [F]


File: 1483926749482.png (55.84 KB, 200x125, challenegeaccepted.png)

le gimp

to le rescue


Dunno what you did to the others... but the image at bottom right is the one I get when I just goto mode/gray-scale without any further messing... I think there is still some color difference but its close to 0 - just about illustrates the idea.


This doesn't work retroactively though. If the site has already been crawled, the data they already have will remain indexed for some time. Unless they changed some policy.


despite the advice already provided here, you can send claims to google. I think everyone here should be strongly arguing for legally mandated algorithms to take down personal images after a claim is filed.