Why Does Dropbox Add a Unique ID to Every Photo?

Following on from my last post about Dropbox changing my photos, I noticed a new exif field of “Image Unique ID” embedded by Dropbox in the image.

This ID would allow Dropbox to track unique files across their storage estate to avoid duplication. Equally it could used to track the original file and who uploaded it from a cropped version posted online, especially if law enforcement turned up with legal papers and demanded access.

Think about leaked documents or protest photos, yes it’s good practise to strip the meta data out but not everyone does.

This again comes back to what Dropbox and it’s camera upload feature is doing and is it documented anywhere?

Note Google Photos does not embedded any tracking data in the exif of the image I tested by uploading and downloading it.

The Hash

The hash for IMG_7082 is 8af323e74def610b0000000000000000 which looks like a 128bit hash but with only the first 64 bits populated. I’ve tried a number of hash tools on various parts of the original but they don’t match the unique ID. I have tested just the pure image data from original and Dropbox modified images.

Reverse engineering the hash function isn’t the real issue here,  the real question is why has this ID been added?

Dropbox iPhone Camera Upload Changes Photos

When Google announced their new Photos tools I decided to give it a go and see what Google’s machine learning could extract from my 83,292 photos stretching back 15 years. I’m sure you know that Google are offering “unlimited” and “free” storage for photos so long as you allow them to optimize your photos. I’m happy with the trade-off in quality as I already manage an archive of full resolution (or so I thought) photos via f-spot and have backup arrangements for it.

Dropbox Camera Upload

I have used the Dropbox Camera Upload feature for about 18 months to get photos off my iPhone and on to my various other devices and offsite backup server. Dropbox state that “When you open the app, photos and videos from your iPhone or iPad are saved to Dropbox at their original size and quality in a private Camera Uploads folder.”

This statement hides the fact, that the Dropbox app re-compresses your photos before it uploads them. I found this out when I used the desktop backup client to seed Google Photos from my Dropbox camera folder, before activating the apps on my iPhone and iPad.

Google checksum all photos before uploading to avoid duplication. When I enabled the Google Photos app on my iOS devices to upload directly from the iOS camera roll, the app started to upload all my photos again. This led to duplicated photos and a few gigs of wasted upload bandwidth. I wanted to understand why this happened and adapt my photo work flow to avoid it happening again.

Image checksums

First of all I extracted a single photo IMG_7082 taken that day directly from my iPhone over USB. I copied the file from the DCIM folder on the phone, gaving me a 2.8MB file as my “master copy”. Checking my Dropbox “Camera Uploads” folder I found the same photo as expected had been renamed by Dropbox but unexpectedly had a different checksum and was over 1 megabyte smaller, the plot thickens!

The obvious next question was what is changing the file, so I extracted the same image file via email (sent as full resolution), iCloud and Photos on my MacBook each time it was the same size with a matching sha1 checksum. Uploading the master file to the free tier of Google Photos and then extracting it via Google Drive or the web UI did change the file but Google are upfront about that.

The Proof

I have created a github repo with all the photos I used in testing if you want to have a look at them yourself it’s here: https://github.com/TimJDFletcher/IMG_7082

Quality Change

I had a quick go at reproducing the same change in size of the image using GIMP and changing the JPEG compression level. I found that at 85% the file size was very close to the file size the both Google Photos and Dropbox produced. This is pretty crude test and is not to say this is the only compression that Google and Dropbox do.

Lessons Learned

The main lesson for me is that I should confirm how applications I rely on to move data work as advertised. I do understand why Dropbox re-compress photos as it gives a large saving in storage and bandwidth, I wish they were as upfront about this as Google are.

Google says they will optimize your photos, if you don’t like this then you can pay money to store the originals. Dropbox on the other hand say “Don’t worry about losing those once-in-a-lifetime shots, no matter what happens to your iPhone.”

Fixing the duplicates

Fixing the duplicates was fairly simple in the end, I just got a list of files uploaded from my iPhone and then deleted them from Google Drive using google-drive-ocamlfuse and a bit of shell script.