Why Does Dropbox Add a Unique ID to Every Photo?

Following on from my last post about Dropbox changing my photos, I noticed a new exif field of “Image Unique ID” embedded by Dropbox in the image.

This ID would allow Dropbox to track unique files across their storage estate to avoid duplication. Equally it could used to track the original file and who uploaded it from a cropped version posted online, especially if law enforcement turned up with legal papers and demanded access.

Think about leaked documents or protest photos, yes it’s good practise to strip the meta data out but not everyone does.

This again comes back to what Dropbox and it’s camera upload feature is doing and is it documented anywhere?

Note Google Photos does not embedded any tracking data in the exif of the image I tested by uploading and downloading it.

The Hash

The hash for IMG_7082 is 8af323e74def610b0000000000000000 which looks like a 128bit hash but with only the first 64 bits populated. I’ve tried a number of hash tools on various parts of the original but they don’t match the unique ID. I have tested just the pure image data from original and Dropbox modified images.

Reverse engineering the hash function isn’t the real issue here,  the real question is why has this ID been added?

Dropbox iPhone Camera Upload Changes Photos

When Google announced their new Photos tools I decided to give it a go and see what Google’s machine learning could extract from my 83,292 photos stretching back 15 years. I’m sure you know that Google are offering “unlimited” and “free” storage for photos so long as you allow them to optimize your photos. I’m happy with the trade-off in quality as I already manage an archive of full resolution (or so I thought) photos via f-spot and have backup arrangements for it.

Dropbox Camera Upload

I have used the Dropbox Camera Upload feature for about 18 months to get photos off my iPhone and on to my various other devices and offsite backup server. Dropbox state that “When you open the app, photos and videos from your iPhone or iPad are saved to Dropbox at their original size and quality in a private Camera Uploads folder.”

This statement hides the fact, that the Dropbox app re-compresses your photos before it uploads them. I found this out when I used the desktop backup client to seed Google Photos from my Dropbox camera folder, before activating the apps on my iPhone and iPad.

Google checksum all photos before uploading to avoid duplication. When I enabled the Google Photos app on my iOS devices to upload directly from the iOS camera roll, the app started to upload all my photos again. This led to duplicated photos and a few gigs of wasted upload bandwidth. I wanted to understand why this happened and adapt my photo work flow to avoid it happening again.

Image checksums

First of all I extracted a single photo IMG_7082 taken that day directly from my iPhone over USB. I copied the file from the DCIM folder on the phone, gaving me a 2.8MB file as my “master copy”. Checking my Dropbox “Camera Uploads” folder I found the same photo as expected had been renamed by Dropbox but unexpectedly had a different checksum and was over 1 megabyte smaller, the plot thickens!

The obvious next question was what is changing the file, so I extracted the same image file via email (sent as full resolution), iCloud and Photos on my MacBook each time it was the same size with a matching sha1 checksum. Uploading the master file to the free tier of Google Photos and then extracting it via Google Drive or the web UI did change the file but Google are upfront about that.

The Proof

I have created a github repo with all the photos I used in testing if you want to have a look at them yourself it’s here: https://github.com/TimJDFletcher/IMG_7082

Quality Change

I had a quick go at reproducing the same change in size of the image using GIMP and changing the JPEG compression level. I found that at 85% the file size was very close to the file size the both Google Photos and Dropbox produced. This is pretty crude test and is not to say this is the only compression that Google and Dropbox do.

Lessons Learned

The main lesson for me is that I should confirm how applications I rely on to move data work as advertised. I do understand why Dropbox re-compress photos as it gives a large saving in storage and bandwidth, I wish they were as upfront about this as Google are.

Google says they will optimize your photos, if you don’t like this then you can pay money to store the originals. Dropbox on the other hand say “Don’t worry about losing those once-in-a-lifetime shots, no matter what happens to your iPhone.”

Fixing the duplicates

Fixing the duplicates was fairly simple in the end, I just got a list of files uploaded from my iPhone and then deleted them from Google Drive using google-drive-ocamlfuse and a bit of shell script.

Fixing LDAP error 53 “Server is unwilling to perform”

Work has a pair of OpenLDAP servers which are in a standard master / slave synchronized setup. While preparing for some updates I checked that the LDAP servers where syncing correctly and discovered that the slave hadn’t updated in over 6 months!

On the slave server /var/log/syslog contained the following errors:

slapd[PID]: do_syncrep2: rid=XXX LDAP_RES_SEARCH_RESULT (53) Server is unwilling to perform
slapd[PID]: do_syncrep2: rid=XXX (53) Server is unwilling to perform

Working through the Ubuntu server guide used to set up the pair of servers in the first place didn’t shed any light on the problem. A fresh Ubuntu 14.04 server in GCE showed the same problem, so at least I know the problem is on the master server.

I finally got a clue from chapter 12 of “LDAP for Rocket Scientists“, which suggested that the master server had “no global superior knowledge”. This was enough to make me test removing the accesslog databases, which track LDAP transactions and allow slave servers to sync changes from the master.

In the end it was a simple as removing the databases from /var/lib/ldap/accesslog and letting slapd rebuild them after a restart. Note depending on the config in slapd this might be in a different directory, check the setting for olcDbDirectory with this command:

sudo slapcat -b cn=config -a "(|(cn=config)(olcDatabase={2}hdb))"

Exchange Cumulative Update 8 and SSL certificates

This weekend while I was patching and rebooting KVM systems for Venom I took the opportunity to apply Microsoft’s latest Exchange Cumulative Update 8 work’s Exchange server.

I ran the pre-upgrade checks and  they picked up that my user wasn’t in the correct AD groups for scheme updates, once that was fixed the upgrade started without problems.

When the upgrade got to Section 10 – Mailbox role: Transport Services the upgrade failed because a certificate had expired. Fine just install an updated certificate but all of the Exchange management tools have been uninstalled so you can’t get to the certificate.

When you rerun the installer it detects the failed install and tries to resume and fails at the same place, in the end I had to move the clock back two days on the server to get the management tools to install so that the certificate could be replaced.

I’ve reviewed the Microsoft documentation and I can’t see any reference to this problem and it wasn’t detected in the pre-upgrade checks. I’m sure there is some powershell magic that could fix this but at 1am on a Monday morning I wasn’t all that interested in finding out!