I wonder if any of the companies that require unique uploads to the cloud service play fast and loose with compression. There is a large spectrum between running a hash over a file to see if it already exists on the server, and doing the same thing over pieces of a file. Full file deduplication might be clearly against the rules, but what about running a totally generic upload cache that only looked at parts of a file to see if it already had been uploaded as a chunk of data and then copying from the server side cache instead of retransmitting from the user. Seems like there is a lot of grey area between deduplication and simple optimized compression.