An interesting spin on an old technique brought to Hadoop
http://eprint.iacr.org/2012/398.pdf
"...Retrieval of previously outsourced data in a privacy-preserving manner is an important requirement in the face of an untrusted cloud provider. PIRMAP is the rst practical PIR mechanism suited to real-world cloud computing. In the case, where a cloud user wishes to privately retrieve large les from the untrusted cloud, PIRMAP is communication e cient. Designed for prominent MapReduce clouds, it leverages their parallelism and aggregation phases for maximum performance. Our analysis shows that PIRMAP is an order of magnitude more e cient than trivial PIR and introduces acceptable overhead over non-privacy-preserving data retrieval. Additionally, we have shown that our scheme can scale to cloud stores of up to 1 TB on Amazon's Elastic MapReduce service..."