Amazon S3 'public' buckets - Still leaking data...

Posted on:

There has been a lot in the news recently about Amazon S3 buckets exposing large quantities of personal data, which should have been under lock and key. The leaks were caused by the owner misconfiguring the bucket privileges. Such as the 'accidental' exposure of 200 million American voters which included personal contact details (Gizmodo has a good read on the topic)! Or how about the leak of 14 million mobile telephone subscriber details by Verizon (check out The Register article)?

Following multiple high-profile releases, Amazon contacted all owners of publicly accessible buckets reminding them of the implications of having a public bucket (you physical have to switch from private to public). Yet, despite the warning, exposed buckets containing highly sensitive data are still being found on a daily basis.

Many black and white hat hacker researchers are hunting for public buckets, so I decided to jump in the white hat hacker ring and see if I could find any.

What is S3?

Amazon Simple Storage Service (hence S3) is a cloud-based storage solution of both static and dynamic content. Many organisations use S3 to store backups, media, logs, data and static websites. Access controls can be applied to the bucket, individual objects (a file) and groups of items to manage the permission levels (who have access). The default control is 'private', meaning an access token is required to access the object.

A public bucket is defined as one which lists and allows access to the data. It is very straight forward to check if a bucket is public. For example, below is a public bucket:

open-bucket-web

And this is a private bucket:

access-denied-web

You simple need to navigate to a default Amazon S3 URL, such as https://s3-eu-west-1.amazonaws.com/BUCKET. Just replace the bucket text with a name and see what comes up.

Finding the Buckets

It is important to note that not all public buckets contain personal data, many function as legitimate data stores (for hot linking etc). It is only the rarest of cases where a misconfiguration or oversight by the owner leads to data being exposed. I wrote a quick and dirty script in NodeJS to search (using a dictionary of common words) for public buckets.

Here are three examples of what I found.

Customer database

A [niche] business selling goods would want to keep its customer database secure, protected from preying eyes and competitors. Well, I stumbled across this database that contained a list of 'high value' customers.

example_data-web

I have obfuscated some sensitive information. But the image should give you an idea of the data available. Unprotected and open to the world.

SPAM list

Amazon is not just used to host legal content, but also illegal content. I was a little surprised to find a set of spam emails and attachments on an open bucket.

spam-content-web

These attachments are used for phishing schemes, and contain links to malware.

Child incident reports

One of the most sensitive open buckets I found was a list of incident reports related to children having accidents at some premises (falling over, tripping, cutting hand etc).

incident_reports_web

These PDF files were uploaded to an open bucket, without any protection. Details related to the incident, child’s name, parents name and some medical history was present in the files.

Source code and databases

The final example is that of a hotel in Indonesia, they have exposed their customer database, source code for the website, API server and mobile phone applications.
directory_web

While the customer data is present, of great concern is that the API payment processor API and corresponding tokens are also present!

Conclusion

For those who use Amazon S3; secure those buckets, remove data from the public eye and importantly regularly check permissions on buckets. It has been a lot easier than expected to discover public buckets, it is a shame that owners are not taking note of Amazon's repeated messages to secure the content.

I have responsibly disclosed to the owners (and Amazon) buckets which should not be publicly accessible.

Tags

amazon s3 data leak
Daniel Leightley

Daniel Leightley

Hi, I'm Dan Leightley, a researcher based in the United Kingdom. I work with machine learning, computer vision and big data

Find out more
comments powered by Disqus