~$ Whoops! That's some PII!
Tags:Information SecurityResponsible DisclosureHackingReverse Engineering
Picture this: One year ago - on the last day of March 2023 - after work, a bored Maya is sitting in front of her computer.
Experienced Maya enthusiasts will immediately think: oh no. this can't be good
But you've read the title, so you may have some idea of where this is going.
In short, I stumbled onto the database of Castle Building Centres Group Ltd., a cooperative lumber/building materials buying group in Canada, found how it was made publicly accessible, and disclosed it to the administrators, who then fixed it.
But that's just the summary, so here's an in-depth retelling of the adventure:
Outline:
- Introduction to Bucket Hunting
- The legal bits
- The practice of responsible disclosure
- Finding an oddly recurring file
- Downloading and setting up the environment
- What I found
- The disclosure
- Timeline
Introduction to Bucket Hunting
You may have already heard the term Bucket Hunting before, but - if not - here's a small 101.
Essentially, due to the state of the Internet, as well as the multitude of services hosted upon it, developers and/or IT engineers find themselves in a situation where they need to store large volumes of content for the purposes of a service.
This brings about a problem: The more content you host on a system that serves a service, the more you risk bottlenecking it with ever-increasing disk I/O and network I/O requirements.
Additionally, if the host system were to become corrupted, data would be lost (whereas a system could quickly be spun up again, when it is necessary to do so).
The solution is to keep the principal service lean (in the sense that it only hosts itself and the materials to function) and host the generated data somewhere else.
This somewhere else could be a file system somewhere in a VPS, but this could lead to similar issues as before with indexing and I/O bottlenecks.
In come storage buckets: a solution provided by some cloud companies, wherein data (or 'objects') can be stored in containerized filesystems, which are then accessible by linking to the resource or using the provided API.
This solution, when implemented correctly, can save engineers a lot of time and cost.
However, certain security considerations must be taken into account, namely authentication (whether one needs to be logged in - in one way or another - to access the object) and authorization (whether one needs a specific set of permissions to access the object).
Because it turns out, you can index storage buckets.
In comes bucket hunting, where certain services like Grayhat Warfare regularly index available files on buckets, and lets their users (free and paid) use the search features to find filenames containing certain words, or files of a certain type (e.g. .sql
, .csv
, etc.).
The legal bits
Now obviously, downloading "someone else's" materials, such as a file or a database, could be portrayed as a criminal act, which is where you - the reader, as a potential party interested in such activities - should inform yourself.
In the case of Switzerland - where I reside - the following - needlessly gendered - legal aspects of the Swiss Criminal Code are enforced:
- Swiss Criminal Code Art. 143 - Unauthorised obtaining of data:
1. Any person who for his own or for another's unlawful gain obtains for himself or another data that is stored or transmitted electronically or in some similar manner and which is not intended for him and has been specially secured to prevent his access shall be liable to a custodial sentence not exceeding five years or to a monetary penalty.
2. The unauthorised obtaining of data to the detriment of a relative or family member is prosecuted only on complaint. - Swiss Criminal Code Art. 143 bis - Unauthorised access to a data processing system:
1. Any person who obtains unauthorised access by means of data transmission equipment to a data processing system that has been specially secured to prevent his access shall be liable on complaint to a custodial sentence not exceeding three years or to a monetary penalty.
2. Any person who markets or makes accessible passwords, programs or other data that he knows or must assume are intended to be used to commit an offence under paragraph 1 shall be liable to a custodial sentence not exceeding three years or to a monetary penalty. - Swiss Criminal Code Art. 143 bis - Damage to data:
1. Any person who without authority alters, deletes or renders unusable data that is stored or transmitted electronically or in some other similar way shall be liable on complaint to a custodial sentence not exceeding three years or to a monetary penalty.
If the offender has caused major damage, a custodial sentence not exceeding five years or a monetary penalty shall be imposed. The offence is prosecuted ex officio.
2. Any person who manufactures, imports, markets, advertises, offers or otherwise makes accessible programs that he knows or must assume will be used for the purposes described in paragraph 1 above, or provides instructions on the manufacture of such programs shall be liable to a custodial sentence not exceeding three years or to a monetary penalty.
If the offender acts for commercial gain, a custodial sentence of from six months to ten years shall be imposed.
This vulnerability disclosure, as well as this post, do not qualify - in my honest understanding, as I am not a lawyer - as violations of these articles for the following reasons:
- Swiss Criminal Code Art. 143 - Unauthorised obtaining of data:
1. This data has not been specially secured to prevent my access as it was in a publicly accessible database backup file in an open AWS S3 bucket.
2. Castle Building Centres Group Ltd. is not a family member or other relative of mine. - Swiss Criminal Code Art. 143 bis - Unauthorised access to a data processing system:
1. This data has not been specially secured to prevent my access as it was in a publicly accessible database backup file in an open AWS S3 bucket.
2. I have not marketed or made accessible any of the password, password hashes, codes, or any other data from this publicly accessible database backup file in an open AWS S3 bucket. - Swiss Criminal Code Art. 143 bis - Damage to data:
1. I have not altered, deleted or rendered unusable any of the data stored on this publicly accessible database backup file in an open AWS S3 bucket.
2. No importing of programs was done to access the data stored on the publicly accessible database backup file in an open AWS S3 bucket.
The practice of responsible disclosure
Responsible disclosure is the art form of telling the vulnerable party that:
- They are vulnerable, and
- Why they are vulnerable, and
- How someone could exploit their vulnerability, and
- How this vulnerability was discovered, and
- If gracious, what next steps to take to resolve the vulnerability.
But to be fair, I might as well hand the microphone over to my good friend Lennaert, who gave a fantastic talk about this at BeerCon 2!
Finding an oddly recurring file
And so, one lovely evening on the A31st of March, 2023, I find myself bucket hunting.
And I find a file named database_backup_20220406.sql
.
And another, named database_backup_20220405.sql
.
And another, named database_backup_20220407.sql
.
Oh, and it is important to mention that all of these files seem to be on the same bucket.
Downloading and setting up the environment
It is important to note, that I was using Grayhat Warfare from inside a virtual machine, of the Kali variety.
So I immediately performed some mild wget <URL>/database_backup_20220406.sql
and almost instantly have a fun little SQL file available on my system.
Now, I could have just done raw queries on the SQL database, but it was already 01:00 the following morning, and I was being a bit lazy.
So, one short XAMPP stack download and install later, I had MySQL and PHPMyAdmin running on localhost, which is a not too terrible quick and dirty solution to view SQL databases.
What I found
The database itself had a few interesting tidbits at first glance, namely the following tables:
From there, I saw two declinations of tables: the castlecore_*
tables - which to me felt like the former were the internal tables - versus the others - being the more "public facing" datasets (as in, to populate the website).
So I then decided to map the database fields in that collection, which got me a very messy DB structure (shown below), where not a single Foreign Key was explicitly defined (making a lot of the patchwork analysis quite haphazard).
But, to be honest, database engineering critique is not the objective of this post.
What this database contained, were tables which allowed me to describe the following data points of a majority of Castle's user ecosystem:
- Full names
- Email addresses
- User Type (Administrator, Member, Partner, VIP)
- Phone numbers
- Hashed & Salted passwords (likely bcrypt)
- Work Location
- Unique keys
And these weren't just small or local companies either, as notable "heavy hitters" included some large publicly traded companies, such as:
- Makita Canada Inc. ( link)
- IKO Industries Ltd. ( link)
- The Hillman Group Canada Ltd. ( link)
- Louisiana-Pacific Corporation ( link)
- Honeywell International ( link)
- etc.
===
It's important for me to state at this stage, that all of the partners listed above are publicly listed on Castle Building Centres Group Ltd.'s own website, on the "Partners" page (see here).
===
I'm not quite sure what rawPass
is, beyond being a certain pattern, but I for sure hope it's not either the salt, the actual unsalted password, or the initial password. I would, however, not be surprised if it were...
Additionally a column called code
peaked my interest, but it could be anything (including first time login URL parameter stub on email confirmation, the TOTP code, etc.). What it wasn't was Base64.
Plus, I had the entire order list, price list, for the entire catalog, as well as their entire "Careers" page.
So most of this isn't exactly world ending in it's own right, but thinking about it now, I could see a few usecases to exploit this data.
- Phishing pretexts:
Call the associated number, send an email, send an invoice in the mail, in relation to any purchase listed, in order to fool the person on the other end. - Pivot:
Although no raw credentials are located here (I think?), I now have access to email addresses used by administrators, which are likely in use to access the infrastructure. By using breach data, I could potentially find a reused password to successfully login and cause havoc.
I'm not terribly creative, and I'm certain just the list of emails of 1377 enabled users (1478 total) across the 394 enabled vendors (944 total) could be juicy enough for any threat actor.
The disclosure
As soon as I saw that this dataset was online, and even more importantly, that a daily snapshot of it was being pushed to an unprotected S3 bucket, I resolved to disclose the issue, because hopefully Castle - or EspressLabs, the actual company that developped this platform - would listen.
Thankfully, the users
table had a very handy permission
column, which very quickly gave me the list of administrators and their email addresses.
So I, in my infinite wisdom (sarcasm), on a Friday at 18:00-ish local time on the 31st of March 2023, decided to email all these lovely folk the following:
To their credit, they responded within 30 minutes, with the following email, seemingly indicating that a botched cron job of some sort wasn't purging the bucket on backup.
So I checked to see that this was fixed, and rightly enough, I no longer had access to the bucket.
Job done! (But not entirely for them...)
Given the nature of the data, I did mention to them that they should probably reach out to the Canadian Data Protection Authority.
At this time, I have no idea whether they did, as the following email was our last correspondence.
Timeline
- 2023-03-31 22:00 UTC:
Discovery of the database dump. - 2023-03-31 22:10 UTC:
Finish setting up fresh VM and downloading database dump. - 2023-03-31 22:20 UTC:
Begin analysis of database dump. - 2023-03-31 23:40 UTC:
Finish analysis of database dump. - 2023-03-31 23:46 UTC:
Send initial email to administrators. - 2023-04-01 00:08 UTC:
Receive reply from administrators. - 2023-04-01 00:20 UTC:
Confirm database dump no longer accessible. - 2023-04-01 00:35 UTC:
Send final reply. - 2023-04-01 - 2023-07-01:
Initial disclosure cooldown period - 2023-07-01 - 2023-10-15:
ADHD brainholing of the idea to post about it. - 2023-10-15 - 2023-10-16:
Realize I don't know where the VM is anymore (I had mistakenly unlisted it in VMWare). - 2023-10-16 - 2024-01-04:
ADHD brainholing (yes, again) of the idea to post about it. - 2024-01-04 - 2024-01-06:
I remember I don't know where the VM is, and I get writer's block. - 2024-01-06 - 2024-04-15:
Writer's block and then ADHD brainholing (..., I know...) of the idea to post about it. - 2024-04-15:
Find the VM by accident. - 2024-04-16 - 2024-04-17:
Writing this post. - 2024-04-16 - 2024-04-17:
Getting people to review and suggest improvements to this post.
Thank you Kenza, Mark and Ger! - 2024-04-18:
Publication.