Learn from the worst data breaches in history and what your security team can do to prevent similar incidents.
• Stolen credentials caused many of the biggest breaches in history. If attackers can log in, they don’t need to break in
• Misconfigured databases and unpatched vulnerabilities caused the largest breaches by record count. These are basic failures, not advanced attacks
• Detection gaps let attackers stay inside networks for months or years. Every day of undetected access multiplies the damage
• Credential theft and unpatched systems keep causing breaches decade after decade. The basics still aren’t covered
Nearly one trillion stolen identity records are now circulating on the dark web (SpyCloud). Behind that number are real incidents where real organizations failed to protect sensitive data. The 2025 IBM Cost of a Data Breach Report found that breaches now cost an average of $4.44 million.
What’s worse is that many of these breaches were preventable. The same root causes show up again and again: unpatched vulnerabilities and misconfigured databases. Stolen credentials caused many more. Teams keep making the same mistakes because they don’t learn from past incidents.
Once you see how Yahoo lost 3 billion accounts or how Equifax exposed 147 million Americans, you start spotting similar gaps in your own environment.
This guide breaks down 30 famous data breaches. For each one, you’ll learn what happened and what lessons apply to your organization today.
What Is a Data Breach?
Security teams deal with data breaches constantly, but the definition matters for legal and regulatory purposes.
A data breach is a security incident where unauthorized parties access sensitive information. Breaches expose personal data like names and passwords. Social Security numbers and financial records are common targets too. The exposed data ends up on dark web marketplaces where attackers use it against your organization.
There’s an important distinction between breaches and leaks. A breach involves unauthorized access, meaning someone bypassed security controls. A leak typically means data was exposed accidentally, like a misconfigured database. An exposure means data was publicly accessible, even if no one malicious found it.
These distinctions matter because they affect regulatory requirements and notification obligations. Remediation steps differ too. But for practical purposes, any sensitive data ending up where it shouldn’t be is bad news.
Debating definitions doesn’t help you. Let’s look at the root causes instead.
What Are the Most Common Causes of Data Breaches?
The same vulnerabilities appear in breach after breach. Attackers don’t need zero-days when basic security failures keep working.
Stolen credentials cause more breaches than any other vector. The Verizon 2025 DBIR found 88% of breaches involved stolen or weak credentials. Attackers get passwords from phishing and infostealer malware. Third-party breaches expose them too. When employees reuse passwords across services, one breach leads to many more.
Unpatched vulnerabilities gave attackers entry in some of the biggest breaches on record. The Equifax breach exploited a known Apache Struts vulnerability that had a patch available for two months. WannaCry ransomware spread through a Windows vulnerability Microsoft had patched 59 days earlier.
Misconfigured systems expose data without any hacking required. Elasticsearch and MongoDB databases get left open to the internet. Cloud storage buckets get set to public access. The CAM4 breach happened because of an exposed Elasticsearch cluster with no authentication.
Phishing attacks harvest credentials directly from employees who enter them on fake login pages. Advanced phishing kits capture session tokens in real-time, bypassing MFA. IBM’s Cost of a Data Breach Report found phishing-originated breaches cost an average of $4.8 million.
Insider threats account for fewer incidents but often expose the most sensitive data. Insiders already have legitimate access, so they don’t trigger the same detection alerts as external attackers. The Alibaba/Taobao breach came from a developer at a consulting firm who scraped customer data for eight months.
Third-party compromises let attackers reach organizations through their vendors. The Target breach started with a compromised HVAC vendor. The MOVEit attacks in 2023 affected over 60 million people through a single supply chain vulnerability.
Now let’s look at specific data breach incidents. Each one offers lessons for security teams.
What Are the Biggest Data Breaches in History?
We’ve organized these data breach incidents by scale. First, the largest data breaches in history affecting over 1 billion records. Then major breaches between 100 million and 1 billion. We finish with incidents that changed how teams approach protection.
Mega Breaches: 1 Billion+ Records Exposed
These are the biggest data breaches in history by sheer volume of exposed data.
1. CAM4 (10.88 Billion Records)
Date: March 2020
Records Exposed: 10.88 billion
Data Types: Names, emails, sexual orientations, chat logs, payment information, device info, IP addresses
Root Cause: Misconfigured Elasticsearch database
CAM4 is an adult streaming platform that left an Elasticsearch database exposed without authentication. Security researchers discovered 7 terabytes of data sitting open on the internet. CAM4 took the database offline within 30 minutes of notification, but the exposure window is unknown.
The breach included extremely sensitive data. Chat transcripts and sexual orientations create serious blackmail risks for victims. Payment logs add sextortion potential on top of that. What’s striking about CAM4 is that despite being the largest breach by record count ever recorded, there’s no confirmed evidence the data was exploited before researchers discovered it. The window may have been short enough that criminals never found it.
Lesson: Asset inventory and configuration management matter. You can’t protect databases you don’t know exist. Elasticsearch clusters need authentication enabled by default. You also need processes to detect internet-exposed assets.
2. Yahoo (3 Billion Accounts)
Date: August 2013 (disclosed 2016)
Records Exposed: 3 billion accounts
Data Types: Names, emails, phone numbers, dates of birth, hashed passwords, security questions
Root Cause: State-sponsored attack
Yahoo initially disclosed a breach affecting 1 billion accounts. Ten months later, they updated that number to 3 billion. That’s every Yahoo account that existed. The attackers were inside Yahoo’s network for three years before being discovered.
The timing was terrible. Yahoo was in the middle of selling its core business to Verizon for $4.8 billion. After the breach disclosure, that price dropped by $350 million. Two Russian intelligence officers and two attackers were later indicted.
This was actually Yahoo’s second major breach. A separate 2014 incident exposed 500 million accounts. State-sponsored attackers manufactured session cookies to bypass authentication entirely.
Lesson: Detecting threats matters as much as preventing them. Attackers were inside Yahoo’s network for years. Continuous monitoring for unusual activity could have cut that dwell time dramatically. MFA and proper session management would have blocked the cookie-based authentication bypass on top of that.
3. National Public Data (2.9 Billion Records)
Date: December 2023 (disclosed August 2024)
Records Exposed: 2.9 billion
Data Types: Names, Social Security numbers, addresses, dates of birth, phone numbers
Root Cause: Unknown (company filed for bankruptcy)
National Public Data was a background check company that aggregated personal information from public sources and sold access to businesses. An attacker calling themselves “USDoD” initially listed the data for sale at $3.5 million. When no buyer emerged, another actor leaked the full dataset for free.
The breach exposed SSNs for potentially hundreds of millions of Americans. Even deceased individuals were included. The data makes identity theft possible at massive scale, and targeted social engineering becomes trivial with it. National Public Data filed for Chapter 11 bankruptcy in October 2024, partly due to litigation from the breach.
Lesson: Data aggregators create concentrated risk. When background check companies and credit bureaus get breached, the impact is catastrophic because they hold so much data in one place. Check how your third-party vendors handle sensitive data.
4. Aadhaar (1.1 Billion Records)
Date: 2018-2019 (multiple incidents)
Records Exposed: 1.1 billion
Data Types: Names, Aadhaar numbers (national ID), addresses, phone numbers, bank details
Root Cause: Exposed API endpoint
Aadhaar is India’s biometric identification system, covering over 90% of the population. The 12-digit ID numbers function like Social Security numbers for banking and utilities. Government services require them too.
A state-owned gas company called Indane left an API endpoint exposed that connected to the Aadhaar database. The API used a hardcoded access token. When decoded, the token literally translated to “INDAADHAARSECURESTATUS.” That’s not a security gap, that’s carelessness with a national ID system. No rate limiting meant attackers could enumerate all possible Aadhaar numbers.
The Indian government initially denied the breach despite multiple security researchers confirming the exposure. The data was being sold for about $7 on WhatsApp. It took months for the vulnerable API to be taken offline despite repeated notifications.
Lesson: API security requires authentication and rate limiting. Hardcoded tokens are a gift to attackers. Government denial doesn’t make breaches disappear. It just delays remediation while attackers exploit the data.
5. Alibaba/Taobao (1.1 Billion Records)
Date: November 2019 (disclosed 2021)
Records Exposed: 1.1 billion
Data Types: Usernames, phone numbers
Root Cause: Web scraping via developer with elevated access
A software developer at a consulting company that provided services to Alibaba built a custom web crawler. The crawler scraped user data from Taobao, Alibaba’s massive e-commerce platform, for eight months before detection.
In China, phone numbers are tied to real identities because registration requires valid government ID. This makes phone number exposure particularly dangerous. The scraped data makes account takeover and social engineering easier. It also opens up doxxing.
The developer and a colleague received three-year prison sentences. While passwords weren’t exposed, the combination of usernames and phone numbers provides enough information for targeted attacks.
Lesson: Monitor contractors the same way you monitor employees. Elevated API access should trigger additional logging and anomaly detection. Eight months of scraping means nobody was watching data access patterns.
Major Data Breaches: 100 Million to 1 Billion Records
These famous data breaches hit hundreds of millions of people and reshaped how data breach companies handle security.
6. LinkedIn (700 Million Users)
Date: June 2021
Records Exposed: 700 million profiles
Data Types: Email addresses, phone numbers, physical addresses, professional history, geolocation data
Root Cause: API scraping
An attacker using the name “TomLiner” scraped approximately 90% of LinkedIn’s user base by exploiting the platform’s API. LinkedIn initially claimed this wasn’t a breach because no private data was accessed. But the scraped data included information users reasonably expected to be protected.
This was actually LinkedIn’s second major incident. In 2012, attackers breached LinkedIn and stole 164 million credentials. The passwords were hashed with SHA1 without salting. That made them easy to crack.
Lesson: API rate limiting and abuse detection are essential. Just because data is technically accessible doesn’t mean scraping at scale is acceptable. Affected users don’t care whether you call it a “breach” or “scraping.” The exposure is what matters.
7. Facebook (533 Million Users)
Date: April 2019
Records Exposed: 533 million profiles
Data Types: Phone numbers, Facebook IDs, names, locations, birthdates, email addresses
Root Cause: Contact importer feature exploitation
Attackers exploited Facebook’s contact importer feature, designed to help users find friends. By uploading massive sets of phone numbers, attackers could see which numbers matched Facebook accounts. They then queried those accounts to pull profile information.
Facebook didn’t notify users about the exposure because, as they put it, the data had been “widely distributed” and couldn’t be identified to notify specific users. The vulnerability was patched in August 2019.
Lesson: Features designed for user convenience often create security risks. The contact importer was legitimate functionality weaponized at scale. Abuse detection needs to catch automated exploitation of otherwise normal features.
8. Marriott/Starwood (500 Million Records)
Date: 2014-2018 (disclosed November 2018)
Records Exposed: 500 million guest records
Data Types: Names, addresses, phone numbers, email, passport numbers, payment card details, Starwood Preferred Guest numbers
Root Cause: Compromised Starwood network (pre-acquisition)
Attackers compromised the Starwood hotel chain’s reservation system in 2014. When Marriott acquired Starwood in 2016, they inherited the compromised network without knowing it. The breach wasn’t discovered until September 2018 when Marriott’s security tool flagged suspicious database queries.
For four years, attackers had access to guest records. The breach included encrypted payment card data, but the encryption keys may have been compromised too. Passport numbers were also exposed. That creates identity theft risks beyond typical breaches.
UK regulators fined Marriott £18.4 million. The US Federal Trade Commission required Marriott to implement a full security overhaul.
Lesson: When you acquire a company, you acquire their security problems. Marriott inherited Starwood’s breach because nobody checked for existing compromises. Any acquisition should include breach detection as part of integration.
9. Yahoo (500 Million Accounts)
Date: Late 2014 (disclosed September 2016)
Records Exposed: 500 million accounts
Data Types: Names, email addresses, phone numbers, dates of birth, hashed passwords, security questions
Root Cause: State-sponsored attack using forged cookies
This is Yahoo’s other major breach, separate from the 2013 incident. Attackers stole Yahoo’s cryptographic keys used to sign session cookies. With those keys, they could mint valid authentication tokens for any account without knowing the password. They specifically targeted accounts of interest to Russian intelligence, like journalists and government officials.
Two FSB officers and two criminal attackers were indicted. One was arrested in Canada and extradited to the US. The others remain in Russia.
Lesson: Session management is as important as password security. Cookie-based attacks bypass password controls entirely. Session tokens need proper expiration and secure storage. Anomaly detection for unusual access patterns adds another layer.
10. Adult Friend Finder (412 Million Accounts)
Date: October 2016
Records Exposed: 412 million accounts across six databases
Data Types: Usernames, email addresses, passwords (SHA1 hashed), dates of last visits, browser information
Root Cause: Local file inclusion vulnerability
FriendFinder Networks operated several adult dating sites. Attackers exploited a local file inclusion vulnerability that allowed code execution on the web server. They extracted 20 years of data from Adultfriendfinder.com and Cams.com, among other properties.
Over 15 million “deleted” accounts were still in the database. Passwords were hashed with SHA1, which is trivial to crack. The sensitive nature of the sites made the breach particularly damaging. Victims faced real extortion risks.
Lesson: Data retention policies matter. Keeping 20 years of data, including “deleted” accounts, massively expanded the breach impact. Weak password hashing (SHA1 without salt) meant exposed credentials were easily cracked.
11. MySpace (360 Million Accounts)
Date: 2008 (disclosed 2016)
Records Exposed: 360 million accounts
Data Types: Email addresses, usernames, SHA1 hashed passwords
Root Cause: Unknown
This breach happened around 2008 but wasn’t publicly known until 2016 when the data appeared for sale on the dark web. The seller asked for six bitcoins, about $3,000 at the time.
MySpace’s password hashing was embarrassingly weak. They only hashed the first ten characters of passwords, converted to lowercase, with no salt. Cracking was trivial.
Lesson: Even defunct platforms create ongoing risk. Users who reused their MySpace passwords on other services stayed vulnerable years later. Proper password hashing with unique salts is essential, and there’s no statute of limitations on breach discovery.
12. Equifax (147 Million Records)
Date: May-July 2017 (disclosed September 2017)
Records Exposed: 147 million Americans
Data Types: Names, Social Security numbers, birth dates, addresses, driver’s license numbers
Root Cause: Unpatched Apache Struts vulnerability (CVE-2017-5638)
Equifax is one of the three major credit bureaus. Attackers exploited a vulnerability in Apache Struts that had been patched two months earlier. An expired SSL certificate prevented Equifax’s intrusion detection from seeing the data exfiltration for 76 days.
The breach exposed sensitive financial data for nearly half of all American adults. The Equifax breach looks like carelessness, not sophistication. Equifax’s response made things worse. Their breach notification website initially directed users to a fake phishing site. Congressional hearings followed. Executives departed. Costs exceeded $1.38 billion.
For a deeper analysis, see our full Equifax data breach case study.
Lesson: Patching delays have catastrophic consequences. Two months of exposure to a known vulnerability led to one of the worst breaches in history. Security basics like certificate management and network segmentation would have contained the damage.
13. Change Healthcare (100+ Million Records)
Date: February 2024
Records Exposed: 100+ million (potentially one-third of Americans)
Data Types: Health records, insurance claims, Social Security numbers, financial information
Root Cause: Compromised credentials, lack of MFA
Change Healthcare processes insurance claims for most of American healthcare. ALPHV/BlackCat ransomware affiliates used stolen credentials to access a Citrix portal that lacked multi-factor authentication.
The attack disrupted healthcare operations across the US. Pharmacies couldn’t process prescriptions. Providers couldn’t submit claims. UnitedHealth Group, Change Healthcare’s parent company, paid a $22 million ransom. They later reported the breach may have hit 100 million people.
Lesson: MFA on remote access is non-negotiable. A single portal without MFA led to one of the largest healthcare data breaches ever. Critical infrastructure needs security controls that match its importance.
14. Experian/Court Ventures (200 Million Records)
Date: 2013 (disclosed 2014)
Records Exposed: 200 million
Data Types: Names, addresses, Social Security numbers, dates of birth, financial information
Root Cause: Inadequate acquisition due diligence
A Vietnamese national named Hieu Minh Ngo purchased access to Experian’s databases through Court Ventures, a data broker Experian had acquired. Ngo posed as a private investigator in Singapore to gain access. He ran an identity theft service selling SSN lookups for years before being caught.
This exposed a major flaw in data broker acquisitions. Experian inherited Court Ventures’ lax customer vetting practices without proper security review. Ngo eventually got 13 years in prison.
Lesson: Vet who gets access to your data. Experian inherited Court Ventures’ compromised access controls without proper review. Any company selling access to sensitive data needs strict customer vetting.
15. Adobe (153 Million Records)
Date: October 2013
Records Exposed: 153 million user records
Data Types: Email addresses, encrypted passwords, password hints, usernames
Root Cause: Network intrusion and poor password storage
Attackers breached Adobe’s network and stole both customer data and source code for products including Acrobat and ColdFusion. The password storage was particularly problematic. Adobe used 3DES encryption instead of proper hashing. Many passwords shared the same encryption key.
Password hints were stored in plaintext alongside the encrypted passwords. Security researchers could often guess passwords just by reading the hints. Poor cryptographic choices made everything worse.
Lesson: Password hashing matters. Encryption is not a substitute for proper password hashing with unique salts. Storing password hints in plaintext defeats the purpose of encrypting passwords.
16. eBay (145 Million Records)
Date: February-March 2014 (disclosed May 2014)
Records Exposed: 145 million users
Data Types: Names, encrypted passwords, email addresses, physical addresses, phone numbers, dates of birth
Root Cause: Compromised employee credentials
Attackers compromised a small number of employee login credentials and used that access to reach the user database. The breach went undetected for over two months. eBay was criticized for the slow disclosure timeline. They discovered the breach in late February but didn’t notify users until May.
Financial information was stored separately and wasn’t compromised. But the combination of names and addresses opened the door to identity theft. Birthdates made targeted phishing easier too.
Lesson: Employee credential monitoring is critical. Attackers used valid credentials to access systems without triggering alerts. Faster detection and disclosure protect users from extended exposure windows.
17. Canva (137 Million Records)
Date: May 2019
Records Exposed: 137 million user accounts
Data Types: Usernames, email addresses, names, cities, countries, bcrypt-hashed passwords
Root Cause: Unknown vulnerability exploited by GnosticPlayers
An attacker known as “GnosticPlayers” breached the Australian design platform and tried to sell the data. Canva detected the attack in progress and interrupted it before all data was exfiltrated. They forced password resets for affected users.
The passwords were hashed with bcrypt, a strong algorithm that makes cracking difficult. Some users who authenticated via Google had their tokens exposed. Canva’s rapid response contained the damage.
Lesson: Strong cryptographic protections keep users safe even when breaches happen. Canva’s use of bcrypt meant most passwords stayed secure. Catching attacks in progress allows faster response.
18. Heartland Payment Systems (130 Million Cards)
Date: 2008 (disclosed January 2009)
Records Exposed: 130 million credit and debit card numbers
Data Types: Card numbers, expiration dates, cardholder names
Root Cause: SQL injection leading to malware installation
Attackers used SQL injection to install malware on Heartland’s payment processing systems. The malware captured card data as it passed through for processing. At the time, this was the largest payment card breach ever recorded.
Albert Gonzalez, who also orchestrated the TJX breach, was convicted and sentenced to 20 years in federal prison. The incident pushed the industry toward end-to-end encryption for payment processing.
Lesson: Input validation prevents SQL injection. Payment systems require encryption of data in transit, not just at rest. This breach accelerated adoption of PCI DSS security standards.
Notable Cyber Security Breaches That Changed Practices
These breaches may have exposed fewer records, but they changed how security teams approach protection.
19. MOVEit (60+ Million Affected)
Date: May-June 2023
Records Exposed: 60+ million across 2,500+ organizations
Data Types: Varied by victim (PII, financial data, health records)
Root Cause: Zero-day SQL injection in MOVEit Transfer (CVE-2023-34362)
The Cl0p ransomware gang exploited a zero-day vulnerability in Progress Software’s MOVEit file transfer application. They compromised hundreds of organizations at once. Government agencies were hit. So were banks and healthcare providers.
This was a supply chain attack at scale. Organizations that used MOVEit became victims even though their own security wasn’t breached. British Airways was hit. So were BBC and Shell. Numerous US government agencies were affected too.
Lesson: Your security doesn’t matter if your vendors aren’t secure. MOVEit victims had solid security themselves. Know what software your organization runs and watch it for new vulnerabilities.
20. Target (40 Million Payment Cards)
Date: November-December 2013
Records Exposed: 40 million payment cards, 70 million customer records
Data Types: Payment card numbers, CVVs, PINs, names, addresses, phone numbers, email addresses
Root Cause: Third-party HVAC vendor compromise
Attackers compromised Fazio Mechanical Services, an HVAC vendor that had network access for billing purposes. From there, they moved laterally to Target’s point-of-sale systems and deployed memory-scraping malware.
Target’s security team in India actually detected the intrusion and alerted US staff. The alerts were ignored. What’s maddening about Target is that the detection worked. The people receiving the alerts didn’t. The breach was eventually discovered when the Secret Service notified Target that stolen cards were being used for fraud.
For the complete story, see our Target data breach analysis.
Lesson: Third-party access requires network segmentation. An HVAC vendor shouldn’t have any path to payment systems. Alert fatigue leads to ignored warnings. Security teams need processes to prioritize and act on detections.
21. Home Depot (56 Million Payment Cards)
Date: April-September 2014
Records Exposed: 56 million payment cards, 53 million email addresses
Data Types: Payment card numbers, customer email addresses
Root Cause: Stolen vendor credentials and custom malware
Attackers used credentials stolen from a third-party vendor to access Home Depot’s network. They deployed custom point-of-sale malware across self-checkout systems in US and Canadian stores. The malware ran for five months before detection.
Home Depot’s breach came just months after Target’s. Despite the Target incident making headlines, Home Depot hadn’t rolled out many of the same controls that could have prevented a similar attack.
See our detailed Home Depot data breach breakdown.
Lesson: Industry breaches should trigger immediate security reviews. Home Depot had months of warning from Target’s incident but didn’t adequately respond. Self-checkout systems were a different attack surface than traditional registers.
22. T-Mobile (37 Million Customers)
Date: Late 2022 - January 2023
Records Exposed: 37 million current customers
Data Types: Names, billing addresses, email, phone numbers, dates of birth, account numbers
Root Cause: Exploited API vulnerability
This was T-Mobile’s eighth major breach since 2018. Attackers exploited a vulnerability in one of T-Mobile’s APIs to pull customer data. The breach began in late November 2022 and wasn’t detected until January 2023.
Previous T-Mobile breaches included a 2021 incident affecting 76 million people and a 2020 breach targeting employee email accounts. At some point you have to ask whether it’s a security problem or a leadership problem.
Lesson: Repeated breaches point to systemic security failures. Securing APIs means locking down authentication and watching for abuse. Companies that experience multiple breaches need to rethink their security architecture.
23. 23andMe (7 Million Profiles)
Date: October 2023
Records Exposed: 6.9 million genetic profiles
Data Types: Genetic ancestry data, names, birth years, family tree information
Root Cause: Credential stuffing attack
Attackers used credentials leaked in other breaches to access 23andMe accounts. Because 23andMe’s “DNA Relatives” feature lets users share information with genetic matches, compromising one account exposed data from many relatives.
The genetic data included ancestry percentages and ethnic background. Health predispositions were exposed too. This information can’t be changed like a password. That’s what makes genetic breaches uniquely dangerous.
Lesson: Credential stuffing attacks exploit password reuse across services. Features that share data between users make breach impact worse. Genetic and biometric data creates permanent risks that never go away.
24. PowerSchool (Unknown Millions)
Date: December 2024 - January 2025
Records Exposed: Millions of students and teachers (exact number unknown)
Data Types: Names, addresses, Social Security numbers, medical records, grades, disciplinary records
Root Cause: Compromised support credentials
PowerSchool provides student information systems to 16,000 school districts. Attackers used compromised credentials to access the PowerSource customer support portal, then used a maintenance tool to export student and teacher data.
The breach exposed decades of historical data, including records of students who graduated years ago. Some districts reported exposure of medical information and Social Security numbers. Disciplinary records leaked too.
Lesson: Support portals and admin tools need the same security as production systems. Maintenance access that can bulk export data requires extra controls and monitoring. Keeping historical data makes breaches worse.
25. AT&T (73 Million Customers)
Date: 2019 (disclosed 2024)
Records Exposed: 73 million current and former customers
Data Types: Names, email addresses, mailing addresses, phone numbers, Social Security numbers, dates of birth
Root Cause: Unknown
AT&T initially denied the data was theirs when it appeared on the dark web in 2021. In March 2024, they confirmed the breach after security researchers demonstrated the data was authentic. The breach included Social Security numbers for 65 million people.
AT&T reset passcodes for 7.6 million current customers as a precaution. The source of the breach remains publicly unknown.
Lesson: Breach investigation and disclosure take time, but denial delays user protection. Build processes to quickly validate whether leaked data is authentic.
26. Uber (57 Million Users)
Date: October 2016 (disclosed November 2017)
Records Exposed: 57 million riders and drivers
Data Types: Names, email addresses, phone numbers, driver’s license numbers (for drivers)
Root Cause: Exposed AWS credentials in GitHub repository
Attackers found AWS credentials in a private GitHub repository used by Uber engineers. They used those credentials to access an Amazon S3 bucket containing rider and driver information.
Instead of disclosing the breach, Uber paid the attackers $100,000 through their bug bounty program and had them sign NDAs. This cover-up led to criminal charges against Uber’s former security chief, who was convicted of obstruction.
Lesson: Secret sprawl in code repositories is a common vulnerability. Credential scanning and secrets management are essential. Covering up breaches creates legal liability and damages trust far more than disclosure.
27. Capital One (106 Million Records)
Date: March-July 2019
Records Exposed: 106 million credit card applications
Data Types: Names, addresses, phone numbers, email, dates of birth, income, credit scores, Social Security numbers
Root Cause: Misconfigured web application firewall
A former Amazon Web Services employee exploited a misconfigured firewall to access Capital One’s cloud infrastructure. She used server-side request forgery (SSRF) to obtain AWS credentials and extract data from S3 buckets.
The attacker, Paige Thompson, was caught after bragging about the hack on social media and Slack. She was convicted in 2022. Capital One paid $80 million in regulatory penalties.
Lesson: Cloud security misconfigurations create massive exposure. SSRF attacks against metadata endpoints remain common. Security monitoring should catch unusual data access patterns.
28. Sina Weibo (538 Million Accounts)
Date: March 2020
Records Exposed: 538 million accounts
Data Types: Names, usernames, genders, locations, phone numbers
Root Cause: Allegedly from a phone-to-account lookup feature
Sina Weibo (now just Weibo) is one of China’s largest social media platforms. An attacker sold the data for about $250, which suggests they didn’t understand its value.
Weibo initially claimed the data came from publicly available information through a phone number lookup feature. They later admitted that if users reused passwords, the exposed data could be used to compromise other accounts.
Lesson: Features that map phone numbers to accounts create scraping opportunities. Phone numbers in China are tied to real identities. That makes their exposure particularly sensitive.
29. Snowflake Campaign / Ticketmaster (560 Million Records)
Date: April-June 2024
Records Exposed: 560 million Ticketmaster customers, plus 160+ other organizations
Data Types: Names, emails, phone numbers, payment information (Ticketmaster); varied by victim
Root Cause: Stolen credentials from infostealer malware, no MFA on Snowflake accounts
A group tracked as UNC5537 used credentials stolen by infostealer malware to log into Snowflake cloud accounts belonging to over 160 companies. Most of these accounts had no MFA enabled. The attackers didn’t hack Snowflake itself. They logged in as customers using passwords that infostealers had harvested, some dating back to 2020.
Ticketmaster was the biggest victim. ShinyHunters listed 560 million customer records for sale at $500,000. Santander, Advance Auto Parts, Neiman Marcus, and LendingTree were also hit. Snowflake responded by making MFA mandatory for all accounts going forward. Hard to argue that shouldn’t have been the default from day one.
Lesson: Cloud platforms are only as secure as the credentials protecting them. Infostealer malware on one employee’s machine gave attackers the keys to 160+ companies. MFA on cloud accounts isn’t optional. Old stolen credentials stay dangerous for years.
30. AT&T Call Records (110 Million Customers)
Date: April 2024 (disclosed July 2024)
Records Exposed: ~110 million customers
Data Types: Call and text metadata (numbers called, call duration, cell tower IDs)
Root Cause: Stolen Snowflake credentials, no MFA
Part of the same Snowflake campaign as Ticketmaster. Attackers accessed AT&T’s Snowflake environment for 11 days in April 2024 and stole six months of call and text records for nearly all AT&T wireless customers.
The stolen data didn’t include names or Social Security numbers. But call metadata reveals who you talk to and how often, plus roughly where you are. That’s enough for surveillance or blackmail. AT&T paid $370,000 of a $1 million ransom demand in exchange for a video of the attackers deleting the data.
Lesson: Metadata is sensitive even without names attached. Call records reveal relationships and patterns that attackers can exploit. This was AT&T’s second major breach disclosure in 2024, and both traced back to credential theft.
What Do These Data Breach Examples Have in Common?
Thirty breaches, and the same mistakes keep showing up.
Credential monitoring is the continuous process of scanning dark web marketplaces and stealer logs for your organization’s exposed credentials. Breach compilations get scanned too. Monitoring lets security teams force password resets before attackers can use the leaked credentials to log in.
Stolen credentials caused many attacks. Whether through phishing or credential stuffing or infostealer malware, attackers logged in rather than breaking in. The Snowflake campaign showed how stolen credentials from a single infostealer infection can compromise 160+ companies. Monitoring for compromised credentials catches exposure before exploitation.
Detection took too long. The Marriott breach went undetected for four years. Yahoo attackers operated for three years. Equifax missed 76 days of data exfiltration. Every day of undetected access multiplies the damage.
Third-party relationships created exposure. Target and Home Depot were both compromised through vendors. MOVEit victims faced supply chain compromise at scale. Third-party risk management is essential because your security is only as strong as your weakest vendor.
Basic security failures caused catastrophic breaches. Unpatched vulnerabilities and misconfigured databases show up repeatedly. So do exposed credentials in code repositories and missing MFA. Advanced attacks are rare. Basic failures are common.
Data retention made things worse. MySpace’s 20 years of data made that breach worse. PowerSchool’s historical student records did the same. Adult Friend Finder’s “deleted” accounts were still sitting there when attackers found them. Companies kept data longer than necessary.
How Can You Prevent Data Breaches?
To prevent breaches, address the patterns that appear across these incidents.
Monitor for compromised credentials continuously. When employee credentials appear in third-party breaches or stealer logs, reset exposed passwords before they’re exploited. Compromised credential monitoring detects exposure across dark web sources.
Enforce MFA everywhere. Change Healthcare fell because a portal lacked MFA. The Snowflake campaign hit 160+ companies for the same reason. Microsoft reports that MFA blocks 99% of credential attacks. Prioritize phishing-resistant methods like FIDO2 for high-value accounts.
Patch vulnerabilities promptly. The Equifax breach exploited a two-month-old vulnerability. WannaCry exploited a 59-day-old patch. Establish patching SLAs that prioritize internet-facing systems and known-exploited vulnerabilities.
Segment networks and limit blast radius. Target’s HVAC vendor shouldn’t have had any path to payment systems. Assume breaches will happen and architect systems so compromising one component doesn’t expose everything.
Manage third-party risk actively. Assess vendor security before granting access. Monitor for vendor breaches that could affect your organization. Limit vendor access to only what’s necessary.
Minimize data retention. Don’t keep data you don’t need. Deleted accounts should actually be deleted. Historical records should have retention limits. Less data stored means less data exposed in a breach.
Detect threats faster. Years of undetected access turned many of these breaches from bad to catastrophic. Invest in detection that catches unusual activity before attackers reach their objectives.
Conclusion
Billions of exposed records. Billions of dollars in damages. But every incident on this list contains lessons you can apply today.
The same vulnerabilities show up repeatedly. Stolen credentials and unpatched systems cause breach after breach. So do misconfigured databases and third-party compromises. Teams that address these basics prevent the majority of incidents.
How fast you detect a breach controls the damage. Attackers who stay hidden for months or years cause far more damage than those caught quickly. Continuous monitoring for compromised credentials shortens the window of exposure. So does anomaly detection and misconfiguration scanning.
Every breach on this list was preventable with security controls that existed at the time. The real question is whether you roll out those controls before attackers exploit the gaps.
Check your organization’s dark web exposure to see if employee credentials are already circulating on criminal marketplaces.
Data Breach Examples FAQ
The biggest examples include Yahoo (3 billion accounts) and National Public Data (2.9 billion records). These breaches exposed passwords and Social Security numbers on a massive scale. Root causes ranged from unpatched vulnerabilities to misconfigured databases.
Credential theft is the most common type of data breach. IBM X-Force found that 30% of intrusions used valid credentials as the initial access vector. Phishing attacks and infostealer malware are the top sources. Third-party breaches also expose credentials when employees reuse passwords.
The main types are external attacks and insider threats. Accidental exposure through misconfigured databases is a third category. External attacks account for most breaches. But insider incidents often expose the most sensitive data because insiders already have access.
The CAM4 breach in 2020 exposed 10.88 billion records. That’s the largest breach by record count. The Yahoo breach in 2013 affected 3 billion accounts and is the largest by user impact. National Public Data’s 2024 breach exposed 2.9 billion records including Social Security numbers.
A data breach occurs when unauthorized parties access sensitive information. This includes external attackers breaking in and employees leaking data. Accidental exposure through misconfigured systems also counts. Even if data is accessed but not exfiltrated, it still qualifies as a breach and may trigger notification requirements.
Check if your email appears in known breaches using lookup tools. Watch for signs like unexpected password reset emails and unfamiliar login alerts. You can use dark web monitoring to detect when employee credentials appear in breaches or stealer logs.