Data Breach Examples: The Biggest Security Incidents

Data Breach Examples: The Biggest Security Incidents

Learn from the worst data breaches in history and what your security team can do to prevent similar incidents.

• Stolen credentials caused many of the biggest breaches in history. If attackers can log in, they don’t need to break in
• Misconfigured databases and unpatched vulnerabilities caused the largest breaches by record count. These are basic failures, not advanced attacks
• Detection gaps let attackers stay inside networks for months or years. Every day of undetected access multiplies the damage
• Credential theft and unpatched systems keep causing breaches from 2008 to 2025. The basics still aren’t covered

Over 343 billion credentials have been leaked and are now circulating on the dark web. Behind that number are real incidents where real organizations failed to protect sensitive data. The 2025 IBM Cost of a Data Breach Report found that breaches now cost an average of $4.44 million.

What’s worse is that many of these breaches were preventable. The same root causes appear again and again: unpatched vulnerabilities and misconfigured databases. Stolen credentials enabled many more. Teams keep making the same mistakes because they don’t learn from past incidents.

Studying data breach examples isn’t just academic exercise. It’s how security teams identify blind spots in their own defenses. When you understand how Yahoo lost 3 billion accounts or how Equifax exposed 147 million Americans, you can spot similar vulnerabilities in your environment.

This guide breaks down 30 famous data breaches, from massive data breach incidents like Yahoo and CAM4 to smaller incidents that changed security practices. For each one, you’ll learn what happened and what lessons apply to your organization today.

What Is a Data Breach?

Security teams deal with data breaches constantly, but the definition matters for legal and regulatory purposes.

A data breach is a security incident where unauthorized parties access sensitive information. Breaches expose personal data like names and passwords. Social Security numbers and financial records are common targets too. The exposed data ends up on dark web marketplaces where attackers use it to target your organization.

There’s an important distinction between breaches and leaks. A breach involves unauthorized access, meaning someone bypassed security controls. A leak typically means data was exposed accidentally, like a misconfigured database. An exposure means data was publicly accessible, even if no one malicious found it.

These distinctions matter because they affect regulatory requirements and notification obligations. Remediation steps differ too. But for practical purposes, any sensitive data ending up where it shouldn’t be is bad news.

Understanding why breaches happen is more useful than debating definitions. Let’s look at the root causes.

What Are the Most Common Causes of Data Breaches?

The same vulnerabilities appear in breach after breach. Attackers don’t need zero-days when basic security failures keep working.

Stolen credentials cause more breaches than any other vector. The Verizon 2025 DBIR found that 88% of breaches involved stolen or weak credentials. Attackers get passwords from phishing and infostealer malware. Third-party breaches expose them too. When employees reuse passwords across services, one breach leads to many more.

Unpatched vulnerabilities gave attackers entry in some of the biggest breaches on record. The Equifax breach exploited a known Apache Struts vulnerability that had a patch available for two months. WannaCry ransomware spread through a Windows vulnerability Microsoft had patched 59 days earlier.

Misconfigured systems expose data without any hacking required. Elasticsearch and MongoDB databases get left open to the internet. Cloud storage buckets get set to public access. Security groups allow traffic from anywhere. The CAM4 breach happened because of an exposed Elasticsearch cluster with no authentication.

Phishing attacks harvest credentials directly from employees who enter them on fake login pages. Advanced phishing kits capture session tokens in real-time, bypassing MFA. IBM’s Cost of a Data Breach Report found phishing-originated breaches cost an average of $4.8 million.

Insider threats account for fewer incidents but often expose the most sensitive data. Insiders already have legitimate access, so they don’t trigger the same detection alerts as external attackers. The Alibaba/Taobao breach came from a developer at a consulting firm who scraped customer data for eight months.

Third-party compromises let attackers reach organizations through their vendors. The Target breach started with a compromised HVAC vendor. The MOVEit attacks in 2023 affected over 60 million people through a single supply chain vulnerability.

Now let’s examine specific data breach incidents. Each one offers lessons for security teams.

What Are the Biggest Data Breaches in History?

We’ve organized these data breach incidents by scale. First, the largest data breaches in history affecting over 1 billion records. Then major data breaches between 100 million and 1 billion. We finish with cyber security breaches that changed how teams approach protection.

Mega Breaches: 1 Billion+ Records Exposed

These are the biggest data breaches in history by sheer volume of exposed data.

1. CAM4 (10.88 Billion Records)

Date: March 2020 Records Exposed: 10.88 billion Data Types: Names, emails, sexual orientations, chat logs, payment information, device info, IP addresses Root Cause: Misconfigured Elasticsearch database

CAM4 is an adult streaming platform that left an Elasticsearch database exposed without authentication. Security researchers discovered 7 terabytes of data sitting open on the internet. CAM4 took the database offline within 30 minutes of notification, but the exposure window is unknown.

The breach included extremely sensitive data. Chat transcripts and sexual orientations create serious blackmail risks for victims. Payment logs add sextortion potential. Criminals regularly exploit this type of data. Despite being the largest breach by record count, there’s no confirmed evidence the data was exploited before it was discovered.

Lesson: Asset inventory and configuration management matter. You can’t protect databases you don’t know exist. Elasticsearch clusters need authentication enabled by default. You also need processes to detect internet-exposed assets.

2. Yahoo (3 Billion Accounts)

Date: August 2013 (disclosed 2016) Records Exposed: 3 billion accounts Data Types: Names, emails, phone numbers, dates of birth, hashed passwords, security questions Root Cause: State-sponsored attack

Yahoo initially disclosed a breach affecting 1 billion accounts. Ten months later, they updated that number to 3 billion, meaning every Yahoo account that existed. The attackers were inside Yahoo’s network for three years before being discovered.

The timing was terrible. Yahoo was in the middle of selling its core business to Verizon for $4.8 billion. After the breach disclosure, that price dropped by $350 million. Two Russian intelligence officers and two attackers were later indicted.

This was actually Yahoo’s second major breach. A separate 2014 incident exposed 500 million accounts. State-sponsored attackers manufactured session cookies to bypass authentication entirely.

Lesson: Detecting threats matters as much as preventing them. Attackers were inside Yahoo’s network for years. Continuous monitoring for anomalous activity could have shortened that dwell time dramatically. Also, MFA and proper session management could have prevented cookie-based authentication bypass.

3. National Public Data (2.9 Billion Records)

Date: December 2023 (disclosed August 2024) Records Exposed: 2.9 billion Data Types: Names, Social Security numbers, addresses, dates of birth, phone numbers Root Cause: Unknown (company filed for bankruptcy)

National Public Data was a background check company that aggregated personal information from public sources and sold access to businesses. An attacker calling themselves “USDoD” initially listed the data for sale at $3.5 million. When no buyer emerged, another actor leaked the full dataset for free.

The breach exposed SSNs for potentially hundreds of millions of Americans. Even deceased individuals were included. The data makes identity theft and fraud possible at massive scale. With this data, targeted social engineering is trivial. National Public Data filed for Chapter 11 bankruptcy in October 2024, partly due to litigation from the breach.

Lesson: Data aggregators create concentrated risk. When background check companies and credit bureaus get breached, the impact is catastrophic because they hold so much data in one place. Check how your third-party vendors handle sensitive data.

4. Aadhaar (1.1 Billion Records)

Date: 2018-2019 (multiple incidents) Records Exposed: 1.1 billion Data Types: Names, Aadhaar numbers (national ID), addresses, phone numbers, bank details Root Cause: Exposed API endpoint

Aadhaar is India’s biometric identification system, covering over 90% of the population. The 12-digit ID numbers function like Social Security numbers for banking and utilities. Government services require them too.

A state-owned gas company called Indane left an API endpoint exposed that connected to the Aadhaar database. The API used a hardcoded access token. When decoded, the token literally translated to “INDAADHAARSECURESTATUS.” No rate limiting meant attackers could enumerate all possible Aadhaar numbers.

The Indian government initially denied the breach despite multiple security researchers confirming the exposure. The data was being sold for approximately $7 on WhatsApp. It took months for the vulnerable API to be taken offline despite multiple notifications.

Lesson: API security requires authentication and rate limiting. Monitoring is essential too. Hardcoded tokens are a critical vulnerability. Government denial doesn’t make breaches disappear. It just delays remediation while attackers exploit the data.

5. Alibaba/Taobao (1.1 Billion Records)

Date: November 2019 (disclosed 2021) Records Exposed: 1.1 billion Data Types: Usernames, phone numbers Root Cause: Web scraping via developer with elevated access

A software developer at a consulting company that provided services to Alibaba built a custom web crawler. The crawler scraped user data from Taobao, Alibaba’s massive e-commerce platform, for eight months before detection.

In China, phone numbers are tied to real identities because registration requires valid government ID. This makes phone number exposure particularly dangerous. The scraped data makes account takeover and social engineering easier. Doxxing is another risk.

The developer and a colleague received three-year prison sentences. While passwords weren’t exposed, the combination of usernames and phone numbers provides enough information for targeted attacks.

Lesson: Monitor contractors the same way you monitor employees. Elevated API access should trigger additional logging and anomaly detection. Eight months of scraping means nobody was watching data access patterns.

Major Data Breaches: 100 Million to 1 Billion Records

These famous data breaches affected hundreds of millions of people and reshaped how data breach companies handle security.

6. LinkedIn (700 Million Users)

Date: June 2021 Records Exposed: 700 million profiles Data Types: Email addresses, phone numbers, physical addresses, professional history, geolocation data Root Cause: API scraping

An attacker using the name “TomLiner” scraped approximately 90% of LinkedIn’s user base by exploiting the platform’s API. LinkedIn initially claimed this wasn’t a breach because no private data was accessed. But the scraped data included information users reasonably expected to be protected.

This was actually LinkedIn’s second major incident. In 2012, attackers breached LinkedIn and stole 164 million credentials. The passwords were hashed with SHA1 without salting, making them easy to crack.

Lesson: API rate limiting and abuse detection are essential. Just because data is technically accessible doesn’t mean scraping at scale is acceptable. The distinction between “breach” and “scraping” matters less to affected users than the exposure itself.

7. Facebook (533 Million Users)

Date: April 2019 Records Exposed: 533 million profiles Data Types: Phone numbers, Facebook IDs, names, locations, birthdates, email addresses Root Cause: Contact importer feature exploitation

Attackers exploited Facebook’s contact importer feature, which was designed to help users find friends. By uploading massive sets of phone numbers, attackers could see which numbers matched Facebook accounts. They then queried those accounts to extract profile information.

Facebook didn’t notify users about the exposure because, as they put it, the data had been “widely distributed” and couldn’t be identified to notify specific users. The vulnerability was patched in August 2019.

Lesson: Features designed for user convenience often create security risks. The contact importer was legitimate functionality weaponized at scale. Abuse detection needs to catch automated exploitation of otherwise normal features.

8. Marriott/Starwood (500 Million Records)

Date: 2014-2018 (disclosed November 2018) Records Exposed: 500 million guest records Data Types: Names, addresses, phone numbers, email, passport numbers, payment card details, Starwood Preferred Guest numbers Root Cause: Compromised Starwood network (pre-acquisition)

Attackers compromised the Starwood hotel chain’s reservation system in 2014. When Marriott acquired Starwood in 2016, they inherited the compromised network without knowing it. The breach wasn’t discovered until September 2018 when Marriott’s security tool flagged suspicious database queries.

For four years, attackers had access to guest records. The breach included encrypted payment card data, but the encryption keys may have been compromised too. Passport numbers were also exposed, creating identity theft risks beyond typical breaches.

UK regulators fined Marriott £18.4 million. The US Federal Trade Commission required Marriott to implement a full security overhaul.

Lesson: M&A security due diligence must include thorough security assessments. Marriott inherited Starwood’s breach because they didn’t detect the existing compromise. Any acquisition should include breach detection as part of integration planning.

9. Yahoo (500 Million Accounts)

Date: Late 2014 (disclosed September 2016) Records Exposed: 500 million accounts Data Types: Names, email addresses, phone numbers, dates of birth, hashed passwords, security questions Root Cause: State-sponsored attack using forged cookies

This is Yahoo’s other major breach, separate from the 2013 incident. Attackers stole Yahoo’s cryptographic keys used to sign session cookies. With those keys, they could mint valid authentication tokens for any account without knowing the password. They specifically targeted accounts of interest to Russian intelligence, including journalists and government officials.

Two FSB officers and two criminal attackers were indicted. One was arrested in Canada and extradited to the US. The others remain in Russia.

Lesson: Session management is as critical as password security. Cookie-based attacks bypass password controls entirely. Session tokens need proper expiration and secure storage. Anomaly detection for unusual access patterns adds another layer.

10. Adult Friend Finder (412 Million Accounts)

Date: October 2016 Records Exposed: 412 million accounts across six databases Data Types: Usernames, email addresses, passwords (SHA1 hashed), dates of last visits, browser information Root Cause: Local file inclusion vulnerability

FriendFinder Networks operated several adult dating sites. Attackers exploited a local file inclusion vulnerability that allowed code execution on the web server. They extracted 20 years of data from Adultfriendfinder.com and Cams.com, among other properties.

Over 15 million “deleted” accounts were still in the database. Passwords were hashed with SHA1, which is trivial to crack. The sensitive nature of the sites made the breach particularly damaging for victims facing extortion risks.

Lesson: Data retention policies matter. Keeping 20 years of data, including “deleted” accounts, massively expanded the breach impact. Weak password hashing (SHA1 without salt) meant exposed credentials were easily cracked.

11. MySpace (360 Million Accounts)

Date: 2008 (disclosed 2016) Records Exposed: 360 million accounts Data Types: Email addresses, usernames, SHA1 hashed passwords Root Cause: Unknown

This breach happened around 2008 but wasn’t publicly known until 2016 when the data appeared for sale on the dark web. The seller asked for six bitcoins, about $3,000 at the time.

MySpace’s password hashing was particularly weak. They only hashed the first ten characters of passwords, converted to lowercase, with no salt. This made cracking trivial.

Lesson: Even defunct platforms create ongoing risk. Users who reused their MySpace passwords on other services remained vulnerable years later. Proper password hashing with unique salts is essential, and there’s no statute of limitations on breach discovery.

12. Equifax (147 Million Records)

Date: May-July 2017 (disclosed September 2017) Records Exposed: 147 million Americans Data Types: Names, Social Security numbers, birth dates, addresses, driver’s license numbers Root Cause: Unpatched Apache Struts vulnerability (CVE-2017-5638)

Equifax is one of the three major credit bureaus. Attackers exploited a vulnerability in Apache Struts that had been patched two months earlier. An expired SSL certificate prevented Equifax’s intrusion detection from seeing the data exfiltration for 76 days.

The breach exposed sensitive financial data for nearly half of all American adults. Equifax’s response made things worse. Their breach notification website initially directed users to a fake phishing site. Congressional hearings followed. Executives departed. Costs exceeded $1.38 billion.

For a deeper analysis, see our full Equifax data breach case study.

Lesson: Patching delays have catastrophic consequences. Two months of exposure to a known vulnerability led to one of the worst breaches in history. Security basics like certificate management and network segmentation would have limited the damage.

13. Change Healthcare (100+ Million Records)

Date: February 2024 Records Exposed: 100+ million (potentially one-third of Americans) Data Types: Health records, insurance claims, Social Security numbers, financial information Root Cause: Compromised credentials, lack of MFA

Change Healthcare processes insurance claims for most of American healthcare. ALPHV/BlackCat ransomware affiliates used stolen credentials to access a Citrix portal that lacked multi-factor authentication.

The attack disrupted healthcare operations nationwide. Pharmacies couldn’t process prescriptions. Providers couldn’t submit claims. UnitedHealth Group, Change Healthcare’s parent company, paid a $22 million ransom. They later reported the breach may have affected 100 million people.

Lesson: MFA on remote access is non-negotiable. A single portal without MFA led to one of the largest healthcare breaches ever. Critical infrastructure needs security controls that match its importance.

14. Experian/Court Ventures (200 Million Records)

Date: 2013 (disclosed 2014) Records Exposed: 200 million Data Types: Names, addresses, Social Security numbers, dates of birth, financial information Root Cause: Inadequate acquisition due diligence

A Vietnamese national named Hieu Minh Ngo purchased access to Experian’s databases through Court Ventures, a data broker Experian had acquired. Ngo posed as a private investigator in Singapore to gain access. He ran an identity theft service selling SSN lookups for years before being caught.

This exposed a major flaw in data broker acquisitions. Experian inherited Court Ventures’ lax customer vetting practices without proper security review. Ngo was eventually sentenced to 13 years in prison.

Lesson: Vet who gets access to your data. Experian inherited Court Ventures’ compromised access controls without proper review. Any company selling access to sensitive data needs strict customer vetting.

15. Adobe (153 Million Records)

Date: October 2013 Records Exposed: 153 million user records Data Types: Email addresses, encrypted passwords, password hints, usernames Root Cause: Network intrusion and poor password storage

Attackers breached Adobe’s network and stole both customer data and source code for products including Acrobat and ColdFusion. The password storage was particularly problematic. Adobe used 3DES encryption instead of proper hashing. Many passwords shared the same encryption key.

Password hints were stored in plaintext alongside the encrypted passwords. Security researchers could often guess passwords by analyzing the hints. Poor cryptographic choices amplified the damage.

Lesson: Password hashing matters. Encryption is not a substitute for proper password hashing with unique salts. Storing password hints in plaintext defeats the purpose of encrypting passwords.

16. eBay (145 Million Records)

Date: February-March 2014 (disclosed May 2014) Records Exposed: 145 million users Data Types: Names, encrypted passwords, email addresses, physical addresses, phone numbers, dates of birth Root Cause: Compromised employee credentials

Attackers compromised a small number of employee login credentials and used that access to reach the user database. The breach went undetected for over two months. eBay was criticized for the slow disclosure timeline. They discovered the breach in late February but didn’t notify users until May.

Financial information was stored separately and wasn’t compromised. But the combination of names and addresses enabled identity theft. Birthdates made targeted phishing easier too.

Lesson: Employee credential monitoring is critical. Attackers used valid credentials to access systems without triggering alerts. Faster detection and disclosure protects users from extended exposure windows.

17. Canva (137 Million Records)

Date: May 2019 Records Exposed: 137 million user accounts Data Types: Usernames, email addresses, names, cities, countries, bcrypt-hashed passwords Root Cause: Unknown vulnerability exploited by GnosticPlayers

An attacker known as “GnosticPlayers” breached the Australian design platform and attempted to sell the data. Canva detected the attack in progress and interrupted it before all data was exfiltrated. They forced password resets for affected users.

The passwords were hashed with bcrypt, a strong algorithm that makes cracking difficult. Some users who authenticated via Google had their tokens exposed. Canva’s rapid response limited the damage.

Lesson: Strong cryptographic protections keep users safe even when breaches occur. Canva’s use of bcrypt meant most passwords remained secure. Detecting attacks in progress allows faster response.

18. Heartland Payment Systems (130 Million Cards)

Date: 2008 (disclosed January 2009) Records Exposed: 130 million credit and debit card numbers Data Types: Card numbers, expiration dates, cardholder names Root Cause: SQL injection leading to malware installation

Attackers used SQL injection to install malware on Heartland’s payment processing systems. The malware captured card data as it passed through for processing. At the time, this was the largest payment card breach ever recorded.

Albert Gonzalez, who also orchestrated the TJX breach, was convicted and sentenced to 20 years in federal prison. It pushed the industry toward end-to-end encryption for payment processing.

Lesson: Input validation prevents SQL injection. Payment systems require encryption of data in transit, not just at rest. This breach accelerated adoption of PCI DSS security standards.

Notable Cyber Security Breaches That Changed Practices

These breaches may have exposed fewer records, but they changed how security teams approach protection.

19. MOVEit (60+ Million Affected)

Date: May-June 2023 Records Exposed: 60+ million across 2,500+ organizations Data Types: Varied by victim (PII, financial data, health records) Root Cause: Zero-day SQL injection in MOVEit Transfer (CVE-2023-34362)

The Cl0p ransomware gang exploited a zero-day vulnerability in Progress Software’s MOVEit file transfer application. They compromised hundreds of organizations simultaneously. Government agencies were hit. So were banks and healthcare providers.

This was a supply chain attack at scale. Organizations that used MOVEit became victims even though their own security wasn’t breached. British Airways was hit. So were BBC and Shell. Numerous US government agencies were affected too.

Lesson: Supply chain security requires continuous vendor assessment. Even if your security is solid, your vendors can expose you. Know what software your organization runs. Monitor it for vulnerabilities.

20. Target (40 Million Payment Cards)

Date: November-December 2013 Records Exposed: 40 million payment cards, 70 million customer records Data Types: Payment card numbers, CVVs, PINs, names, addresses, phone numbers, email addresses Root Cause: Third-party HVAC vendor compromise

Attackers compromised Fazio Mechanical Services, an HVAC vendor that had network access for billing purposes. From there, they moved laterally to Target’s point-of-sale systems and deployed memory-scraping malware.

Target’s security team in India actually detected the intrusion and alerted US staff. The alerts were ignored. The breach was eventually discovered when the Secret Service notified Target that stolen cards were being used for fraud.

For the complete story, see our Target data breach analysis.

Lesson: Third-party access requires network segmentation. An HVAC vendor shouldn’t have any path to payment systems. Alert fatigue leads to ignored warnings. Security teams need processes to prioritize and act on detections.

21. Home Depot (56 Million Payment Cards)

Date: April-September 2014 Records Exposed: 56 million payment cards, 53 million email addresses Data Types: Payment card numbers, customer email addresses Root Cause: Stolen vendor credentials and custom malware

Attackers used credentials stolen from a third-party vendor to access Home Depot’s network. They deployed custom point-of-sale malware across self-checkout systems in US and Canadian stores. The malware operated for five months before detection.

Home Depot’s breach came just months after Target’s. Despite the Target incident making headlines, Home Depot hadn’t implemented many of the same controls that could have prevented a similar attack.

See our detailed Home Depot data breach breakdown.

Lesson: Industry breaches should trigger immediate security assessments. Home Depot had months of warning from Target’s incident but didn’t adequately respond. Self-checkout systems represented a different attack surface than traditional registers.

22. T-Mobile (37 Million Customers)

Date: Late 2022 - January 2023 Records Exposed: 37 million current customers Data Types: Names, billing addresses, email, phone numbers, dates of birth, account numbers Root Cause: Exploited API vulnerability

This was T-Mobile’s eighth major breach since 2018. Attackers exploited a vulnerability in one of T-Mobile’s APIs to extract customer data. The breach began in late November 2022 and wasn’t detected until January 2023.

Previous T-Mobile breaches included a 2021 incident affecting 76 million people and a 2020 breach targeting employee email accounts.

Lesson: Repeated breaches indicate systemic security failures. Securing APIs means locking down authentication and watching for abuse. Companies that experience multiple breaches need to rethink their security architecture.

23. 23andMe (7 Million Profiles)

Date: October 2023 Records Exposed: 6.9 million genetic profiles Data Types: Genetic ancestry data, names, birth years, family tree information Root Cause: Credential stuffing attack

Attackers used credentials leaked in other breaches to access 23andMe accounts. Because 23andMe’s “DNA Relatives” feature lets users share information with genetic matches, compromising one account exposed data from many relatives.

The genetic data included ancestry percentages and ethnic background. Health predispositions were exposed too. This information can’t be changed like a password. That makes genetic breaches uniquely dangerous.

Lesson: Credential stuffing attacks exploit password reuse across services. Features that share data between users amplify breach impact. Genetic and biometric data creates unique, permanent risks.

24. PowerSchool (Unknown Millions)

Date: December 2024 - January 2025 Records Exposed: Millions of students and teachers (exact number unknown) Data Types: Names, addresses, Social Security numbers, medical records, grades, disciplinary records Root Cause: Compromised support credentials

PowerSchool provides student information systems to 16,000 school districts. Attackers used compromised credentials to access the PowerSource customer support portal, then used a maintenance tool to export student and teacher data.

The breach exposed decades of historical data, including records of students who graduated years ago. Some districts reported exposure of medical information and Social Security numbers. Disciplinary records leaked too.

Lesson: Support portals and administrative tools need the same security as production systems. Maintenance access that can bulk export data requires additional controls and monitoring. Keeping historical data makes breaches worse.

25. AT&T (73 Million Customers)

Date: 2019 (disclosed 2024) Records Exposed: 73 million current and former customers Data Types: Names, email addresses, mailing addresses, phone numbers, Social Security numbers, dates of birth Root Cause: Unknown

AT&T initially denied the data was theirs when it appeared on the dark web in 2021. In March 2024, they confirmed the breach after security researchers demonstrated the data was authentic. The breach included Social Security numbers for 65 million people.

AT&T reset passcodes for 7.6 million current customers as a precaution. The source of the breach remains publicly unknown.

Lesson: Breach investigation and disclosure take time, but denial delays user protection. Build processes to quickly validate whether leaked data is authentic.

26. Uber (57 Million Users)

Date: October 2016 (disclosed November 2017) Records Exposed: 57 million riders and drivers Data Types: Names, email addresses, phone numbers, driver’s license numbers (for drivers) Root Cause: Exposed AWS credentials in GitHub repository

Attackers found AWS credentials in a private GitHub repository used by Uber engineers. They used those credentials to access an Amazon S3 bucket containing rider and driver information.

Instead of disclosing the breach, Uber paid the attackers $100,000 through their bug bounty program and had them sign NDAs. This cover-up led to criminal charges against Uber’s former security chief, who was convicted of obstruction.

Lesson: Secret sprawl in code repositories is a common vulnerability. Credential scanning and secrets management are essential. Covering up breaches creates legal liability and damages trust more than disclosure.

27. Capital One (106 Million Records)

Date: March-July 2019 Records Exposed: 106 million credit card applications Data Types: Names, addresses, phone numbers, email, dates of birth, income, credit scores, Social Security numbers Root Cause: Misconfigured web application firewall

A former Amazon Web Services employee exploited a misconfigured firewall to access Capital One’s cloud infrastructure. She used server-side request forgery (SSRF) to obtain AWS credentials and extract data from S3 buckets.

The attacker, Paige Thompson, was caught after bragging about the hack on social media and Slack. She was convicted in 2022. Capital One paid $80 million in regulatory penalties.

Lesson: Cloud security misconfigurations create massive exposure. SSRF attacks against metadata endpoints remain common. Security monitoring should detect unusual data access patterns.

28. Sina Weibo (538 Million Accounts)

Date: March 2020 Records Exposed: 538 million accounts Data Types: Names, usernames, genders, locations, phone numbers Root Cause: Allegedly from a phone-to-account lookup feature

Sina Weibo (now just Weibo) is one of China’s largest social media platforms. An attacker sold the data for approximately $250, suggesting they didn’t fully understand its value.

Weibo initially claimed the data came from publicly available information through a phone number lookup feature. They later admitted that if users reused passwords, the exposed data could be used to compromise other accounts.

Lesson: Features that map phone numbers to accounts create scraping opportunities. Phone numbers in China are tied to real identities, making their exposure particularly sensitive.

29. NetEase (235 Million Accounts)

Date: October 2015 Records Exposed: 235 million accounts Data Types: Email addresses, plaintext passwords Root Cause: Unknown vulnerability in email services

NetEase operates popular Chinese email services. Attackers exploited unknown vulnerabilities in their 163.com and 126.com webmail applications. Unlike most breaches, the passwords were stored in plaintext.

NetEase has consistently denied the breach occurred despite numerous users confirming their actual passwords appeared in the leaked data.

Lesson: Plaintext password storage is inexcusable. Even denied breaches create real-world risk. Users should assume their credentials are compromised and use unique passwords everywhere.

30. Dubsmash (162 Million Accounts)

Date: December 2018 Records Exposed: 162 million accounts Data Types: Usernames, email addresses, hashed passwords, geolocations, phone numbers Root Cause: Unknown

Dubsmash was a video messaging app that got breached and had its data sold alongside 15 other hacked apps on the Dream Market dark web marketplace. The passwords were hashed with PBKDF2, a strong algorithm.

Dubsmash acknowledged the breach but never explained how it happened. Reddit purchased Dubsmash in 2020 and subsequently shut it down.

Lesson: Strong hashing protects passwords but doesn’t prevent the breach itself. App acquisitions should include security assessments. Users lose protection when acquired services shut down.

What Do These Data Breach Examples Have in Common?

Looking across all these data breach incidents, clear patterns emerge that security teams can use to prioritize defenses.

Credential monitoring is the continuous process of scanning dark web marketplaces and stealer logs for your organization’s exposed credentials. Breach compilations get scanned too. Monitoring lets security teams force password resets before attackers can use the leaked credentials to log in.

Stolen credentials enabled many attacks. Whether through phishing or credential stuffing, attackers logged in rather than breaking in. Third-party breaches exposed passwords too. Monitoring for compromised credentials catches exposure before exploitation.

Detection took too long. The Marriott breach went undetected for four years. Yahoo attackers operated for three years. Equifax missed 76 days of data exfiltration. Every day of undetected access multiplies the damage.

Third-party relationships created exposure. Target and Home Depot were both compromised through vendors. MOVEit victims faced supply chain compromise at scale. Third-party risk management is essential because your security is only as strong as your weakest vendor.

Basic security failures caused catastrophic breaches. Unpatched vulnerabilities and misconfigured databases appear repeatedly. So do exposed credentials in code repositories and missing MFA. Advanced attacks are rare. Basic failures are common.

Data retention amplified impact. MySpace’s 20 years of data made that breach worse. PowerSchool’s historical student records did the same. Adult Friend Finder’s “deleted” accounts were still there when attackers found them. Companies kept data longer than necessary.

How Can You Prevent Data Breaches?

To prevent breaches, address the patterns that appear across these incidents.

Monitor for compromised credentials continuously. When employee credentials appear in third-party breaches or stealer logs, reset exposed passwords before they’re exploited. Compromised credential monitoring detects exposure across dark web sources.

Enforce MFA everywhere. The Change Healthcare breach happened because a portal lacked MFA. Microsoft reports that MFA stops 99% of credential attacks. Prioritize phishing-resistant methods like FIDO2 for high-value accounts.

Patch vulnerabilities promptly. The Equifax breach exploited a two-month-old vulnerability. WannaCry exploited a 59-day-old patch. Establish patching SLAs that prioritize internet-facing systems and known-exploited vulnerabilities.

Segment networks and limit blast radius. Target’s HVAC vendor shouldn’t have had any path to payment systems. Assume breaches will happen and architect systems so compromising one component doesn’t expose everything.

Manage third-party risk actively. Assess vendor security before granting access. Monitor for vendor breaches that could affect your organization. Limit vendor access to only what’s necessary.

Minimize data retention. Don’t keep data you don’t need. Deleted accounts should actually be deleted. Historical records should have retention limits. Less data stored means less data exposed in a breach.

Detect threats faster. Years of undetected access turned many of these breaches from bad to catastrophic. Invest in detection capabilities that catch anomalous activity before attackers achieve their objectives.

Conclusion

These 30 data breach examples represent some of the worst data breaches ever recorded. Billions of exposed records and billions of dollars in damages. But every incident contains lessons that security teams can apply today.

The same vulnerabilities appear repeatedly. Stolen credentials and unpatched systems cause breach after breach. So do misconfigured databases and third-party compromises. Teams that address these basics prevent the majority of incidents.

Detection speed determines impact. Attackers who stay hidden for months or years cause far more damage than those caught quickly. Continuous monitoring for compromised credentials shortens the window of exposure. So does anomaly detection and misconfiguration scanning.

Prevention is possible. Every breach in this list was preventable with security controls that existed at the time. The question is whether you implement those controls before attackers exploit the gaps.

Check your organization’s dark web exposure to see if employee credentials are already circulating on criminal marketplaces.

Data Breach Examples FAQ

The biggest examples include Yahoo (3 billion accounts) and National Public Data (2.9 billion records). These breaches exposed passwords and Social Security numbers on a massive scale. Root causes ranged from unpatched vulnerabilities to misconfigured databases.

Credential theft is the most common type of data breach. IBM X-Force found that 30% of intrusions used valid credentials as the initial access vector. Phishing attacks and infostealer malware are the top sources. Third-party breaches also expose credentials when employees reuse passwords.

The main types are external attacks and insider threats. Accidental exposure through misconfigured databases is a third category. External attacks account for most breaches. But insider incidents often expose the most sensitive data because insiders already have access.

The CAM4 breach in 2020 exposed 10.88 billion records, making it the largest breach by record count. The Yahoo breach in 2013 affected 3 billion accounts, making it the largest by user impact. National Public Data’s 2024 breach exposed 2.9 billion records including Social Security numbers.

A data breach occurs when unauthorized parties access sensitive information. This includes external attackers breaking in and employees leaking data. Accidental exposure through misconfigured systems also counts. Even if data is accessed but not exfiltrated, it still qualifies as a breach and may trigger notification requirements.

Check if your email appears in known breaches using lookup tools. Watch for signs like unexpected password reset emails and unfamiliar login alerts. You can use dark web monitoring to detect when employee credentials appear in breaches or stealer logs.

Related Articles