This is a list of two things: (1) security incidents that involved exposed secrets, and (2) security research on secret detection.
-
2025-11-19: Secret detectors miss many real secrets due to false positive mitigations
Lewis Ardern from Semgrep details a number of cases where the false positive mitigations that secret detectors use cause them to miss live secrets. In particular, regular expressions that require nearby keywords or non-word-character anchors are significant culprits. Many of the secrets they found were exposed in public GitHub repositories, and found by no popular tool that they tried, including GitHub's own secret scanning.
-
2025-10-29: Ernst & Young accidentally expose secrets in 4TB database backup file
Researchers at Neo Security came across a 4TB SQL Server backup file in an Azure storage bucket. Through some investigation they discovered that the bucket belonged to Ernst & Young, and that it contained secrets.
-
2025-10-28: Tata Motors leaks AWS keys granting access to 70TB of data
Eaton Zveare discovered two AWS keys in the frontend Typescript code for a parts ordering website of Tata Motors. The keys granted access to hundreds of S3 buckets containing 70TB of sensitive information, including customer lists and invoices.
-
2025-07-31: Finding secrets with a combination of regex and a fine-tuned BERT-based model
SpecterOps shares details of DeepPass2, their second iteration of a system designed to detect hardcoded secrets, particularly freeform passwords. DeepPass2 combines regex-based secrets detection from Nosey Parker with a custom fine-tuned BERT-based language model (xlm-RoBERTa-base).
-
2025-07-29: The TP-Link Archer C50 Router Firmware Includes Hard-Coded Encryption Key
The end-of-life TP-Link Archer C50 router firmware embeds a key used for encrypting its configuration files. This hardcoded secret would allow an attacker access to admin credentials and wifi passwords.
-
2025-07-14: xAI API Key Leaked on GitHub
An API key allowing access to at least 52 xAI LLMs was exposed in a public GitHub repository. The person responsible, an employee at DOGE, had been granted access to sensitive databases at the U.S. Social Security Administration, the Treasury and Justice departments, and the Department of Homeland Security.
-
2025-06-10: Lean and Mean: How We Fine-Tuned a Small Language Model for Secret Detection in Code
Erez Harush and Daniel Lazarev of Wiz give details of a "small" language model (Llama 3.2 1B) that they fine-tuned for generic secrets detection. They note that secret detection using big hosted LLMs would be time- and cost-prohibitive at realistic scale. They share some details of their training data preparation: they used public files from GitHub, eliminated near duplicates, and used an ensemble of LLMs to detect and label likely secrets from within those files. Their end result is a smaller secrets detection model that can run in a few seconds per file on a regular CPU, and scores >80% on both precision and recall.
-
2025-04-22: How I made $64k from deleted files — a bug bounty story
Sharon Brizinov details how he ran a secrets detection campaign at scale against numerous bug bounty programs, netting $64k. His insight was to scan files in Git history that had been deleted, and were not readily accessible unless you went out of your way looking for them.
-
2024-10-29: Partial passwords for Colorado voting systems accidentally exposed in spreadsheet of Department of State website.
-
2024-10-20: Internet Archive compromised after hackers discover hardcoded Zendesk credentials in GitLab repositories.
A hacker found an exposed GitLab configuration file that included Git credentials, allowing them to access non-public source code, which included additional credentials. These credentials included Zendesk support system credentials that exposed more than 800k support tickets. This is the second time in October that the Internet Archive was hacked.
-
Idan Ben Ari at Cybernari use canary tokens to investigate how quickly tokens are detected and used when placed in various public places. The places include GitHub, GitLab, Bitbucket, DockerHub, HTTP servers, FTP servers, Pastebin, JSFiddle, PyPI, npm, S3, and GCS. Half the tokens were compromised, sometimes within seconds or minutes of publishing, though some places averaged days before compromise.
-
The token was found within a Python bytecode file (a .pyc file) and did not correspond to what was included in source code. More details are shared on the PyPI blog.
-
2024-06-23: Phantom Secrets: Undetected Secrets Expose Major Corporations
Yakir Kadkoda and Ilay Goldman of Aqua Security investigate how to find additional content in Git repositories that are missed when you do a simple
git clone. As part of this research they demonstrate several cases of real secrets that they found that had been missed previously, which allowed unintended access to several systems. -
2024-04-15: 50k smart locks from Chirp Systems vulnerable to remote unlock using hardcoded credentials
-
2024-04-14: Finding secrets in "lost" commits on GitLab
Richard Finlay Tweed discovered that by viewing GitLab's Events API, you can discover "lost" commit hashes from force push events, and with that, recover the content.
A code implementation of the idea is available as a project on GitHub.
-
2024-04-09: Microsoft exposes passwords in inadvertently exposed Azure storage bucket
-
2024-03-24: Ultimate guide to secrets in Lambda
AJ Stuyvenberg shares a detailed guide on various approaches to dealing with secrets in AWS Lambda.
-
2024-03-19: Misconfigured Firebase instances leak 19 million plaintext passwords
-
2024-03-07: Private links discovered to be publicly exposed in multiple URL-scanning services
-
2024-02-29: GitHub enables by default the blocking of pushes containing hardcoded credentials in public repositories
-
2024-02-29: GitLab releases an open-source tool to detect secrets exposed in videos
The tool, available on GitLab, runs an OCR system on each frame of a video, then performs approximate regex matching to detect secrets while accounting for errors in the OCR process.
-
2024-02-21: TruffleHog now avoids triggering known AWS Canary Tokens when performing credential validation
-
2024-02-17: BMW exposes secret access keys and internal data via Azure bucket misconfiguration.
-
2024-01-30: Twilio Segment integrates with secret scanning from GitHub, GitLab, and others
-
2024-01-26: Mercedes-Benz inadvertently exposes a privileged GitHub Enterprise token in a public GitHub repository
This was also discussed on Hacker News.
-
2024-01-17: Microsoft credentials for Toyota Tsusho Insurance Broker India leaked via insurance calculator website
A security researcher at Eaton found exposed Microsoft credentials for Toyota Tsusho Insurance Broker India on an insurance calculator website for Eicher Motors. The insurance calculator website allowed anyone to have it send an email. When this functionality was exploited, the HTTP response included server error log messages that exposed additional credentials. The researcher was able to access the sending email account, and found all previous emails that had been sent, comprising more than 25GiB of sensitive data.
-
2023-12-04: Cyber Av3ngers gang hacks industrial controllers across multiple US states
An Iranian threat group hacked into a Pennsylvanian water authority pump station controller, Pittsburgh’s Full Pint Beer brewery, 4 other utilities, and a public aquarium. The targets involved Unitronics PLCs, which were compromised using the default port (20256) and credentials (
1111). -
2023-12-04: Bar Lanyado from Lasso Security found more than 1500 valid exposed HuggingFace API tokens using the search functionality on GitHub and HuggingFace
Some of the tokens had write permission, including to high-profile models including Bloom, Meta-Llama, and Pythia. These leaked credentials could be used for several supply chain attacks, including backdooring models, poisoning training data, and stealing confidential models and data. This was additionally covered here.
-
2023-12-04: GitHub now scans discussions for secrets
This is an extension on their earlier changes to scan issues for secrets.
-
2023-11-21: The Ticking Supply Chain Attack Bomb of Exposed Kubernetes Secrets
Yakir Kadkoda and Assaf Morag of Aqua Security describe finding numerous valid credentials for Kubernetes clusters in public GitHub repositories. Such secrets go undetected by other tools, probably because they are base64-encoded in YAML files by default.
-
2023-11-14: All the Small Things: Azure CLI Leakage and Problematic Usage Patterns
Aviad Hahami of Palo Alto Networks writes up some simple but high-impact secret exposure vulnerabilities in GitHub Actions when using the
azure/loginaction. That action would leak the environment to the build log in certain cases, making secrets visible to anyone who would look at the log. -
2023-11-13: Uncovering thousands of unique secrets in PyPI packages
Tom Forbes follow up on his earlier independent research on finding AWS keys in PyPI packages, this time with GitGuardian, and found 768 provably live secrets.
-
2023-11-08: GitHub offers an LLM-powered secrets scanner with its Advanced Security package, currently in beta
-
2023-11-08: GitHub offers a generative AI-powered regex generator with its Advanced Security package, currently in beta
-
2023-10-26: GitHub scans NPM packages for exposed secrets
-
2023-08-17: PyPI is a GitHub Secrets Scanning partner, and automatically revokes exposed PyPI tokens that are found
In 15 months, PyPI has revoked over 500 exposed tokens.
-
2023-01-06: I scanned every package on PyPi and found 57 live AWS keys
Tom Forbes scanned every PyPI package for AWS keys, tested each one for validity, and found 57 live keys. Cleverly, the automation that he wrote runs in GitHub Actions, and automatically commits any found valid keypair to a public GitHub repository, causing the AWS/GitHub secret scanning integration to invalidate those credentials. This is a bold move, as it could break live applications. He released the application that runs in GitHub Actions under the MIT license. Additionally, here released another repository that has the contents of the PyPI JSON API for all packages, updated every 12 hours. This could be a useful launch point for other people investigating PyPI in bulk.