Notes on Secrets

This is a list of two things: (1) security incidents that involved exposed secrets, and (2) security research on secret detection.

2025-11-19: Secret detectors miss many real secrets due to false positive mitigations

Lewis Ardern from Semgrep details a number of cases where the false positive mitigations that secret detectors use cause them to miss live secrets. In particular, regular expressions that require nearby keywords or non-word-character anchors are significant culprits. Many of the secrets they found were exposed in public GitHub repositories, and found by no popular tool that they tried, including GitHub's own secret scanning.
2025-10-29: Ernst & Young accidentally expose secrets in 4TB database backup file

Researchers at Neo Security came across a 4TB SQL Server backup file in an Azure storage bucket. Through some investigation they discovered that the bucket belonged to Ernst & Young, and that it contained secrets.
2025-10-28: Tata Motors leaks AWS keys granting access to 70TB of data

Eaton Zveare discovered two AWS keys in the frontend Typescript code for a parts ordering website of Tata Motors. The keys granted access to hundreds of S3 buckets containing 70TB of sensitive information, including customer lists and invoices.
2025-07-31: Finding secrets with a combination of regex and a fine-tuned BERT-based model

SpecterOps shares details of DeepPass2, their second iteration of a system designed to detect hardcoded secrets, particularly freeform passwords. DeepPass2 combines regex-based secrets detection from Nosey Parker with a custom fine-tuned BERT-based language model (xlm-RoBERTa-base).
2025-07-29: The TP-Link Archer C50 Router Firmware Includes Hard-Coded Encryption Key

The end-of-life TP-Link Archer C50 router firmware embeds a key used for encrypting its configuration files. This hardcoded secret would allow an attacker access to admin credentials and wifi passwords.
2025-07-14: xAI API Key Leaked on GitHub

An API key allowing access to at least 52 xAI LLMs was exposed in a public GitHub repository. The person responsible, an employee at DOGE, had been granted access to sensitive databases at the U.S. Social Security Administration, the Treasury and Justice departments, and the Department of Homeland Security.
2025-06-10: Lean and Mean: How We Fine-Tuned a Small Language Model for Secret Detection in Code

Erez Harush and Daniel Lazarev of Wiz give details of a "small" language model (Llama 3.2 1B) that they fine-tuned for generic secrets detection. They note that secret detection using big hosted LLMs would be time- and cost-prohibitive at realistic scale. They share some details of their training data preparation: they used public files from GitHub, eliminated near duplicates, and used an ensemble of LLMs to detect and label likely secrets from within those files. Their end result is a smaller secrets detection model that can run in a few seconds per file on a regular CPU, and scores >80% on both precision and recall.
2025-04-22: How I made $64k from deleted files — a bug bounty story

Sharon Brizinov details how he ran a secrets detection campaign at scale against numerous bug bounty programs, netting $64k. His insight was to scan files in Git history that had been deleted, and were not readily accessible unless you went out of your way looking for them.
2024-10-29: Partial passwords for Colorado voting systems accidentally exposed in spreadsheet of Department of State website.
2024-10-20: Internet Archive compromised after hackers discover hardcoded Zendesk credentials in GitLab repositories.

A hacker found an exposed GitLab configuration file that included Git credentials, allowing them to access non-public source code, which included additional credentials. These credentials included Zendesk support system credentials that exposed more than 800k support tickets. This is the second time in October that the Internet Archive was hacked.
2024-08-15: Palo Alto Network's Unit 42 researchers discover extortion campaign that gains initial access to victim networks by finding exposed secrets in accidentally exposed .env files.
2024-08-08: What's the worst place to leave your secrets? Research into what happens to AWS credentials that are left in public places.

Idan Ben Ari at Cybernari use canary tokens to investigate how quickly tokens are detected and used when placed in various public places. The places include GitHub, GitLab, Bitbucket, DockerHub, HTTP servers, FTP servers, Pastebin, JSFiddle, PyPI, npm, S3, and GCS. Half the tokens were compromised, sometimes within seconds or minutes of publishing, though some places averaged days before compromise.
2024-07-24: American Megatrends International leaks its Secure Boot platform key in a public GitHub repository, affecting over 900 models.
2024-07-08: Researchers at JFrog find a privileged GitHub access token for the PyPI, Python, and Python Software Foundation organizations.

The token was found within a Python bytecode file (a .pyc file) and did not correspond to what was included in source code. More details are shared on the PyPI blog.
2024-06-23: Phantom Secrets: Undetected Secrets Expose Major Corporations

Yakir Kadkoda and Ilay Goldman of Aqua Security investigate how to find additional content in Git repositories that are missed when you do a simple git clone. As part of this research they demonstrate several cases of real secrets that they found that had been missed previously, which allowed unintended access to several systems.
2024-04-15: 50k smart locks from Chirp Systems vulnerable to remote unlock using hardcoded credentials
2024-04-14: Finding secrets in "lost" commits on GitLab

Richard Finlay Tweed discovered that by viewing GitLab's Events API, you can discover "lost" commit hashes from force push events, and with that, recover the content.

A code implementation of the idea is available as a project on GitHub.
2024-04-09: Microsoft exposes passwords in inadvertently exposed Azure storage bucket
2024-04-01: Samsung inadvertently exposes a privileged GitHub Enterprise token in a Jenkins instance that was unintentionally exposed to the Internet
2024-03-24: Ultimate guide to secrets in Lambda

AJ Stuyvenberg shares a detailed guide on various approaches to dealing with secrets in AWS Lambda.
2024-03-19: Misconfigured Firebase instances leak 19 million plaintext passwords
2024-03-07: Private links discovered to be publicly exposed in multiple URL-scanning services
2024-02-29: GitHub enables by default the blocking of pushes containing hardcoded credentials in public repositories
2024-02-29: GitLab releases an open-source tool to detect secrets exposed in videos

The tool, available on GitLab, runs an OCR system on each frame of a video, then performs approximate regex matching to detect secrets while accounting for errors in the OCR process.
2024-02-21: TruffleHog now avoids triggering known AWS Canary Tokens when performing credential validation
2024-02-17: BMW exposes secret access keys and internal data via Azure bucket misconfiguration.
2024-02-01: Football Australia hardcodes AWS keypairs into its HTML source, granting access to 127 S3 buckets containing PII, source code, and other non-public data
2024-01-30: Twilio Segment integrates with secret scanning from GitHub, GitLab, and others
2024-01-26: Mercedes-Benz inadvertently exposes a privileged GitHub Enterprise token in a public GitHub repository

This was also discussed on Hacker News.
2024-01-17: Microsoft credentials for Toyota Tsusho Insurance Broker India leaked via insurance calculator website

A security researcher at Eaton found exposed Microsoft credentials for Toyota Tsusho Insurance Broker India on an insurance calculator website for Eicher Motors. The insurance calculator website allowed anyone to have it send an email. When this functionality was exploited, the HTTP response included server error log messages that exposed additional credentials. The researcher was able to access the sending email account, and found all previous emails that had been sent, comprising more than 25GiB of sensitive data.
2023-12-04: Cyber Av3ngers gang hacks industrial controllers across multiple US states

An Iranian threat group hacked into a Pennsylvanian water authority pump station controller, Pittsburgh’s Full Pint Beer brewery, 4 other utilities, and a public aquarium. The targets involved Unitronics PLCs, which were compromised using the default port (20256) and credentials (1111).
2023-12-04: Bar Lanyado from Lasso Security found more than 1500 valid exposed HuggingFace API tokens using the search functionality on GitHub and HuggingFace

Some of the tokens had write permission, including to high-profile models including Bloom, Meta-Llama, and Pythia. These leaked credentials could be used for several supply chain attacks, including backdooring models, poisoning training data, and stealing confidential models and data. This was additionally covered here.
2023-12-04: GitHub now scans discussions for secrets

This is an extension on their earlier changes to scan issues for secrets.
2023-11-21: The Ticking Supply Chain Attack Bomb of Exposed Kubernetes Secrets

Yakir Kadkoda and Assaf Morag of Aqua Security describe finding numerous valid credentials for Kubernetes clusters in public GitHub repositories. Such secrets go undetected by other tools, probably because they are base64-encoded in YAML files by default.
2023-11-14: All the Small Things: Azure CLI Leakage and Problematic Usage Patterns

Aviad Hahami of Palo Alto Networks writes up some simple but high-impact secret exposure vulnerabilities in GitHub Actions when using the azure/login action. That action would leak the environment to the build log in certain cases, making secrets visible to anyone who would look at the log.
2023-11-13: Uncovering thousands of unique secrets in PyPI packages

Tom Forbes follow up on his earlier independent research on finding AWS keys in PyPI packages, this time with GitGuardian, and found 768 provably live secrets.
2023-11-08: GitHub offers an LLM-powered secrets scanner with its Advanced Security package, currently in beta
2023-11-08: GitHub offers a generative AI-powered regex generator with its Advanced Security package, currently in beta
2023-10-26: GitHub scans NPM packages for exposed secrets
2023-08-17: PyPI is a GitHub Secrets Scanning partner, and automatically revokes exposed PyPI tokens that are found

In 15 months, PyPI has revoked over 500 exposed tokens.
2023-01-06: I scanned every package on PyPi and found 57 live AWS keys

Tom Forbes scanned every PyPI package for AWS keys, tested each one for validity, and found 57 live keys. Cleverly, the automation that he wrote runs in GitHub Actions, and automatically commits any found valid keypair to a public GitHub repository, causing the AWS/GitHub secret scanning integration to invalidate those credentials. This is a bold move, as it could break live applications. He released the application that runs in GitHub Actions under the MIT license. Additionally, here released another repository that has the contents of the PyPI JSON API for all packages, updated every 12 hours. This could be a useful launch point for other people investigating PyPI in bulk.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Notes on Secrets

About

Uh oh!

Releases

Packages

bradlarsen/notes-on-secrets

Folders and files

Latest commit

History

Repository files navigation

Notes on Secrets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages