Using AI/ML to Find and Remediate Enterprise Secrets in Code & Document Sharing Platforms
Abstract: We introduce a new challenge to the software development community: 1) leveraging AI to accurately detect and flag up secrets in code and on popular document sharing platforms that frequently used by developers, such as Confluence and 2) automatically remediating the detections (e.g. by suggesting password vault functionality). This is a challenging, and mostly unaddressed task. Existing methods leverage heuristics and regular expressions, that can be very noisy, and therefore increase toil on developers. The next step - modifying code itself - to automatically remediate a detection, is a complex task. We introduce two baseline AI models that have good detection performance and propose an automatic mechanism for remediating secrets found in code, opening up the study of this task to the wider community.
- [n. d.]. https://docs.openrewrite.org/concepts-explanations/recipes
- A Comparative Study of Software Secrets Reporting by Secret Detection Tools. arXiv:2307.00714Â [cs.CR]
- Ryan Daws. 2020. Starbucks’ API key found in public github repository – reports. https://www.developer-tech.com/news/2020/jan/07/starbucks-api-key-found-public-github-repository-reports/
- gitleaks. 2023. gitleaks. https://github.com/gitleaks/gitleaks
- Aaron Loo. 2018. Yelp’s Secret Detector: Preventing Secrets in Source Code. https://engineeringblog.yelp.com/2018/06/yelps-secret-detector.html
- How bad can it git? characterizing secret leakage in public github repositories.. In NDSS.
- Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825–2830.
- Olivia Powell. 2023. Over 77,000 uber employee details leaked online. https://www.cshub.com/attacks/news/iotw-over-77000-uber-employee-details-leaked-in-data-breach
- Secrets in Source Code: Reducing False Positives using Machine Learning. In 2020 International Conference on COMmunication Systems & NETworkS (COMSNETS). 168–175. https://doi.org/10.1109/COMSNETS48256.2020.9027350
- trufflehog. 2023. trufflehog - Find leaked credentials. https://github.com/trufflesecurity/trufflehogy
- Zeljka Zorz. 2014. 10,000 github users inadvertently reveal their AWS Secret Access Keys. https://www.helpnetsecurity.com/2014/03/24/10000-github-users-inadvertently-reveal-their-aws-secret-access-keys/
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.