Papers
Topics
Authors
Recent
Search
2000 character limit reached

Using AI/ML to Find and Remediate Enterprise Secrets in Code & Document Sharing Platforms

Published 3 Jan 2024 in cs.SE and cs.AI | (2401.01754v1)

Abstract: We introduce a new challenge to the software development community: 1) leveraging AI to accurately detect and flag up secrets in code and on popular document sharing platforms that frequently used by developers, such as Confluence and 2) automatically remediating the detections (e.g. by suggesting password vault functionality). This is a challenging, and mostly unaddressed task. Existing methods leverage heuristics and regular expressions, that can be very noisy, and therefore increase toil on developers. The next step - modifying code itself - to automatically remediate a detection, is a complex task. We introduce two baseline AI models that have good detection performance and propose an automatic mechanism for remediating secrets found in code, opening up the study of this task to the wider community.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. [n. d.]. https://docs.openrewrite.org/concepts-explanations/recipes
  2. A Comparative Study of Software Secrets Reporting by Secret Detection Tools. arXiv:2307.00714 [cs.CR]
  3. Ryan Daws. 2020. Starbucks’ API key found in public github repository – reports. https://www.developer-tech.com/news/2020/jan/07/starbucks-api-key-found-public-github-repository-reports/
  4. gitleaks. 2023. gitleaks. https://github.com/gitleaks/gitleaks
  5. Aaron Loo. 2018. Yelp’s Secret Detector: Preventing Secrets in Source Code. https://engineeringblog.yelp.com/2018/06/yelps-secret-detector.html
  6. How bad can it git? characterizing secret leakage in public github repositories.. In NDSS.
  7. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825–2830.
  8. Olivia Powell. 2023. Over 77,000 uber employee details leaked online. https://www.cshub.com/attacks/news/iotw-over-77000-uber-employee-details-leaked-in-data-breach
  9. Secrets in Source Code: Reducing False Positives using Machine Learning. In 2020 International Conference on COMmunication Systems & NETworkS (COMSNETS). 168–175. https://doi.org/10.1109/COMSNETS48256.2020.9027350
  10. trufflehog. 2023. trufflehog - Find leaked credentials. https://github.com/trufflesecurity/trufflehogy
  11. Zeljka Zorz. 2014. 10,000 github users inadvertently reveal their AWS Secret Access Keys. https://www.helpnetsecurity.com/2014/03/24/10000-github-users-inadvertently-reveal-their-aws-secret-access-keys/

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.