Papers
Topics
Authors
Recent
Search
2000 character limit reached

Unbundle-Rewrite-Rebundle: Runtime Detection and Rewriting of Privacy-Harming Code in JavaScript Bundles

Published 1 May 2024 in cs.CR | (2405.00596v3)

Abstract: This work presents Unbundle-Rewrite-Rebundle (URR), a system for detecting privacy-harming portions of bundled JavaScript code and rewriting that code at runtime to remove the privacy-harming behavior without breaking the surrounding code or overall application. URR is a novel solution to the problem of JavaScript bundles, where websites pre-compile multiple code units into a single file, making it impossible for content filters and ad-blockers to differentiate between desired and unwanted resources. Where traditional content filtering tools rely on URLs, URR analyzes the code at the AST level, and replaces harmful AST sub-trees with privacy-and-functionality maintaining alternatives. We present an open-sourced implementation of URR as a Firefox extension and evaluate it against JavaScript bundles generated by the most popular bundling system (Webpack) deployed on the Tranco 10k. We evaluate URR by precision (1.00), recall (0.95), and speed (0.43s per script) when detecting and rewriting three representative privacy-harming libraries often included in JavaScript bundles, and find URR to be an effective approach to a large-and-growing blind spot unaddressed by current privacy tools.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. 2009. Node.js. https://nodejs.org/en/
  2. 2010. npm | Home. https://www.npmjs.com/
  3. 2014a. Github Repository |||| webpack. https://github.com/webpack/webpack
  4. 2014b. NPM Registry |||| webpack. https://www.npmjs.com/package/webpack
  5. 2014c. webpack. https://webpack.js.org/
  6. 2015. Rollup | Rollup. https://rollupjs.org/
  7. 2018. Parcel - The zero configuration build tool for the web. https://parceljs.org
  8. 2020. Browserify. https://browserify.org/
  9. 2023a. Customer Case Studies | Fingerprint Device Intelligence Platform. https://fingerprint.com/case-studies/
  10. 2023b. @fingerprintjs/fingerprintjs. https://www.npmjs.com/package/@fingerprintjs/fingerprintjs
  11. 2023. Getting started with web-ext. https://extensionworkshop.com/documentation/develop/getting-started-with-web-ext/
  12. 2023. prebid.js. https://www.npmjs.com/package/prebid.js
  13. 2023. Puppeteer | Puppeteer. https://pptr.dev/ publisher: Google, Inc..
  14. 2023. @sentry/browser. https://www.npmjs.com/package/@sentry/browser
  15. 2023a. uBlock: Web Accessible Resources. https://github.com/gorhill/uBlock/tree/9123563895f0499849b4d85c4f95e1ed6ace2231/src/web_accessible_resources
  16. 2023b. uBlock: Web Accessible Resources: fingerprint3.js. https://github.com/gorhill/uBlock/blob/9123563895f0499849b4d85c4f95e1ed6ace2231/src/web_accessible_resources/fingerprint3.js
  17. acornjs. 2023. acorn: A small, fast, JavaScript-based JavaScript parser. https://github.com/acornjs/acorn
  18. Errors, Misunderstandings, and Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems. In Proceedings of the Internet Measurement Conference (Amsterdam, Netherlands) (IMC ’19). Association for Computing Machinery, New York, NY, USA, 230–244. https://doi.org/10.1145/3355369.3355588
  19. TrackerSift: untangling mixed tracking and functional web resources. In Proceedings of the 21st ACM Internet Measurement Conference (Virtual Event) (IMC ’21). Association for Computing Machinery, New York, NY, USA, 569–576. https://doi.org/10.1145/3487552.3487855
  20. Babel. 2023. The compiler for next generation JavaScript. https://babeljs.io/
  21. Leveraging Machine Learning to Improve Unwanted Resource Filtering (AISec ’14). Association for Computing Machinery, New York, NY, USA, 95–102. https://doi.org/10.1145/2666652.2666662
  22. Detecting Filter List Evasion with Event-Loop-Turn Granularity JavaScript Signatures. In 2021 IEEE Symposium on Security and Privacy (SP). 1715–1729. https://doi.org/10.1109/SP40001.2021.00007
  23. Oliver Dunk. 2023. Improving content filtering in Manifest V3. https://developer.chrome.com/blog/improvements-to-content-filtering-in-manifest-v3/
  24. EasyList Authors. 2023. EasyList. https://easylist.to/easylist/easylist.txt
  25. EasyPrivacy Authors. 2023. EasyPrivacy. https://easylist.to/easylist/easyprivacy.txt
  26. eyeo GmbH. 2023. AdBlock Plus: The world’s #1 free ad blocker. https://adblockplus.org/
  27. Tristan F. 2023. npm-rank: get popular npm packages. https://github.com/wooorm/npm-high-impact
  28. HideNoSeek: Camouflaging Malicious JavaScript in Benign ASTs. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS ’19). Association for Computing Machinery, New York, NY, USA, 1899–1913. https://doi.org/10.1145/3319535.3345656
  29. JStap: a static pre-filter for malicious JavaScript detection (ACSAC ’19). Association for Computing Machinery, New York, NY, USA, 257–269. https://doi.org/10.1145/3359789.3359813
  30. JaSt: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript. In Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA).
  31. FingerprintJS. 2023a. Browser fingerprinting library. https://github.com/fingerprintjs/fingerprintjs
  32. FingerprintJS. 2023b. Evade ad blockers. https://github.com/fingerprintjs/fingerprintjs/blob/master/docs/evade_ad_blockers.md
  33. FingerprintJS, Inc. 2023. The device intelligence platform. https://fingerprint.com
  34. Functional Software Inc. 2023. Sentry: Application Performance Monitoring & Error Tracking Software. https://sentry.io/welcome/
  35. Ad-blocking: A study on performance, privacy and counter-measures. In Proceedings of the ACM on Web Science Conference. 259–262.
  36. Quantifying web adblocker privacy. In European Symposium on Research in Computer Security (ESORICS). Springer, 21–42.
  37. Raymond Hill and Nik Rolls. 2023. uBlock Origin - Free, open-source ad content blocker. https://ublockorigin.com/
  38. AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. In 2020 IEEE Symposium on Security and Privacy (SP). 763–776. https://doi.org/10.1109/SP40000.2020.00005
  39. Andrei Kashcha. 2023. npmrank: npm dependencies graph metrics. https://github.com/anvaka/npmrank
  40. AutoFR: Automated Filter Rule Generation for Adblocking. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 7535–7552. https://www.usenix.org/conference/usenixsecurity23/presentation/le
  41. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. In Proceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS 2019). https://doi.org/10.14722/ndss.2019.23386
  42. Knowing your enemy: understanding and detecting malicious web advertising. In Proceedings of the ACM conference on Computer and communications security (CCS). 674–686.
  43. Phish in Sheep’s Clothing: Exploring the Authentication Pitfalls of Browser Fingerprinting. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 1651–1668. https://www.usenix.org/conference/usenixsecurity22/presentation/lin-xu
  44. The ESTree Spec. https://github.com/estree/estree
  45. MDN. 2023a. storage. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/storage publisher: Mozilla.
  46. MDN. 2023b. webRequest.filterResponseData(). https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/filterResponseData publisher: Mozilla.
  47. Ralph C Merkle. 1987. A digital signature based on a conventional encryption function. In Conference on the theory and application of cryptographic techniques. Springer, 369–378.
  48. Block me if you can: A large-scale study of tracker-blocking tools. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 319–333.
  49. Statically Detecting JavaScript Obfuscation and Minification Techniques in the Wild. In 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 569–580. https://doi.org/10.1109/DSN48987.2021.00065
  50. npm/registry. 2023. package download counts. https://github.com/npm/registry/blob/master/docs/download-counts.md
  51. Prebid. 2023. A free and open source library for publishers to quickly implement header bidding. https://github.com/prebid/Prebid.js
  52. Prebid.org Inc. 2023. Boost Programmatic Advertising Revenue. https://prebid.org/
  53. Annoyed users: Ads and ad-block usage in the wild. In Proceedings of the Internet Measurement Conference (IMC). 93–106.
  54. Jeremy Rack and Cristian-Alexandru Staicu. 2023a. Jack-in-the-box: An Empirical Study of JavaScript Bundling on the Web and its Security Implications. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (Copenhagen, Denmark) (CCS ’23). Association for Computing Machinery, New York, NY, USA, 15 pages. https://doi.org/10.1145/3576915.3623140
  55. Jeremy Rack and Cristian-Alexandru Staicu. 2023b. Jack-in-the-box: An Empirical Study of JavaScript Bundling on the Web and its Security Implications. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (Copenhagen, Denmark) (CCS ’23). Association for Computing Machinery, New York, NY, USA, 3198–3212. https://doi.org/10.1145/3576915.3623140
  56. Sentry. 2023a. Analytics. https://develop.sentry.dev/analytics/
  57. Sentry. 2023b. Dealing with Ad-Blockers. https://docs.sentry.io/platforms/javascript/troubleshooting/#dealing-with-ad-blockers
  58. Sentry. 2023. Official Sentry SDKs for JavaScript. https://github.com/getsentry/sentry-javascript
  59. Blocked or broken? Automatically detecting when privacy interventions break websites. arXiv preprint arXiv:2203.03528 (2022).
  60. SugarCoat: Programmatically Generating Privacy-Preserving, Web-Compatible Resource Replacements for Content Blocking. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (Virtual Event, Republic of Korea) (CCS ’21). Association for Computing Machinery, New York, NY, USA, 2844–2857. https://doi.org/10.1145/3460120.3484578
  61. Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking. In Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems (Boston, MA, USA) (SIGMETRICS ’20). Association for Computing Machinery, New York, NY, USA, 75–76. https://doi.org/10.1145/3393691.3394228
  62. A system for detecting third-party tracking through the combination of dynamic analysis and static analysis. In IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). 1–6. https://doi.org/10.1109/INFOCOMWKSHPS51825.2021.9484564
  63. Terser. 2023. JavaScript mangler and compressor toolkit. https://terser.org/
  64. webpack. 2023. Stats Data. https://webpack.js.org/api/stats/
  65. Rob Wu. 2022. Manifest v3 in Firefox: Recap & Next Steps. https://blog.mozilla.org/addons/2022/05/18/manifest-v3-in-firefox-recap-next-steps/
  66. The dark alleys of madison avenue: Understanding malicious advertisements. In Proceedings of the Conference on Internet Measurement Conference (IMC). 373–380.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 2 likes about this paper.