Unbundle-Rewrite-Rebundle: Runtime Detection and Rewriting of Privacy-Harming Code in JavaScript Bundles
Abstract: This work presents Unbundle-Rewrite-Rebundle (URR), a system for detecting privacy-harming portions of bundled JavaScript code and rewriting that code at runtime to remove the privacy-harming behavior without breaking the surrounding code or overall application. URR is a novel solution to the problem of JavaScript bundles, where websites pre-compile multiple code units into a single file, making it impossible for content filters and ad-blockers to differentiate between desired and unwanted resources. Where traditional content filtering tools rely on URLs, URR analyzes the code at the AST level, and replaces harmful AST sub-trees with privacy-and-functionality maintaining alternatives. We present an open-sourced implementation of URR as a Firefox extension and evaluate it against JavaScript bundles generated by the most popular bundling system (Webpack) deployed on the Tranco 10k. We evaluate URR by precision (1.00), recall (0.95), and speed (0.43s per script) when detecting and rewriting three representative privacy-harming libraries often included in JavaScript bundles, and find URR to be an effective approach to a large-and-growing blind spot unaddressed by current privacy tools.
- 2009. Node.js. https://nodejs.org/en/
- 2010. npm | Home. https://www.npmjs.com/
- 2014a. Github Repository |||| webpack. https://github.com/webpack/webpack
- 2014b. NPM Registry |||| webpack. https://www.npmjs.com/package/webpack
- 2014c. webpack. https://webpack.js.org/
- 2015. Rollup | Rollup. https://rollupjs.org/
- 2018. Parcel - The zero configuration build tool for the web. https://parceljs.org
- 2020. Browserify. https://browserify.org/
- 2023a. Customer Case Studies | Fingerprint Device Intelligence Platform. https://fingerprint.com/case-studies/
- 2023b. @fingerprintjs/fingerprintjs. https://www.npmjs.com/package/@fingerprintjs/fingerprintjs
- 2023. Getting started with web-ext. https://extensionworkshop.com/documentation/develop/getting-started-with-web-ext/
- 2023. prebid.js. https://www.npmjs.com/package/prebid.js
- 2023. Puppeteer | Puppeteer. https://pptr.dev/ publisher: Google, Inc..
- 2023. @sentry/browser. https://www.npmjs.com/package/@sentry/browser
- 2023a. uBlock: Web Accessible Resources. https://github.com/gorhill/uBlock/tree/9123563895f0499849b4d85c4f95e1ed6ace2231/src/web_accessible_resources
- 2023b. uBlock: Web Accessible Resources: fingerprint3.js. https://github.com/gorhill/uBlock/blob/9123563895f0499849b4d85c4f95e1ed6ace2231/src/web_accessible_resources/fingerprint3.js
- acornjs. 2023. acorn: A small, fast, JavaScript-based JavaScript parser. https://github.com/acornjs/acorn
- Errors, Misunderstandings, and Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems. In Proceedings of the Internet Measurement Conference (Amsterdam, Netherlands) (IMC ’19). Association for Computing Machinery, New York, NY, USA, 230–244. https://doi.org/10.1145/3355369.3355588
- TrackerSift: untangling mixed tracking and functional web resources. In Proceedings of the 21st ACM Internet Measurement Conference (Virtual Event) (IMC ’21). Association for Computing Machinery, New York, NY, USA, 569–576. https://doi.org/10.1145/3487552.3487855
- Babel. 2023. The compiler for next generation JavaScript. https://babeljs.io/
- Leveraging Machine Learning to Improve Unwanted Resource Filtering (AISec ’14). Association for Computing Machinery, New York, NY, USA, 95–102. https://doi.org/10.1145/2666652.2666662
- Detecting Filter List Evasion with Event-Loop-Turn Granularity JavaScript Signatures. In 2021 IEEE Symposium on Security and Privacy (SP). 1715–1729. https://doi.org/10.1109/SP40001.2021.00007
- Oliver Dunk. 2023. Improving content filtering in Manifest V3. https://developer.chrome.com/blog/improvements-to-content-filtering-in-manifest-v3/
- EasyList Authors. 2023. EasyList. https://easylist.to/easylist/easylist.txt
- EasyPrivacy Authors. 2023. EasyPrivacy. https://easylist.to/easylist/easyprivacy.txt
- eyeo GmbH. 2023. AdBlock Plus: The world’s #1 free ad blocker. https://adblockplus.org/
- Tristan F. 2023. npm-rank: get popular npm packages. https://github.com/wooorm/npm-high-impact
- HideNoSeek: Camouflaging Malicious JavaScript in Benign ASTs. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS ’19). Association for Computing Machinery, New York, NY, USA, 1899–1913. https://doi.org/10.1145/3319535.3345656
- JStap: a static pre-filter for malicious JavaScript detection (ACSAC ’19). Association for Computing Machinery, New York, NY, USA, 257–269. https://doi.org/10.1145/3359789.3359813
- JaSt: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript. In Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA).
- FingerprintJS. 2023a. Browser fingerprinting library. https://github.com/fingerprintjs/fingerprintjs
- FingerprintJS. 2023b. Evade ad blockers. https://github.com/fingerprintjs/fingerprintjs/blob/master/docs/evade_ad_blockers.md
- FingerprintJS, Inc. 2023. The device intelligence platform. https://fingerprint.com
- Functional Software Inc. 2023. Sentry: Application Performance Monitoring & Error Tracking Software. https://sentry.io/welcome/
- Ad-blocking: A study on performance, privacy and counter-measures. In Proceedings of the ACM on Web Science Conference. 259–262.
- Quantifying web adblocker privacy. In European Symposium on Research in Computer Security (ESORICS). Springer, 21–42.
- Raymond Hill and Nik Rolls. 2023. uBlock Origin - Free, open-source ad content blocker. https://ublockorigin.com/
- AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. In 2020 IEEE Symposium on Security and Privacy (SP). 763–776. https://doi.org/10.1109/SP40000.2020.00005
- Andrei Kashcha. 2023. npmrank: npm dependencies graph metrics. https://github.com/anvaka/npmrank
- AutoFR: Automated Filter Rule Generation for Adblocking. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 7535–7552. https://www.usenix.org/conference/usenixsecurity23/presentation/le
- Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. In Proceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS 2019). https://doi.org/10.14722/ndss.2019.23386
- Knowing your enemy: understanding and detecting malicious web advertising. In Proceedings of the ACM conference on Computer and communications security (CCS). 674–686.
- Phish in Sheep’s Clothing: Exploring the Authentication Pitfalls of Browser Fingerprinting. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 1651–1668. https://www.usenix.org/conference/usenixsecurity22/presentation/lin-xu
- The ESTree Spec. https://github.com/estree/estree
- MDN. 2023a. storage. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/storage publisher: Mozilla.
- MDN. 2023b. webRequest.filterResponseData(). https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/filterResponseData publisher: Mozilla.
- Ralph C Merkle. 1987. A digital signature based on a conventional encryption function. In Conference on the theory and application of cryptographic techniques. Springer, 369–378.
- Block me if you can: A large-scale study of tracker-blocking tools. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 319–333.
- Statically Detecting JavaScript Obfuscation and Minification Techniques in the Wild. In 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 569–580. https://doi.org/10.1109/DSN48987.2021.00065
- npm/registry. 2023. package download counts. https://github.com/npm/registry/blob/master/docs/download-counts.md
- Prebid. 2023. A free and open source library for publishers to quickly implement header bidding. https://github.com/prebid/Prebid.js
- Prebid.org Inc. 2023. Boost Programmatic Advertising Revenue. https://prebid.org/
- Annoyed users: Ads and ad-block usage in the wild. In Proceedings of the Internet Measurement Conference (IMC). 93–106.
- Jeremy Rack and Cristian-Alexandru Staicu. 2023a. Jack-in-the-box: An Empirical Study of JavaScript Bundling on the Web and its Security Implications. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (Copenhagen, Denmark) (CCS ’23). Association for Computing Machinery, New York, NY, USA, 15 pages. https://doi.org/10.1145/3576915.3623140
- Jeremy Rack and Cristian-Alexandru Staicu. 2023b. Jack-in-the-box: An Empirical Study of JavaScript Bundling on the Web and its Security Implications. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (Copenhagen, Denmark) (CCS ’23). Association for Computing Machinery, New York, NY, USA, 3198–3212. https://doi.org/10.1145/3576915.3623140
- Sentry. 2023a. Analytics. https://develop.sentry.dev/analytics/
- Sentry. 2023b. Dealing with Ad-Blockers. https://docs.sentry.io/platforms/javascript/troubleshooting/#dealing-with-ad-blockers
- Sentry. 2023. Official Sentry SDKs for JavaScript. https://github.com/getsentry/sentry-javascript
- Blocked or broken? Automatically detecting when privacy interventions break websites. arXiv preprint arXiv:2203.03528 (2022).
- SugarCoat: Programmatically Generating Privacy-Preserving, Web-Compatible Resource Replacements for Content Blocking. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (Virtual Event, Republic of Korea) (CCS ’21). Association for Computing Machinery, New York, NY, USA, 2844–2857. https://doi.org/10.1145/3460120.3484578
- Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking. In Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems (Boston, MA, USA) (SIGMETRICS ’20). Association for Computing Machinery, New York, NY, USA, 75–76. https://doi.org/10.1145/3393691.3394228
- A system for detecting third-party tracking through the combination of dynamic analysis and static analysis. In IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). 1–6. https://doi.org/10.1109/INFOCOMWKSHPS51825.2021.9484564
- Terser. 2023. JavaScript mangler and compressor toolkit. https://terser.org/
- webpack. 2023. Stats Data. https://webpack.js.org/api/stats/
- Rob Wu. 2022. Manifest v3 in Firefox: Recap & Next Steps. https://blog.mozilla.org/addons/2022/05/18/manifest-v3-in-firefox-recap-next-steps/
- The dark alleys of madison avenue: Understanding malicious advertisements. In Proceedings of the Conference on Internet Measurement Conference (IMC). 373–380.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.