HDNA: A graph-based change detection in HTML pages(Deface Attack Detection)
Abstract: In this paper, a new approach called HDNA (HTML DNA) is introduced for analyzing and comparing Document Object Model (DOM) trees in order to detect differences in HTML pages. This method assigns an identifier to each HTML page based on its structure, which proves to be particularly useful for detecting variations caused by server-side updates, user interactions or potential security risks. The process involves preprocessing the HTML content generating a DOM tree and calculating the disparities between two or more trees. By assigning weights to the nodes valuable insights about their hierarchical importance are obtained. The effectiveness of the HDNA approach has been demonstrated in identifying changes in DOM trees even when dynamically generated content is involved. Not does this method benefit web developers, testers, and security analysts by offering a deeper understanding of how web pages evolve. It also helps ensure the functionality and performance of web applications. Additionally, it enables detection and response to vulnerabilities that may arise from modifications in DOM structures. As the web ecosystem continues to evolve HDNA proves to be a tool, for individuals engaged in web development, testing, or security analysis.
- D. Chakrabarti and R. R. Mehta, “The paths more taken: matching DOM trees to search logs for accurate webpage clustering,” in Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, M. Rappa, P. Jones, J. Freire, and S. Chakrabarti, Eds. ACM, 2010, pp. 211–220. [Online]. Available: https://doi.org/10.1145/1772690.1772713
- Y. Zhou, Y. Sheng, N. Vo, N. Edmonds, and S. Tata, “Simplified DOM trees for transferable attribute extraction from the web,” CoRR, vol. abs/2101.02415, 2021. [Online]. Available: https://arxiv.org/abs/2101.02415
- A. Biørn-Hansen, T. A. Majchrzak, and T.-M. Grønli, “Progressive web apps for the unified development of mobile applications,” in Web Information Systems and Technologies, T. A. Majchrzak, P. Traverso, K.-H. Krempels, and V. Monfort, Eds. Cham: Springer International Publishing, 2018, pp. 64–86.
- A. Petukhov and D. D. Kozlov, “Detecting security vulnerabilities in web applications using dynamic analysis with penetration testing,” 2008. [Online]. Available: https://api.semanticscholar.org/CorpusID:1063530
- Ó. Soto-Sánchez, M. Maes-Bermejo, M. Gallego, and F. Gortázar, “A dataset of regressions in web applications detected by end-to-end tests,” Software Quality Journal, vol. 30, no. 2, pp. 425–454, Jun 2022. [Online]. Available: https://doi.org/10.1007/s11219-021-09566-x
- I. Bluemke and A. Malanowska, “Software testing effort estimation and related problems: A systematic literature review,” ACM Comput. Surv., vol. 54, no. 3, apr 2021. [Online]. Available: https://doi.org/10.1145/3442694
- C. Klammer and R. Ramler, “A journey from manual testing to automated test generation in an industry project,” in 2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), 2017, pp. 591–592.
- M. Nooraei Abadeh, “Genetic-based web regression testing: an ontology-based multi-objective evolutionary framework to auto-regression testing of web applications,” Service Oriented Computing and Applications, vol. 15, no. 1, pp. 55–74, Mar 2021. [Online]. Available: https://doi.org/10.1007/s11761-020-00312-y
- A. R. Pai, G. Joshi, and S. Rane, “Quality and reliability studies in software defect management: a literature review,” International Journal of Quality & Reliability Management, vol. 38, no. 10, pp. 2007–2033, Jan 2021. [Online]. Available: https://doi.org/10.1108/IJQRM-07-2019-0235
- Z. Alkhalil, C. Hewage, L. F. Nawaf, and I. A. Khan, “Phishing attacks: A recent comprehensive study and a new anatomy,” in Frontiers of Computer Science, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:232144884
- M. Humayun, M. Niazi, N. Z. Jhanjhi, M. Alshayeb, and S. Mahmood, “Cyber security threats and vulnerabilities: A systematic mapping study,” Arabian Journal for Science and Engineering, vol. 45, no. 4, pp. 3171–3189, Apr 2020. [Online]. Available: https://doi.org/10.1007/s13369-019-04319-2
- S. Buro and I. Mastroeni, “Abstract code injection,” in Verification, Model Checking, and Abstract Interpretation, I. Dillig and J. Palsberg, Eds. Cham: Springer International Publishing, 2018, pp. 116–137.
- A. K. Priyanka and S. Sai Smruthi, “Web application vulnerabilities: Exploitation and prevention,” in 2020 International Conference on Electrotechnical Complexes and Systems (ICOECS), 2020, pp. 1–5.
- M. Denis, C. Zena, and T. Hayajneh, “Penetration testing: Concepts, attack methods, and defense strategies,” in 2016 IEEE Long Island Systems, Applications and Technology Conference (LISAT), 2016, pp. 1–6.
- R. S. Devi and M. M. Kumar, “Testing for security weakness of web applications using ethical hacking,” in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184), 2020, pp. 354–361.
- S. Kumar, R. Mahajan, N. Kumar, and S. K. Khatri, “A study on web application security and detecting security vulnerabilities,” in 2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2017, pp. 451–455.
- N. Albalawi, N. Alamrani, R. Aloufi, M. Albalawi, A. Aljaedi, and A. R. Alharbi, “The reality of internet infrastructure and services defacement: A second look at characterizing web-based vulnerabilities,” Electronics, vol. 12, no. 12, 2023. [Online]. Available: https://www.mdpi.com/2079-9292/12/12/2664
- F. Cremer, B. Sheehan, M. Fortmann, A. N. Kia, M. Mullins, F. Murphy, and S. Materne, “Cyber risk and cybersecurity: a systematic review of data availability,” The Geneva Papers on Risk and Insurance - Issues and Practice, vol. 47, no. 3, pp. 698–736, Jul 2022. [Online]. Available: https://doi.org/10.1057/s41288-022-00266-6
- E. Hehir, M. Zeller, J. Luckhurst, and T. Chandler, “Developing student connectedness under remote learning using digital resources: A systematic review,” Education and Information Technologies, vol. 26, no. 5, pp. 6531–6548, Sep 2021. [Online]. Available: https://doi.org/10.1007/s10639-021-10577-1
- S. S. N. Challapalli, P. Kaushik, S. Suman, B. D. Shivahare, V. Bibhu, and A. D. Gupta, “Web development and performance comparison of web development technologies in node.js and python,” in 2021 International Conference on Technological Advancements and Innovations (ICTAI), 2021, pp. 303–307.
- F. Palomino, F. Paz, and A. Moquillaza, “Web analytics for user experience: A systematic literature review,” in Design, User Experience, and Usability: UX Research and Design, M. M. Soares, E. Rosenzweig, and A. Marcus, Eds. Cham: Springer International Publishing, 2021, pp. 312–326.
- H. Choi and S. Sim, “A study on efficiency of markup language using dom tree,” Wireless Personal Communications, vol. 86, no. 1, pp. 143–163, Jan 2016. [Online]. Available: https://doi.org/10.1007/s11277-015-3057-z
- M. Rabinovich, M. Stern, and D. Klein, “Abstract syntax networks for code generation and semantic parsing,” CoRR, vol. abs/1704.07535, 2017. [Online]. Available: http://arxiv.org/abs/1704.07535
- Z. Zheng, P. Zhao, G. Long, F. Zhu, K. Zhu, W. Zhao, L. Diao, J. Yang, and W. Lin, “Fusionstitching: Boosting memory intensive computations for deep learning workloads,” CoRR, vol. abs/2009.10924, 2020. [Online]. Available: https://arxiv.org/abs/2009.10924
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.