Papers
Topics
Authors
Recent
Search
2000 character limit reached

Reverse Engineering Structure and Semantics of Input of a Binary Executable

Published 22 May 2024 in cs.CR | (2405.14052v1)

Abstract: Knowledge of the input format of binary executables is important for finding bugs and vulnerabilities, such as generating data for fuzzing or manual reverse engineering. This paper presents an algorithm to recover the structure and semantic relations between fields of the input of binary executables using dynamic taint analysis. The algorithm improves upon prior work by not just partitioning the input into consecutive bytes representing values but also identifying syntactic components of structures, such as atomic fields of fixed and variable lengths, and different types of arrays, such as arrays of atomic fields, arrays of records, and arrays with variant records. It also infers the semantic relations between fields of a structure, such as count fields that specify the count of an array of records or offset fields that specify the start location of a variable-length field within the input data. The algorithm constructs a C/C++-like structure to represent the syntactic components and semantic relations. The algorithm was implemented in a prototype system named ByteRI 2.0. The system was evaluated using a controlled experiment with synthetic subject programs and real-world programs. The subject programs were created to accept a variety of input formats that mimic syntactic components and selected semantic relations found in conventional data formats, such as PE, PNG, ZIP, and CSV. The results show that ByteRI 2.0 correctly identifies the syntactic elements and their grammatical structure, as well as the semantic relations between the fields for both synthetic subject programs and real-world programs. The recovered structures, when used as a generator, produced valid data that was acceptable for all the synthetic subject programs and some of the real-world programs.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.