Tools to find similarity of various common procedures across compilers and versions

Researcher:

Categories:

Computer Science & Electrical Engineering

The Technology

We address the problem of finding similar procedures in stripped binaries (computer machine language executable artifacts). Previous solutions cannot find similarity between binaries compiled with different compilers (compiler = translator from high-level programmer written source code to machine code), or that hold some variation due to code patching. Previous approached that apply to this problem yield low accuracy and a high number of false matches (i.e. two procedures were flagged as similar, although they are not). This is mostly due to a syntactic approach (i.e. looking at the code’s form, and not it’s meaning).The novel technology is a computer implemented method of estimating a similarity of binary records comprising executable code, comprising converting a first binary record and a second binary record to a first intermediate representation (IR) and a second IR respectively, decomposing each of the first IR and the second IR to a plurality of strands which are partial dependent chains of program instructions, calculating a probability score for each of the plurality of strands of the first IR to have an equivalent counterpart in the second IR by comparing each strand of the first IR to one or more strands of the second IR, adjusting the probability score for each strand according to a significance value calculated for each strand and calculating a similarity score defining a functional similarity between the first IR and the second IR by aggregating the adjusted probability score of the plurality of strands.

Advantages

Higher accuracy as compared to existing solutions

Applications and Opportunities

Finding vulnerable code in binaries of unknown origins
Finding code clones to allow for code re-use
Finding plagiarism in source code

Business Development Contacts

Dr. Arkadiy Morgenshtein

Director of Business Development, ICT

Tools to find similarity of various common procedures across compilers and versions

Categories:

The Technology

Advantages

Applications and Opportunities

BECOME A MEMBER