Author

Logan Cannan

Date of Award

2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Engineering

Committee Chair

Tommy Morris

Committee Member

Rhonda Gaede

Committee Member

David Coe

Committee Member

Tathagata Mukherjee

Committee Member

Leon Jololian

Committee Member

Bramwell Brizendine

Research Advisor

Tommy Morris

Subject(s)

Binary principle (Linguistics), Machine learning, Malware (Computer software)--Classification, Natural language processing (Computer science)

Abstract

Machine learning assisted binary analysis is an area of great interest in cybersecurity research. Training accurate machine learning models requires methods of binary lifting, which can require binaries to be translated through an intermediate language representation. This dissertation postulates that different intermediate language representations change the performance characteristics of these machine learning models. This dissertation takes a published machine learning frameworks as a control, modifies their input methodology to include different intermediate language representation transforms, and performs direct comparisons of model performance to ascertain bias in performance caused by different intermediate languages. This research enables the machine learning engineer, focused on binary analysis tasks, to build models with the knowledge on how characteristics of intermediate languages may be biasing output performance.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.