Date of Award
2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Engineering
Committee Chair
Tommy Morris
Committee Member
Rhonda Gaede
Committee Member
David Coe
Committee Member
Tathagata Mukherjee
Committee Member
Leon Jololian
Committee Member
Bramwell Brizendine
Research Advisor
Tommy Morris
Subject(s)
Binary principle (Linguistics), Machine learning, Malware (Computer software)--Classification, Natural language processing (Computer science)
Abstract
Machine learning assisted binary analysis is an area of great interest in cybersecurity research. Training accurate machine learning models requires methods of binary lifting, which can require binaries to be translated through an intermediate language representation. This dissertation postulates that different intermediate language representations change the performance characteristics of these machine learning models. This dissertation takes a published machine learning frameworks as a control, modifies their input methodology to include different intermediate language representation transforms, and performs direct comparisons of model performance to ascertain bias in performance caused by different intermediate languages. This research enables the machine learning engineer, focused on binary analysis tasks, to build models with the knowledge on how characteristics of intermediate languages may be biasing output performance.
Recommended Citation
Cannan, Logan, "Lost in translation: the impact of intermediate language representations on machine learning applications" (2025). Dissertations. 465.
https://louis.uah.edu/uah-dissertations/465