Date of Award
2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
Committee Chair
Vineetha Menon
Committee Member
Jerome Baudry
Committee Member
Huaming Zhang
Committee Member
Jacob Hauenstein
Committee Member
Jeremy C. Smith
Research Advisor
Vineetha Menon
Subject(s)
Computational biology, Proteins--Conformation, Artificial intelligence--Data processing, Deep learning (Machine learning)
Abstract
Drug development is a lengthy, expensive process with a high failure rate. The urgency brought on by the COVID-19 pandemic has accelerated the integration of artificial intelligence (AI) and machine learning (ML) to enhance drug discovery by increasing target specificity, reducing toxicity, and optimizing formulation strategies. Building on this momentum, this dissertation introduces novel AI/ML-driven frameworks for protein conformation selection and classification, addressing critical challenges in modern drug discovery. However, designing such a novel AI/ML data-driven framework is a difficult task because most real-world biomedical datasets suffer from class imbalance issues, which can significantly skew AI/ML model training, resulting in biased models and poor prediction accuracy for the minority class. Another issue is the small sample sizes in biomedical datasets, which might complicate drug discovery by misclassifying drug candidate conformations. Hence, to address the aforementioned challenges, this dissertation presents a series of AI/ML data-driven methodologies that have the capability to work with smaller sample sizes suffering from class-imbalance while maintaining model performance. This dissertation presents multiple AI/ML data-driven frameworks aimed at: i) addressing the class imbalance issue in biomedical data, particularly in identifying potential binding protein conformations in a dataset where the non-binding protein conformation outnumbers the binding protein conformations, ii) using data-driven approaches to select probable physio-chemical features of potential binding protein conformations which could aid in identifying unique physio-chemical descriptors that could play a pivotal role in the binding capability of a protein conformation and also help in reducing the dimensionality of the dataset, allowing this work to be carried out on a personal computer rather than a supercomputer, iii) maximizing the prediction accuracy of binding and non-binding protein conformations, and iv) utilizing a Multi-modal Framework for Integrating Local and Global Descriptors via Graph Convolutional Networks. The AI/ML methodologies introduced are novel and innovative, striving to achieve a thorough understanding of the selection and prediction processes of binding protein conformations, which are crucial for drug discovery applications. The research outcomes from this work can help to streamline the development of new drugs and have a direct impact on the efficiency, cost-effectiveness, and speed with which novel therapeutic agents are introduced to the market.
Recommended Citation
Gupta, Shivangi, "Novel AI/ML-based frameworks for protein conformation selection in drug discovery applications" (2025). Dissertations. 456.
https://louis.uah.edu/uah-dissertations/456