Applications of Bayesian Network Models in studying Acute Myeloid Leukemia (AML)
Date
2016-05
Authors
Agrahari, Rupesh
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
My thesis aims at designing a computational model to analyze gene expression data to improve cancer diagnosis, specifically Acute Myeloid Leukemia (AML), which is a type of aggressive blood cancer. As part of a team of researchers in the Oncinfo Lab, I used Bayesian networks (BN) to model gene expression data. A BN is a probabilistic graphical model where a set of random variables represent nodes of a Directed Acyclic Graph (DAG). The edges of the DAG model the conditional dependencies between the random variables. We used established clustering methods to cluster data and group similar genes together. Specifically, we applied Weighted Gene Co-Expression Network Analysis (WGCNA) as a clustering mechanism to cluster our gene expression data. For each cluster of genes, we used principal component analysis (PCA) to compute a single value, called an eigengene. Eigengenes were represented by nodes in the BN and dependency among those eigengenes were modeled by the edges of the BN. The rational for using a BN in this framework is that it can model gene expressions and dependencies, enabling us to use probability theory to make scientific predictions. The application of our BN model is to identify AML patients from another type of hematological malignancy. I performed the classification of patients using a cross-validation technique and tested the performance on an independent dataset. Moreover, I trained my model on a training dataset with 366 samples and evaluated the performance on a test dataset with 74 samples. The accuracy of predictions on train and test datasets were 93.5% and 84%, respectively. Further improvements to the methodology are required to improve its accuracy and make it appropriate for clinical use.
Description
Keywords
Bayesian network, Co-expression analysis, Predictive model, Acute Myeloid Leukemia, Myelodysplastic syndrome, Eigengene, Cross validation
Citation
Agrahari, R. (2016). <i>Applications of Bayesian network models in studying Acute Myeloid Leukemia (AML)</i> (Unpublished thesis). Texas State University, San Marcos, Texas.