2. Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology and Oregon Clinical and Translational Research Institute, Oregon Health & Sciences University, 3181 S.W. Sam Jackson Park Rd., Portland, 97239-3098.
Background: Reconstructability Analysis (RA) has been used to detect epistasis in genomic data; in that work, even the simplest RA models (variable-based models without loops) gave performance superior to two other methods. A follow-on theoretical study showed that RA also offers higher-resolution models, namely variable-based models with loops and state-based models, likely to be even more effective in modeling epistasis, and also described several mathematical approaches to classifying types of epistasis.
Methods: The present paper extends this second study by discussing a non-standard use of RA: the analysis of epistasis in quantitative as opposed to nominal variables; such quantitative variables are, for example, encountered in genetic characterizations of gene expression, e.g., eQTL data. Three methods are investigated for applying variable- and state-based RA to quantitative dependent variables: (i) k-systems analysis, which treats continuous function values as pseudofrequencies, (ii) b-systems analysis, which derives continuous values from binned DVs using expected value calculations, and (iii) u-systems analysis, which treats continuous function values as pseudo-utilities subject to a lottery. These methods are demonstrated and compared on synthetic data.
Results: The three methods of k-, b-, and u-systems analyses, both variable-based and state-based, are then applied to a published SNP dataset. A preliminary search is done with b-systems analysis, followed by more refined k- and u-systems searches. The analyses suggest candidates for epistatic interactions that affect the level of gene expression. As in the synthetic data studies, state-based RA is more powerful than variable-based RA. Conclusions: While the previous RA studies looked at epistasis in nominal (or discretized) data, this paper shows that RA can also analyze epistasis in quantitative expression data without discretizing this data. Since RA can also model epistasis in frequency distributions and detect linkage disequilibrium, its successful application here also to continuous functions shows that it offers a flexible methodology for the analysis of genomic interaction effects.
Keywords: epistasis, gene-gene interactions, gene expression, eQTL, Reconstructability Analysis, information theory, graphical models, OCCAM, bioinformatics, k-systems analysis, u-systems analysis, function decomposition, state-based modeling.