Research Seminar

Integration of -omics data through maximum likelihood based simultaneous component analysis


Robert van den Berg


KU Leuven

Abstract: The joint analysis of multiset or multiblock data that consist of multiple blocks containing different types of data and sharing the object or variable mode becomes more widespread in different science disciplines. Examples of this type of joint analyses in, for instance, systems biology include: (i) the analysis of metabolite concentrations measured with complementary analytical chemical measurement platforms in which the experimental mode is shared; and (ii) the analysis of gene expression combined with protein-promotor interaction data in which the variable mode (genes) is shared. One of the goals of such joint analyses is to identify the structural mechanisms underlying the different data blocks and their interrelations. Different approaches for the analysis of multiblock or multiset data are available. We focus here on the family of simultaneous component analysis (SCA) models.

A simultaneous analysis of multiblock data is not straightforward, as various factors can influence the outcome of the analysis. These factors include differences between the blocks in dimensionality, homogeneity, and error level. Recently, a maximum likelihood SCA (MxLSCA) method was developed to improve the recovery of the structural mechanisms underlying multiblock data especially in presence of between-block differences in error level. In an earlier simulation study, we found that MxLSCA outperformed SCA in terms of recovery of the structure underlying coupled data sets.

This earlier study, however, was based on fully artificial data and pertained to data blocks coupled via the variable mode. Here, we present the results of a new simulation study in which (i) the data blocks were coupled via the experimental mode, and (ii) real life microbial metabolomics data blocks were used to generate simulated data sets with a realistic correlation structure. In this new study MxLSCA again outperformed SCA-P in recovering the true data structures.

This is joint work of Robert van den Berg, Iven Van Mechelen, Tom Wilderjans, Katrijn Van Deun, Henk Kiers, and Age Smilde
Date: Tue Apr 21, 12:15 pm - 1:15 pm
Place: room 02.51 (Department of Psychology, Tiensestraat 102, 3000 Leuven)