The Sachs dataset consists of 14 measurements of human T cells. In each of these 14 experiments, external influence was applied to certain phospho-proteins and -lipids, whose abundance is given as variables of the dataset. In addition, the variables affect each other, making graphical models a useful approach for the analysis.
After a short analysis and overview of the pooled data, vine copula mixture models, introduced by Sahin and Czado (2021), are used to find substructures in the Sachs dataset. A particular focus here is on the discussion of various initial conditions, such as the initial clustering, the feasible marginal distribution and copula families, etc. With the VCMMs it was possible to find substructures in the data. These are discussed in terms of their biological context, such as their origin from the different experiments.
Finally the focus is upon on the causal analysis of the data. In every experiment and thus in every single observation, external influence was applied to the measurements. It is therefore interesting to build models where all external influences are removed. To do this, D-Vine regression models are fitted on certain subsets where specific variables were not influenced. Due to tail dependence, it also makes major differences to the results which copula families are used. While Gaussian models are commonly used in previous research, the results of models with other parametric copula families are presented here, too.
«
The Sachs dataset consists of 14 measurements of human T cells. In each of these 14 experiments, external influence was applied to certain phospho-proteins and -lipids, whose abundance is given as variables of the dataset. In addition, the variables affect each other, making graphical models a useful approach for the analysis.
After a short analysis and overview of the pooled data, vine copula mixture models, introduced by Sahin and Czado (2021), are used to find substructures in the Sachs dat...
»