In the analysis of trends in health outcomes, an ongoing issue is how to
separate and estimate the effects of age, period, and cohort. As these 3 variables
are perfectly collinear by definition, regression coefficients in a general linear
model are not unique. In this tutorial, we review why identification is a problem,
and how this problem may be tackled using partial least squares and principal
components regression analyses. Both methods produce regression coefficients that
fulfill the same collinearity constraint as the variables age, period, and cohort.
We show that, because the constraint imposed by partial least squares and principal
components regression is inherent in the mathematical relation among the 3
variables, this leads to more interpretable results. We use one dataset from a
Taiwanese health-screening program to illustrate how to use partial least squares
regression to analyze the trends in body heights with 3 continuous variables for
age, period, and cohort. We then use another dataset of hepatocellular carcinoma
mortality rates for Taiwanese men to illustrate how to use partial least squares
regression to analyze tables with aggregated data. We use the second dataset to show
the relation between the intrinsic estimator, a recently proposed method for the
age-period-cohort analysis, and partial least squares regression. We also show that
the inclusion of all indicator variables provides a more consistent approach. R code
for our analyses is provided in the eAppendix.
«
In the analysis of trends in health outcomes, an ongoing issue is how to
separate and estimate the effects of age, period, and cohort. As these 3 variables
are perfectly collinear by definition, regression coefficients in a general linear
model are not unique. In this tutorial, we review why identification is a problem,
and how this problem may be tackled using partial least squares and principal
components regression analyses. Both methods produce regression coefficients that
fulfill the...
»