A gene-based test of association using canonical correlation analysis

Bioinformatics. 2012 Mar 15;28(6):845-50. doi: 10.1093/bioinformatics/bts051. Epub 2012 Jan 31.

Abstract

Motivation: Canonical correlation analysis (CCA) measures the association between two sets of multidimensional variables. We reasoned that CCA could provide an efficient and powerful approach for both univariate and multivariate gene-based tests of association without the need for permutation testing.

Results: Compared with a commonly used permutation-based approach, CCA (i) is faster; (ii) has appropriate type-I error rate for normally distributed quantitative traits; (iii) provides comparable power for small to medium-sized genes (<100 kb); (iv) provides greater power when the causal variants are uncommon; (v) provides considerably less power for larger genes (≥100 kb) when the causal variants have a broad minor allele frequency (MAF) spectrum. Application to a GWAS of leukocyte levels identified SAFB and a histone gene cluster as novel putative loci harboring multiple independent variants regulating lymphocyte and neutrophil counts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Frequency
  • Genome-Wide Association Study*
  • Humans
  • Multivariate Analysis*
  • Quantitative Trait, Heritable