稀有变异关联分析研究进展及其在畜禽中的应用展望

引用本文

苗健, 常天鹏, 史新平, 夏江威, 高会江, 李俊雅. 稀有变异关联分析研究进展及其在畜禽中的应用展望[J]. 畜牧兽医学报, 2017, 48(7): 1173-1180.

MIAO Jian, CHANG Tian-peng, SHI Xin-ping, XIA Jiang-wei, GAO Hui-jiang, LI Jun-ya. Study Progress on Rare Variants Association Studies and Its Application in Livestock[J]. Acta Veterinaria et Zootechnica Sinica, 2017, 48(7): 1173-1180.

苗健^1,2, 常天鹏¹, 史新平¹, 夏江威¹, 高会江¹, 李俊雅¹

1. 中国农业科学院北京畜牧兽医研究所, 北京 100193;
2. 福建农林大学动物科学学院, 福州 350002

收稿日期：2017-02-16

基金项目：国家自然科学基金（31472079）

作者简介：苗健(1993-), 男, 江苏常州人, 硕士生, 主要从事动物遗传育种与繁殖研究, E-mail:miaojian6363@163.com

通信作者：高会江, 研究员, E-mail:gaohj111@sina.com
李俊雅, 研究员, E-mail:lijunya@caas.cn

摘要：在过去十年里，全基因组关联分析成功鉴定了数以千计的常见变异与常见疾病（性状）的关联。尽管如此，缺失遗传力的问题逐渐引起了广泛关注。由于GWAS的目标是鉴定常见变异与表型的关联，稀有变异成为解释缺失遗传力的一个重要答案。随着测序技术的发展，人们得以研究稀有变异与复杂疾病（性状）的关联。一系列的稀有变异关联分析（RVAS）方法被提出并应用于人类复杂疾病中，然而在畜禽上鲜有研究。本文首先综述了RVAS中具有代表性的测序核关联检验（SKAT）及其家族；其后，总结了两种在稀有变异中常用的提高效力的方法：极端表型抽样和荟萃分析；然后，探讨了使用芯片数据研究RVAS的方法：基因型填充和稀有单倍型关联分析；最后展望了稀有变异关联分析在畜禽上的应用前景。

关键词：稀有变异关联分析极端表型抽样荟萃分析基因型填充稀有单倍型

Study Progress on Rare Variants Association Studies and Its Application in Livestock

MIAO Jian^1,2, CHANG Tian-peng¹, SHI Xin-ping¹, XIA Jiang-wei¹, GAO Hui-jiang¹, LI Jun-ya¹

1. Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China;
2. College of Animal Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China

[1]	张沅. 家畜育种学[M]. 北京: 中国农业出版社, 2001. ZHANG Y. Animal breeding[M]. Beijing: China Agriculture Press, 2001. (in Chinese)
[2]	VISSCHER P M, BROWN M A, MCCARTHY M I, et al. Five years of GWAS discovery[J]. Am J Hum Genet, 2012, 90(1): 7–24. DOI: 10.1016/j.ajhg.2011.11.029
[3]	SHARMA A, LEE J S, DANG C G, et al. Stories and challenges of genome wide association studies in livestock-a review[J]. Asian Austral J Anim, 2015, 28(10): 1371–1379. DOI: 10.5713/ajas.14.0715
[4]	MANOLIO T A, COLLINS F S, COX N J, et al. Finding the missing heritability of complex diseases[J]. Nature, 2009, 461(7265): 747–753. DOI: 10.1038/nature08494
[5]	GIBSON G. Rare and common variants:twenty arguments[J]. Nat Rev Genet, 2012, 13(2): 135–145. DOI: 10.1038/nrg3118
[6]	LI Y L, FENG T, ZHU X. Detecting association with rare variants for common diseases using haplotype-based methods[J]. Stat Interface, 2011, 4(3): 273–283. DOI: 10.4310/SII.2011.v4.n3.a2
[7]	LIU D J, PELOSO G M, ZHAN X W, et al. Meta-analysis of gene-level tests for rare variant association[J]. Nat Genet, 2014, 46(2): 200–204.
[8]	ROTH E M, MCKENNEY J M, HANOTIN C, et al. Atorvastatin with or without an antibody to PCSK9 in primary hypercholesterolemia[J]. New Engl J Med, 2012, 367(20): 1891–1900. DOI: 10.1056/NEJMoa1201832
[9]	BAILÉN A R. Effect of a monoclonal antibody to PCSK9 on LDL cholesterol[J]. Rev Clin Esp, 2012, 212(7): 408–409.
[10]	DERING C, HEMMELMANN C, PUGH E, et al. Statistical analysis of rare sequence variants:an overview of collapsing methods[J]. Genet Epidemiol, 2011, 35(S1): S12–S17. DOI: 10.1002/gepi.v35.1s
[11]	梁融, 张俊国, 卜涛, 等. 稀有变异的关联性研究统计方法[J]. 中华流行病学杂志, 2015, 36(8): 900–903. LIANG R, ZHANG J G, BU T, et al. Review for the testing on rare-variants association with disease[J]. Chinese Journal of Epidemiology, 2015, 36(8): 900–903. (in Chinese)
[12]	WU M C, LEE S, CAI T X, et al. Rare-variant association testing for sequencing data with the sequence kernel association test[J]. Am J Hum Genet, 2011, 89(1): 82–93. DOI: 10.1016/j.ajhg.2011.05.029
[13]	DUCHESNE P, DE MICHEAUX P L. Computing the distribution of quadratic forms:Further comparisons between the Liu-Tang-Zhang approximation and exact methods[J]. Comput Stat Data Anal, 2010, 54(4): 858–862. DOI: 10.1016/j.csda.2009.11.025
[14]	LEE S, EMOND M J, BAMSHAD M J, et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies[J]. Am J Hum Genet, 2012, 91(2): 224–237. DOI: 10.1016/j.ajhg.2012.06.007
[15]	LEE S, WU M C, LIN X H. Optimal tests for rare variant effects in sequencing association studies[J]. Biostatistics, 2012, 13(4): 762–775. DOI: 10.1093/biostatistics/kxs014
[16]	IONITA-LAZA I, LEE S, MAKAROV V, et al. Sequence kernel association tests for the combined effect of rare and common variants[J]. Am J Hum Genet, 2013, 92(6): 841–853. DOI: 10.1016/j.ajhg.2013.04.015
[17]	HAN F, PAN W. A data-adaptive sum test for disease association with multiple common or rare variants[J]. Hum Hered, 2010, 70(1): 42–54. DOI: 10.1159/000288704
[18]	LEE S, TESLOVICH T M, BOEHNKE M, et al. General framework for meta-analysis of rare variants in sequencing association studies[J]. Am J Hum Genet, 2013, 93(1): 42–53. DOI: 10.1016/j.ajhg.2013.05.010
[19]	BARNETT I J, LEE S, LIN X H. Detecting rare variant effects using extreme phenotype sampling in sequencing association studies[J]. Genet Epidemiol, 2013, 37(2): 142–151. DOI: 10.1002/gepi.21699
[20]	WANG X F, LEE S, ZHU X F, et al. GEE-based SNP set association test for continuous and discrete traits in family-based association studies[J]. Genet Epidemiol, 2013, 37(8): 778–786. DOI: 10.1002/gepi.21763
[21]	CHEN M H, YANG Q. GWAF:an R package for genome-wide association analyses with family data[J]. Bioinformatics, 2010, 26(4): 580–581. DOI: 10.1093/bioinformatics/btp710
[22]	LIN X Y, LEE S, WU M C, et al. Test for rare variants by environment interactions in sequencing association studies[J]. Biometrics, 2016, 72(1): 156–164. DOI: 10.1111/biom.v72.1
[23]	LIN D Y, ZENG D L, TANG Z Z. Quantitative trait analysis in sequencing studies under trait-dependent sampling[J]. Proc Natl Acad Sci U S A, 2013, 110(30): 12247–12252. DOI: 10.1073/pnas.1221713110
[24]	ZUK O, SCHAFFNER S F, SAMOCHA K, et al. Searching for missing heritability:Designing rare variant association studies[J]. Proc Natl Acad Sci U S A, 2014, 111(4): E455–E464. DOI: 10.1073/pnas.1322563111
[25]	LANGE L A, HU Y N, ZHANG H, et al. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol[J]. Am J Hum Genet, 2014, 94(2): 233–245. DOI: 10.1016/j.ajhg.2014.01.010
[26]	EMOND M J, LOUIE T, EMERSON J, et al. Exome sequencing of extreme phenotypes identifies DCTN4 as a modifier of chronic Pseudomonas aeruginosa infection in cystic fibrosis[J]. Nat Genet, 2012, 44(8): 886–889. DOI: 10.1038/ng.2344
[27]	FLANNICK J, THORLEIFSSON G, BEER N L, et al. Loss-of-function mutations in SLC30A8 protect against type 2 diabetes[J]. Nat Genet, 2014, 46(4): 357–363. DOI: 10.1038/ng.2915
[28]	ZHOU Y J, WANG Y, CHEN L L. Detecting the common and individual effects of rare variants on quantitative traits by using extreme phenotype sampling[J]. Genes, 2016, 7(1): 2.
[29]	YANG J, BENYAMIN B, MCEVOY B P, et al. Common SNPs explain a large proportion of the heritability for human height[J]. Nat Genet, 2010, 42(7): 565–569. DOI: 10.1038/ng.608
[30]	EVANGELOU E, IOANNIDIS J P A. Meta-analysis methods for genome-wide association studies and beyond[J]. Nat Rev Genet, 2013, 14(6): 379–389. DOI: 10.1038/nrg3472
[31]	SHUSTER J J. Empirical versus natural weighting in random effects meta-analysis[J]. Stat Med, 2014, 33(7): 1260. DOI: 10.1002/sim.6031
[32]	LEE S, ABECASIS G R, BOEHNKE M, et al. Rare-variant association analysis:study designs and statistical tests[J]. Am J Hum Genet, 2014, 95(1): 5–23. DOI: 10.1016/j.ajhg.2014.06.009
[33]	PASANIUC B, PRICE A L. Dissecting the genetics of complex traits using summary association statistics[J]. Nat Rev Genet, 2017, 18(2): 117–127.
[34]	FENG S, LIU D J, ZHAN X W, et al. RAREMETAL:fast and powerful meta-analysis for rare variants[J]. Bioinformatics, 2014, 30(19): 2828–2829. DOI: 10.1093/bioinformatics/btu367
[35]	ZHAN X W, HU Y N, LI B S, et al. RVTESTS:an efficient and comprehensive tool for rare variant association analysis using sequence data[J]. Bioinformatics, 2016, 32(9): 1423–1426. DOI: 10.1093/bioinformatics/btw079
[36]	HOWIE B N, DONNELLY P, MARCHINI J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies[J]. PLoS Genet, 2009, 5(6): e1000529. DOI: 10.1371/journal.pgen.1000529
[37]	HOWIE B, FUCHSBERGER C, STEPHENS M, et al. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing[J]. Nat Genet, 2012, 44(8): 955–959. DOI: 10.1038/ng.2354
[38]	BROWNING B L, BROWNING S R. Genotype imputation with millions of reference samples[J]. Am J Hum Genet, 2016, 98(1): 116–126. DOI: 10.1016/j.ajhg.2015.11.020
[39]	ROSHYARA N R, HORN K, KIRSTEN H, et al. Comparing performance of modern genotype imputation methods in different ethnicities[J]. Sci Rep, 2016, 6: 34386. DOI: 10.1038/srep34386
[40]	FRAZER K A, BALLINGER D G, COX D R, et al. A second generation human haplotype map of over 3.1 million SNPs[J]. Nature, 2007, 449(7164): 851–861. DOI: 10.1038/nature06258
[41]	GOLDSTEIN D B, ALLEN A, KEEBLER J, et al. Sequencing studies in human genetics:design and interpretation[J]. Nat Rev Genet, 2013, 14(7): 460–470. DOI: 10.1038/nrg3455
[42]	KOSMICKI J A, CHURCHHOUSE C L, RIVAS M A, et al. Discovery of rare variants for complex phenotypes[J]. Hum Genet, 2016, 135(6): 625–634. DOI: 10.1007/s00439-016-1679-1
[43]	DO R, KATHIRESAN S, ABECASIS G R. Exome sequencing and complex disease:practical aspects of rare variant association studies[J]. Hum Mol Genet, 2012, 21(R1): R1–R9. DOI: 10.1093/hmg/dds387
[44]	SCHAID D J, ROWLAND C M, TINES D E, et al. Score tests for association between traits and haplotypes when linkage phase is ambiguous[J]. Am J Hum Genet, 2002, 70(2): 425–434. DOI: 10.1086/338688
[45]	SCHAID D J. Genetic epidemiology and haplotypes[J]. Genet Epidemiol, 2004, 27(4): 317–320. DOI: 10.1002/(ISSN)1098-2272
[46]	WANG M, LIN S L. Detecting associations of rare variants with common diseases:collapsing or haplotyping?[J]. Brief Bioinform, 2015, 16(5): 759–768. DOI: 10.1093/bib/bbu050
[47]	LIN W Y, YI N J, ZHI D G, et al. Haplotype-based methods for detecting uncommon causal variants with common SNPS[J]. Genet Epidemiol, 2012, 36(6): 572–582. DOI: 10.1002/gepi.21650
[48]	LIN W Y, YI N J, LOU X Y, et al. Haplotype kernel association test as a powerful method to identify chromosomal regions harboring uncommon causal variants[J]. Genet Epidemiol, 2013, 37(6): 560–570. DOI: 10.1002/gepi.21740
[49]	LI J, ZHANG K, YI N. A Bayesian hierarchical model for detecting haplotype-haplotype and haplotype-environment interactions in genetic association studies[J]. Hum Hered, 2011, 71(3): 148–160. DOI: 10.1159/000324841
[50]	GUO W, LIN S L. Generalized linear modeling with regularization for detecting common disease rare haplotype association[J]. Genet Epidemiol, 2009, 33(4): 308–316. DOI: 10.1002/gepi.v33:4
[51]	LI Y, BYRNES A E, LI M Y. To identify associations with rare variants, just WHaIT weighted haplotype and imputation-based tests[J]. Am J Hum Genet, 2010, 87(5): 728–735. DOI: 10.1016/j.ajhg.2010.10.014
[52]	BISWAS S, LIN S L. Logistic Bayesian LASSO for identifying association with rare haplotypes and application to age-related macular degeneration[J]. Biometrics, 2012, 68(2): 587–597. DOI: 10.1111/biom.2012.68.issue-2
[53]	LIN S. Kullback-Leibler divergence for detection of rare haplotype common disease association[J]. Eur J Hum Genet, 2015, 23(11): 1558–1565. DOI: 10.1038/ejhg.2015.25
[54]	MADSEN B E, BROWNING S R. A groupwise association test for rare mutations using a weighted sum statistic[J]. PLoS Genet, 2009, 5(e10003842).
[55]	GONZALEZ-RECIO O, DAETWYLER H D, MACLEOD I M, et al. Rare variants in transcript and potential regulatory regions explain a small percentage of the missing heritability of complex traits in cattle[J]. PLoS One, 2015, 10(12): e0143945. DOI: 10.1371/journal.pone.0143945
[56]	JIANG D, MCPEEK M S. Robust rare variant association testing for quantitative traits in samples with related individuals[J]. Genet Epidemiol, 2014, 38(1): 10–20. DOI: 10.1002/gepi.21775
[57]	ZHANG Q Q, GULDBRANDTSEN B, CALUS M P L, et al. Comparison of gene-based rare variant association mapping methods for quantitative traits in a bovine population with complex familial relationships[J]. Genet Sel Evol, 2016, 48(1): 60. DOI: 10.1186/s12711-016-0238-5
[58]	CURTIS D, NORTH B V, SHAM P C. Use of an artificial neural network to detect association between a disease and multiple marker genotypes[J]. Ann Hum Genet, 2001, 65(1): 95–107. DOI: 10.1046/j.1469-1809.2001.6510095.x
[59]	MOTSINGER-REIF A A, DUDEK S M, HAHN L W, et al. Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology[J]. Genet Epidemiol, 2008, 32(4): 325–340. DOI: 10.1002/(ISSN)1098-2272
[60]	KARATZOGLOU A, SMOLA A, HORNIK K, et al. Kernlab-An S4 package for kernel methods in R[J]. J Stat Softw, 2004, 11(9). DOI: 10.18637/jss.v011.i09
[61]	XU S Z, XU Y, GONG L, et al. Metabolomic prediction of yield in hybrid rice[J]. Plant J, 2016, 88(2): 219–227. DOI: 10.1111/tpj.2016.88.issue-2
[62]	BREIMAN L. Random forests[J]. Mach Learn, 2001, 45(1): 5–32. DOI: 10.1023/A:1010933404324
[63]	ACHARJEE A, KLOOSTERMAN B, DE VOS R C H, et al. Data integration and network reconstruction with-omics data using Random Forest regression in potato[J]. Anal Chim Acta, 2011, 705(1-2): 56–63. DOI: 10.1016/j.aca.2011.03.050


畜牧兽医学报 2017, Vol. 48 Issue (7): 1173-1180. DOI: 10.11843/j.issn.0366-6964.2017.07.001	PDF