Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci.
Journal: 2010/June - Nature Genetics
ISSN: 1546-1718
Abstract:
To identify new genetic risk factors for rheumatoid arthritis, we conducted a genome-wide association study meta-analysis of 5,539 autoantibody-positive individuals with rheumatoid arthritis (cases) and 20,169 controls of European descent, followed by replication in an independent set of 6,768 rheumatoid arthritis cases and 8,806 controls. Of 34 SNPs selected for replication, 7 new rheumatoid arthritis risk alleles were identified at genome-wide significance (P < 5 x 10(-8)) in an analysis of all 41,282 samples. The associated SNPs are near genes of known immune function, including IL6ST, SPRED2, RBPJ, CCR6, IRF5 and PXK. We also refined associations at two established rheumatoid arthritis risk loci (IL2RA and CCL21) and confirmed the association at AFF3. These new associations bring the total number of confirmed rheumatoid arthritis risk loci to 31 among individuals of European ancestry. An additional 11 SNPs replicated at P < 0.05, many of which are validated autoimmune risk alleles, suggesting that most represent genuine rheumatoid arthritis risk alleles.
Relations:
Content
Citations
(480)
References
(56)
Grants
(858)
Diseases
(1)
Conditions
(1)
Chemicals
(1)
Genes
(37)
Organisms
(1)
Processes
(2)
Affiliates
(1)
Similar articles
Articles by the same authors
Discussion board
Nat Genet 42(6): 508-514

Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci

+61 authors

METHODS

Sample collections

Case-control collections are listed in Table 1 and described in detail in our previous studies6,7,11,13,19. Collections were composed entirely of individuals of selfdescribed European ancestry, and all cases either met the 1987 American College of Rheumatology (ACR) criteria for diagnosis of RA50 or were diagnosed by board-certified rheumatologists. All RA cases were further limited to anti-CCP+ patients, or RF+ patients if anti-CCP status data were missing. The BRASS RA samples have been used in our previous 100K GWAS6, but here are presented for the first time genotyped with the Affymetrix 6.0 array. In the current study, controls were matched to BRASS RA cases using principal components analysis from GWAS data from three separate studies: controls from a multiple sclerosis GWAS31, controls from an age-related macular degeneration GWAS51, and controls from a myocardial infarction GWAS52. WTCCC collection controls included the 1958 birth and National Blood Service cohorts as well as non-autoimmune disease cases (bipolar disorder, cardiovascular disease, hypertension and type 2 diabetes)15. All GWAS collections except WTCCC were restricted to control subjects matching cases using principal components analysis of GWAS data. All replication sample collections were composed of epidemiologically and/or geographically matched cases and controls, except NARAC II collection, which was case-control matched based on genotypes at a set of ancestry-informative markers as previously described11. The eight replication samples included: (1) CCP positive cases and controls from Halifax and Toronto (CANADA-II)13; (2) RF positive Dutch cases from Groningen and Nijmegen, together with geographically matched controls53; (3) CCP positive Dutch cases and controls collected from the greater Amsterdam region (GENRA)54; (4) North American RF positive cases and controls matched on gender, age, and grandparental country of origin from the Genomics Collaborative Initiative (GCI)4; (5) CCP or RF positive Dutch cases and controls from Leiden University Medical Center (LUMC)5; (6) CCP positive cases drawn from North American clinics and controls from the New York Cancer Project (together this collection is called NARAC-II)7; (7) CCP or RF positive cases recruited at multiple sites in the United Kingdom by the United Kingdom Rheumatoid Arthritis Genetics (UKRAG) collaboration9; and (8) CCP or RF positive cases identified by chart review from the Nurses Health Study (NHS) and matched controls based on age, gender, menopausal status, and hormone use55. We used available SNP data from this and previous studies to identify genetically identical samples from the same country13; we assumed these represented duplicated individuals and removed them. Institutional review boards at each collection site approved the study, and all individuals gave their informed consent.

Genotyping

The BRASS GWAS collection was genotyped on the Affymetrix Genechip 6.0 platform at the Broad Institute (Boston, USA). All other GWAS collections were genotyped as previously described7,13,19. Genotype data for GWAS samples from RA and other disease studies were obtained with permission from the investigators and/or disease consortia. Additional shared control GWAS genotype data were obtained from the NIMH (USA) through a formal application and approval process (part of the BRASS collection), and from the Illumina iControls database (NARAC III). For each GWAS collection, quality control (QC) was implemented in cases and in each control cohort separately, and then again in the merged collection data. QC steps included filtering SNPs and individuals with >5% missing data, followed by filtering SNPs with minor allele frequency (MAF) <1% and a Chi-squared test of Hardy Weinberg equilibrium PHWE<10. For the WTCCC collection, genotyped on the older Affymetrix 500K platform, we implemented more stringent QC (>1% missing data, MAF<1%, PHWE<10). We then used individual-pairwise identity-by-state estimates to remove occasional related and potentially contaminated samples. Data processing and QC were performed in PLINK56. Additional details are described in the Supplementary Note online.

The 34 SNPs chosen for replication, as well as proxy SNPs, were directly genotyped in each of eight collections (Table 1 and Supplementary Note online). Canada samples were genotyped on the Sequenom iPlex platform at University of Toronto, Mount Sinai Hospital and University Health Network (Toronto, Canada); Dutch and GENRA samples were genotyped on the Sequenom iPlex platform at the Broad Institute; UKRAG samples were genotyped on the Sequenom iPlex platform at The University of Manchester, UK; GCI and LUMC samples were genotyped by kinetic PCR at Celera (Alameda California, USA); NARAC II samples were genotyped on the Sequenom iPlex platform at the NIH (USA); and NHS samples were genotyped on the Biotrove OpenArray platform at the Channing Laboratory, Harvard Medical School (Boston, USA). QC filter criteria for SNPs in each replication or validation collection were 10% missing data, 1% MAF and PHWE<10. If a given SNP failed in genotyping or QC in a collection, a proxy SNP (r2>0.8) with the least missing data (if available) was used. See Supplementary Note online for details.

Genome-wide association analyses

To address population stratification and remove outliers in our GWAS for BRASS, Canada, EIRA, NARAC I, and NARAC III, we performed principal component (PC) analysis using EIGENSTRAT57. For BRASS, Canada, NARAC I, and NARAC III we further removed poorly matched controls based on case-control Euclidean distances calculated from five PCs11. Once matched, imputation was conducted on GWAS genotype data for each GWAS collection separately, using the IMPUTE software16 and haplotype-phased HapMap Phase 2 European CEU founders as a reference panel. Imputation yielded posterior genotype probabilities as well as imputation quality scores at SNPs not genotyped with a minor allele frequency >=1% in HapMap CEU.

We conducted logistic regression analysis for each SNP in each GWAS collection, to estimate regression coefficients β and z-scores for allele counts (additive model), which were then genomic-control corrected17. λGC values for genotyped SNPs only versus all SNPs (Supplementary Table 5 online) verified that logistic regression controlled for any deflation in the distribution of association test results18. We then conducted a meta-analysis to combine results across datasets for 2,554,714 SNPs with high quality genotype data in one or more collections (see Supplementary Note online) by summing inverse variance-weighted β and z-scores18, and again genomic-control corrected our results. We also conducted Cochran’s Q test for heterogeneity across collections, using the β coefficients for each collection for which results were available for a given SNP. Detailed description of all analyses and results is provided in Supplementary Note online.

Replication analysis

Replication and combined analyses followed the GWAS meta-analysis: after matching in the NARAC II collection (Table 1)11, and removing spurious duplicate samples12, logistic regression was used to test for association and inverse variance-weighted z-scores were summed across collections. Replication association tests were one-tailed, for the same allele being risk or protective as in the GWAS meta-analysis.

Sample collections

Case-control collections are listed in Table 1 and described in detail in our previous studies6,7,11,13,19. Collections were composed entirely of individuals of selfdescribed European ancestry, and all cases either met the 1987 American College of Rheumatology (ACR) criteria for diagnosis of RA50 or were diagnosed by board-certified rheumatologists. All RA cases were further limited to anti-CCP+ patients, or RF+ patients if anti-CCP status data were missing. The BRASS RA samples have been used in our previous 100K GWAS6, but here are presented for the first time genotyped with the Affymetrix 6.0 array. In the current study, controls were matched to BRASS RA cases using principal components analysis from GWAS data from three separate studies: controls from a multiple sclerosis GWAS31, controls from an age-related macular degeneration GWAS51, and controls from a myocardial infarction GWAS52. WTCCC collection controls included the 1958 birth and National Blood Service cohorts as well as non-autoimmune disease cases (bipolar disorder, cardiovascular disease, hypertension and type 2 diabetes)15. All GWAS collections except WTCCC were restricted to control subjects matching cases using principal components analysis of GWAS data. All replication sample collections were composed of epidemiologically and/or geographically matched cases and controls, except NARAC II collection, which was case-control matched based on genotypes at a set of ancestry-informative markers as previously described11. The eight replication samples included: (1) CCP positive cases and controls from Halifax and Toronto (CANADA-II)13; (2) RF positive Dutch cases from Groningen and Nijmegen, together with geographically matched controls53; (3) CCP positive Dutch cases and controls collected from the greater Amsterdam region (GENRA)54; (4) North American RF positive cases and controls matched on gender, age, and grandparental country of origin from the Genomics Collaborative Initiative (GCI)4; (5) CCP or RF positive Dutch cases and controls from Leiden University Medical Center (LUMC)5; (6) CCP positive cases drawn from North American clinics and controls from the New York Cancer Project (together this collection is called NARAC-II)7; (7) CCP or RF positive cases recruited at multiple sites in the United Kingdom by the United Kingdom Rheumatoid Arthritis Genetics (UKRAG) collaboration9; and (8) CCP or RF positive cases identified by chart review from the Nurses Health Study (NHS) and matched controls based on age, gender, menopausal status, and hormone use55. We used available SNP data from this and previous studies to identify genetically identical samples from the same country13; we assumed these represented duplicated individuals and removed them. Institutional review boards at each collection site approved the study, and all individuals gave their informed consent.

Genotyping

The BRASS GWAS collection was genotyped on the Affymetrix Genechip 6.0 platform at the Broad Institute (Boston, USA). All other GWAS collections were genotyped as previously described7,13,19. Genotype data for GWAS samples from RA and other disease studies were obtained with permission from the investigators and/or disease consortia. Additional shared control GWAS genotype data were obtained from the NIMH (USA) through a formal application and approval process (part of the BRASS collection), and from the Illumina iControls database (NARAC III). For each GWAS collection, quality control (QC) was implemented in cases and in each control cohort separately, and then again in the merged collection data. QC steps included filtering SNPs and individuals with >5% missing data, followed by filtering SNPs with minor allele frequency (MAF) <1% and a Chi-squared test of Hardy Weinberg equilibrium PHWE<10. For the WTCCC collection, genotyped on the older Affymetrix 500K platform, we implemented more stringent QC (>1% missing data, MAF<1%, PHWE<10). We then used individual-pairwise identity-by-state estimates to remove occasional related and potentially contaminated samples. Data processing and QC were performed in PLINK56. Additional details are described in the Supplementary Note online.

The 34 SNPs chosen for replication, as well as proxy SNPs, were directly genotyped in each of eight collections (Table 1 and Supplementary Note online). Canada samples were genotyped on the Sequenom iPlex platform at University of Toronto, Mount Sinai Hospital and University Health Network (Toronto, Canada); Dutch and GENRA samples were genotyped on the Sequenom iPlex platform at the Broad Institute; UKRAG samples were genotyped on the Sequenom iPlex platform at The University of Manchester, UK; GCI and LUMC samples were genotyped by kinetic PCR at Celera (Alameda California, USA); NARAC II samples were genotyped on the Sequenom iPlex platform at the NIH (USA); and NHS samples were genotyped on the Biotrove OpenArray platform at the Channing Laboratory, Harvard Medical School (Boston, USA). QC filter criteria for SNPs in each replication or validation collection were 10% missing data, 1% MAF and PHWE<10. If a given SNP failed in genotyping or QC in a collection, a proxy SNP (r2>0.8) with the least missing data (if available) was used. See Supplementary Note online for details.

Genome-wide association analyses

To address population stratification and remove outliers in our GWAS for BRASS, Canada, EIRA, NARAC I, and NARAC III, we performed principal component (PC) analysis using EIGENSTRAT57. For BRASS, Canada, NARAC I, and NARAC III we further removed poorly matched controls based on case-control Euclidean distances calculated from five PCs11. Once matched, imputation was conducted on GWAS genotype data for each GWAS collection separately, using the IMPUTE software16 and haplotype-phased HapMap Phase 2 European CEU founders as a reference panel. Imputation yielded posterior genotype probabilities as well as imputation quality scores at SNPs not genotyped with a minor allele frequency >=1% in HapMap CEU.

We conducted logistic regression analysis for each SNP in each GWAS collection, to estimate regression coefficients β and z-scores for allele counts (additive model), which were then genomic-control corrected17. λGC values for genotyped SNPs only versus all SNPs (Supplementary Table 5 online) verified that logistic regression controlled for any deflation in the distribution of association test results18. We then conducted a meta-analysis to combine results across datasets for 2,554,714 SNPs with high quality genotype data in one or more collections (see Supplementary Note online) by summing inverse variance-weighted β and z-scores18, and again genomic-control corrected our results. We also conducted Cochran’s Q test for heterogeneity across collections, using the β coefficients for each collection for which results were available for a given SNP. Detailed description of all analyses and results is provided in Supplementary Note online.

Replication analysis

Replication and combined analyses followed the GWAS meta-analysis: after matching in the NARAC II collection (Table 1)11, and removing spurious duplicate samples12, logistic regression was used to test for association and inverse variance-weighted z-scores were summed across collections. Replication association tests were one-tailed, for the same allele being risk or protective as in the GWAS meta-analysis.

Supplementary Material

Supplementary Data

Supplementary Data

Click here to view.(700K, pdf)

Acknowledgments

RMP is supported by grants from the NIH (R01-AR057108, R01-AR056768, and U54 RR020278), a private donation from the Fox Trot Fund, the William Randolph Hearst Fund of Harvard University, the American College of Rheumatology ‘Within Our Reach’ campaign, and a Career Award for Medical Scientists from the Burroughs Wellcome Fund. SR is supported by an NIH Career Development Award (1K08AR055688–01A1) and an American College of Rheumatology Bridge Grant. The Broad Institute Center for Genotyping and Analysis is supported by grant U54 RR020278 from the National Center for Research Resources. The BRASS Registry is supported by a grant from Crescendo and Biogen-Idec. NARAC is supported by the NIH (NO1-AR-2–2263 and RO1 AR44422). LAC is supported by the NIH (R01 AI065841 and 5-M01-RR-00079). The Nurses Health Study is supported by NIH grants P01 CA87969, CA49449, CA67262, CA50385, {"type":"entrez-nucleotide","attrs":{"text":"AR049880","term_id":"5971872","term_text":"AR049880"}}AR049880–06 and AR47782. This research was also supported in part by the Intramural Research Program of the National Institute of Arthritis, Musculoskeletal and Skin Diseases of the National Institutes of Health. This research was also supported in part by grants to KAS from the Canadian Institutes for Health Research (MOP79321 and IIN - 84042) and the Ontario Research Fund (RE01061) and by a Canada Research Chair. Genotyping of UKRAG samples was supported by the Arthritis Research campaign arc grant reference number 17552 and by the Manchester Biomedical Research Centre and Manchester Academy of Health Sciences. CW was funded by the Netherlands Organization for Scientific Research (VICI grant 918.66.620). We acknowledge the help of Ben A.C. Dijkmans, Dirkjan van Schaardenburg, A. Salvador Peña, Paul L. Klarenbeek, Zhuoli Zhang, Mike T Nurmohammed, Willem F Lems, Rob R.J. van de Stadt, Wouter H. Bos, Jenny Ursum, Margret G.M. Bartelds, Daniëlle M. Gerlag, Marleen G.H. van der Sande, Carla A. Wijbrandts, and Marieke M.J. Herenius in gathering GENRA patient samples and data. We thank the Myocardial Infarction Genetics Consortium (MIGen) study for the use of genotype data from their healthy controls in our study. The MIGen study was funded by the U.S. National Institutes of Health and National Heart, Lung, and Blood Institute’s STAMPEED genomics research program R01HL087676 and a grant from the National Center for Research Resources. We thank Johanna Seddon Progression of AMD Study, AMD Registry Study, Family Study of AMD, The US Twin Study of AMD, and the Age-Related Eye Disease Study (AREDS) for use of genotype data from their healthy controls in our study. We thank David Hafler and the Multiple Sclerosis collaborative for use of genotype data from their healthy controls recruited at Brigham and Women’s Hospital.

Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Boston, Massachusetts, 02115, USA
Broad Institute, Cambridge, Massachusetts, 02142 USA
Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, 02114, USA
Genetics and Genomics Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, US National Institutes of Health, Bethesda, Maryland 20892, USA
Arthritis Research Campaign (arc)–Epidemiology Unit, Stopford Building, The University of Manchester, Manchester M13 9PT, United Kingdom
Celera, Alameda, California 94502, USA
Dept of Medicine, University of Toronto, Mount Sinai Hospital and University Health Network, Toronto, Ontario M5G 1X5, Canada
Institute of Environmental Medicine, Karolinska Institutet, Stockholm 171 77, Sweden
University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030, USA
Rosalind Russell Medical Research Center for Arthritis, Department of Medicine, University of California, San Francisco, California 94143, USA
Laboratory of Immunogenetics, Department of Pathology, VU University Medical Center, 1007 MB Amsterdam, The Netherlands
Department of Neurology, Center for Neurologic Diseases, Brigham and Women’s Hospital, Boston, MA 02115, USA
NIHR-Leeds Musculoskeletal Biomedical Research Unit, Leeds Institute of Molecular Medicine, University of Leeds, LS9 7TF, United Kingdom
University of Oxford Institute of Musculoskeletal Sciences, Botnar Research Centre, Oxford OX3 7LD, United Kingdom
Musculoskeletal and Genetics Section, Division of Applied Medicine, University of Aberdeen, AB25 2ZD, United Kingdom
Department of Rheumatology, Leiden University Medical Centre, 2333 ZA Leiden, The Netherlands
The Feinstein Institute for Medical Research, North Shore-Long Island Jewish Health System, Manhasset, New York 11030, USA
Rheumatology Unit, Department of Medicine, Karolinska Institutet at Karolinska University Hospital Solna, Stockholm 171 76, Sweden
Genome Institute of Singapore, Singapore 138672
Rowe Program in Genetics, University of California at Davis, Davis, California 95616, USA
Clinical and Academic Rheumatology, Kings College Hospital NHS Foundation Trust, Denmark Hill, London SE5 9RS, United Kingdom
Clinical Immunology and Rheumatology, Academic Medical Center, University of Amsterdam, Amsterdam 1105AZ, The Netherlands
Department of Rheumatology, VU University Medical Center, 1007 MB Amsterdam, The Netherlands
School of Medicine &amp; Biomedical Sciences, Sheffield University, Sheffield S10 2JF, United Kingdom
Jan van Breemen Institute, 1056 AB Amsterdam, The Netherlands
Sanquin Research Landsteiner Laboratory, Academic Medical Center, University of Amsterdam, 1006 AD Amsterdam, The Netherlands
Roche Diagnostics, Pleasanton, CA 94588 USA
Department of Rheumatology and Clinical Immunology, University Medical Center Groningen and University of Groningen, Groningen, the Netherlands
Department of Genetics, University Medical Center Groningen and University of Groningen, Groningen, the Netherlands
Division of Genetics, Brigham and Women’s Hospital, Boston, Massachusetts, 02115, USA
Department of Human Genetics, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500HB Nijmegen, The Netherlands
Department of Rheumatology, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500HB Nijmegen, The Netherlands
Corresponding author
Supplementary Note Online
these authors contributed equally

Abstract

To identify novel genetic risk factors for rheumatoid arthritis (RA), we conducted a genome-wide association study (GWAS) meta-analysis of 5,539 autoantibody positive RA cases and 20,169 controls of European descent, followed by replication in an independent set of 6,768 RA cases and 8,806 controls. Of 34 SNPs selected for replication, 7 novel RA risk alleles were identified at genome-wide significance (P<5×10) in analysis of all 41,282 samples. The associated SNPs are near genes of known immune function, including IL6ST, SPRED2, RBPJ, CCR6, IRF5, and PXK. We also refined the risk alleles at two established RA risk loci (IL2RA and CCL21) and confirmed the association at AFF3. These new associations bring the total number of confirmed RA risk loci to 31 among individuals of European ancestry. An additional 11 SNPs replicated at P<0.05, many of which are validated autoimmune risk alleles, suggesting that most represent bona fide RA risk alleles.

Abstract

RA is a common autoimmune disease that affects up to 1% of the general adult population worldwide1. Approximately two-thirds of patients are seropositive for rheumatoid factor (RF) or anti-cyclic citrullinated peptide (anti-CCP) autoantibodies2. Genetic studies in autoantibody positive RA among subjects of European ancestry have identified multiple major histocompatibility complex (MHC) region alleles and 25 confirmed RA risk alleles in 23 non-MHC loci315. These alleles explain about 23% of the genetic burden of RA11, indicating that additional alleles remain to be discovered.

To identify novel RA risk alleles, we conducted a GWAS meta-analysis of 5,539 autoantibody positive RA cases and 20,169 controls of European ancestry (Table 1). This study expands upon our previous GWAS meta-analysis of 3,393 cases and 12,462 controls11 by (a) adding a new GWAS dataset of 483 RA cases recruited from the Boston area (Brigham Rheumatoid Arthritis Sequential Study, BRASS) and 1,449 shared controls, (b) adding 513 cases and 431 controls recruited from Sweden (Epidemiological Investigation of Rheumatoid Arthritis, EIRA), (c) incorporating a recently published GWAS of 2,418 cases and 4,504 controls recruited from North America (Canada and North American Rheumatoid Arthritis Consortium-III, NARAC-III)13, and (d) adding additional shared controls to the NARAC-III dataset. For each of six GWAS case-control collections, we removed SNPs and individuals that failed quality control, matched case-control samples using principal components analysis (PCA) to minimize bias due to stratification, and imputed16 genome-wide to infer genotypes at additional central European (CEU) HapMap SNPs. We used logistic regression to conduct a GWAS for 2.56 million SNPs in each collection, corrected for residual inflation using genomic control17, and combined results across collections by inverse variance-weighted meta-analysis (PGWAS)18. We found little evidence of systematic bias, as indicated by the genomic control inflation factor λGC (1.04)17, quantile-quantile (Q–Q) plots, results for markers highly differentiated across Europe19, and comparisons with an alternative analysis using PCA eigenvectors as covariates (Supplementary Figures 1–4; Supplementary Table 1). See Methods and Supplementary Note for full details of the analysis.

Table 1

Patient collections.

Case CollectionControl CollectionGeographical OriginCase Antibody StatusCasesControlsGenotyping PlatformCase-Control Stratification Correction
Brigham Rheumatoid Arthritis Sequential Study (BRASS)Shared controlsBoston, USA100% CCP+4831449Affymetrix 6.0GWAS data PC matching

Meta-analysisCANADACANADA and Shared controlsToronto, Canada100% CCP+5891472Illumina 370KGWAS data PC matching

Epidemiological Investigation of Rheumatoid Arthritis (EIRA)EIRASweden100% CCP+11731089Illumina 317KEpidemiologically matched &amp; GWAS data PC matching

5,539 Cases; 20,169 ControlsNorth American Rheumatoid Arthritis Consortium (NARAC) IShared controlsNorth America100% CCP+8671041Illumina 550KGWAS data PC matching

NARAC IIIShared controlsNorth America100% CCP+9024510Illumina 317KGWAS data PC matching

Wellcome Trust Case Control Consortium (WTCCC)Shared controls from WTCCCUnited Kingdom100% RF+ or CCP+152510608Affymetrix 500KGeographically matched

CANADA IICANADA IIToronto &amp; Halifax, Canada100% CCP+10761269Sequenom iPlexGeographically matched

ReplicationDutchDutchThe Netherlands100% RF+718697Sequenom iPlexGeographically matched

Genetics Network Rheumatology Amsterdam (GENRA)GENRAAmsterdam, The Netherlands100% CCP+5191155Sequenom iPlexGeographically matched

6,768 Cases; 8,806 ControlsGenomics Collaborative Initiative (GCI)GCINorth America100% RF+461460Kinetic PCREpidemiologically matched

Leiden University Medical Center (LUMC)LUMCLeiden, The Netherlands100% RF+ or CCP+310544Kinetic PCRGeographically matched

NARAC IIShared controlsNorth America100% CCP+462693Sequenom iPlexAncestry informative marker data matching

United Kingdom Rheumatoid Arthritis Genetics (UKRAG)UKRAGUnited Kingdom100% RF+ or CCP+29063494Sequenom iPlexGeographically matched

Nurses Health Study (NHS)NHSNorth America100% RF+ or CCP+316494Biotrove OpenArrayEpidemiologically matched

Meta-analysis of GWAS results for six collections (top panel) was used to identify SNPs for replication in eight collections (bottom panel). For each collection, we list the source of controls, geographic origin, autoantibody status of cases, numbers of cases and controls, genotyping platform (GWAS collection microarray platform, and replication and validation/prediction collection direct-genotyping technology), and the strategy used to correct for casecontrol population stratification. See Methods and Supplementary Note for details.

Our GWAS meta-analysis found support for RA risk loci previously confirmed among individuals of European ancestry, consistent with power for our study design (Table 2, Supplementary Figure 5). Four of the 25 confirmed non-MHC risk alleles obtained PGWAS<5×10 (PTPN22, CTLA4, TNFAIP3, and CD40 loci); the remaining 21 confirmed alleles obtained PGWAS≤0.002. We found modest evidence for association at PADI4 (rs2240340, PGWAS=0.01, odds ratio (OR)=1.06) and FCRL3 (rs3761959, PGWAS=0.001, OR=1.08), but not CD244 (rs3753389, PGWAS=0.26, OR=1.03), loci implicated in RA patients of Asian ancestry3,20,21 (Supplementary Table 1 online).

Table 2

Previously known RA risk associated SNPs in Europeans.

SNPGWAS Meta-AnalysisPower
LocusIDGene(s)Minor alleleMAFOR (95% CI)PGWASα = 10−6α = 5×10−8
1p36rs3890745*TNFRSF14C0.320.89 (0.85,0.94)3.6×10−60.450.24

1p13rs2476601PTPN22A0.101.94 (1.81,2.08)9.1×10−7411

1p13rs11586238CD2,CD58G0.241.13 (1.07,1.19)1.0×10−50.460.26

1q23rs12746613*FCGR2AT0.121.13 (1.06,1.21)0.00040.110.04

1q31rs10919563*PTPRCA0.130.88 (0.82,0.94)0.00020.100.03

2p16rs13031237RELT0.371.13 (1.07,1.18)7.9×10−70.670.45

2q11rs10865035*AFF3A0.471.12 (1.07,1.17)2.0×10−60.550.33

2q32rs7574865STAT4T0.221.16 (1.10,1.23)2.9×10−70.770.58

2q33rs1980422CD28C0.241.12 (1.06,1.18)5.2×10−50.320.15

2q33rs3087243CTLA4A0.440.87 (0.83,0.91)1.2×10−80.890.74

4q27rs6822844IL2,IL21T0.180.90 (0.84,0.95)0.00070.080.02

6p21rs6910071HLA-DRB1 (*0401 tag)G0.222.88 (2.73,3.03)<10−29911

6q21rs548234PRDM1C0.331.10 (1.05,1.16)9.7×10−50.210.08

6q23rs10499194TNFAIP3T0.270.91 (0.87,0.96)0.00070.110.03

6q23rs6920220TNFAIP3A0.221.22 (1.16,1.29)8.9×10−1310.99

6q23rs5029937TNFAIP3T0.041.40 (1.24,1.58)7.5×10−80.950.86

6q25rs394581*TAGAPC0.300.91 (0.87,0.96)0.00060.130.05

8p23rs2736340BLKT0.251.12 (1.07,1.18)1.5×10−50.340.17

9p13rs2812378*CCL21G0.341.10 (1.05,1.16)0.00010.210.09

9q33rs3761847TRAF1,C5G0.431.13 (1.08,1.18)2.1×10−70.700.49

10p15rs2104286IL2RAC0.270.92 (0.87,0.97)0.0020.050.02

10p15rs4750316PRKCQC0.190.87 (0.82,0.92)2.0×10−60.410.21

11p12rs540386*TRAF6T0.140.88 (0.83,0.94)0.00030.130.05

12q13rs1678542*KIF5A,PIP4K2CG0.380.91 (0.87,0.96)0.00020.200.08

20q13rs4810485CD40T0.250.85 (0.80,0.90)2.8×10−90.880.73

22q12rs3218253*IL2RBA0.261.09 (1.03,1.15)0.0020.070.02

GWAS meta-analysis results for previously known SNPs associated with RA risk among European populations. Listed are the chromosome, SNP ID, and candidate gene(s) in the region. The minor allele and frequency (positive strand in HapMap release 22, frequency in controls subjects), odds ratio (95% confidence interval), and association P-value are derived from our GWAS meta-analysis. Most SNPs have achieved P<5×10 in combined analysis from previous studies; those with an asterisk (*) have been validated by replication in independent samples, but may not have attained P<5×10 in any single study. SNPs with strong evidence of association in Asians (including PADI4, rs2240340, PGWAS = 0.01; FCRL3, rs3761959, PGWAS = 0.001; and CD244, rs3753389, PGWAS = 0.26) or suggestive evidence of association in Europeans are shown in Supplementary Table 1 online. Power was calculated based on the odds ratios from our meta-analysis at α = 10 (a threshold for selecting SNPs for replication in the current study) and 5×10 (genome-wide signicance). Details of the power calculations can be found in the Supplementary Materials online.

We used three criteria to select 34 independent SNPs for replication after removing confirmed RA risk alleles. First, we selected all 11 SNPs with PGWAS<10. Second, we selected 9 SNPs (of 116 tested) with PGWAS<0.0001 and identified as functionally related to known RA risk loci by GRAIL22, a bioinformatic analysis that identifies connections among genes in published abstracts (Supplementary Figure 6 online). Third, we selected 14 SNPs (of 104 tested) with PGWAS<0.01 and previously known to be associated with other autoimmune diseases2331, as we found evidence for enrichment of autoimmune SNPs in our RA GWAS (Supplementary Figure 7 online). See Supplementary Note online for additional details about our selection of SNPs for replication.

These 34 SNPs were genotyped in an independent set of 6,768 autoantibody positive RA cases and 8,806 matched controls of European ancestry (Table 1). As in our GWAS, all cases were seropositive for disease-specific autoantibodies (anti-CCP or RF). For each SNP, we tested for replication of the GWAS meta-analysis association by calculating a one-tailed P-value (Preplication) for the same allele conferring risk in both analyses. Additionally, we conducted overall association tests across all 41,282 samples (GWAS meta-analysis plus replication, Poverall), and considered Poverall<5×10 to be a reproducible level of significance for RA risk association.

Of the 34 SNPs tested, 12 obtained a Bonferroni-corrected Preplication<0.0015 (=0.05/34), and an additional 9 obtained Preplication<0.05. Ten of these achieved genome-wide significance in the combined analysis, Poverall<5×10, indicating that these are validated RA risk alleles (Table 3). The other 11 SNPs had Preplication<0.05, but Poverall>5×10, indicating highly suggestive but not genome-wide significant association (Table 4). Results for all 34 SNPs in replication and combined analyses are shown in Supplementary Table 2 online.

Table 3

Newly validated RA risk alleles.

SNPGWAS Meta-AnalysisReplicationCombined
IDChrPositionGene(s)AllelePGWASORMAFPreplicationOR (95% CI)MAFPoverallCochran Q
MajorMinorCaseControlCaseControlP
Novel RA and novel autoimmune risk loci

rs9347342p1465,507,237SPRED2AG3.2E-071.130.520.490.00021.13 (1.06,1.21)0.530.515.3E-100.95

rs68592195q1155,474,337ANKRD55, IL6STCA2.5E-090.780.180.210.00020.85 (0.78,0.93)0.190.229.6E-120.19

rs262325q21102,624,619C5orf13CT4.3E-070.880.290.320.0040.93 (0.88,0.98)0.300.324.1E-080.82

Novel RA, previously implicated autoimmune risk loci

rs133155913p1458,531,881PXKTC3.7E-071.290.100.090.0021.13 (1.04,1.23)0.090.084.6E-080.12

rs8740404p1525,784,466RBPJGC1.9E-071.140.330.303.0E-111.18 (1.12,1.24)0.340.301.0E-160.57

rs30930236q27167,504,701CCR6GA3.3E-071.130.470.434.5E-061.11 (1.06,1.16)0.460.431.5E-110.42

rs104886317q32128,188,134IRF5TC2.8E-061.190.130.111.2E-061.25 (1.14,1.37)0.130.104.2E-110.60

Associations at known RA risk loci

rs116769222q11100,265,458AFF3AT6.9E-071.120.490.461.1E-091.15 (1.10,1.20)0.480.451.0E-140.50

rs9510059p1334,733,681CCL21AG5.4E-070.840.140.166.7E-050.87 (0.81,0.93)0.130.153.9E-100.61

rs70677810p156,138,955IL2RACT7.9E-081.140.440.401.5E-051.11 (1.06,1.17)0.430.401.4E-110.36

GWAS, replication and combined meta-analysis results for SNPs that achieve genome-wide significance for association with RA risk. Listed are the rs ID for each SNP, the chromosomal/cytological band, position in human genome build 36, candidate genes in the region (selected by GRAIL analysis, or manually based on immunological function; see text and Supplementary Note), and major and minor alleles (positive strand in HapMap release 22, major/minor based on frequency in GWAS controls). The association P-value, odds ratio (OR) with respect to minor allele (95% confidence interval for replication analysis) and minor allele frequencies (MAF) in cases and controls are listed for our GWAS and replication analyses. For the combined analysis, the overall association P-value and the P-value for Cochran’s Q test for heterogeneity are listed.

: Implicated in other autoimmune diseases: PXK is associated with SLE (rs6445975, r2 = 0.15 with rs13315591); RBPJ is associated with T1D (rs10517086, r2 = 1 with rs874040); CCR6 is associated with Crohn’s (rs2301436, r2 = 0.48 with rs3093023); and IRF5 SNP rs10488631 is associated with SLE.
: Previously implicated in rheumatoid arthritis: AFF3 SNP rs11676922 has recently been reported (Barton et al 2009); CCL21 SNP rs2812378 (r2 = 0.06 with rs951005); and IL2RA SNP rs2104286 (r2 = 0.25 with rs706778).

Table 4

Suggestive RA risk alleles.

SNPGWAS Meta-AnalysisReplicationCombined
IDChrPos (HG18)Gene(s)aAllelePGWASORMAFPrep.OR (95% CI)MAFPoverallCochran Q
MajorMinorCaseControlCaseControlP
rs7543174a1q21151,340,745IL6RTC7.9E-051.130.180.160.011.07 (1.01,1.13)0.190.181.2E-050.06

rs840016a1q24164,140,328CD247CT3.6E-050.900.390.420.0060.92 (0.86,0.98)0.380.401.6E-060.62

rs13119723b4q27123,575,918IL2, IL21AG0.0010.890.130.156.7E-050.87 (0.81,0.93)0.150.176.8E-070.46

rs11594656b10p156,162,015IL2RATA0.00020.900.230.250.040.95 (0.90,1.00)0.240.250.00010.86

rs2793108b10p1131,419,111ZEB1TC0.0020.930.400.430.0010.93 (0.89,0.98)0.410.431.4E-050.73

rs3184504b12q24110,347,328SH2B3TC0.0040.930.490.490.00020.92 (0.88,0.96)0.480.496.0E-060.08

rs7155603a14q2475,030,289BATFAG1.0E-051.160.210.190.0011.12 (1.04,1.20)0.230.211.1E-070.74

rs8045689a16p1128,895,770CD19, NFATC2IPTC5.3E-051.140.320.300.011.06 (1.01,1.12)0.320.302.4E-050.35

rs2872507b17q1235,294,289IKZF3GA4.7E-051.100.490.470.0021.08 (1.02,1.14)0.480.469.4E-070.69

rs11203203b21q2242,709,255UBASH3AGA2.5E-051.110.390.370.021.07 (1.00,1.14)0.380.373.8E-060.49

rs5754217b22q1120,264,229UBE2L3GT0.00071.100.220.190.011.07 (1.01,1.13)0.220.214.8E-050.79

GWAS, replication and combined meta-analysis results for SNPs with highly suggestive associations with RA risk (defined as P<0.05 in our replication samples). As in Table 2, listed for each SNP are the rs ID, chromosomal location and position, candidate gene(s), and major and minor alleles, GWAS and replication analysis P-value, odds ratio and case/control minor allele frequencies, and the combined analysis P-values for association and for Cochran’s Q test for heterogeneity.

: Selected for replication based on GRAIL Ptext score
: Selected for replication based on PGWAS < 0.01 and validated autoimmune disease association: rs13119723, IL2/IL21, RA and Celiac; rs11594656, IL2RA, T1D and MS; rs2793108, ZEB1, T1D; rs3184504, SH2B3, Celiac and T1D; rs2872507, IKZF3, Crohn’s; rs11203203, UBASH3A, T1D; rs5754217, UBE2L3, SLE; the ZEB1 SNP was from the May 2009 release of the online T1D database (www.T1Dbase.org).
Autoimmune disease associations for SNPs other than those tested for replication in the present study (see text, Fig. 2 and Supplementary Note): CD247, Crohn’s; IL2/IL21, Celiac, RA and T1D; IL2RA, MS, RA and T1D; BATF, T1D; CD19/NFATC2IP, T1D; IKZF3, T1D; UBASH3A, Crohn’s and T1.

Three of the 10 newly validated RA risk loci have not been implicated in any genetic studies of RA or other autoimmune disease. The SNPs are located at chromosome 2p14 (rs934734, Poverall=5.3×10), 5q11 (rs6859219, Poverall=9.6×10) and 5q21 (rs26232, Poverall=4.1×10). Replication sample odds ratios (for minor alleles) were 1.13, 0.85 and 0.93, respectively (Table 3).

Although no single gene can be declared causal, we labeled RA risk loci with the names of the most compelling candidate gene(s) from each region of linkage disequilibrium (LD) based upon GRAIL analysis and/or knowledge of RA pathogenesis2,22. At 2p14, the most significant SNP (rs934734) is located within intron 1 of the sprouty-related, EVH1 domain containing 2 (SPRED2) gene (Figure 1A), which has been shown to regulate CD45+ hematopoietic cells via the Ras/MAP kinase pathway32. At 5q11, the most significant SNP (rs6859219) is located in ANKRD55, an ankyrin repeat domain-containing gene of unknown function (Figure 1B). A more compelling immunological candidate, interleukin 6 signal transducer (IL6ST), lies ~150 kilobases (kb) proximal but outside the region of LD with associated SNPs. The IL6ST protein product, gp130, functions as a part of the receptor complex for the inflammatory cytokine IL633. At 5q21, there is no obvious biological candidate gene; SNP rs26232 lies within the intron of predicted gene C5orf30 (Figure 1C).

An external file that holds a picture, illustration, etc.
Object name is nihms203085f1a.jpg
An external file that holds a picture, illustration, etc.
Object name is nihms203085f1b.jpg
An external file that holds a picture, illustration, etc.
Object name is nihms203085f1c.jpg
An external file that holds a picture, illustration, etc.
Object name is nihms203085f1d.jpg
Association with RA risk across 4 loci

Regional association plots show strength of association (-Log(P-value)) versus chromosomal position (kilobases, kb) for all SNPs across 1 Megabase (Mb) regions centered on the newly validated SNPs (labeled). PGWAS values are plotted with red/white diamonds for all SNPs, shaded by the degree of LD (r2, see legend) with the validated SNP (larger red diamond). Poverall in combined analysis of GWAS and replication collections is plotted with the blue diamond. Local recombination rates estimated from HapMap CEU (centi-Morgans per Mb, blue line) are plotted against the secondary y-axis, showing recombination hotspots across the region. Labeled green arrows below the plots indicate genes and their orientations. (a) 2p14, SPRED2 locus. (b) 5q11, IL6ST-ANKRD55 locus. (c) 5q21, C5orf13 locus. (d) 10p15, IL2RA locus.

Our study provides the first convincing evidence that 4 loci implicated in other autoimmune diseases are also associated with risk of RA. Of these, three of the 4 SNPs were selected for replication based on obtaining PGWAS<10, regardless of their previously reported associations with autoimmune disease. These 3 SNPs are located at chromosome 3p14 (rs13315591, near PXK, Poverall=4.6×10), 4p15 (rs874040, near RBPJ, Poverall=1.0×10) and 6q27 (rs3093023, in CCR6, Poverall=1.5×10). The 3p14 SNP lies 187 kb proximal to a SNP associated with systemic lupus erythematosus (SLE)25; the SLE SNP, intronic in the PXK gene, is only weakly associated with RA risk in our study (rs6445975, PGWAS = 0.03, r2 = 0.15 and D’ = 0.75 with rs13315591). The 4p15/RBPJ SNP is in complete LD (r2 = 1) with a SNP associated with risk of type 1 diabetes (T1D, rs10517086)23, and the same allele confers risk in both diseases. The 6q27/CCR6 SNP rs3093023 is in LD with a SNP associated with Crohn’s disease24, which is only weakly associated with RA risk in our study (Crohn’s SNP rs2301436, PGWAS = 0.045, r2 = 0.48 and D’ = 0.80).

The IRF5 SNP rs10488631 (PGWAS =2.8×10), chosen because of its association in SLE34,35, was convincingly associated with autoantibody positive RA in our study (Preplication=1.2×10, Poverall=4.2×10). Graham et al.36 proposed that the complex SLE risk associations in IRF5 are explained by three independent groups of SNPs, each consisting of SNPs in tight LD with each other. In our dataset, these groups are represented by rs10488631 (group 1), rs729302 (group 2) and rs4728142 (group 3). In addition to the group 1 SNP, rs10488631, we found evidence for association with group 3 SNP rs4728142 (PGWAS=7.2×10), which is in LD with a variant that alters IRF5 polyadenylation and expression36. However, we found no evidence for association for group 2 (rs729302, PGWAS=0.15). Conditional analyses indicated that the group 1 and group 3 effects are independent (Supplementary Table 3 online). We note that group 3 SNP rs3807306 has been suggested to be associated with autoantibody negative RA37, and our results suggest that this SNP is associated with autoantibody positive RA (PGWAS=4.5×10).

Our GWAS meta-analysis identified new alleles exhibiting independent effects from previously implicated RA risk alleles at two other loci, CCL2111 (rs951005, Poverall=3.9×10) and IL2RA15 (rs706778, Poverall=3.3×10). Our new CCL21 SNP has little LD with the previous RA risk-associated SNP rs281237811 (PGWAS=1.6×10; r2=0.06 and D’=0.83). Conditional analyses indicated that the associations are partly independent (Supplementary Table 3 online); the new SNP rs951005 remains significant (P = 1.7×10) conditional on the previously known SNP rs2812378; conditional on rs951005, rs2812378 remains nominally significant (P = 0.01).

SNPs at IL2RA are known to be associated with multiple autoimmune diseases including RA15, T1D23 and multiple sclerosis (MS)38,39. The IL2RA SNP identified in our GWAS (rs706778, Poverall=3.3×10) is in partial LD with a SNP previously implicated in RA, T1D and MS (rs2104286, r2=0.25 and D’=1)15 and another SNP implicated in T1D and MS (rs11594656; r2=0.24 and D’=1)23. In fact, T1D risk is conferred by a haplotype tagged by rs2104286(T)-rs11594656(T)38. In our GWAS dataset, rs706778 correlated strongly with the T1D risk haplotype (r2=0.83), and conditional and haplotype analyses demonstrated that risk of RA is better explained by this haplotype [rs2104286(T)-rs11594656(T)] than by rs706778 (Supplementary Tables 3 and 4 online).

A SNP at the AFF3 locus has been previously implicated in RA14, with equivocal evidence for association with T1D23,40. Our study provides definitive evidence that this locus is associated with risk of autoantibody positive RA (rs11676922, Poverall=1.0×10, OR=1.14).

Eighteen of the 34 SNPs that we tested in replication (based on the criteria above) are in LD with validated autoimmune risk alleles (Figure 2). Of these, 5 demonstrated genome-wide (Poverall<5×10) and 7 showed suggestive (Preplication<0.05) evidence of association in our study; only 6 showed no evidence of association in replication. While not meant as a complete comparison of all known RA and other autoimmune disease risk alleles, these results underscore the emerging overlap in genetic bases of RA and other autoimmune diseases41,42. Although additional replication in large sample collections is required to confirm the suggestive associations, many likely represent true RA risk alleles.

An external file that holds a picture, illustration, etc.
Object name is nihms203085f2.jpg
Previously validated autoimmune SNPs tested in our replication study

Eighteen SNPs tested in our replication samples were in LD (defined as r2 > 0.3) with a validated autoimmune risk allele. Of these, 5 were validated as RA risk alleles in our study (Poverall < 5×10, inner most circle), 6 were suggestive associations (Preplication <0.05 but Poverall > 5×10), and 6 demonstrated no evidence of association in our replication samples (Preplication ≥0.05). For the 12 SNPs with suggestive or no evidence of association, each SNP is plotted by the strength of association with RA risk in the replication samples; those closer to the inner circle have more significant Preplication. All of the RA risk alleles confer risk in the same direction as the validated autoimmune risk alleles (when the same allele or a near perfect proxy was tested). We include the following as “autoimmune” diseases in our study, listed on the outside of the circle, although these reflect diseases along the autoimmuneinflammatory spectrum: systemic lupus erythematosus (SLE), celiac disease, Crohn’s disease, multiple sclerosis (MS), psoriasis, and type 1 diabetes (T1D); other autoimmune diseases are not included (e.g., autoimmune thyroiditis). The haplotype tagged by the IL2RA SNP, rs706778, is associated with T1D and MS38,39; the SH2B3 SNP is associated with both T1D and Celiac disease23,30. The CCR6 SNP rs3093023 is in partial LD with a SNP associated with Crohn’s disease (rs2301436, r2 = 0.48)24. The AFF3 SNP has an equivocal association with T1D (where the associated SNP is rs965344240). The IL2-IL21 SNP tested in our study, rs13119723, is in LD (r=0.67) with a SNP previously implicated in both Celiac disease and RA (rs6822844)10,29, but only in partial LD (r=0.09) with a T1D SNP (rs4505848)23. The CD19-NFATC2IP SNP tested in our study, rs8045689, was selected because of GRAIL; it is in partial LD (r=0.38) with a SNP associated with T1D (rs4788084)23. The TNIP1 SNP in our study, rs6889239, is in strong LD with an SLE SNP (rs7708392, r=0.91)49, but not in LD with another TNIP1 SNP associated with psoriasis (rs17728338, r<0.01)26. The ZEB1 SNP (rs2793108) was from the May 2009 release of T1D base, although this SNP did not appear in a subsequent publication23. The PXK SLE SNP was tested in this study but not shown, as it is in weak LD with the RA risk SNP (r = 0.15 between rs6445975 and rs13315591); the RA SNP was selected because of PGWAS<10, not because of its association with another autoimmune disease. Note that there are SNPs associated with RA and other autoimmune diseases not shown; we only include those SNPs tested as part of the current study.

Several of the SNP associations further implicate T-cells in RA pathogenesis. RBPJ (recombination site binding protein J; also known as CSL) is a transcription factor within the Notch signaling pathway. RBPJ-deficient mice have no T-cell development, while early B-cell development is maintained43. CCR6 is a cell surface protein that distinguishes Th17 cells from other CD4+ helper T-cells44. Synoviocytes from arthritic joints of mice and RA patients produce CCL20, a CCR6 ligand. Intriguingly, anti-CCR6 monoclonal antibodies substantially inhibit mouse arthritis, suggesting that CCR6 could be a therapeutic target in RA44. CD247 (Preplication=0.006, Poverall=1.6×10; Table 3) encodes the T-cell receptor zeta chain, a subunit of the T-cell receptor-CD3 complex that when mutated causes an inflammatory arthritis in mouse45,46. The zeta chain plays an important role in coupling antigen recognition to several intracellular signal-transduction pathways. These results build upon knowledge of T-cell activation-differentiation in RA pathogenesis evidenced by genetic associations at HLA-DRB1 (presents antigens to T-cells), PTPN22 (alters T-cell thresholds), CTLA4 (co-stimulator for T-cell activation), IL2RA/CD25 (mediates IL2-dependent T-cell responses), and STAT4 (transcription factor in Th1 cell differentiation), among others.

In conclusion, we find convincing evidence for association with risk of autoantibody positive RA at 10 loci, seven of which represent novel risk loci for RA. We estimate that the now >30 validated non-MHC RA risk alleles explain 3.9% of the total disease variance, with 0.67% due to the new RA risk alleles reported here. It is clear that additional risk alleles remain to be identified, as genetic discoveries explain only 16% of disease variance (including an estimated 12% for the MHC47), whereas more than half the risk of autoantibody positive RA is thought to be genetic47,48. The number of SNPs with suggestive evidence of association in our study (Table 4) further indicates that many more common risk alleles with modest effect sizes remain to be discovered. In addition to common variants, the roles of rare variants, copy number variants and epigenetic modifications will need to be explored with newer genomic technologies.

Footnotes

Links

EIGENSTRAT software: http://genepath.med.harvard.edu/~reich/EIGENSTRAT.htm

IMPUTE and SNPTEST software: https://mathgen.stats.ox.ac.uk/impute/impute.html

GRAIL software: http://www.broadinstitute.org/mpg/grail/

GWAS meta-analysis results: http://www.broadinstitute.org/ftp/pub/rheumatoid_arthritis/Stahl_etal_2010NG

Author contributions

Study design: RMP, SR and EAS. Analysis: EAS (lead), SR, FASK, RC. Sample procurement and data generation: JW, KAS, PKG, LK, NAS, MEW, CW, MJHC, NdV, PPT, EWK, REMT, TWJH, ABB (leads); EFR, GX, SE, BPT, YL, AZ, AH, CG, LA, CIA, KGA, AB, JB, EB, NPB, JJC, J Coblyn, KHC, LAC, JBAC, J Cui, PIWdB, PLDJ, BD, PE, EF, PH, LJH, DLK, XK, ATL, XL, PM, AWM, LP, MDP, TRDJR, DMR, MS, MFS, SS, WT, AHMvdH-vM, IEvdH-B, CEvdS, PLCMvR, AGW, GJW, PW, BIRAC and YEAR consortia. Writing: RMP, EAS (leads); SR, FASK (primary contributors); JW, KAS, PKG, LK, NAS, MEW, CW, MJHC, NdV, PPT, EWK, REMT, TWJH, ABB, EFR, GX, SE, BPT, YL, AZ, AH, CG, LA, CIA, KGA, AB, JB, EB, NPB, JJC, J Coblyn, KHC, LAC, JBAC, J Cui, PIWdB, PLDJ, BD, PE, EF, PH, LJH, DLK, XK, ATL, XL, PM, AWM, LP, MDP, TRDJR, DMR, MS, MFS, SS, WT, AHMvdH-vM, IEvdH-B, CEvdS, PLCMvR, AGW, GJW, PW.

The authors declare that we have no competing financial interest.

Footnotes

References

  • 1. Silman AJ, Pearson JEEpidemiology and genetics of rheumatoid arthritis. Arthritis Res. 2002;4(Suppl 3):S265–72.[Google Scholar]
  • 2. Klareskog L, Catrina AI, Paget SRheumatoid arthritis. Lancet. 2009;373:659–72.[PubMed][Google Scholar]
  • 3. Suzuki A, et al Functional haplotypes of PADI4, encoding citrullinating enzyme peptidylarginine deiminase 4, are associated with rheumatoid arthritis. Nat Genet. 2003;34:395–402.[PubMed][Google Scholar]
  • 4. Begovich AB, et al A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis. Am J Hum Genet. 2004;75:330–7.[Google Scholar]
  • 5. Kurreeman FA, et al A candidate gene approach identifies the TRAF1/C5 region as a risk factor for rheumatoid arthritis. PLoS Med. 2007;4:e278.[Google Scholar]
  • 6. Plenge RM, et al Two independent alleles at 6q23 associated with risk of rheumatoid arthritis. Nat Genet. 2007;39:1477–1482.[Google Scholar]
  • 7. Plenge RM, et al TRAF1-C5 as a Risk Locus for Rheumatoid Arthritis – A Genomewide Study. N Engl J Med. 2007;357:1199–209.[Google Scholar]
  • 8. Remmers EF, et al STAT4 and the risk of rheumatoid arthritis and systemic lupus erythematosus. N Engl J Med. 2007;357:977–86.[Google Scholar]
  • 9. Thomson W, et al Rheumatoid arthritis association at 6q23. Nat Genet. 2007;39:1431–1433.[Google Scholar]
  • 10. Zhernakova A, et al Novel association in chromosome 4q27 region with rheumatoid arthritis and confirmation of type 1 diabetes point to a general risk locus for autoimmune diseases. Am J Hum Genet. 2007;81:1284–8.[Google Scholar]
  • 11. Raychaudhuri S, et al Common variants at CD40 and other loci confer risk of rheumatoid arthritis. Nat Genet. 2008;40:1216–23.[Google Scholar]
  • 12. Raychaudhuri S, et al Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk. Nat Genet. 2009;41:1313–8.[Google Scholar]
  • 13. Gregersen PK, et al REL, encoding a member of the NF-kappaB family of transcription factors, is a newly defined risk locus for rheumatoid arthritis. Nat Genet. 2009;41:820–3.[Google Scholar]
  • 14. Barton A, et al Identification of AF4/FMR2 family, member 3 (AFF3) as a novel rheumatoid arthritis susceptibility locus and confirmation of two further panautoimmune susceptibility genes. Hum Mol Genet. 2009;18:2518–22.[Google Scholar]
  • 15. Barton A, et al Rheumatoid arthritis susceptibility loci at chromosomes 10p15, 12q13 and 22q13. Nat Genet. 2008;40:1156–9.[Google Scholar]
  • 16. Marchini J, Howie B, Myers S, McVean G, Donnelly PA new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007;39:906–13.[PubMed][Google Scholar]
  • 17. Devlin B, Roeder KGenomic control for association studies. Biometrics. 1999;55:997–1004.[PubMed][Google Scholar]
  • 18. de Bakker PI, et al Practical aspects of imputation-driven meta-analysis of genomewide association studies. Hum Mol Genet. 2008;17:R122–8.[Google Scholar]
  • 19. Consortium WTCCGenome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–78.[Google Scholar]
  • 20. Suzuki A, et al Functional SNPs in CD244 increase the risk of rheumatoid arthritis in a Japanese population. Nat Genet. 2008;40:1224–9.[PubMed][Google Scholar]
  • 21. Kochi Y, et al A functional variant in FCRL3, encoding Fc receptor-like 3, is associated with rheumatoid arthritis and several autoimmunities. Nat Genet. 2005;37:478–85.[Google Scholar]
  • 22. Raychaudhuri S, et al Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet. 2009;5:e1000534.[Google Scholar]
  • 23. Barrett JC, et al Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet. 2009[Google Scholar]
  • 24. Barrett JC, et al Genome-wide association defines more than 30 distinct susceptibility loci for Crohn’s disease. Nat Genet. 2008;40:955–62.[Google Scholar]
  • 25. Harley JB, et al Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci. Nat Genet. 2008;40:204–10.[Google Scholar]
  • 26. Nair RP, et al Genome-wide scan reveals association of psoriasis with IL-23 and NF-kappaB pathways. Nat Genet. 2009;41:199–204.[Google Scholar]
  • 27. de Cid R, et al Deletion of the late cornified envelope LCE3B and LCE3C genes as a susceptibility factor for psoriasis. Nat Genet. 2009;41:211–5.[Google Scholar]
  • 28. Hom G, et al Association of systemic lupus erythematosus with C8orf13-BLK and ITGAM-ITGAX. N Engl J Med. 2008;358:900–9.[PubMed][Google Scholar]
  • 29. van Heel DA, et al A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nat Genet. 2007;39:827–9.[Google Scholar]
  • 30. Hunt KA, et al Newly identified genetic risk variants for celiac disease related to the immune response. Nat Genet. 2008;40:395–402.[Google Scholar]
  • 31. De Jager PL, et al Meta-analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci. Nat Genet. 2009;41:776–82.[Google Scholar]
  • 32. Nobuhisa I, et al Spred-2 suppresses aorta-gonad-mesonephros hematopoiesis by inhibiting MAP kinase activation. J Exp Med. 2004;199:737–42.[Google Scholar]
  • 33. Ernst M, Jenkins BJAcquiring signalling specificity from the cytokine receptor gp130. Trends Genet. 2004;20:23–32.[PubMed][Google Scholar]
  • 34. Graham RR, et al A common haplotype of interferon regulatory factor 5 (IRF5) regulates splicing and expression and is associated with increased risk of systemic lupus erythematosus. Nat Genet. 2006;38:550–555.[PubMed][Google Scholar]
  • 35. Sigurdsson S, et al Polymorphisms in the tyrosine kinase 2 and interferon regulatory factor 5 genes are associated with systemic lupus erythematosus. Am J Hum Genet. 2005;76:528–37.[Google Scholar]
  • 36. Graham RR, et al Three functional variants of IFN regulatory factor 5 (IRF5) define risk and protective haplotypes for human lupus. Proc Natl Acad Sci U S A. 2007;104:6758–63.[Google Scholar]
  • 37. Sigurdsson S, et al Association of a haplotype in the promoter region of the interferon regulatory factor 5 gene with rheumatoid arthritis. Arthritis Rheum. 2007;56:2202–2210.[PubMed][Google Scholar]
  • 38. Dendrou CA, et al Cell-specific protein phenotypes for the autoimmune locus IL2RA using a genotype-selectable human bioresource. Nat Genet. 2009;41:1011–5.[Google Scholar]
  • 39. Maier LM, et al IL2RA genetic heterogeneity in multiple sclerosis and type 1 diabetes susceptibility and soluble interleukin-2 receptor production. PLoS Genet. 2009;5:e1000322.[Google Scholar]
  • 40. Todd JA, et al Robust associations of four new chromosome regions from genomewide analyses of type 1 diabetes. Nat Genet. 2007;39:857–64.[Google Scholar]
  • 41. Zhernakova A, van Diemen CC, Wijmenga CDetecting shared pathogenesis from the shared genetics of immune-related diseases. Nat Rev Genet. 2009;10:43–55.[PubMed][Google Scholar]
  • 42. Plenge RMRecent progress in rheumatoid arthritis genetics: one step towards improved patient care. Curr Opin Rheumatol. 2009;21:262–71.[PubMed][Google Scholar]
  • 43. Rothenberg EV, Moore JE, Yui MALaunching the T-cell-lineage developmental programme. Nat Rev Immunol. 2008;8:9–21.[Google Scholar]
  • 44. Hirota K, et al Preferential recruitment of CCR6-expressing Th17 cells to inflamed joints via CCL20 in rheumatoid arthritis and its animal model. J Exp Med. 2007;204:2803–12.[Google Scholar]
  • 45. Wucherpfennig KW, Call MJ, Deng L, Mariuzza RStructural alterations in peptide-MHC recognition by self-reactive T cell receptors. Curr Opin Immunol. 2009[Google Scholar]
  • 46. Sakaguchi N, et al Altered thymic T-cell selection due to a mutation of the ZAP-70 gene causes autoimmune arthritis in mice. Nature. 2003;426:454–60.[PubMed][Google Scholar]
  • 47. MacGregor AJ, et al Characterizing the quantitative genetic contribution to rheumatoid arthritis using data from twins. Arthritis Rheum. 2000;43:30–7.[PubMed][Google Scholar]
  • 48. van der Woude D, et al Quantitative heritability of anti-citrullinated protein antibody-positive and anti-citrullinated protein antibody-negative rheumatoid arthritis. Arthritis Rheum. 2009;60:916–23.[PubMed][Google Scholar]
  • 49. Gateva V, et al A large-scale replication study identifies TNIP1, PRDM1, JAZF1, UHRF1BP1 and IL10 as risk loci for systemic lupus erythematosus. Nat Genet. 2009;41:1228–33.[Google Scholar]
  • 50. Arnett FC, et al The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum. 1988;31:315–24.[PubMed][Google Scholar]
  • 51. Neale BM, et al A Genome-Wide Scan of Advanced Age-Related Macular Degeneration Suggests a Novel Role of Lipase-C. 2009 in review. [PubMed][Google Scholar]
  • 52. Kathiresan S, et al Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number variants. Nat Genet. 2009;41:334–41.[Google Scholar]
  • 53. Coenen MJ, et al Common and different genetic background for rheumatoid arthritis and coeliac disease. Hum Mol Genet. 2009;18:4195–203.[PubMed][Google Scholar]
  • 54. Wijbrandts CA, et al The clinical response to infliximab in rheumatoid arthritis is in part dependent on pretreatment tumour necrosis factor alpha expression in the synovium. Ann Rheum Dis. 2008;67:1139–44.[Google Scholar]
  • 55. Costenbader KH, Chang SC, De Vivo I, Plenge R, Karlson EWPTPN22, PADI-4 and CTLA-4 genetic polymorphisms and risk of rheumatoid arthritis in two longitudinal cohort studies: evidence of gene-environment interactions with heavy cigarette smoking. Arthritis Res Ther. 2008;10:R52.[Google Scholar]
  • 56. Purcell S, et al PLINK: a tool set for whole-genome association and populationbased linkage analyses. Am J Hum Genet. 2007;81:559–75.[Google Scholar]
  • 57. Price AL, et al Principal components analysis corrects for stratification in genomewide association studies. Nat Genet. 2006;38:904–9.[PubMed][Google Scholar]
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.