PLoS Genetics. May/31/2011; 7(6)

Published online Jun/29/2011

PMID: 21738480

PMC: 3128114

doi: 10.1371/journal.pgen.1002113

Multiple Loci Are Associated with White Blood Cell Phenotypes

Michael A. Nalls

David J. Couper

Toshiko Tanaka

Frank J. A. van Rooij

Qiong Yang+85 authors

Abstract

White blood cell (WBC) count is a common clinical measure from complete blood count assays, and it varies widely among healthy individuals. Total WBC count and its constituent subtypes have been shown to be moderately heritable, with the heritability estimates varying across cell types. We studied 19,509 subjects from seven cohorts in a discovery analysis, and 11,823 subjects from ten cohorts for replication analyses, to determine genetic factors influencing variability within the normal hematological range for total WBC count and five WBC subtype measures. Cohort specific data was supplied by the CHARGE, HeamGen, and INGI consortia, as well as independent collaborative studies. We identified and replicated ten associations with total WBC count and five WBC subtypes at seven different genomic loci (total WBC count—6p21 in the HLA region, 17q21 near ORMDL3, and CSF3; neutrophil count—17q21; basophil count- 3p21 near RPN1 and C3orf27; lymphocyte count—6p21, 19p13 at EPS15L1; monocyte count—2q31 at ITGA4, 3q21, 8q24 an intergenic region, 9q31 near EDG2), including three previously reported associations and seven novel associations. To investigate functional relationships among variants contributing to variability in the six WBC traits, we utilized gene expression- and pathways-based analyses. We implemented gene-clustering algorithms to evaluate functional connectivity among implicated loci and showed functional relationships across cell types. Gene expression data from whole blood was utilized to show that significant biological consequences can be extracted from our genome-wide analyses, with effect estimates for significant loci from the meta-analyses being highly corellated with the proximal gene expression. In addition, collaborative efforts between the groups contributing to this study and related studies conducted by the COGENT and RIKEN groups allowed for the examination of effect homogeneity for genome-wide significant associations across populations of diverse ancestral backgrounds.

Author Summary

WBC traits are highly variable, moderately heritable, and commonly assayed as part of clinical complete blood count (CBC) examinations. The counts of constituent cell subtypes comprising the WBC count measure are assayed as part of a standard clinical WBC differential test. In this study we employed meta-analytic techniques and identified ten associations with WBC measures at seven genomic loci in a large sample set of over 31,000 participants. Cohort specific data was supplied by the CHARGE, HeamGen, and INGI consortia, as well as independent collaborative studies. We confirm previous associations of WBC traits with three loci and identified seven novel loci. We also utilize a number of additional analytic methods to infer the functional relatedness of independently implicated loci across WBC phenotypes, as well as investigate direct functional consequences of these loci through analyses of genomic variation affecting the expression of proximal genes in samples of whole blood. In addition, subsequent collaborative efforts with studies of WBC traits in African-American and Japanese cohorts allowed for the investigation of the effects of these genomic variants across populations of diverse continental ancestries.

Introduction

The WBC count, a classic marker of immune or inflammatory response, varies substantially among healthy individuals. The counts of constituent cell subtypes comprising the WBC count measure are assayed as part of a standard clinical WBC differential test. While the WBC count and WBC differential count are often obtained to assess for evidence of infection or underlying inflammation, prospective epidemiologic studies have consistently linked higher WBC counts, within the clinically designated normal range, along with other inflammatory markers, to increased risk of coronary artery disease, cancer, and total mortality [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]. Studies are often not consistent on the specific WBC subpopulations involved, but granulocytes, in general, and neutrophils, in particular, are most often implicated in these observations [11], [12]. In the general population, the total WBC count is also directly associated with many cardiovascular disease risk factors, such as higher blood pressure, cigarette smoking, adiposity, lower socioeconomic status, and higher levels of plasma inflammatory markers [13].

WBC counts are also moderately heritable [14], with heritability estimates varing from approximately 0.14 to 0.4 across total leukocyte count and cell subtypes, as assessed in a Sardinian population, with the highest heritability estimates for monocyte counts [14]. In addition, the substantially lower neutrophil count and total WBC count in African Americans compared to European-ancestry individuals seems to be at least partially explained by a regulatory variant in the Duffy Antigen Receptor for Chemokine (DARC) gene, which accounts for ∼20% of total variation in the measures [15], [16]. Recent studies have sought to investigate the common genetic variants associated with several blood count traits in European-ancestry and Japanese individuals, but have not focused specifically on the multiple cell types comprising the total WBC count measurement [17], [18], [19], [20].

In the current study, we sought to identify and replicate common genetic variants that influence normal variation in six WBC phenotypes commonly measured in the clinical setting and in population studies that comprise the CHARGE Consortium [21], the HaemGen Consortium [18] and independent collaborative studies. We utilized genome-wide association study (GWAS) data and meta-analytic techniques to identify 10 genome-wide significant loci in a study of over 31,000 individuals (Table 1), examining variants possibly associated with total WBC count, three granulocyte phenotypes (neutrophil, basophil and eosinophil counts), and two non-granulocyte phenotypes (lymphocytes and monocytes). We also investigated the shared functional connectivity of identified loci across phenotypes with regards to both known biological pathways and nearby gene expression effects. As previous research has shown strong selective pressures at the locus identified to affect WBC counts in African American populaftions, we examined the possibility of recent selective effects at significantly associated loci in European ancestry individuals [15], [16]. Additional efforts were made in collaboration with RIKEN and COGENT investigators to identify homogenous associations across populations of diverse ancestral backgrounds.

10.1371/journal.pgen.1002113.t001

Table 1

Descriptive statistics of contributing cohorts.

Discovery PhaseTotal WBCGranulocytesNon-granulocytesCovariates

Study	AGES	ARIC	BLSA	FHS	Health ABC	InChianti	RS
WBC count	6.010 (1.793)	5.933 (1.400)	5.441 (1.099)	4.07 (0.235)	6.166 (1.373)	5.965 (1.260)	6.487 (1.496)
Basophils	0.029 (0.025)	0.025 (0.033)	0.012 (0.015)	N.A.	0.060 (0.031)	0.026 (0.019)	N.A.
Eosinophils	0.207 (0.147)	0.104 (0.103)	0.174 (0.093)	N.A.	0.173 (0.102)	0.171 (0.091)	N.A.
Neutrophils	3.511 (1.295)	3.646 (1.118)	3.149 (0.868)	N.A.	3.658 (1.010)	3.626 (1.020)	N.A.
Lymphocytes	1.726 (0.943)	1.814 (0.479)	1.636 (0.445)	N.A.	1.741 (0.580)	1.834 (0.527)	2.500 (0.781)
Monocytes	0.538 (0.177)	0.344 (0.139)	0.406 (0.145)	N.A.	0.534 (0.150)	0.310 (0.091)	N.A.
% Female	58.0	53.2	48.7	51.2	47.1	57.0	59.5
% Current smoker	12.7	21.1	2.7	42.2	4.3	17.7	22.6
Age in years	76.4 (5.5)	54.3 (5.7)	66.8 (13.9)	35.9 (10.4)	75.7 (2.8)	68.1 (15.3)	69.1 (9.0)
Sample Sizes
Total N	3217	4846	337	3909	1075	1014	5111

Replication PhaseTotal WBCGranulocytesNon-granulocytesCovariates

Study	HVH	INGI-CARL	INGI-FVG	INGI-VB	KORAF3	KORAF4	RSII	SORBS	TwinsUK	UKBS1
WBC count	6.932 (2.076)	7.176 (2.900)	5.867 (1.414)	6.250 (1.242)	6.781 (1.671)	5.904 (1.638)	6.468 (1.388)	5.204 (1.159)	5.926 (1.530)	7.255 (1.464)
Basophils	0.068 (0.034)	N.A.	N.A.	0.029 (0.014)	N.A.	N.A.	N.A.	0.027 (0.014)	N.A.	0.100 (0.001)
Eosinophils	0.204 (0.158)	N.A.	N.A.	0.185 (0.100)	N.A.	N.A.	N.A.	0.150 ( 0.082)	N.A.	0.182 (0.092)
Neutrophils	4.193 (1.586)	N.A.	N.A.	3.444 (0.917)	N.A.	N.A.	N.A.	2.853 (0.837)	N.A.	4.268 (1.105)
Lymphocytes	1.932 (0.799)	2.492 (0.682)	2.083 (0.141)	2.141 (0.549)	N.A.	N.A.	2.236 (0.568)	1.749 (0.531)	N.A.	2.293 (0.593)
Monocytes	0.534 (0.189)	0.415 (0.346)	0.786 (0.232)	0.451 (0.126)	N.A.	N.A.	0.455 (0.164)	0.424 (0.119)	N.A.	0.446 (0.129)
% Female	53.0	58.2	59.4	57.0	50.6	51.2	57.2	40.1	100.0	52.0
% Current smoker	8.4	25.5	21.7	14.1	13.3	14.7	16.1	12.9	N.A.	N.A.
Age in years	68.1 (9.1)	44.2 (20.0)	48.3 (19.3)	54.8 (18.0)	62.5 (10.1)	60.9 (8.9)	67.8 (7.1)	47.9 (16.0)	50.9 (13.4)	43.4 (12.4)
Sample Sizes
Total N	403	510	1045	1405	1640	1813	1518	785	1466	1238

The numbers above are thousands of cells per milliliter of blood. All values are presented as mean (SD), except where % indicated.

Results

In the discovery meta-analysis of 19,509 subjects from seven cohorts, we identified 11 genome-wide significant associations with six white cell phenotypes (total WBC, neutrophil, basophil, eosinophil, lymphocyte and monocyte counts, see Table 1, Table S1, Figure 1). Further, we found strong evidence for replication of 10 of the 11 trait-locus associations in 11,823 independent samples from 10 GWAS cohorts who contributed summary statistics for SNPs of interest. The discovery analysis results (Figure 1 and Figures S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 for details), and the results of replication testing for the 10 replicated trait-loci are summarized in Table 2. These results are presented in greater detail for all genome-wide significant SNPs in Table S2.

10.1371/journal.pgen.1002113.g001

Figure 1

Results of discovery phase meta-analyses for white blood cell phenotypes.

Red regions denote loci containing SNPs reaching genome-wide significance at p-values<5.00E-08. Blue regions denote suggestive loci containing SNPs with p-values between 5.00E-08 and 1.00E-07.

10.1371/journal.pgen.1002113.t002

Table 2

Summary results from discovery and replication phase analyses for all genome-wide significant and successfully replicated loci.

				Discovery Phase			Replication Phase			Joint Estimates
Phenotype	SNP	Effect Allele	Alternative Allele	Beta	SE	P-value	Beta	SE	P-value	Beta	SE	P-value	Effect Heterogeneity P-value	CHR	Cytoband	BP	Genes +/−100 kb from most significant SNP in locus	Variance explained (adjusted r2) per trait
WBC	rs2517524	a	c	0.017	0.003	2.64E-09	0.011	0.003	2.50E-05	0.013	0.002	1.46E-12	0.443	6	6p21.33	31133692	MUC21, HCG22, C6orf15, CDSN, PSORS1C1, PSORS1C2 and CCHCR1	0.070
	rs4794822	t	c	0.028	0.003	3.23E-23	0.018	0.002	1.31E-14	0.022	0.002	1.72E-34	0.053	17	17q21.1	35410238	GSDMB, ORMDL3, GSDMA, PSMD3, CSF3, MED24, SNORD124, THRA and NR1D1
Neutrophils	rs8078723	t	c	−0.043	0.004	2.84E-23	−0.036	0.006	4.99E-10	−0.041	0.004	3.16E-31	0.358	17	17q21.1	35420405	GSDMB, ORMDL3, GSDMA, PSMD3, CSF3, MED24, SNORD124, THRA and NR1D1	0.067
Basophils	rs4328821	a	g	0.010	0.002	2.58E-08	0.010	0.002	8.40E-06	0.010	0.001	8.72E-13	0.218	3	3q21.3	129799125	LOC90246, C3orf27 and RPN1	0.021
Lymphocytes	rs2524079	a	g	0.014	0.003	1.85E-08	0.018	0.005	2.03E-04	0.015	0.002	2.93E-11	0.667	6	6p21.33	31348700	PSORS1C3, HCG27, HLA-C and HLA-B	0.058
	rs11878602	a	c	−0.016	0.003	3.42E-09	−0.014	0.004	9.69E-04	−0.015	0.002	8.43E-12	0.975	19	19p13.11	16416153	EPS15L1, CALR3, C19orf44 and CHERP
Monocytes	rs1449263	t	c	0.036	0.005	6.71E-14	0.037	0.006	2.13E-10	0.036	0.004	5.21E-23	0.999	2	2q31.3	182027546	ITGA4, CERKL	0.088
	rs9880192	c	g	−0.028	0.005	1.35E-08	−0.027	0.006	2.08E-05	−0.028	0.004	1.23E-12	0.888	3	3q21.3	129780259	GATA2, LOC90246, C3orf27 and RPN1
	rs1991866	c	g	−0.032	0.005	4.58E-11	−0.037	0.006	7.51E-10	−0.034	0.004	1.79E-19	0.747	8	8q24.21	130693287	NA
	rs10980800	t	c	−0.044	0.006	1.13E-14	−0.039	0.007	7.03E-09	−0.042	0.004	4.26E-22	0.625	9	9q31.3	112955726	EDG2

SNPs included are the most significant SNP per genomic region of interest to aid in clarity.

Total WBC count was associated with two independent loci in the discovery phase of analyses; the first locus was on chromosome 6p21 encompassing a region from 31,131,127 bp to 31,161,846 bp near HLA and PSORS1 gene families, and the second locus on chromosome 17q21 from 35,345,186 bp to 35,470,048 bp near candidate genes ORMDL3 and CSF3. Both loci showed independent replication (Table 2). Neutrophil count was associated with a 196,381 bp region on chromosme 17q21 containing 46 genome-wide significant SNPs from the discovery phase. This region overlaps the locus on chromosome 17q21 identified for the total WBC count phenotype and showed positive association in replication testing at all but 2 SNPs. Basophil count was associated with one SNP, rs4328821 on chromosome 3q21 near RPN1 and C3orf27. This SNP showed a significant positive association between basophil count and minor allele dosage (minor allele frequency 0.110, p-value in discovery phase 2.58E-08, p-value in replication phase 8.40E-06). No regions showed genome-wide significance for association with eosinophil count.

Lymphocyte count was associated with two loci, with one locus on chromosome 6p21 overlapping with the chromosome 6p21 total WBC count locus. The second locus associated with lymphocyte counts is on chromosome 19p13, from 16,32,871 bp to 16,429,197 bp at EPS15L1 and was successfully replicated. Monocyte count was associated with the largest number of independent hits for any of the traits, with five loci identified in the discovery analysis, four of which showed significant associations in replication testing. These four loci include two intergenic regions (>100 kb to nearest known genes) on chromosome 8q24 (from 130,672,817 to 130,693,287) and chromosome 9q31 (112,917,232 to 113,073,157). We also identified a novel association on chromosome 2q31 (182,027,546 to 182,036,459) near ITGA4, and a single genome-wide significant SNP on chromosome 3q21, which is located 18,866 bp from the SNP significantly associated with basophil count (r² = 0.076, D′ = 0.841). The monocyte- associated locus on chromosome 1q22 contained only one genome-wide significant SNP which failed to replicate and is not included in Table 2 (denoted by rs17131683, which exhibited a replication p-value of 0.770 but a consistent negative effect associated with the A allele).

Many of the loci described in this report contain genome-wide significant SNPs spread throughout much of each locus suggestive of either extensive LD or multiple association signals at each locus. For loci described in Table S2 (except those containing less than 3 genome-wide significant SNPs), conditional analyses were conducted using the allele dosage of the most significant SNP per locus as a covariate in a subset of discovery cohorts (AGES, ARIC, BLSA, Health ABC and InChianti). Statistical models were identical to those used in the discovery analyses except for the additional SNP covariates. No SNPs analyzed remained significant after correcting for 147 tests, showing that only one signal of association exists per locus. The complete results for these analyses are evident in Table S3. As each locus accounts for only one unique signal per trait, adjusted r² estimates were calculated for each trait across loci within this subset of cohorts, and may be found in Table 2.

To assess the independence of associated SNPs from the total WBC count measurement, all genome-wide significant SNP associations for white cell subtypes (Table S2) were re-analyzed as per the discovery phase analysis methods adjusting for total WBC count as an additional covariate in a subset of discovery cohorts (AGES, ARIC, BLSA, Health ABC and InChianti). After Bonferroni correction for 97 independent tests at least one SNP per locus remained significant, suggesting some independence from the total WBC measure in the SNP associations. The extended results of this analysis and a table of r² values for the phenotypes of interest based on the same subset of discovery cohorts may be found in Table S4 and Table S5.

These results demonstrate a high degree of relatedness across traits, with individual loci affecting multiple WBC traits, that may be pleiotropic to some degree or due to the biological relatedness of the traits. Neutrophils are the most abundant WBC subtype, and the locus on chromosome 17q21 associated with both total WBC count and neutrophil count independently based on conditional analyses described above. In this region, 38 SNPs were common to both traits in the discovery analysis, and these SNPs showed identical directions of effects across both traits. In the chromosome 3q21 between C3orf27 and RPN1, rs9880192 was associated with monocyte count and rs4328821 was associated with basophil count, suggesting pleiotropy in this region. The region on chromosome 6p21 near the PSORS1 family of genes as well as HLA-C and HLA-B contains associated SNPs with both total WBC count and lymphocyte count, although, interestingly, spatially overlapping SNPs failed to replicate in the other trait, further suggesting independence of effects seen in conditional analyses of all loci.

To further examine the functional connectivity of these loci across white blood cell phenotypes, the Gene Relationships Among Implicated Loci package (GRAIL, http://www.broad.mit.edu/mpg/grail/) was utilized to mine PubMed archives using textual analyses of known functional associations to identify concurrent effects and relationships across phenotypes [22]. In brief, GRAIL incorporates functional annotations from text mining related to specific genomic loci, usually genome-wide association study results, to assess the functional inter-relatedness of genes in linkage disequilibrium with the regions of interest and construct networks of related genes sharing biological function. In our analyses, we utilized GRAIL to survey the functional relatedness of all regions containing significant results passing Bonferroni correction in both the discovery and replication phases. We identified four clusters of related genes with false-discovery rate adjusted p-values<0.05 out of the 49 gene clusters generated by the GRAIL package, which are described in Figure 2 and Table S6. All four clusters show significant interconnectivity between genes proximal to loci on chromosome 17q21 associated with total WBC and neutrophil counts and the chromosome 19p13 locus associated with lymphocyte count. The two most significant clusters also show relationships between genes proximal to the previously mentioned loci and candidate genes at the chromosome 8q24 region associated with monocyte counts. Genes at the chromosome 17q21 locus associated with both total WBC and neutrophil counts appeared in all significant pathways identified in the GRAIL analyses, suggesting biological connectivity across both granulocyte and non-granulocyte cell lineages. Candidate genes from the gasdermin (GSDML) and mediator complex subunit (MED) families were highly enriched in the significant gene clusters and comprised 37.5% of genes in these clusters. These results suggest shared biological pathways between these genes and cell proliferation among WBC subtypes associated with these genomic regions, although as emphasized in conditional analyses, these effects at these loci remain to some degree independent of the total WBC measure.

10.1371/journal.pgen.1002113.g002

Figure 2

GRAIL gene clusters represented graphically for all clusters reaching a FDR adjusted p-value <0.05.

The relative thickness of each line connecting nodes is indicative of a p-value closer to zero. Gene symbols are designated on each node, with the cytogenetic position following the colon. Individual gene clusters are color-coded.

We also examined possible functional consequences of individual SNP associations by analyzing whole blood genome-wide gene expression data from the InChianti study to identify associations between SNPs found to be significant in the meta-analysis and cis changes in gene expression. All SNPs significant in both the discovery and replication phases for all phenotypes were used in the expression analysis. Each SNP in this dataset which was within 500 kb of an expression probe was treated as a possible expression quantitative trait locus (eQTL). For 85 SNPs in our subset of significant meta-analysis results, we tested at total 741 candidate eQTL associations using multivariate linear regression. This analysis tested each SNP in the subset for an association with each expression probe within 500 kb. After Bonferroni correction for the 741 tests conducted, 36 SNPs in the chromosome 17q21 locus associated with total WBC count and neutrophil count in the meta-analysis were also significantly associated with cis-expression levels. In total, these 36 SNPs were associated with three expression probes, 2 probes tagging transcripts in the GSDML gene (probes ILMN_2347193 and ILMN_1666206) and another probe tagging a single probe in ORMD3L (probe ILMN_1662174), for a total of 103 signficicant eQTLs (Figure 3). Both probes in GSDML were highly correlated (r² = 0.582), although neither GSDML probe was strongly corellated with the probe of interest in the ORMD3L gene (r²<0.200). With each SNP's minor allele as the point of reference, all effect directions for significant meta-analysis and eQTL associations were concordant, showing a strong correlation between the effect sizes in the meta-analysis and eQTL analysis, suggesting that the effect of the identified SNPs on the corresponding WBC trait may be transcriptionally mediated. For example, the correlation of effect estimates between the eQTL and meta-analysis for neutrophil associated SNPs also associated with the ILMN_1666206 probe highly significant, with an r² of 0.898. For details of all eQTLs tested, please refer to Table S7 and the Methods and Materials section.

10.1371/journal.pgen.1002113.g003

Figure 3

The directionality and magnitude of effects for SNPs significant in both the meta-analysis and eQTL analysis are highly correlated (r² = 0.795).

Effect estimates in both the eQTL analysis and meta-analysis were standardized to effects based on dosages of the minor allele for each SNP. Meta-analysis effects are based on beta coefficients from regression, while eQTL effects are based on standardized betas. This figure only includes significant SNPs to improve clarity.

Previous studies of WBC genetic variation in African American populations have shown evidence of WBC associated loci in a region of high selective pressures, with the putative functional SNP rs2814778 being almost fixed in frequency in sub-Saharan African populations where malaria is endemic [15]. We therefore undertook an investigation of recent selective pressures acting upon SNPs identified associated with WBC phenotypes in European-ancestry populations. Evidence for selection for all 10 significant trait-loci from the meta-analysis was evaluated by mining HapMap2 CEU data. Integrated Haplotype Scores (iHS) were used to quantify selection at each locus based on homozygosity decay in extended haplotypes and are availible for download from the Haplotter website (http://hg-wen.uchicago.edu/selection) [23]. Selection was quantified by an absolute value of iHS>2 indicating strong selective pressures, and absolute iHS values ≤2 and ≥1.635 were interpreted as indicating recent-moderate selection, positive or negative iHS values indicated the direction of selection [24]. All replicated SNPs were evaluated for signatures of selection. One locus showed evidence of selection, and this was the lymphocyte associated locus on chromosome 19p13, from 16,336,388–16,429,197 bp. This locus on chromosome 19p13 showed evidence of selection for all SNPs analyzed in the replication phase. All SNPs in this locus appear to be under some degree of negative selection with 2 of the 11 SNPs being under strong negative selection and the rest under recent moderate selection. However, when testing the correlation between effect estimates and iHS at this locus using the ancestral allele as a reference for effect direction, no clear correlation between iHS and effect size was detected (r² = 0.226, p-value = 0.140). Full iHS annotation for replicated SNPs are shown in in Table S8.

As part of collaborative efforts with the COGENT and RIKEN groups, a coordinated exchange of summary statistics for SNP identified in Table 2 was organized from the parallel studies conducted by these groups. Random-effects meta-analysis techniques were utilized to identify effects that were consistent across studies of diverese ethnic backgrounds. While all joint effect estimates remained in consistent directions with those described in Table 2, heterogeneity of effects across the 3 ancestral populatons severely attenuated the strength of the associations for all but 2 of the genome-wide significant associations identified in this report. The associations at rs4794822 (WBC count) and rs11878602 (lymphocyte count) remained genome-wide significant. Consistent robust effects across ancestral populations at rs11878602 may lend some support to the recent-moderate negative selection at this SNP (iHS = −1.924) being associated with an increase in frequency of the derived A allele associated with decreasing monocyte counts. These results suggest that these SNPs may be very close to the functional variants associated with these effects, as well as exhibiting relatively consistent effects across multiple population ancestries with differing LD structures. The results of these analyses are detailed in Table 3.

10.1371/journal.pgen.1002113.t003

Table 3

Results of random-effects meta-analyses incorperating summary statistics from COGENT, RIKEN, and CHARGE meta-analyses for loci of interest.

		European Ancestry - CHARGE and HeamGen			African American Ancestry - COGENT Network			Japanese Ancestry - Riken Institute			Random Effects Meta-analysis
Phenotype	SNP	Beta	SE	P-value	Beta	SE	P-value	Beta	SE	P-value	Summary Effect	Summary SE	P-value	Heterogeneity Variance	Heterogeneity P-value
WBC	rs2517524	0.013	0.002	1.46E-12	0.007	0.004	0.058	0.002	0.003	0.437	0.008	0.004	0.029	3.19E-05	0.008
	rs4794822	0.022	0.002	1.72E-34	0.013	0.004	7.56E-04	0.019	0.003	1.25E-12	0.019	0.002	3.44E-15	9.74E-06	0.088
Neutrophils	rs8078723	−0.041	0.004	3.16E-31	−0.007	0.007	0.331	−0.034	0.005	5.60E-11	−0.028	0.009	1.19E-03	1.99E-04	1.82E-04
Basophils	rs4328821	0.010	0.001	8.72E-13	0.002	0.001	0.046	0.009	0.001	4.67E-25	0.007	0.003	6.37E-03	1.82E-05	3.11E-08
Lymphocytes	rs2524079	0.015	0.002	2.93E-11	0.008	0.005	0.097	0.003	0.004	0.394	0.009	0.004	0.025	3.63E-05	0.015
	rs11878602	−0.015	0.002	8.43E-12	−0.015	0.005	2.20E-03	−0.008	0.003	0.016	−0.013	0.002	1.84E-08	5.53E-06	0.224
Monocytes	rs1449263	0.036	0.004	5.21E-23	0.009	0.003	5.01E-03	0.041	0.006	2.92E-12	0.028	0.011	9.13E-03	3.34E-04	3.69E-10
	rs9880192	−0.028	0.004	1.23E-12	0.004	0.005	0.431	−0.038	0.011	6.74E-04	−0.020	0.013	0.125	4.43E-04	2.01E-07
	rs1991866	−0.034	0.004	1.79E-19	−0.005	0.003	0.138	−0.025	0.006	2.92E-05	−0.021	0.010	0.041	2.98E-04	4.73E-09
	rs10980800	−0.042	0.004	4.26E-22	−0.005	0.004	0.185	0.014	0.013	0.263	−0.012	0.016	0.428	6.75E-04	1.42E-11

Discussion

This meta-analysis has identified ten genome-wide significant trait-locus associations with WBC phenotypes spread across seven genomic loci. Of these trait-locus associations, seven associations represent novel findings, and three associations, across two genomic loci, confirm previously identified associations. The association of chromosome 17q21 near ORMDL3 and CSF3 associated in our study with total WBC count and neutrophil count, and the 9q31 locus associated with monocyte count have been previously demonstrated in European-ancestry and Japanese populations [25], [26]. The chromosome 3q21 locus near RPN1 and C3orf27 has been previously shown to be associated with eosinophil count, and is instead associated with related granulocyte cell measures of monocyte and basophil counts in this study, although we do identify suggestive p-values at ∼1.00E-4 to 1.00E-8 and the same direction of effect for the additional loci identified in Gubjartsson et al., 2010 [19], [27]. In addition, a number of previous GWAS have implicated the monocyte count associated locus at chromosome 8q24.21 as affecting height, renal function, serum protein levels, multiple sclerosis, glioma, leukemia and a number of other cancers [28]–[59]. Through conditional analyses and an analysis of functional relatedness, we have shown correlation among related traits and possible pleiotropic connectivity of these loci across phenotypes that represent measures of biologically related cellular lineages, as well as identified loci showing a direct association between allelic gene expression differences and variation in phenotypic measures.

The associations identified in this study are robust, and several have been previously identified in GWAS studies of immunologically relevant phenotypes, such as the association of celiac disease with the chromosome 2q31.1 locus containing ITGA4. ITGA4 encodes an alpha integrin subunit present on monocytes, lymphocytes, endothelial cells and erythrocytes that serves as an adhesion molecular receptor for VCAM-1, fibronectin. VCAM-1 is expressed at high levels on the vasculature of the bone marrow, and therefore alpha4integrin receptors may play a role in homing and recruitment of certain cell types during hematopoiesis [60], [61].

The overlapping loci on chromosome 17q21 associated with both total WBC and neutrophil counts constitute a single plausible candidate locus contributing to general variability in total WBC count and neutrophil count via hematopoietic mechanisms. These measures are highly correlated, and the estimated SNP effects are also likely correlated for this reason. From a functional perspective, the role of G-CSF, the CSF3 gene product, has been well-described in the biology of myeloid progenitor production and differentiation. This locus previously reached genome-wide significance in a joint analysis of discovery and replication cohorts of total WBC count in individuals of European ancestry [18]. The same locus, containing the genes PSMD3 and CSF3, was associated with neutrophil count in a cohort of Japanese participants [27]. This study of Japanese subjects also reported a significant genome-wide association with neutrophil count for three SNPs in a locus at chromosome 20, containing the gene PLCB4. Due to the lower minor allele frequency in European ancestry individuals of these three correlated SNPs (minor allele frequency = 0.076 in HapMap CEU samples versus 0.289 in HapMap JPT samples for rs2072910) and the marginal effect size detected in the original report, statistical power to detect this association was limited in our analysis [27].

Our data suggest that the locus on chromosome 17q21 has functional connectivity across white cell subtypes. Multiple genes at this locus appeared in all significant pathways identified in the GRAIL analyses showing a functional connectivity across both granulocyte and non-granulocyte cell lineages. Gene clusters detailed in the GRAIL analyses show significant functional clusters including genomic regions that are separately associated with granulocyte and non-granulocyte traits within the same cluster. However, the results of the GRAIL analyses may be influenced by both funding avenues and publication bias, as the classifications are based only on PubMed searchable published results.

Cis-eQTL effects on transcripts in the ORMDL3 and GSDML were shown to be highly correlated with allelic effects associated with neutrophil and total WBC count measures quantified in the meta-analysis. This suggests that variation in total WBC count and neutrophil counts are at least in part due to polymorphism-based regulation of gene expression. This regulation of gene expression by SNPs in this region may be related to some form of systemic immunological function, as variants in this region were also shown to be associated with expression of ORMDL3 in childhood asthma [62]. Interestingly, Okada et al. also showed significant eQTL associations between SNPs and transcripts at this locus, although their associations implicate regulation of PSMD3 as affecting neutrophil variation, rather than the exact transcript associations identified in this report [27]. This slight difference may be attributable to the larger sample size in our study, differences in population ancestry of the two studies, and the use of expression data derived from lymphoblastoid cell lines instead of whole blood. The CSF3 gene at this locus is a possible candidate contributing to effects across multiple cell subtypes at this locus, as functional studies have demonstrated that this gene creates a protein integral to the differentiation and functionality of granulocyte cells [63], [64], [65], [66]. One important caveat to our eQTL analysis is the lack of representation of the CSF3 gene on the expression array used.

The loci on chromosome 6p21 associated with total WBC and lymphocyte counts appear to be independent of each other as the lymphocyte association persists after adjustment for total WBC. While the closest replicated SNP associations are roughly 200 kb from each other, and a shared functional connectivity between these regions was not elucidated in the GRAIL analysis. At this locus, rs2524079 associated with lymphocyte count is in moderate LD with a number of SNPs in the periphery of the total WBC count locus, including rs2844619, a SNP significant in the total WBC count discovery phase analysis (D′ = 0.762, r² = 0.305, from HapMap Phase II CEU samples). The finding of multiple independent effects at a single locus has occurred in prior studies and include examples such as the finding of two independent signals within the PLAG1 locus associated with human height, suggesting a locus specific effect, in the current example affecting leucopoiesis or leukocyte homeostasis [48]. Our chromosome 6p21 WBC and lymphocyte loci, both harbor candidate genes that have been previously implicated as associated with phenotypes closely related to immunological function. The locus associated with total WBC count on chromosome 6p21 contains genes associated with follicular lymphoma (CHCG22), progression of HIV-1 infection (CDSN, PSORS1C1 and PSORS1C2) and psoriasis (CCHCR1, PSORS1C1 and PSORS1C2) [67], [68], [69]. This gene rich region includes HLA family genes, particularly HLA-B and HLA-C that are candidates within the lymphocyte associated locus, and actually overlap with the psoriasis candidate locus identified previously via linkage mapping studies and includes PSORS1C1 and PSORS1C2, showing the relatedness of these two loci on chromosome 6p21 [69], although conditional analyses adjusting for total white blood cell count validate these as primarily independent effects. The lymphocyte associated regions containing HLA-B and HLA-C, harbors two genes that have been implicated in multiple GWAS as modifiers of immunological responses, associated with IL-18 levels, HIV-1 control, vitiligo, multiple clerosis, and psoriasis [49], [68], [70], [71], [72], [73], [74], [75], [76].

The region associated with basophil and monocyte counts on chromosome 3q21 proximal to the GATA2 gene was previously described as associated with variation in eosinophil count in European and Asian ancestry populations [19]. This association with eosinophil count was not identified as genome-wide significant in our analyses, with multiple SNPs in the GATA2 region (GATA2+/−250 kb) approaching genome-wide significance, exhibiting p-values in the discovery meta-analysis including a regional minimum of 6.73E-07 (rs4328821 ). This is likely due to our decreased sample size for analyses of eosinophil count compared to the original report. However, our data show a significant association between this genomic region and basophil count, and basophils and eosinophils are both granulocytic cells with common lineage in WBC differentiation. GATA2 is a well-known transcription factor involved in maintenance of early hematopoietic cell pools and proximal hematopoietic pathways. In addition, this region proximal to GATA2 is also significantly associated with monocyte counts, showing overlapping associations across both granulocyte and non-granulocyte cell lineages and supporting the previously described functional role of GATA2 more proximally in the WBC differentiation process [77], [78]. The independence across granulocyte and non-granulocyte lineages is evident as both associations showed independent signals of association after adjustment for total WBC.

Our analyses have identified genomic loci associated with total WBC and constituent white cell subtypes in European ancestry cohorts. Our findings differ from the results of similar studies of African American and Asian ancestry populations. Population variation at previously identified total WBC count associated loci of DARC in African American cohorts and PLCB4 in Asian ancestry samples motivated our investigation of possible selection at the loci identified in this report, as allele frequencies of SNPs in DARC and PLCB4 differ across populations. This may be suggestive of recent selection. Our analyses of iHS statistics for genome-wide significant SNPs yielded only one locus under selection, with all SNPs investigated in this region being under negative selection in European ancestry populations. These SNPs on chromosome 19p13 associated with lymphocyte counts are proximal to candidate genes such as CHREP, which function in calcium homeostasis in lymphocytes, and mutations in the coding region of CALR3 are associated with familial hypertrophic cardiomyopathy [79], [80]. A thorough search of literature did not reveal any known selective factors associated with this locus. In addition, the fact that this locus remained genome-wide significant in random-effects modeling across diverse ancestral populations suggests a highly generalizable effect at this locus that may or may not be related to selective factors.

In conclusion, we have identified and replicated a set of 10 independent trait-locus associations influencing multiple related WBC traits, of which seven are novel associations. Integrative analyses of our association data and gene expression analyses support pleiotropic effects that will require further functional testing to clarify.

Materials and Methods

Ethics Statement

All participating studies conducted their research in accordance with their respective institutional scientific and ethical review boards. All human participants provided informed consent and all clinical investigation was conducted in accordance with the Declaration of Helsinki.

Study Methods

WBC counts were measured in 19,509 subjects in 7 discovery cohorts (The Rotterdam Study (RS), Framingham Heart Study(FHS), the NHLBI's Atherosclerosis Risk in Communities (ARIC) Study, the Age, Gene/Environment Susceptibility – Reykjavik Study (AGES) Study, Health Aging and Body Composition Study (Health ABC), the Baltimore Longitudinal Study of Aging (BLSA), and the Invecchiare in Chianti Study (InChianti)) and 11,823 subjects in 10 replication cohorts (the Sorbs, the Twins UK cohort (TwinsUK), Kooperative Gesundheitsforschung in der Region Augsburg (KORAF3 & KORAF4) and UK Blood Services Donor Panel 1 (UKBS1) studies, three of the Italian Network on Genetic Isolates (INGI) studies (Carlantino, Val Borbera and Friuli Venezia Giulia), the Rotterdam Study II (RSII) and the Heart and Vascular Health Study (HVH)). In order to study genetic factors affecting variation of these traits within normal ranges, each study excluded all participants with any WBC measure (total WBC or one of the 5 cell subtypes) outside of +/−2 standard deviations from the mean value for that trait.

WBC phenotypes were derived from data provided by fluorescence activated cell sorting technologies commonly employed in clinical and epidemiological studies to interrogate common hematological elements found in peripheral blood. Total WBC count was reported in thousands of cells per ml, and sub-type specific cell counts were calculated by multiplying the proportion of the WBC count comprised by each cell type by the total WBC measure. Any subject with a trait value greater than 2 SD from the corresponding mean of that trait in each cohort or missing data for any assayed phenotype were excluded from all analyses. Shapiro-Wilk tests of normality were implemented in the smallest discovery cohorts as the data was available at the time of study design (the InChianti study and the Baltimore Longitudinal Study of Aging, BLSA) to evaluate normality of the phenotypes for analyses. Raw values, natural log transformed and square-root transformed values for each phenotype of interest in these two studies were compared with regard to deviations from normality based assessment of the Shapiro-Wilk test statistic in the two studies. Based upon these reviews, a uniform analysis plan was established for conducting each study-level analyses, analysis, using either log transformation (total WBC count, neutrophil count, and monocyte count) or square-root transformation (basophil count, eosinophil count and lymphocyte count) in order to normalize the distributions of the phenotypic data.

At the study level, GWAS analyses were conducted on unrelated participants (except for FHS) of confirmed European ancestry based on either multi-dimensional scaling or principal components analyses, concordance between genotypic and self-reported gender, and successfully genotyped at >95% of attempted SNPs. SNPs were filtered based on criteria of minor allele frequency >0.01 (MAF), missingness per SNP<5% and Hardy-Weinberg equilibrium p-value>1.00E-7 (HWE, used to exclude poorly clustered genotypes). Participants and SNPs passing basic quality control were imputed to >2.4 million SNPs based on HapMapII haplotype data. All studies utilized multivariate linear regression to generate study level summary statistics for each phenotype, with allelic dosages at each SNP used as the independent variable and primary covariates of age at hematology assay, current smoking status and sex. Detailed descriptions of participating studies, their quality control practices and study-level analyses which may differ slightly from those described above are provided in Text S1.

To conduct meta-analyses, all studies submitted summary statistics from the study-level linear regression analyses for each phenotype. Meta-analyses were conducted using inverse-variance weighted fixed-effects models to combine beta coefficients and standard errors from study level regression results for each SNP to derive a combined p-value. Prior to discovery meta-analyses, SNPs were excluded if imputation quality metrics (equivalent to the squared correlation between proximal imputed and genotyped SNPs) were less than 0.30. Study level results were also corrected for genomic inflation factors (λ_GC) by incorporating study specific λ_GC estimates into the scaling of the standard errors (SE) of the regression coefficients by multiplying the SE by the square-root of the genomic inflation factor (see Table S1 for study and phenotype specific genomic inflation factors) [81]. Study specific genomic inflation factor estimates for all discovery cohorts were all <1.05 except for 1.12 in the Health ABC analysis of basophil count and 1.09 in the analysis of total white blood cell count in AGES (Table S1). No definitive cause of this inflation could be identified, and of particular interest, the genomic inflation factors for related traits in these two studies were within the normally accepted range. Meta-analyses were implemented using METAL and independently re-analyzed using R to confirm results [82].

We chose a priori to carry over all results from discovery meta-analyses at p-values<5.00E-08 to replication meta-analyses, excluding any SNPs with Cochran's Q test of heterogeneity p-values<0.01 or missing in more than 2 studies. These conservative exclusion criteria caused the exclusion of 6 of 167 genome-wide significant SNPs from replication analyses, and these SNPs did not constitute any new loci of interest. The final number of SNPs for replication analyses was then reduced to 161 candidate SNPs across all phenotypes. For replication meta-analyses of individual SNPs, each phenotype was analyzed separately using similar inverse-variance weighted meta-analyses as in the discovery stage analyses, although no genomic control was used. P-values for significant associations in the replication stage were corrected for the number of SNPs tested per phenotype using the standard Bonferroni correction for multiple testing (total WBC count corrected for 63 SNPs, with a significance threshold of p-value≤7.94E-4; neutrophil count corrected for 46 SNPs, with a significance threshold of p-value≤1.09E-3; basophil count corrected for 1 SNP, with a significance threshold of p-value≤0.05; lymphocyte count corrected for 14 SNPs, with a significance threshold of p-value≤3.57E-3; and monocyte count corrected for 37 SNPs, with a significance threshold of p-value≤1.35E-3). Of the 161 candidate SNPs included in the replication phase, 152 SNPs passed the trait-specific replication p-value thresholds. Ony one genome-wide significant locus failed to replicate, and this locus on chromosome 1q22 contained only one genome-wide significant SNP associated with monocyte count in the discovery phase.

Of the 152 successfully replicated associations, 109 SNPs were unique, since some SNPs were significant across multiple phenotypes. These replicated SNPs were analyzed in GRAIL to infer a possible biological connection between significant meta-analysis loci. GRAIL was used to mine textual data based on PubMed keywords to examine functional relatedness across phenotypes based on inferred biological interconectivity between genes proximal to meta-analysis results. SNP (rs) identifiers for these associated SNPs were input into the GRAIL webserver as a means of specifying genomic regions of interest in constructing query and seed regions to be analyzed. Genes for text mining of the functional datasource were identified using the LD structure of HapMap2 Release 22 CEU samples, gathering gene identifiers to search indexed abstracts from PubMed last curated on May 2010. Genes in regions of interest were clustered based on keyword similarity. These genes and clusters were then scored based on ranked similarity, adjusting for gene size, to generate p-values evaluating the strength of the functional interconnectivity of genes in the regions of interest. P-values for these functional clusters were then false-discovery rate adjusted (FDR) to correct for multiple testing, with the FDR adjusted p-value of 0.05 considered the threshold for significance.

For the eQTL analysis, 501 participants with complete genotyping data from the InChianti study were also successfully assayed on Illumina HT12v.3 genome-wide expression arrays using RNA isolated from whole blood. Quality control of the genome-wide expression data included the exclusion of probes with detection p-values>0.01 with missing data for greater than 5% of participants. Samples must have been assayed with at least 95% of the filtered probe sets passing quality control in order to be included in analyses.

5094 probes passed quality control and were subsequently cubic-spline normalized prior to analysis. In the investigation of possible cis-eQTL associations at regions of interest identified in the meta-analysis, all probes within 500 kb of successfully replicated SNPs from the meta-analysis were identified based on annotations from ReMOAT (http://www.compbio.group.cam.ac.uk/Resources/Annotation/) [83]. Thus, we tested 741 possible cis-eQTLs. Multivariate linear regression was implemented using PLINKv.1.07 [84], testing the dosage of minor alleles as a predictor of gene expression level for each probe. These linear regression models were adjusted for hybridization batch, amplification batches, sex, smoking, study site and age at baseline of study. The p-values generated by each analysis was corrected for the number of tests, with a minimum threshold of significance at p-value≤6.75E-05.

Supporting Information

Figure S1

Detailed association plot for the WBC locus at Chr6:31033692–31233692 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s001.pdfClick here for additional data file.

Figure S2

Detailed association plot for the WBC locus at Chr17:35310238–35510238 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s002.pdfClick here for additional data file.

Figure S3

Detailed association plot for the Neutrophil locus at Chr17:35306999–35506999 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s003.pdfClick here for additional data file.

Figure S4

Detailed association plot for the Basophil locus at Chr3:129699125–129899125 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s004.pdfClick here for additional data file.

Figure S5

Detailed association plot for the Lymphocyte locus at Chr6:31250153–31450153 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s005.pdfClick here for additional data file.

Figure S6

Detailed association plot for the Lymphocyte locus at Chr19:16309375–16509375 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s006.pdfClick here for additional data file.

Figure S7

Detailed association plot for the Monocyte locus at Chr2:181927546–182127546 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s007.pdfClick here for additional data file.

Figure S8

Detailed association plot for the Monocyte locus at Chr3:129680259–129880259 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s008.pdfClick here for additional data file.

Figure S9

Detailed association plot for the Monocyte locus at Chr8:130578550–130778550 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s009.pdfClick here for additional data file.

Figure S10

Detailed association plot for the Monocyte locus at Chr9:112855726–113055726 bp. Locus specific plots showing top SNP per replicated locus +/−100 kilobases. SNPs in each region are color-coded based on linkage disequilibrium (r²) estimates from the CEU subset from HapMap Phase II: purple indicates reference SNP from meta-analysis, red indicates r²>0.8, orange indicates 0.6<r²≤0.8, green indicates 0.4<r²≤0.6, light blue indicates 0.2<r²≤0.4, and dark blue indicates r²≤0.2. Recombination rates estimated from the CEU HapMap Phase II data are included as a blue line in the background of the figure. Gene boundaries and exon positions are taken from RefSeq and UCSC Genome browser (build 36). Locus plots were generated using the LocusZoom Stand-alone package (http://genome.sph.umich.edu/wiki/LocusZoom_Standalone), incorporating the R packages Grid and Lattice, as well as the package New Fugue (http://genome.sph.umich.edu/wiki/New_Fugue) to estimate LD structure.

(PDF)

pgen.1002113.s010.pdfClick here for additional data file.

Table S1

Genomic inflation factors λ_GC for discovery stage analyses. The λ_GC values were calculated inclusive of all SNPs analyzed, and these values were utilized as genomic control factors for the meta-analyses.

(PDF)

pgen.1002113.s011.pdfClick here for additional data file.

Table S2

Full discovery findings and replication results for SNPs of interest from the discovery stage of analyses.

(XLS)

pgen.1002113.s012.xlsClick here for additional data file.

Table S3

Results of fixed-effects meta-analyses incorperating a study-level adjustment for the most significant SNP per locus identified.

(TXT)

pgen.1002113.s013.txtClick here for additional data file.

Table S4

Results of fixed-effects meta-analyses incorperating a study-level adjustment for the total WBC measure.

(TXT)

pgen.1002113.s014.txtClick here for additional data file.

Table S5

Adjusted r² estimates across phenotypes.

(XLSX)

pgen.1002113.s015.xlsxClick here for additional data file.

Table S6

Gene based clustering from GRAIL. This includes all clusters evaluated.

(PDF)

pgen.1002113.s016.pdfClick here for additional data file.

Table S7

eQTL analysis results for all associations tested between SNPs and transcripts in a subset of 501 samples from the InCianti study.

(TXT)

pgen.1002113.s017.txtClick here for additional data file.

Table S8

iHS scores per SNP of interest.

(PDF)

pgen.1002113.s018.pdfClick here for additional data file.

Text S1

Study descriptions and additional information.

(DOC)

pgen.1002113.s019.docClick here for additional data file.

Footnotes

The authors have declared that no competing interests exist.

This research was made possible by NIA/NIH contract AG000932-02 (2009) Characterization of Normal Genomic Variability. This study utilized the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, MD [http://biowulf.nih.gov]. The Age, Gene/Environment Susceptibility Reykjavik Study is funded by NIH contract N01-AG-12100, the NIA Intramural Research Program, Hjartavernd (the Icelandic Heart Association), and the Althingi (the Icelandic Parliament). The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, N01-HC-55022, and grants R01HL087641, R01HL59367, and R01HL086694; National Human Genome Research Institute contract U01HG004402; and National Institutes of Health contract HHSN268200625226C. The authors thank the staff and participants of the ARIC study for their important contributions. Infrastructure was partly supported by Grant Number UL1RR025005, a component of the National Institutes of Health and NIH Roadmap for Medical Research. The National Heart, Lung, and Blood Institute's Framingham Heart Study is a joint project of the National Institutes of Health and Boston University School of Medicine and was supported by the National Heart, Lung, and Blood Institute's Framingham Heart Study (contract No. N01-HC-25195) and its contract with Affymetrix for genotyping services (contract No. N02-HL-6-4278). Analyses reflect the efforts and resource development from the Framingham Heart Study investigators participating in the SNP Health Association Resource (SHARe) project. A portion of this research was conducted using the Linux Cluster for Genetic Analysis (LinGA-II) funded by the Robert Dawson Evans Endowment of the Department of Medicine at Boston University School of Medicine and Boston Medical Center. The Health ABC Study was supported in part by the Intramural Research Program of the NIH, National Institute on Aging, NIA contracts N01AG62101, N01AG62103, and N01AG62106. The genome-wide association study was funded by NIA grant 1R01AG032098-01A1 to Wake Forest University Health Sciences and genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number HHSN268200782096C. The InChianti Study was supported as a “targeted project” (ICS 110.1RS97.71) by the Italian Ministry of Health, by the U.S. National Institute on Aging (Contracts N01-AG-916413, N01-AG-821336, 263 MD 9164 13, and 263 MD 821336), and in part by the Intramural Research Program, National Institute on Aging, National Institutes of Health, USA. David Melzer's participation was supported in part by NIH grant R01 AG24233-01. The Heart and Vascular Health (HVH) Study was funded by grants HL085251 and HL087652 from the National Heart, Lung and Blood Institute. The INGI cohorts and their investigators would like to acknowledge the regions of Friuli Venezia Giulia region and Fondo Trieste, and funding from Health Ministry project RF-FSR-2007-647201, Fondazione Compagnia di San Paolo and Fondazione Cassa di Risparmio di Alessandria. The KORA Augsburg studies were financed by the Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany, and supported by grants from the German Federal Ministry of Education and Research (BMBF). Part of this work was financed by the German National Genome Research Network (NGFN). This research was supported within the Munich Center of Health Sciences (MC Health) as part of LMUinnovativ. The GWAS database of the Rotterdam Study was funded through the Netherlands Organization of Scientific Research NWO (nr. 175.010.2005.011, 911.03.012) and the Research Institute for Diseases in the Elderly (RIDE). This study was supported by the Netherlands Genomics Initiative (NGI)/NWO project number 050 060 810 (Netherlands Consortium for Healthy Ageing). The Rotterdam Study is supported by the Erasmus Medical Center and Erasmus University, Rotterdam; the Netherlands organization for scientific research (NWO); the Netherlands Organization for the Health Research and Development (ZonMw); the Research Institute for Diseases in the Elderly (RIDE); the Netherlands Heart Foundation; the Ministry of Education, Culture, and Science; the Ministry of Health, Welfare, and Sports; the European Commission (DG XII); and the Municipality of Rotterdam. Janine F. Felix and Abbas Dehghan are supported by the Netherlands Organization for Scientific Research (NWO, VICI no. 918-76-619). The Twins UK study was funded by the Wellcome Trust, European Community's Seventh Framework Programme (FP7/2007–2013)/grant agreement HEALTH-F2-2008-201865-GEFOS and (FP7/2007–2013), ENGAGE project grant agreement HEALTH-F4-2007-201413, and the FP-5 GenomEUtwin Project (QLG2-CT-2002-01254). The study also receives support from the Dept of Health via the National Institute for Health Research (NIHR) comprehensive Biomedical Research Centre award to Guy's & St. Thomas' NHS Foundation Trust in partnership with King's College London. Tim D. Spector is an NIHR senior Investigator. The project also received support from a Biotechnology and Biological Sciences Research Council (BBSRC) project grant (G20234). Additional support includes the Wellcome Trust grant 076113/C/04/Z, the Juvenile Diabetes Research Foundation grant WT061858, and by the National Institute of Health Research of England. Nicole Soranzo is supported by Wellcome Trust (Core Grant Number 091746/Z/10/Z). The participation of Alex P. Reiner is supported by National Heart, Lung, and Blood Institute grant R01 HL-071862. Holger Prokisch's participation was supported by the German Federal Ministry of Education and Research (BMBF) funded German Center for Diabetes Research (DZD e.V.) and Systems Biology of Metabotypes (SysMBo #0315494A). Financial support was received from the German Research Council (KFO-152), IZKF (B27), and the German Diabetes Association. Inga Prokopenko and Vasiliki Lagou were partial funded through the European Community's Seventh Framework Programme (FP7/2007–2013), ENGAGE project, grant agreement HEALTH-F4-2007-201413. Acknowledgments from COGENT collaborators: COGENT and CARe: The authors acknowledge the essential role of the Continental Origins and Genetic Epidemiology Network (COGENT) Consortium in development and support of this manuscript. COGENT members participating in support of this manuscript include the WHI, ARIC, CARDIA, JHS, HANDLS, Health ABC, and GeneSTAR. CARe: The authors wish to acknowledge the support of the National Heart, Lung, and Blood Institute and the contributions of the research institutions, study investigators, field staff and study participants in creating this resource for biomedical research. The following parent studies contributed study data, ancillary study data, and DNA samples through the Broad Institute (N01-HC-65226) to create this genotype/phenotype data base for wide dissemination to the biomedical research community: Atherosclerosis Risk in Communities (ARIC): University of North Carolina at Chapel Hill (N01-HC-55015), Baylor Medical College (N01-HC-55016), University of Mississippi Medical Center (N01-HC-55021),University of Minnesota (N01-HC-55019), Johns Hopkins University (N01-HC-55020), University of Texas, Houston (N01-HC-55017), University of North Carolina (N01-HC-55018). Other NIH support contributing to the GWAS in ARIC are: R01HL087641, R01HL59367, R01HL86694, U01HG004402, and HHSN268200625226C. Coronary Artery Risk in Young Adults (CARDIA): University of Alabama at Birmingham (N01-HC-48047), University of Minnesota (N01-HC-48048), Northwestern University (N01-HC-48049), Kaiser Foundation Research Institute (N01-HC-48050), University of Alabama at Birmingham (N01-HC-95095), Tufts-New England Medical Center (N01-HC-45204), Wake Forest University (N01-HC-45205), Harbor-UCLA Research and Education Institute (N01-HC-05187), University of California, Irvine (N01-HC-45134, N01-HC-95100). Jackson Heart Study (JHS): Jackson State University (N01-HC-95170), University of Mississippi (N01-HC-95171), Tougaloo College (N01-HC-95172). Healthy Aging in Neighborhoods of Diversity across the Life Span Study (HANDLS): This research was supported by the Intramural Research Program of the NIH, National Institute on Aging and the National Center on Minority Health and Health Disparities (intramural project # Z01-AG000513 and human subjects protocol # 2009-149). Data analyses for the HANDLS study utilized the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, Maryland (http://biowulf.nih.gov). GeneSTAR: This research was supported by the National Heart, Lung, and Blood Institute (NHLBI) through the PROGENI (U01 HL72518) and STAMPEED (R01 HL087698-01) consortia. Additional support was provided by grants from the NIH/National Institute of Nursing Research (R01 NR08153), and the NIH/National Center for Research Resources (M01-RR000052) to the Johns Hopkins General Clinical Research Center. WHI: The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts N01WH22110, 24152, 32100-2, 32105-6, 32108-9, 32111-13, 32115, 32118-32119, 32122, 42107-26, 42129-32, and 44221. RIKEN collaborators: We would like to thank all the staff of the Laboratory for Statistical Analysis at RIKEN fortheir technical assistance. This study was supported by Ministry of Education, Culture, Sports, Science, and Technology, Japan. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgments

The authors acknowledge the essential role of the Cohorts for Heart and Aging Research inGenome Epidemiology (CHARGE) Consortium in development and support of this manuscript. CHARGE members include The Rotterdam Study (RS), Framingham Heart Study (FHS), Cardiovascular Health Study (CHS), the NHLBI's Atherosclerosis Risk in Communities (ARIC) Study, and the Age, Gene/Environment Susceptibility – Reykjavik Study (AGES) Study. The participation of studies representing the HeamGen Consortium aided in verifying the results from CHARGE, as well as further developing this manuscript, and include the Sorbs, the Twins UK cohort (TwinsUK), Kooperative Gesundheitsforschung in der Region Augsburg (KORAF3 & KORAF4), and UK Blood Services Donor Panel 1 (UKBS1) studies. Three of the Italian Network on Genetic Isolates (INGI) studies (Carlantino, Val Borbera, and Friuli Venezia Giulia) made valuable contributions to this research. The collaboration of studies such as the Health Aging and Body Composition Study (Health ABC), the Baltimore Longitudinal Study of Aging (BLSA), the Invecchiare in Chianti Study (InChianti), the Rotterdam Study II (RSII), and the Heart and Vascular Health Study (HVH) also played a vital role. Collaboration with the COGENT and RIKEN consortia facilitated additional research and collaboration enriching this project. We thank Dr. Michael Moorhouse, Pascal Arp, Mila Jhamai, Marijn Verkerk, and Sander Bervoets for their help in creating the genetic database. We thank the laboratory technicians Jeannette M. Vergeer-Drop, Bernadette H. M. van Ast-Copier, Andy A. L. J. van Oosterhout, Sue Ellen Mauricia, Andrea J. M. Vermeij-Verdoold, Els Halbmeijer-van der Plas, Debby M. S. Lont, and Hasna Kariouh for their help in phenotype assessment. We thank the staff from the Genotyping Facilities at the Wellcome Trust Sanger Institute for sample preparation, quality control, and genotyping led by Leena Peltonen and Panos Deloukas; Le Centre National de Geçnotypage, France, led by Mark Lathrop, for genotyping; Duke University, North Carolina, USA, led by David Goldstein, for genotyping; and the Finnish Institute of Molecular Medicine, Finnish Genome Center, University of Helsinki, led by Aarno Palotie. We would like to thank Knut Krohn (Microarray Core Facility of the Interdisciplinary Centre for Clinical Research, University of Leipzig) for the genotyping/analytical support and Joachim Thiery (Institute of Laboratory Medicine, Clinical Chemistry and Molecular Diagnostics, University of Leipzig) for clinical chemistry services. We thank Nigel W. Rayner (WTCHG, University of Oxford, UK) for the excellent bioinformatics support. The authors thank the WHI investigators and staff for their dedication, and the study participants for making the program possible. A listing of WHI investigators can be found at http://www.whiscience.org/publications/WHI_investigators_shortlist.pdf. The authors would also like to acknowledge Dr. Mark R. Cookson and Dr. Bryan J. Traynor for assistance manuscript formatting.

References

1. DaneshJCollinsRApplebyPPetoR1998Association of fibrinogen, C-reactive protein, albumin, or leukocyte count with coronary heart disease: meta-analyses of prospective studies.JAMA27914771482[PubMed][Google Scholar]
2. MadjidMAwanIWillersonJTCasscellsSW2004Leukocyte count and coronary heart disease: implications for risk assessment.J Am Coll Cardiol4419451956[PubMed][Google Scholar]
3. ShankarAWangJJRochtchinaEYuMCKeffordR2006Association between circulating white blood cell count and cancer mortality: a population-based cohort study.Arch Intern Med166188194[PubMed][Google Scholar]
4. RuggieroCMetterEJCherubiniAMaggioMSenR2007White blood cell count and mortality in the Baltimore Longitudinal Study of Aging.J Am Coll Cardiol4918411850[PubMed][Google Scholar]
5. Lloyd-JonesDMCamargoCAAllenLAGiuglianoRPO'DonnellCJ2003Predictors of long-term mortality after hospitalization for primary unstable angina pectoris and non-ST-elevation myocardial infarction.Am J Cardiol9211551159[PubMed][Google Scholar]
6. LeeCDFolsomARNietoFJChamblessLEShaharE2001White blood cell count and incidence of coronary heart disease and ischemic stroke and mortality from cardiovascular disease in African-American and White men and women: atherosclerosis risk in communities study.Am J Epidemiol154758764[PubMed][Google Scholar]
7. GrimmRHJrNeatonJDLudwigW1985Prognostic importance of the white blood cell count for coronary, cancer, and all-cause mortality.JAMA25419321937[PubMed][Google Scholar]
8. GillumRFMussolinoMEMadansJH2005Counts of neutrophils, lymphocytes, and monocytes, cause-specific mortality and coronary heart disease: the NHANES-I epidemiologic follow-up study.Ann Epidemiol15266271[PubMed][Google Scholar]
9. FriedmanGDKlatskyALSiegelaubAB1974The leukocyte count as a predictor of myocardial infarction.N Engl J Med29012751278[PubMed][Google Scholar]
10. BrownDWGilesWHCroftJB2001White blood cell count: an independent predictor of coronary heart disease mortality among a national cohort.J Clin Epidemiol54316322[PubMed][Google Scholar]
11. WheelerJGMussolinoMEGillumRFDaneshJ2004Associations between differential leucocyte count and incident coronary heart disease: 1764 incident cases from seven prospective studies of 30,374 individuals.Eur Heart J2512871292[PubMed][Google Scholar]
12. RanaJSBoekholdtSMRidkerPMJukemaJWLubenR2007Differential leucocyte count and the risk of future coronary artery disease in healthy men and women: the EPIC-Norfolk Prospective Population Study.J Intern Med262678689[PubMed][Google Scholar]
13. NietoFJSzkloMFolsomARRockRMercuriM1992Leukocyte count correlates in middle-aged adults: the Atherosclerosis Risk in Communities (ARIC) Study.Am J Epidemiol136525537[PubMed][Google Scholar]
14. PiliaGChenWMScuteriAOrruMAlbaiG2006Heritability of cardiovascular and personality traits in 6,148 Sardinians.PLoS Genet2e132doi:10.1371/journal.pgen.0020132[PubMed][Google Scholar]
15. ReichDNallsMAKaoWHAkylbekovaELTandonA2009Reduced neutrophil count in people of African descent is due to a regulatory variant in the Duffy antigen receptor for chemokines gene.PLoS Genet5e1000360doi:10.1371/journal.pgen.1000360[PubMed][Google Scholar]
16. NallsMAWilsonJGPattersonNJTandonAZmudaJM2008Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies.Am J Hum Genet828187[PubMed][Google Scholar]
17. GaneshSKZakaiNAvan RooijFJSoranzoNSmithAV2009Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium.Nat Genet4111911198[PubMed][Google Scholar]
18. SoranzoNSpectorTDManginoMKuhnelBRendonA2009A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium.Nat Genet4111821190[PubMed][Google Scholar]
19. GudbjartssonDFBjornsdottirUSHalapiEHelgadottirASulemP2009Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction.Nat Genet41342347[PubMed][Google Scholar]
20. KamataniYMatsudaKOkadaYKuboMHosonoNGenome-wide association study of hematological and biochemical traits in a Japanese population.Nat Genet42210215[PubMed][Google Scholar]
21. PsatyBMO'DonnellCJGudnasonVLunettaKLFolsomAR2009Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium: Design of prospective meta-analyses of genome-wide association studies from 5 cohorts.Circ Cardiovasc Genet27380[PubMed][Google Scholar]
22. RaychaudhuriSPlengeRMRossinEJNgACPurcellSM2009Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions.PLoS Genet5e1000534doi:10.1371/journal.pgen.1000534[PubMed][Google Scholar]
23. VoightBFKudaravalliSWenXPritchardJK2006A map of recent positive selection in the human genome.PLoS Biol4e72doi:10.1371/journal.pbio.0040072[PubMed][Google Scholar]
24. HindorffLASethupathyPJunkinsHARamosEMMehtaJP2009Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.Proc Natl Acad Sci U S A10693629367[PubMed][Google Scholar]
25. KamataniYMatsudaKOkadaYKuboMHosonoN2010Genome-wide association study of hematological and biochemical traits in a Japanese population.Nat Genet42210215[PubMed][Google Scholar]
26. FerreiraMAHottengaJJWarringtonNMMedlandSEWillemsenG2009Sequence variants in three loci influence monocyte counts and erythrocyte volume.Am J Hum Genet85745749[PubMed][Google Scholar]
27. OkadaYKamataniYTakahashiAMatsudaKHosonoN2010Common variations in PSMD3-CSF3 and PLCB4 are associated with neutrophil count.Hum Mol Genet1920792085[PubMed][Google Scholar]
28. 2009Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20.Nat Genet41824828[PubMed][Google Scholar]
29. BeatyTHMurrayJCMarazitaMLMungerRGRuczinskiIA genome-wide association study of cleft lip with and without cleft palate identifies risk variants near MAFB and ABCA4.Nat Genet42525529[PubMed][Google Scholar]
30. BirnbaumSLudwigKUReutterHHermsSSteffensM2009Key susceptibility locus for nonsyndromic cleft lip with or without cleft palate on chromosome 8q24.Nat Genet41473477[PubMed][Google Scholar]
31. Crowther-SwanepoelDBroderickPDi BernardoMCDobbinsSETorresMCommon variants at 2q37.3, 8q24.21, 15q21.3 and 16q24.1 influence chronic lymphocytic leukemia risk.Nat Genet42132136[PubMed][Google Scholar]
32. CuiROkadaYJangSGKuJLParkJGCommon variant in 6q26-q27 is associated with distal colon cancer in an Asian population.Gut[Google Scholar]
33. DuboisPCTrynkaGFrankeLHuntKARomanosJMultiple common variants for celiac disease influencing immune gene expression.Nat Genet42295302[PubMed][Google Scholar]
34. EastonDFPooleyKADunningAMPharoahPDThompsonD2007Genome-wide association study identifies novel breast cancer susceptibility loci.Nature44710871093[Google Scholar]
35. EelesRAKote-JaraiZAl OlamaAAGilesGGGuyM2009Identification of seven new prostate cancer susceptibility loci through a genome-wide association study.Nat Genet4111161121[PubMed][Google Scholar]
36. EelesRAKote-JaraiZGilesGGOlamaAAGuyM2008Multiple newly identified loci associated with prostate cancer susceptibility.Nat Genet40316321[PubMed][Google Scholar]
37. Enciso-MoraVBroderickPMaYJarrettRFHjalgrimHA genome-wide association study of Hodgkin's lymphoma identifies new susceptibility loci at 2p16.1 (REL), 8q24.21 and 10p14 (GATA3).Nat Genet4211261130[Google Scholar]
38. FerreiraRCPan-HammarstromQGrahamRRGatevaVFontanGAssociation of IFIH1 and other autoimmunity risk alleles with selective IgA deficiency.Nat Genet42777780[PubMed][Google Scholar]
39. FletcherOJohnsonNOrrNHoskingFJGibsonLJNovel breast cancer susceptibility locus at 9q31.2: results of a genome-wide association study.J Natl Cancer Inst103425435[PubMed][Google Scholar]
40. FrankeAMcGovernDPBarrettJCWangKRadford-SmithGLGenome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci.Nat Genet4211181125[Google Scholar]
41. GoodeELChenevix-TrenchGSongHRamusSJNotaridouMA genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24.Nat Genet42874879[PubMed][Google Scholar]
42. GrantSFWangKZhangHGlabersonWAnnaiahK2009A genome-wide association study identifies a locus for nonsyndromic cleft lip with or without cleft palate on 8q24.J Pediatr155909913[PubMed][Google Scholar]
43. GudmundssonJSulemPGudbjartssonDFBlondalTGylfasonA2009Genome-wide association and replication studies identify four variants associated with prostate cancer susceptibility.Nat Genet4111221126[PubMed][Google Scholar]
44. GudmundssonJSulemPManolescuAAmundadottirLTGudbjartssonD2007Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24.Nat Genet39631637[PubMed][Google Scholar]
45. HansonRLCraigDWMillisMPYeattsKAKobesS2007Identification of PVT1 as a candidate gene for end-stage renal disease in type 2 diabetes using a pooling-based genome-wide single nucleotide polymorphism association study.Diabetes56975983[PubMed][Google Scholar]
46. KiemeneyLASulemPBesenbacherSVermeulenSHSigurdssonAA sequence variant at 4p16.3 confers susceptibility to urinary bladder cancer.Nat Genet42415419[PubMed][Google Scholar]
47. KiemeneyLAThorlaciusSSulemPGellerFAbenKK2008Sequence variant on 8q24 confers susceptibility to urinary bladder cancer.Nat Genet4013071312[PubMed][Google Scholar]
48. Lango AllenHEstradaKLettreGBerndtSIWeedonMNHundreds of variants clustered in genomic loci and biological pathways affect human height.Nature467832838[PubMed][Google Scholar]
49. MelzerDPerryJRHernandezDCorsiAMStevensK2008A genome-wide association study identifies protein quantitative trait loci (pQTLs).PLoS Genet4e1000072doi:10.1371/journal.pgen.1000072[PubMed][Google Scholar]
50. RothmanNGarcia-ClosasMChatterjeeNMalatsNWuXA multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci.Nat Genet42978984[PubMed][Google Scholar]
51. SheteSHoskingFJRobertsonLBDobbinsSESansonM2009Genome-wide association study identifies five susceptibility loci for glioma.Nat Genet41899904[PubMed][Google Scholar]
52. TakataRAkamatsuSKuboMTakahashiAHosonoNGenome-wide association study identifies five new susceptibility loci for prostate cancer in the Japanese population.Nat Genet42751754[PubMed][Google Scholar]
53. TenesaAFarringtonSMPrendergastJGPorteousMEWalkerM2008Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21.Nat Genet40631637[PubMed][Google Scholar]
54. ThomasGJacobsKBYeagerMKraftPWacholderS2008Multiple loci identified in a genome-wide association study of prostate cancer.Nat Genet40310315[PubMed][Google Scholar]
55. TomlinsonIWebbECarvajal-CarmonaLBroderickPKempZ2007A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21.Nat Genet39984988[PubMed][Google Scholar]
56. TomlinsonIPWebbECarvajal-CarmonaLBroderickPHowarthK2008A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3.Nat Genet40623630[PubMed][Google Scholar]
57. TurnbullCAhmedSMorrisonJPernetDRenwickAGenome-wide association study identifies five new breast cancer susceptibility loci.Nat Genet42504507[PubMed][Google Scholar]
58. YeagerMOrrNHayesRBJacobsKBKraftP2007Genome-wide association study of prostate cancer identifies a second risk locus at 8q24.Nat Genet39645649[PubMed][Google Scholar]
59. ZankeBWGreenwoodCMRangrejJKustraRTenesaA2007Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24.Nat Genet39989994[PubMed][Google Scholar]
60. SimmonsPJMasinovskyBLongeneckerBMBerensonRTorok-StorbB1992Vascular cell adhesion molecule-1 expressed by bone marrow stromal cells mediates the binding of hematopoietic progenitor cells.Blood80388395[PubMed][Google Scholar]
61. AvecillaSTHattoriKHeissigBTejadaRLiaoF2004Chemokine-mediated interaction of hematopoietic progenitors with the bone marrow vascular niche is required for thrombopoiesis.Nat Med106471[PubMed][Google Scholar]
62. MoffattMFKabeschMLiangLDixonALStrachanD2007Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma.Nature448470473[PubMed][Google Scholar]
63. O'MahonyDSPhamUIyerRHawnTRLilesWC2008Differential constitutive and cytokine-modulated expression of human Toll-like receptors in primary neutrophils, monocytes, and macrophages.Int J Med Sci518[PubMed][Google Scholar]
64. ValenteJFAlexanderJWLiBGNoelJGCusterDA2002Effect of in vivo infusion of granulocyte colony-stimulating factor on immune function.Shock172329[PubMed][Google Scholar]
65. NagataSTsuchiyaMAsanoSKaziroYYamazakiT1986Molecular cloning and expression of cDNA for human granulocyte colony-stimulating factor.Nature319415418[PubMed][Google Scholar]
66. SaitoTUsuiNAsaiODobashiNYanoS2007Elevated serum levels of human matrix metalloproteinase-9 (MMP-9) during the induction of peripheral blood stem cell mobilization by granulocyte colony-stimulating factor (G-CSF).J Infect Chemother13426428[PubMed][Google Scholar]
67. SkibolaCFBracciPMHalperinECondeLCraigDW2009Genetic variants at 6p21.33 are associated with susceptibility to follicular lymphoma.Nat Genet41873875[PubMed][Google Scholar]
68. FellayJGeDShiannaKVColomboSLedergerberB2009Common genetic variation and the control of HIV-1 in humans.PLoS Genet5e1000791doi:10.1371/journal.pgen.1000791[PubMed][Google Scholar]
69. ValdimarssonH2007The genetic basis of psoriasis.Clin Dermatol25563567[PubMed][Google Scholar]
70. CaponFBijlmakersMJWolfNQuarantaMHuffmeierU2008Identification of ZNF313/RNF114 as a novel psoriasis susceptibility gene.Hum Mol Genet1719381945[PubMed][Google Scholar]
71. De JagerPLJiaXWangJde BakkerPIOttoboniL2009Meta-analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci.Nat Genet41776782[PubMed][Google Scholar]
72. LimouSLe ClercSCoulongesCCarpentierWDinaC2009Genomewide association study of an AIDS-nonprogression cohort emphasizes the role played by HLA genes (ANRS Genomewide Association Study 02).J Infect Dis199419426[PubMed][Google Scholar]
73. LiuYHelmsCLiaoWZabaLCDuanS2008A genome-wide association study of psoriasis and psoriatic arthritis identifies new disease loci.PLoS Genet4e1000041doi:10.1371/journal.pgen.1000041[PubMed][Google Scholar]
74. NairRPDuffinKCHelmsCDingJStuartPE2009Genome-wide scan reveals association of psoriasis with IL-23 and NF-kappaB pathways.Nat Genet41199204[PubMed][Google Scholar]
75. PelakKGoldsteinDBWalleyNMFellayJGeD2010Host determinants of HIV-1 control in African Americans.J Infect Dis20111411149[PubMed][Google Scholar]
76. QuanCRenYQXiangLHSunLDXuAE2010Genome-wide association study for vitiligo identifies susceptibility loci at 6q27 and the MHC.Nat Genet42614618[PubMed][Google Scholar]
77. TsaiFYOrkinSH1997Transcription factor GATA-2 is required for proliferation/survival of early hematopoietic cells and mast cell formation, but not for erythroid and myeloid terminal differentiation.Blood8936363643[PubMed][Google Scholar]
78. LabbayeCValtieriMBarberiTMecciaEMasellaB1995Differential expression and functional role of GATA-2, NF-E2, and GATA-1 in normal adult hematopoiesis.J Clin Invest9523462358[PubMed][Google Scholar]
79. O'RourkeFALaPlanteJMFeinsteinMB2003Antisense-mediated loss of calcium homoeostasis endoplasmic reticulum protein (CHERP; ERPROT213-21) impairs Ca2+ mobilization, nuclear factor of activated T-cells (NFAT) activation and cell proliferation in Jurkat T-lymphocytes.Biochem J373133143[PubMed][Google Scholar]
80. ChiuCTeboMInglesJYeatesLArthurJW2007Genetic screening of calcium regulation genes in familial hypertrophic cardiomyopathy.J Mol Cell Cardiol43337343[PubMed][Google Scholar]
81. de BakkerPIFerreiraMAJiaXNealeBMRaychaudhuriS2008Practical aspects of imputation-driven meta-analysis of genome-wide association studies.Hum Mol Genet17R122128[PubMed][Google Scholar]
82. WillerCJLiYAbecasisGR2010METAL: Fast and efficient meta-analysis of genomewide association scans.Bioinformatics[Google Scholar]
83. Barbosa-MoraisNLDunningMJSamarajiwaSADarotJFRitchieME2010A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data.Nucleic Acids Res38e17[PubMed][Google Scholar]
84. PurcellSNealeBTodd-BrownKThomasLFerreiraMA2007PLINK: a tool set for whole-genome association and population-based linkage analyses.Am J Hum Genet81559575[PubMed][Google Scholar]

Multiple Loci Are Associated with White Blood Cell Phenotypes

Abstract

Author Summary

Introduction

Results

Results of discovery phase meta-analyses for white blood cell phenotypes.

GRAIL gene clusters represented graphically for all clusters reaching a FDR adjusted p-value <0.05.

The directionality and magnitude of effects for SNPs significant in both the meta-analysis and eQTL analysis are highly correlated (r2 = 0.795).

Discussion

Materials and Methods

Ethics Statement

Study Methods

Supporting Information

Footnotes

Acknowledgments

References

The directionality and magnitude of effects for SNPs significant in both the meta-analysis and eQTL analysis are highly correlated (r² = 0.795).