Nature genetics. Oct/31/2013; 45(11): 1274-1283

Published online Oct/5/2013

PMID: 24097068

PMC: 3838666

doi: 10.1038/ng.2797

Discovery and Refinement of Loci Associated with Lipid Levels

Martin L. Buchkovich+250 authors

Abstract

Low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol, triglycerides, and total cholesterol are heritable, modifiable, risk factors for coronary artery disease. To identify new loci and refine known loci influencing these lipids, we examined 188,578 individuals using genome-wide and custom genotyping arrays. We identify and annotate 157 loci associated with lipid levels at P < 5×10⁻⁸, including 62 loci not previously associated with lipid levels in humans. Using dense genotyping in individuals of European, East Asian, South Asian, and African ancestry, we narrow association signals in 12 loci. We find that loci associated with blood lipids are often associated with cardiovascular and metabolic traits including coronary artery disease, type 2 diabetes, blood pressure, waist-hip ratio, and body mass index. Our results illustrate the value of genetic data from individuals of diverse ancestries and provide insights into biological mechanisms regulating blood lipids to guide future genetic, biological, and therapeutic research.

Introduction

Blood lipids are heritable, modifiable, risk factors for coronary artery disease (CAD)^1,2, a leading cause of death³. Human genetic studies of lipid levels can identify targets for new therapies for cholesterol management and prevention of heart disease, and can complement animal studies^4,5. Studies of naturally occurring genetic variation can proceed through large-scale association analyses focused on unrelated individuals or through investigation of Mendelian forms of dyslipidemia in families⁶. We previously identified 95 loci associated with blood lipids, accounting for ~10-12% of the total trait variance⁴ and showed that variants with small effects can point to pathways and therapeutic targets that enable clinically-important changes in blood lipids^4,7.

Here, we report on studies of naturally occurring variation in 188,578 European-ancestry individuals and 7,898 non-European ancestry individuals. Our analyses identify 157 loci associated with lipid levels at P < 5×10⁻⁸, including 62 new loci. Thirty of the 62 loci do not include genes implicated in lipid biology by previous literature. We tested lipid-associated SNPs for association with mRNA expression levels, carried out pathway analyses to uncover relationships between loci, and compared the locations of lipid-associated SNPs with those of genes and other functional elements in the genome. These results provide direction for biological and therapeutic research into risk factors for CAD.

Results

Novel loci associated with blood lipid levels

We examined subjects of European ancestry, including 94,595 individuals from 23 studies genotyped with GWAS arrays⁴ and 93,982 individuals from 37 studies genotyped with the Metabochip array⁸ (Supplementary Table 1 and Supplementary Fig. 1). The Metabochip includes variants representing promising loci from our previous GWAS (14,886 SNPs) and from GWAS of other CAD risk factors and related traits (50,459 SNPs), variants from the 1000 Genomes Project⁹ and focused resequencing¹⁰ efforts in 64 previously associated loci (28,923 SNPs), and fine-mapping variants in 181 loci associated with other traits (93,308 SNPs). In cases where Metabochip and GWAS array data were available for the same individuals, we used Metabochip data to ensure key variants were directly genotyped, rather than imputed.

We excluded individuals known to be on lipid lowering medications and evaluated the additive effects of each SNP on blood lipid levels after adjusting for age and sex. Genomic control values¹¹ for the initial meta-analyses were 1.10 – 1.15, low for a sample of this size, indicating that population stratification should have only a minor impact on our results (Supplementary Fig. 2). After genomic control correction, 157 loci associated with blood lipid levels were identified (P < 5×10⁻⁸), including 62 new loci (Tables 1A-D, Figure 1, Supplementary Tables 2 and 3). Loci were >1 Mb apart and nearly independent (r² < 0.10). Of the 62 novel loci, 24 demonstrated the strongest evidence of association with HDL cholesterol, 15 with LDL cholesterol, 8 with triglyceride levels, and 15 with total cholesterol (Supplementary Fig. 3). Several of these loci were validated by a similar extension based on GLGC GWAS results ¹².

The effects of newly identified loci were generally smaller than in earlier GWAS (Supplementary Fig. 4). For the 62 newly identified variants, trait variance explained in the Framingham offspring were 1.6% for HDL cholesterol, 2.1% for triglycerides, 2.4% for LDL cholesterol, and 2.6% for total cholesterol.

Overlap of genetic discoveries and prior knowledge

To investigate connections between our new loci and known lipid biology, we first catalogued genes within 100 kb of the peak associated SNPs and searched PubMed and OMIM for occurrences of these gene names and their aliases in the context of relevant keywords. After manual curation, we identified at least one strong candidate in 32 of the 62 loci (52%) (Supplementary Table 4). For the remaining 30 loci, we found no literature support for the role of a nearby gene on blood lipid levels. This search highlighted genes whose connections to lipid metabolism have been extensively documented in mouse models (such as VLDLR¹³ and LRPAP1¹³) and human cell lines (such as VIM¹⁴), as well as candidates whose connection to lipid levels is more recent, such as VEGFA. For the latter, recent studies of VEGFB have suggested that vascular endothelial growth factors have an unexpected role in the targeting of lipids to peripheral tissues¹⁵, which we corroborate by associating variants near VEGFA with blood triglyceride and HDL levels.

Multiple types of evidence supported several literature candidates (Supplementary Table 2). For example, VLDLR is categorized by Gene Ontology¹⁶ in the retinoid × nuclear receptor (RXR) activation pathway, which also includes genes (APOB, APOE, CYP7A1, APOA1, HNF1A, HNF4A) in previously implicated loci⁴. However, since these additional sources of evidence build on overlapping knowledge they are not truly independent.

To estimate the probability of finding ≥32 literature supported candidates after automated search and manual review of results, we repeated our text-mining literature search using 100 permutations of SNPs matched for allele frequency, distance to the nearest gene, and number of linkage disequilibrium proxies. To approximate hand-curation of the text-mining results, we focused on genes implicated by 3 or more publications (25 in observed data, 8.7 on average in control SNP sets, P = 8×10⁻⁸).

Pathway analyses

We performed a gene-set enrichment analysis, using MAGENTA¹⁷, to evaluate over-representation of biological pathways among associated loci. Across the 157 loci, MAGENTA identified 71 enriched pathways. These pathways included at least one gene in 20 of our newly identified loci (Supplementary Table 5). Examples include DAGLB (connected to previously associated loci by genes in the triglyceride lipase activity pathway), INSIG2 (connected by the cholesterol and steroid metabolic process pathways), AKR1C4 (connected by the steroid metabolic process and bile acid biosynthesis pathways), VLDLR (connected by the retinoic × receptor activation and lipid transport pathways, among others), PPARA, ABCB11, and UGT1A1 (three genes assigned to pathways implicated in activation of nuclear hormone receptors, which play an important role in lipid metabolism through the transcriptional regulation of genes in sterol metabolic pathways¹⁸). Among the 16 loci where literature review and pathway analysis both suggested a candidate, the predictions overlapped 14 times (Supplementary Table 2; by chance, we expect 6.6 overlapping predictions, P = 1×10⁻⁵).

Protein-protein interactions

We assessed evidence for physical interactions between proteins encoded near our associated SNPs using DAPPLE¹⁹. We found an excess of direct protein-protein interactions for genes in loci associated with LDL (10 interactions, P = 0.0002), HDL (8 interactions, P = 0.002), and total cholesterol (6 interactions, P = 0.017), but not for triglycerides (2 interactions, P = 0.27) (Supplementary Fig. 5). Most of the interactions involved genes at known loci (such as the interaction network connecting PLTP, APOE, APOB, and LIPC) or highlighted the same genes as literature and pathway analyses (such as those connecting VLDLR, APOE, APOB, CETP, and LPL). Among novel loci, we identified a link between AKT1 and GSK3B. GSK3B has been shown to play a role in energy metabolism²⁰ and its activity is regulated by AKT1 through phosphorylation²¹. Literature review also supported a role in blood lipid levels for these two genes.

Regulation of gene expression by associated variants

Many complex trait associated variants act through the regulation of gene expression. We examined whether our 62 novel variants were associated with expression levels of nearby genes in liver, omental fat, or subcutaneous fat. Fifteen were associated with expression of a nearby transcript with P < 5×10⁻⁸ (Supplementary Table 6) and, in seven, the lipid-associated variant was in strong disequilibrium with the strongest expression-quantitative trait locus (eQTL) for the region (r² > 0.8). In three of these loci, literature search also prioritized candidate genes. In all three, eQTL analysis and literature review identified the same candidate (DAGLB, SPTLC3, and PXK, P = 0.05). For the remaining four loci (near RBM5, ADH5, TMEM176A, and GPR146), analysis of expression levels identified candidates that were not supported by literature or pathway analyses.

Coding variation

In some loci where previous coding variant association studies were inconclusive, we now find convincing evidence of association, demonstrating the benefits of the large sample sizes achievable by collaboration. For example, in the APOH locus²², our most strongly associated variant is rs1801689 (APOH C325G, P = 1×10⁻¹¹ for LDL cholesterol). Overall, at 15 of the 62 new loci, there is at least one nonsynonymous variant within 100kb and in strong (r²>0.8) linkage disequilibrium with the index SNP (Supplementary Table 7)(18 loci with no restrictions on distance). This ~30% overlap between associated loci and coding variation is similar to that in other complex traits⁹. Unexpectedly, in the 11 loci where a candidate was suggested by literature review and by coding variation, the two coincided seven times (P = 0.03 compared to expected chance overlap of 3.8 times); thus, agreement between literature and coding variation was less significant than for eQTL and pathway analysis or protein-protein interactions.

Overlap between association signals and regulators of transcription in liver

Despite our efforts, 18 of the 62 new loci remain without prioritized candidate genes. The liver is an important hub of lipid biosynthesis and there is evidence that lipid loci might be associated with changes in gene regulation in liver cells²³. Using ENCODE data²³, we evaluated whether associated SNPs overlapped experimentally annotated functional elements identified in HepG2 cells, a commonly used model of human hepatocytes. To determine significance, we generated 100,000 lists of permuted SNPs, matched for minor allele frequency, distance to the nearest gene, and number of SNPs in r² > 0.8 (described in Methods). In HepG2 cells, lipid-associated SNPs were enriched in eight of the 15 functional chromatin states defined by Ernst et al.²⁴ (P < 1×10⁻⁵; Supplementary Table 8). The strongest enrichment was in regions with “strong enhancer activity” (3.7-fold enrichment, P = 2×10⁻²⁵; Supplementary Table 9). In the other eight cell types examined by Ernst et al., no more than three functional chromatin states showed evidence for enrichment (and, when present, enrichment was weaker).

We proceeded to investigate the overlap between lipid loci and functional marks in HepG2 cells in more detail (Supplementary Table 9). Notable regulatory elements showing significant overlap with lipid loci included histone marks associated with active regulatory regions (H3K27ac, P = 3×10⁻²⁰; H3K9ac, P = 3×10⁻²²), promoters (H3K4me3, P = 2×10⁻¹⁵, H3K4me2, P = 8×10⁻¹²), transcribed regions (H3K36me3, P = 4×10⁻¹⁴), indicators of open chromatin (FAIRE, P = 5×10⁻⁹; DNase, P = 2×10⁻⁴), and regions that interact with transcription factors HNF4A (P = 6×10⁻¹⁰) and CEBP/B (P = 1×10⁻⁵). Overall, 56 of our 62 new loci contained at least one SNP that overlaps a functional mark²⁴ and/or chromatin state²³ highlighted in Supplementary Table 9, including all but 3 of the loci where no candidates were suggested by literature review or analyses of pathways, coding variation, or gene expression (Supplementary Table 10).

Initial fine-mapping of 65 lipid-associated loci

Previous fine-mapping of five LDL-associated lipid loci found that variants showing the strongest association were often substantially different in frequency and effect size from those identified in GWAS¹⁰. Metabochip genotypes enabled us to carry out an initial fine-mapping analysis for 65 loci: 60 selected for fine-mapping based on our previous study⁴ and 5 nominated for fine-mapping because of association to other traits.

For each of these loci, we identified the most strongly associated Metabochip variant and evaluated whether it (a) reached genome-wide significant evidence for association (to avoid chance fluctuations in regions where the signal was relatively weak) and (b) was different from the GWAS index SNP in terms of frequency and effect size (operationalized to r² < 0.8 with the GWAS index SNP). In the European samples, fine-mapping identified eight loci where the fine-mapping signal was clearly different from the GWAS signal (Supplementary Table 11). The two largest differences were at the loci near PCSK9 (top GWAS variant with minor allele frequency f = 0.24 and P = 9×10⁻²⁴; fine-mapping variant with f = 0.03, P = 2×10⁻¹³⁶) and APOE (GWAS variant f = 0.20, P = 3×10⁻⁴⁴, fine-mapping variant f = 0.07, P = 3×10⁻⁶⁵¹), consistent with Sanna et al¹⁰. Large differences were also observed near LRP4 (GWAS f = 0.17, P = 8×10⁻¹⁴; fine-mapping f = 0.35, P = 1×10⁻²⁶), IGF2R (GWAS f = 0.16, P = 7×10⁻⁹; fine-mapping f = 0.37, P = 2×10⁻¹³), NPC1L1 (GWAS f = 0.27, P = 2×10⁻⁵; fine-mapping f = 0.24, P = 1×10⁻¹²), ST3GAL4 (GWAS f = 0.26, P = 2×10⁻⁶; fine-mapping f = 0.07, P = 6×10⁻¹¹), MED1 (GWAS f = 0.37, P = 3×10⁻⁵; fine-mapping f = 0.24, P = 2×10⁻¹⁰), and COBLL1 (GWAS f = 0.12, P = 2×10⁻⁶; fine-mapping f = 0.11, P = 6×10⁻⁹). Thus, although the large changes observed by Sanna et al¹⁰ after fine-mapping are by no means unique, they are not typical. Except for the R46L variant in PCSK9, the variants showing strongest association in fine-mapped loci all had minor allele frequency > .05.

We also attempted fine-mapping in African (N=3,263), East Asian (N=1,771), and South Asian (N=4,901) ancestry samples. Despite comparatively small samples, ancestry-specific analyses identified SNPs clearly distinct from the original GWAS variant in five loci (Supplementary Table 11). These were: APOE, consistent with European ancestry analyses above; three loci where differences in linkage disequilibrium between populations enabled fine-mapping in African (SORT1, LDLR) or East Asian (APOA5) ancestry samples; and CETP, where an African-specific variant was present. For CETP, SORT1, and APOA5, results are consistent with other fine-mapping and functional studies^7,7,25,26.

Association of lipid loci with metabolic and cardiovascular traits

To evaluate the role of the 157 loci identified here on related traits, we evaluated the most strongly associated SNPs for each locus in genetic studies of coronary artery disease (CAD, N=114,590 including 37,653 cases)^27,28, type 2 diabetes (T2D, N=47,117 including 8,130 cases)²⁹, body mass index (BMI, N=123,865 individuals)³⁰ and waist-hip ratio (WHR, N=77,167 individuals)³¹, systolic and diastolic blood pressure (SBP and DBP, N=69,395 individuals)³², and fasting glucose (N=46,186 non-diabetics)³³. We observed an excess of SNPs nominally associated (P < 0.05) with all these traits: a 5.1 fold excess for CAD (40 nominally significant loci, P = 2×10⁻¹⁹), a 4.1 fold excess for BMI (32 loci, P = 1×10⁻¹¹), 3.7 fold excesses for DBP (29 loci, P = 1×10⁻⁹), a 3.4 fold excess for WHR (27 loci, P = 1×10⁻⁹), a 2.5 fold excess for SBP (20 loci, P = 1×10⁻⁴), a 2.3 fold excess for T2D (18 loci, P = 0.001), and a 2.2 fold excess for fasting glucose (17 loci, P = 3×10⁻³) (Supplementary Table 12). Interestingly, among the novel loci, we observed greater overlap with BMI, SBP, and DBP (9 overlapping loci each) than with CAD (8 overlapping loci). Among new loci, the two SNPs showing strongest association to CAD map near RBM5 (rs2013208, P_HDL = 9×10⁻¹², P_CAD = 7×10⁻⁵) and CMTM6 (rs7640978, P_LDL = 1×10⁻⁸, P_CAD = 4×10⁻⁴).

We tested whether the LDL-, total cholesterol- or triglyceride-increasing allele, or HDL-decreasing allele was associated with increased risk of cardiovascular disease or related metabolic outcomes; the direction of effect of each locus was categorized according to the primary association signal at the locus, as in Tables 1A-D. We observed association with increased CAD risk (104/149, P = 1×10⁻⁶), SBP (96/155, P = 2.7×10⁻³) and WHR adjusted for BMI (92/154, P = 0.019). There were many instances where a single locus was associated with many traits. These included variants near FTO, consistent with previous reports³⁴; near VEGFA (associated with triglyceride levels, CAD, T2D, SBP, and DBP), near SLC39A8 (associated with HDL cholesterol, BMI, SBP, and DBP), and near MIR581 (associated with HDL cholesterol, BMI, T2D, and DBP). In some cases, like FTO, a strong association with BMI or another phenotype generates weaker association signals for other metabolic traits³⁴. In other cases, like SORT1, a primary effect on lipid levels may mediate secondary association with other traits, like CAD⁷.

Association of individual lipids with coronary artery disease

Epidemiological studies consistently show high total cholesterol and LDL cholesterol levels are associated with increased risk of CAD, whereas high HDL cholesterol levels are associated with reduced risk of CAD³⁵. In genetic studies, the connection between LDL cholesterol and CAD is clear, whereas the results for HDL cholesterol levels are more equivocal^36-38. In our data, trait increasing alleles at the loci showing strongest association with LDL cholesterol (31 loci), triglycerides (30 loci), or total cholesterol (38 loci) were associated with increased risk of CAD (P = 2×10⁻¹², P = 2×10⁻¹⁶, and P = 0.006). Conversely, trait decreasing alleles at loci showing the strongest association with HDL cholesterol (64 loci), were associated with increased CAD risk with P = 0.02. When we focused on loci uniquely associated with LDL cholesterol (12 loci where P > .05 for other lipids), triglycerides (6 loci), or HDL cholesterol (14 loci), only the LDL association remained significant (P = 0.03).

To better explore how associations with individual lipid levels related to CAD risk, we used linear regression to test whether association with lipid levels could predict impact on CAD risk. In this analysis, the effect on CAD of 149 lipid loci (CAD results were not available for 8 SNPs) was correlated with LDL (Pearson r=0.74, P = 7×10⁻⁶) and triglyceride (Pearson r=0.46, P = 0.02) effect sizes, but not HDL effect sizes (Pearson r=−9×10⁻⁴, P = 0.99; Supplementary Fig. 6). Since most variants affect multiple lipid fractions (Figure 1), dissecting the relationship between lipid level and CAD effects requires multivariate analysis. In a companion manuscript, we use multivariate analysis and detailed examination of triglyceride associated loci to show that increased LDL and triglyceride levels, but not HDL, appear causally related to CAD risk.

Evidence for additional loci, not yet reaching genome-wide significance

To evaluate evidence for loci not yet reaching genome-wide significance, we compared direction of effect in GWAS and Metabochip analyses of non-overlapping samples, outside the 157 genome-wide significant loci. Among independent variants (r² < 0.1) with P < 0.1 in the GWAS-only analysis, a significant excess were concordant in direction of effect for HDL (62.9% in 1,847 SNPs, P < 10⁻¹⁶), LDL (58.6% of 1,730 SNPs, P < 10⁻¹⁶), triglyceride levels (59.1% of 1,783 SNPs, P < 10⁻¹⁶), and total cholesterol (61.0% of 1,904 SNPs, P < 10⁻¹⁶), suggesting many additional loci to be discovered in future studies.

Discussion

Molecular understanding of the genes and pathways that modify blood lipid levels in humans will facilitate the design of new therapies for cardiovascular and metabolic disease. This understanding can be gained from studies of model organisms, in vitro experiments, bioinformatic analyses, and human genetic studies. Here, we demonstrate association between blood lipid levels and 62 new loci, bringing the total number of lipid-associated loci to 157 (See Tables 1A-D and Figure 1). All but one of the loci identified here include protein-coding genes within 100 kb of the SNP showing strongest association. While 38 of the 62 new loci include genes whose role in blood lipid levels is supported by literature review or analysis of curated pathway databases, the remainder includes only genes whose role on blood lipid levels has not been documented.

In total, there are 240 genes within 100 kb of one of our 62 new lipid-associated loci – providing a daunting challenge for future functional studies. Prioritizing on the basis of literature review, pathway analysis, regulation of mRNA expression levels, and protein altering variants suggests that 70 genes in 44 of the 62 new loci might be the focus of the first round of functional studies (summarized in Supplementary Table 2). While we found significant overlap, different sources of prioritization sometimes disagreed. This result suggests that truly understanding causality will be very challenging. The Supplementary Note includes an interpreted digest of genes highlighted by our study. Clearly, a range of approaches will be needed to follow-up these findings. To illustrate possibilities, consider U. S. Patent Application #20,090,036,394 disclosing that, in the mouse, knockout of Gpr146 modifies blood lipid levels. Here, we show that variants near the human homologue of this gene, GPR146, are associated with levels of total cholesterol – providing an added incentive for studies of GPR146 inhibitors in humans. GPR146 encodes a G-protein coupled receptor – an attractive pharmaceutical target – so it is tempting to speculate that, one day, pharmaceutical inhibition of GPR146 may modify cholesterol levels and reduce risk of heart disease.

Each locus typically includes many strongly associated (and potentially causal) variants. Our fine-mapping results illustrate how genetic analysis of large samples and individuals of diverse ancestry can help focus the search for causal variants. In our fine-mapping analysis of 65 lipid-associated loci, we were able to separate the strongest signal in a region from the prior GWAS signal in 12 instances. In three of these 12 instances, fine-mapping was enabled by analysis of a few thousand African or East Asian ancestry individuals, whereas in the remaining instances, fine-mapping was possible through examination of nearly 100,000 European ancestry samples. A more detailed fine-mapping exercise, including imputation of variants from emerging very large reference panels, may help refine the location of additional signals.

Lipid-associated loci were strongly associated with CAD, T2D, BMI, SBP, and DBP. In univariate analyses, we found that impact on LDL and triglycerides all predicted association with CAD, but HDL did not. In a more detailed multivariate investigation, a companion manuscript shows that our data is consistent with the hypothesis that both LDL and triglycerides, but not HDL, are causally related to CAD risk. HDL, LDL, and triglycerides levels summarize aggregate levels of different lipid particles, each with potentially distinct consequences for CAD risk. We evaluated association of our loci with lipid subfractions in 2,900 individuals from the Framingham Heart Study (Supplementary Table 13, Supplementary Fig. 7) and with sphingolipids, which are components of lipid membranes in cells, in 4,034 individuals from five samples of European ancestry³⁹ (Supplementary Table 14). The results suggest HDL-associated variants can have a markedly different impact on these sub-phenotypes. For example, among HDL loci, variants near LIPC were strongly associated with plasmalogen levels (P < 10⁻⁴⁰), variants near ABCA1 were associated with sphingomyelin levels (P < 10⁻⁵), and variants near CETP – which show the strongest association with HDL cholesterol overall – were associated with neither of these. Detailed genetic dissection of these sub-phenotypes in larger samples, could lead to functional groupings of HDL-associated variants that reconcile the results of genetic studies (which show no clear connection between HDL cholesterol-associated variants and CAD risk) and epidemiologic studies (which show clear association between plasma HDL levels and CAD risk).

In summary, we report the largest genetic association study of blood lipid levels yet conducted. The large number of loci identified, the many candidate genes they contain, and the diverse proteins they encode generate new leads and insights into lipid biology. It is our hope that the next round of genetic studies will build on these results, using new sequencing, genotyping, and imputation technologies to examine rare loss-of-function alleles and other variants of clear functional impact to accelerate the translation of these leads into mechanistic insights and improved treatments for CAD.

Online Methods

Samples studied

We collected summary statistics for Metabochip SNPs from 45 studies. Among these, 37 studies consisted primarily of individuals of European ancestry (see Supplementary Table 1 and Supplementary Note for details), including both population-based studies and case-control studies of CAD and T2D. Another 8 studies consisted primarily of individuals with non-European ancestry: two studies of South Asian descent, AIDHS/SDS (N=1,516) and PROMIS (N=3,385); two studies of East Asian descent, CLHNS (N=1,771) and TAI-CHI (N=7044); and five studies of recent African ancestry, MRC/UVRI GPC (N=1,687) from Uganda, SEY (N=426) from the Caribbean, and FBPP (N=1,614, TG results unavailable), GXE (N=397), and SPT (N=838) from the United States (more details in Supplementary Table 1 and Supplementary Note).

Genotyping

We genotyped 196,710 genetic variants prioritized on the basis of prior GWAS for cardiovascular and metabolic phenotypes using the Illumina iSelect Metabochip⁸ genotyping array. To design the Metabochip, we used our previous GWAS of ~100,000 individuals⁴ to prioritize 5,023 SNPs for HDL cholesterol, 5,055 for LDL cholesterol, 5,056 for triglycerides, and 938 for total cholesterol. These independent SNPs represent most loci with P < .005 in our original GWAS for HDL cholesterol, LDL cholesterol and triglycerides and with P < .0005 for total cholesterol. An additional 28,923 SNPs were selected for fine-mapping of 65 previously identified lipid loci. The Metabochip also included 50,459 SNPs prioritized based on GWAS of non-lipid traits and 93,308 SNPs selected for fine-mapping of loci associated with non-lipid traits (5 of these loci were associated with blood lipids by the analyses described here).

Phenotypes

Blood lipid levels were typically measured after > 8 hours of fasting. Individuals known to be on lipid-lowering medication were excluded when possible. LDL cholesterol levels were directly measured in 10 studies (24% of total study individuals) and estimated using the Friedewald formula⁴⁰ in the remaining studies. Trait residuals within each study cohort were adjusted for age, age², and sex, and then quantile normalized. Explicit adjustments for population structure using principal component⁴¹ or mixed model approaches⁴² were carried out in 24 studies (35% of individuals); all studies were adjusted using genomic control prior to meta-analysis¹¹. In studies ascertained on diabetes or CVD status, cases and controls were analyzed separately (Supplementary Table 1). All meta-analyses were limited to a single ancestral group (e.g. European only).

Primary statistical analysis

Individual SNP association tests were performed using linear regression with the inverse normal transformed trait values as the dependent variable and the expected allele count for each individual as the independent variable. These analyses were performed using PLINK (26 samples, 53% of the total number of individuals), SNPTEST (4 samples, 20% of individuals), EMMAX (9 samples, 14% of individuals), Merlin (4 samples, 9% of individuals), GENABEL (1 sample, 3% of individuals), and MMAP (1 sample, 1% of individuals) (Supplementary Table 1).

Meta-analysis

Meta-analysis was performed using the Stouffer method^43,44, with weights proportional to the square root of the sample size for each sample. To correct for inflated test statistics due to potential population stratification, we first applied genomic control to each sample and then repeated the procedure with initial meta-analysis results. For GWAS samples, we used all available SNPs when estimating the median test statistic and inflation factor λ. For Metabochip samples, we used a subset of SNPs (N = 7,168) that had P-values > 0.50 for all lipid traits in the original GWAS, expecting that the majority of these would not be associated with lipids and would behave as null variants in the Metabochip samples. Signals were considered to be novel if they reached a P-value < 5×10⁻⁸ in the combined GWAS and Metabochip meta-analysis and were >1 Mb away from the nearest previously described lipid locus and other novel loci. We used only European samples for the discovery of novel genome-wide significant loci. The non-European samples were meta-analyzed and examined only for fine-mapping analyses.

Quality control

To flag potentially erroneous analyses, we carried out a series of quality control steps. Average standard errors for association statistics from each study were plotted against study sample size to identify outlier studies. We inspected allele frequencies to ensure all analyses used the same strand assignment of alleles. We evaluated whether reported statistics and allelic effects were consistent with published findings for known loci. Genomic control values for study specific analyses were inspected, and all were <1.20. Finally, within each study, we excluded variants for which the minor allele was observed <7 times.

Proportion of trait variance explained

We estimated the increase in trait variance explained by novel loci in the Framingham cohort (N=7,132) using three models for each trait-residual: 1) lead and secondary SNPs from the previously published loci⁴ and 2) previously published lipid loci plus newly reported loci; and 3) newly reported loci. We regressed lipid residuals on these sets of SNPs using the lme kinship package in R.

Initial automated review of the published literature

An initial list of candidates within each locus was generated with Snipper (http://csg.sph.umich.edu/boehnke/snipper/) and then subjected to manual review. For each locus, Snipper first generates a list of nearby genes and then checks for the co-occurrence of the corresponding gene names and selected search terms (“cholesterol”, “lipids”, “HDL”, “LDL”, or “triglycerides”) in published literature and OMIM. We supplemented this approach with traditional literature searches using PubMed and Google.

Generating permuted sets of non-associated SNPs

To estimate the expected chance overlap between literature searches and our loci, we generated lists of permuted SNPs. To generate these lists, we first identified all non-associated lipid SNPs (P > 0.10 for any of the 4 lipid traits) and created bins based on 3 statistics: minor allele frequency, distance to the nearest gene, and number of SNPs with r² > 0.8. For each index SNP, we identified 500 non lipid-associated SNPs that fell within the same 3 bins and randomly selected one SNP for each permuted list.

Pathway analyses

To investigate if lipid-associated variants overlapped previously annotated pathways, we used gene set enrichment analysis (GSEA), as implemented in MAGENTA¹⁷ using the meta-analysis of all studies, including GWAS and Metabochip SNPs. Briefly, MAGENTA first assigns SNPs to a given gene when within 110 kb upstream or 40 kb downstream of transcript boundaries. The most significant SNP P-value within this interval is then adjusted for confounders (gene size, marker density, LD) to create a gene association score. When the same SNP is assigned to multiple genes, only the gene with the lowest score is kept for downstream analyses. Subsequently, MAGENTA attaches pathway terms to each gene using several annotation resources, including GO, PANTHER, Ingenuity, and KEGG. Finally, the genes are ranked on their gene association score, and a modified GSEA test is used to test the null hypothesis that all gene score ranks above a given rank cutoff are randomly distributed with regard to a given pathway term (and compared to multiple randomly sampled gene sets of identical size). We evaluated enrichment by using a rank cutoff of 5% of the total number of genes. A minimum of 10,000 gene set permutations were performed, and up to 1,000,000 permutations for GSEA P-values below 1×10⁻⁴.

We used the Disease Association Protein–Protein Link Evaluator package (DAPPLE; http://www.broadinstitute.org/mpg/dapple/dapple.php) to examine evidence for protein-protein interaction networks connecting genes across different lipid loci. This analysis included the 62 novel loci as well as the 95 previously known loci; we focus our discussion on pathways that included one or more genes from novel loci.

Cis-expression quantitative trait locus analysis

To determine whether lipid-associated SNPs might act as cis-regulators of nearby genes, we examined association with expression levels of 39,280 transcripts in 960 human liver samples, 741 human omental fat samples, and 609 human subcutaneous fat samples. Tissue samples were collected postmortem or during surgical resection from donors; tissue collection, DNA and RNA isolation, expression profiling, and genotyping were performed as described⁴⁵. MACH was used to obtain imputed genotypes for ~2.6 million SNPs in the HapMap release 22 for each of the samples. We examined the correlation between each of the 62 new index SNPs and all transcripts within 500 kb of the SNP position, performing association analyses as previously described⁴⁶.

Functional annotation of associated variants

We attempted to identify lipid-associated SNPs that fall in important regulatory domains. We initially created a list of all potentially causal variants by selecting index SNPs at loci identified in this study or in Teslovich et al⁴. We then selected any variant in strong linkage disequilibrium (r² > 0.8 from 1000 Genomes or HapMap) with each index SNP. We compared the position of the index SNPs and their proxies to previously described functional marks^23,24. To assess the expected overlap with functional marks, we created 100,000 permuted sets of non-associated SNPs (see above) and evaluated permuted SNP lists for overlap with functional domains. We estimated a P-value for each functional domain as the proportion of permuted sets with an equal or greater number of loci overlapping functional domains (for large P-values). For small P-values we used a normal approximation to the empirical overlap distribution to estimate P-values.

Association with lipid subfractions

Lipoprotein fractions for Women’s Genome Health Study (WGHS) samples (N = 23170) were measured using the LipoProtein-II assay (Liposcience Inc. Raleigh, NC) and Framingham Heart Study Offspring samples (N = 2900) were measured with the LipoProtein-I assay (Liposcience Inc. Raleigh, NC)⁴⁷. Additional information on sub-fraction measurements can be found in Supplementary Fig. 7. Log transformations were used for non-normalized traits. All models were adjusted for age, sex, and PCs. The genetic association analysis of WGHS used SNP genotypes imputed from the HapMap r22 CEU reference panel using MACH. 16,730 out of 23,170 WGHS participants were fasting for 8 hours prior to blood draw (72.2%).

Supplementary Material

NIHMS524703-supplement-1.pdf

FIGURE 1

Overlap between loci associated with different lipid traits

This Venn Diagram illustrates the number of loci that show association with multiple lipid traits. The number of loci primarily associated with only one trait is listed in parentheses after the trait name and the locus name is listed below in italics. Loci that show association with two or more traits are shown in the appropriate section.

TABLE 1A

Novel Loci Primarily Associated with HDL Cholesterol Obtained from Joint GWAS and Metabochip Meta-analysis

Locus	MarkerName	Chr	hg19 Position (Mb)	Associated trait(s)	MAF	Minor/major Allele	Effect of A1	Joint N (in 1000s)	Joint P-value
PIGV-NR0B2	rs12748152	1	27.14	HDL, LDL, TG	.09	T/C	−.051/.050/.037	187/173/178	1×10⁻¹⁵/3×10⁻¹²/1×10⁻⁹
HDGF-PMVK	rs12145743	1	156.70	HDL	.34	G/T	.020	181	2×10⁻⁸
ANGPTL1	rs4650994	1	178.52	HDL	.49	G/A	.021	187	7×10⁻⁹
CPS1	rs1047891	2	211.54	HDL	.33	A/C	−.027	182	9×10⁻¹⁰
ATG7	rs2606736	3	11.40	HDL	.39	C/T	.025	129	5×10⁻⁸
SETD2	rs2290547	3	47.06	HDL	.20	A/G	−.030	187	4×10⁻⁹
RBM5	rs2013208	3	50.13	HDL	.50	T/C	.025	170	9×10⁻¹²
STAB1	rs13326165	3	52.53	HDL	.21	A/G	.029	187	9×10⁻¹¹
GSK3B	rs6805251	3	119.56	HDL	.39	T/C	.020	186	1×10⁻⁸
C4orf52	rs10019888	4	26.06	HDL	.18	G/A	−.027	187	5×10⁻⁸
FAM13A	rs3822072	4	89.74	HDL	.46	A/G	−.025	187	4×10⁻¹²
ADH5	rs2602836	4	100.01	HDL	.44	A/G	.019	187	5×10⁻⁸
RSPO3	rs1936800	6	127.44	HDL, TG^a	.49	C/T	.020/−.020	187/168	3×10⁻¹⁰/3×10⁻⁸
DAGLB	rs702485	7	6.45	HDL	.45	G/A	.024	187	7×10⁻¹²
SNX13	rs4142995	7	17.92	HDL	.38	T/G	−.026	165	9×10⁻¹²
IKZF1	rs4917014	7	50.31	HDL	.32	G/T	.022	187	1×10⁻⁸
TMEM176A	rs17173637	7	150.53	HDL	.12	C/T	−.036	184	2×10⁻⁸
MARCH8-ALOX5	rs970548	10	46.01	HDL, TC	.26	C/A	.026/−.026	187/187	2×10⁻¹⁰/8×10⁻⁹
OR4C46	rs11246602	11	51.51	HDL	.15	C/T	.034	176	2×10⁻¹⁰
KAT5	rs12801636	11	65.39	HDL	.23	A/G	.024	187	3×10⁻⁸
MOGAT2-DGAT2	rs499974	11	75.46	HDL	.19	A/C	−.026	187	1×10⁻⁸
ZBTB42-AKT1	rs4983559	14	105.28	HDL	.40	G/A	.020	184	1×10⁻⁸
FTO	rs1121980	16	53.81	HDL, TG	.43	A/G	−.020/−.021	186/155	7×10⁻⁹/3×10⁻⁸
HAS1	rs17695224	19	52.32	HDL	.26	A/G	−.029	185	2×10⁻¹³

Chr, chromosome;MAF, minor allele frequency; A1, minor allele; A2, major allele.Effect sizes are given with respect to the minor allele (A1) in SD units. For loci associated with two or more traits at genome-wide significance, the trait corresponding to the strongest P-value is listed first. At one locus, the secondary trait was most strongly associated with a different SNP:

ars719726 (within 1Mb of rs1936800, r² = 0.74).

TABLE 1B

Novel Loci Primarily Associated with LDL Cholesterol Obtained from Joint GWAS and Metabochip Meta-analysis

Locus	MarkerName	Chr	hg19 Position (Mb)	Associated trait(s)	MAF	Minor/major Allele	Effect of A1	Joint N (in 1000s)	Joint P-value
ANXA9-CERS2	rs267733	1	150.96	LDL	.16	G/A	−.033	165	5×10⁻⁹
EHBP1	rs2710642	2	63.15	LDL	.35	G/A	−.024	173	6×10⁻⁹
INSIG2	rs10490626	2	118.84	LDL, TC^b	.08	A/G	−.051/.042	173/184	2×10⁻¹²/6×10⁻⁹
LOC84931	rs2030746	2	121.31	LDL, TC	.40	T/C	.021/.020	173/187	9×10⁻⁹/4×10⁻⁸
FN1	rs1250229	2	216.30	LDL	.27	T/C	−.024	173	3×10⁻⁸
CMTM6	rs7640978	3	32.53	LDL, TC	.09	T/C	−.039/−.038	172/186	1×10⁻⁸
ACAD11	rs17404153	3	132.16	LDL, HDL^c	.14	T/G	−.034/.028	172/187	2×10⁻⁹/5×10⁻⁹
CSNK1G3	rs4530754	5	122.86	LDL, TC	.46	G/A	−.028/−.023	173/187	4×10⁻¹²/2×10⁻⁹
MIR148A	rs4722551	7	25.99	LDL, TG^d, TC	.20	C/T	.039/.029/.023	173/187/178	4×10⁻¹⁴/9×10⁻¹¹/7.0×10⁻⁹
SOX17	rs10102164	8	55.42	LDL, TC	.21	A/G	.032/.030	173/187	4×10⁻¹¹/5×10⁻¹¹
BRCA2	rs4942486	13	32.95	LDL	.48	T/C	.024	172	2×10⁻¹¹
APOH-PRXCA	rs1801689	17	64.21	LDL	.04	C/A	.103	111	1×10⁻¹¹
SPTLC3	rs364585	20	12.96	LDL	.38	A/G	−.025	172	4×10⁻¹⁰
SNX5	rs2328223	20	17.85	LDL	.21	C/A	.03	171	6×10⁻⁹
MTMR3	rs5763662	22	30.38	LDL	.04	T/C	.077	163	1×10⁻⁸

Chr, chromosome;MAF, minor allele frequency; A1, minor allele; A2, major allele.Effect sizes are given with respect to the minor allele (A1) in SD units. For loci associated with two or more traits at genome-wide significance, the trait corresponding to the strongest P-value is listed first. At three loci, secondary traits were most strongly associated with different SNPs.

brs17526895 (within 1Mb of rs10490626, r² = 0.98);

crs13076253 (within 1Mb of rs17404153, r² = 0.00);

drs4719841 (within 1Mb of rs4722551, r² = 0.10).

TABLE 1C

Novel Loci Primarily Associated with Total Cholesterol Obtained from Joint GWAS and Metabochip Meta-analysis

Locus	MarkerName	Chr	hg19 Position (Mb)	Associated trait(s)	MAF	Minor/major Allele	Effect of A1	Joint N (in 1000s)	Joint P-value
ASAP3	rs1077514	1	23.77	TC	.15	C/T	−0.03	184	6×10⁻⁹
ABCB11	rs2287623	2	169.83	TC	.41	G/A	0.027	184	4×10⁻¹²
FAM117B	rs11694172	2	203.53	TC	.25	G/A	0.028	187	2×10⁻⁹
UGT1A1	rs11563251	2	234.68	TC, LDL	.12	T/C	0.037/0.034	187/173	1×10⁻⁹/5×10⁻⁸
PXK	rs13315871	3	58.38	TC	.10	A/G	−0.036	187	4×10⁻⁸
KCNK17	rs2758886	6	39.25	TC	.30	A/G	0.023	187	3×10⁻⁸
HBS1L	rs9376090	6	135.41	TC	.28	T/C	−0.025	187	3×10⁻⁹
GPR146	rs1997243	7	1.08	TC	.16	G/A	0.033	183	3×10⁻¹⁰
VLDLR	rs3780181	9	2.64	TC, LDL	.08	G/A	−0.044/−0.044	186/172	7×10⁻¹⁰/2×10⁻⁹
VIM-CUBN	rs10904908	10	17.26	TC	.43	G/A	0.025	187	3×10⁻¹¹
PHLDB1	rs11603023	11	118.49	TC	.42	T/C	0.022	187	1×10⁻⁸
PHC1-A2ML1	rs4883201	12	9.08	TC	.12	G/A	−0.035	187	2×10⁻⁹
DLG4	rs314253	17	7.09	TC, LDL	.37	C/T	−0.023/−0.024	184/170	3×10⁻¹⁰/3×10⁻¹⁰
TOM1	rs138777	22	35.71	TC	.36	A/G	0.021	185	5×10⁻⁸
PPARA	rs4253772	22	46.63	TC, LDL^e	.11	T/C	0.032/−0.031	185/171	1×10⁻⁸/3×10⁻⁸

ers4253776 (within 1Mb of rs4253772, r² = 0.95).

TABLE 1D

Novel Loci Primarily Associated with Triglycerides Obtained from Joint GWAS and Metabochip Meta-analysis

Locus	MarkerName	Chr	hg19 Position (Mb)	Associated trait(s)	MAF	Minor/major Allele	Effect of A1	Joint N (in 1000s)	Joint P-value
LRPAP1	rs6831256	4	3.47	TG, TC^f,LDL^f	.42	G/A	0.026/−0.022/−	177/173/187	2×10⁻¹²/1×10⁻¹⁰/2×10⁻⁸
							0.025
VEGFA	rs998584	6	43.76	TG, HDL	.49	A/C	0.029/−0.026	175/184	3×10⁻¹⁵/2×10⁻¹¹
MET	rs38855	7	116.36	TG	.47	G/A	−0.019	178	2×10⁻⁸
AKR1C4	rs1832007	10	5.25	TG	.18	G/A	−0.033	178	2×10⁻¹²
PDXDC1	rs3198697	16	15.13	TG	.43	T/C	−0.020	176	2×10⁻⁸
MPP3	rs8077889	17	41.88	TG	.22	C/A	0.025	176	1×10⁻⁸
INSR	rs7248104	19	7.22	TG	.42	A/G	−0.022	176	5×10⁻¹⁰
PEPD	rs731839	19	33.90	TG, HDL	.35	G/A	0.022/−0.022	176/185	3×10⁻⁹/3×10⁻⁹

Chr, chromosome;MAF, minor allele frequency; A1, minor allele; A2, major allele.Effect sizes are given with respect to the minor allele (A1) in SD units. For loci associated with two or more traits at genome-wide significance, the trait corresponding to the strongest P-value is listed first. At one locus, secondary traits were most strongly associated with a different SNP:

frs6818397 (within 1 Mb of rs6831256, r² = 0.18).

Footnotes

URLs

Summary results for our studies are available. We hope that they will facilitate continued research into the genetics of blood lipid levels and, eventually, help identify improved treatments for CAD. To browse the full result set, go to http:/www.sph.umich.edu/csg/abecasis/lipids2013/

Disclosures

CHS

Bruce Psaty serves on the DSBM of a clinical trial funded by the manufacturer (Zoll), and he serves on the Steering Committee of the Yale Open-Data Project funded by the Medtronic.

CoLaus

Peter Vollenweider received an unrestricted grant from GSK to build the CoLaus study.

deCODE

Authors affiliated with deCODE Genetics/Amgen, a biotechnology company, are employees of deCODE Genetics/Amgen.

GLACIER

Inês Barroso and spouse own stock in GlaxoSmithKline and Incyte Ltd.

AUTHOR CONTRIBUTIONS

Writing and Analysis Group

G.R.A., M.B., L.A.C., P.D., P.W.F., S.K., K.L.M., E.I., G.M.P., S.S.R., S.R., M.S.S., E.M.S., S.S., C.J.W. (Lead). E.M.S. and S.S. performed meta-analysis and E.M.S., S.S., G.M.P., M.B., J.C., S.G., A.G., and S. K. performed bioinformatics analyses. E.M.S. and S.S. prepared the tables, figures and supplementary material. C.J.W. led the analysis and bioinformatics efforts. E.I. and K.M. led the biological interpretation of results. C.J.W. and G.R.A. wrote the manuscript. All analysis and writing group authors extensively discussed the analysis, results, interpretation and presentation of results.

All authors contributed to the research and reviewed the manuscript.

Design, management and coordination of contributing cohorts

(ADVANCE) T.L.A.; (AGES Reykjavik study) T.B.H., V.G.; (AIDHS/SDS) D.K.S.; (AMC-PAS) P.D., G.K.H.; (Amish GLGC) A.R.S.; (ARIC) E.B.; (B58C-WTCCC & B58C-T1DGC) D.P.S.; (B58C-Metabochip) C.M.L., C.P., M.I.M.; (BLSA) L.F.; (BRIGHT) P.B.M.; (CHS) B.M.P., J.I.R.; (CLHNS) A.B.F., K.L.M., L.S.A.; (CoLaus) P.V.; (deCODE) K.S., U.T.; (DIAGEN) P.E.S., S.R.B.; (DILGOM) S.R.; (DPS) M.U.; (DR’s EXTRA) R.R.; (EAS) J.F.P.; (EGCUT (Estonian Genome Center of University of Tartu)) A.M.; (ELY) N.W.; (EPIC) N.W., K.K.; (EPIC_N_OBSET GWAS) E.H.Young; (ERF) C.M.V.; (ESS (Erasmus Stroke Study)) P.J.K.; (Family Heart Study FHS) I.B.B.; (FBPP) A.C., R.S.C., S.C.H.; (FENLAND) R.L., N.W.; (FIN-D2D 2007) A.K., L.M.; (FINCAVAS) M.K.; (Framingham) L.A.C., S.K., J.M.O.; (FRISCII) A.S., L.W.; (FUSION GWAS) K.L.M., M.B.; (FUSION stage 2) F.S.C., J.T., J.S.; (GenomEUTwin) J.B.W., N.G.M., K.O.K., V.S., J.K., A.J., D.I.B., N.P., T.D.S.; (GLACIER) P.W.F.; (Go-DARTS) A.D.M., C.N.P.; (GxE/Spanish Town) B.O.T., C.A.M., F.B., J.N.H., R.S.C.; (HUNT2) K.H.; (IMPROVE) U.D., A.H., E.T., S.E.H.; (InCHIANTI) S.B.; (KORAF4) C.G.;(LifeLines) B.H.W.; (LOLIPOP) J.S.K., J.C.C.; (LURIC) B.O.B.; W.M.; (MDC) L.C.G., S.K.; (METSIM) J.K., M.L.; (MICROS) P.P.P.; (MORGAM) D.A., J.F.; (MRC/UVRI GPC GWAS) P.K., G.A., J.S., E.H.Y.; (MRC National Survey of Health & Development) D.K.; (NFBC1986) M-R.J.; (NSPHS) U.G.; (ORCADES) H.C.; (PARC) Y.I.C., R.M.K., J.I.R.; (PIVUS) E.I., L.L.; (PROMIS) J.D., P.D., D.S.; (Rotterdam Study) A.H., A.G.U.; (SardiNIA) G.R.A.; (SCARFSHEEP) A.H., U.D.; (SEYCHELLES) M.B., M.B.; P.B.; (SUVIMAX) P.M.; (Swedish Twin Reg.) E.I., N.L.P.; (TAICHI) T.L.A., Y.I.C., C.A.H., T.Q., J.I.R., W.H.S.; (THISEAS) G.D., P.D.; (Tromsø) I.N.; (TWINGENE) U.D., E.I.; (ULSAM) E.I.; (Whitehall II) A.H., M.K.

Genotyping of contributing cohorts

(ADVANCE) D.A.; (AIDHS/SDS) L.F.B., M.L.G.; (AMC-PAS) P.D., G.K.H.; (B58C-WTCCC & B58C-T1DGC) W.L.M.; (B58C-Metabochip) M.I.M.; (BLSA) D.H.; (BRIGHT) P.B.M.; (CHS) J.I.R.; (DIAGEN) N.N., G.M.; (DILGOM) A.P.; (DR’s EXTRA) T.A.L.; (EAS) J.F.W.; (EGCUT (Estonian Genome Center of University of Tartu)) T.E.; (EPIC) P.D.; (EPIC_N_SUBCOH GWAS) I.B.; (ERF) C.M.V.; (ESS (Erasmus Stroke Study)) C.M.V.; (FBPP) A.C., G.B.E.; (FENLAND) M.S.S.; (FIN-D2D 2007) A.J.S.; (FINCAVAS) T.L.; (Framingham) J.M.O.; (FUSION stage 2) L.L.B.; (GLACIER) I.B.; (Go-DARTS) C.G., C.N.P., M.I.M.; (IMPROVE) A.H.; (KORAF3) H.G., T.I.; (KORAF4) N.K.; (LifeLines) C.W.; (LOLIPOP) J.S.K., J.C.C.; (LURIC) M.E.K.; (MDC) B.F.V., R.D.; (MICROS) A.A.H.; (MORGAM) L.T., P.B.; (MRC/UVRI GPC GWAS) M.S.S.; (MRC National Survey of Health & Development) A.W., D.K., K.K.O.; (NFBC1986) A-L.H., M.J, M.M., P.E., S.V.; (NSPHS and FRISCII) Å.J.; (ORCADES) H.C.; (PARC) M.O.G., M.R.J., J.I.R.; (PIVUS) E.I., L.L.; (PROMIS) P.D., K.S.; (Rotterdam Study) A.G.U., F.R.; (SardiNIA) R.N.; (SCARFSHEEP) B.G., R.J.S.; (SEYCHELLES) F.M., G.B.E.; (Swedish Twin Reg.) E.I., N.L.P.; (TAICHI) D.A., T.L.A., E.K., T.Q., L.L.W.; (THISEAS) P.D.; (TWINGENE) A.H., E.I.; (ULSAM) E.I.; (WGHS) D.I.C., P.M.R.; (Whitehall II) A.H., C.L., M.K., M.K.

Phenotype definition of contributing cohorts

(ADVANCE) C.I.; (AGES Reykjavik study) T.B.H., V.G.; (AIDHS/SDS) L.F.B.; (AMC-PAS) J.J.K.; (Amish GLGC) A.R.S., B.D.M.; (B58C-WTCCC & B58C-T1DGC) D.P.S.; (B58C-Metabochip) C.P.; E.H.; (BRIGHT) P.B.M.; (CHS) B.M.P.; (CoLaus) P.V.; (deCODE) G.I.E., H.H., I.O.; (DIAGEN) G.M.; (DILGOM) K.S.; (DPS) J.L.; (DR’s EXTRA) P.K.; (EAS) J.L.B.; (EGCUT (Estonian Genome Center of University of Tartu)) A.M.; (EGCUT (Estonian Genome Center of University of Tartu)) K.F.; (ERF and Rotterdam Study) A.H.; (ERF) C.M.V; (ESS (Erasmus Stroke Study)) E.G.V., H.M.D., P.J.K.; (FBPP) A.C., R.S.C., S.C.H.; (FINCAVAS) T.V.N.; (Framingham) S.K., J.M.O.; (GenomEUTwin: MZGWA) J.B.W.; (GenomEUTwin-FINRISK) V.S.; (GenomEUTwin-FINTWIN) J.K., K.H.; (GenomEUTwin-GENMETS) A.J.; (GenomEUTwin-NLDTWIN) G.W.; (Go-DARTS) A.S.D., A.D.M., C.N.P., L.A.D.; (GxE/Spanish Town) C.A.M., F.B.; (IMPROVE) U.D.; A.H., E.T.; (KORAF3) C.M.; (KORAF4) A.D.; (LifeLines) L.J.; (LOLIPOP) J.S.K., J.C.C.; (LURIC) H.S.; (MDC) L.C.G.; (METSIM) A.S.; (MORGAM) G.C.; (MRC/UVRI GPC GWAS) R.N.N.; (MRC National Survey of Health & Development) D.K.; (NFBC1986) A.R., A-L.H., A.P., M-R.J.; (NSPHS and FRISCII) Å.J.; (NSPHS) U.G.; (ORCADES) S.H.W.; (PARC) Y.I.C., R.M.K.; (PIVUS) E.I., L.L.; (PROMIS) D.F.F.; (Rotterdam Study) A.H.; (SCARFSHEEP) U.D., B.G.; (SEYCHELLES) M.B., M.B., P.B.; (Swedish Twin Reg.) E.I., N.L.P.; (TAICHI) H.C., C.A.H., Y.H., E.K., S.L., W.H.S.; (THISEAS) G.D., M.D.; (Tromsø) T.W.; (TWINGENE) U.D., E.I.; (ULSAM) E.I.; (WGHS) P.M.R.; (Whitehall II) M.K.

Primary analysis from contributing cohorts

(ADVANCE) L.L.W.; (AIDHS/SDS) R.S.; (AMC-PAS) S.K.; (Amish GLGC) J.R.O., M.E.M.; (ARIC) K.A.V.; (B58C-Metabochip) C.M.L., E.H., T.F.; (B58C-WTCCC & B58C-T1DGC) D.P.S.; (BLSA) T.T.; (BRIGHT) T.J.; (CLHNS) Y.W.; (CoLaus) J.S.B.; (deCODE) G.T.; (DIAGEN) A.U.J.; (DILGOM) M.P.; (EAS) R.M.F.; (DPS) A.U.J.; (DR ’S EXTRA) A.U.J.; (EGCUT (Estonian Genome Center of University of Tartu)) E.M., K.F., T.E.; (ELY) D.G.; (EPIC) K.S., D.G.; (EPIC_N_OBSET GWAS) E.Y., C.L.; (EPIC_N_SUBCOH GWAS) N.W.; (ERF) A.I.; (ESS (Erasmus Stroke Study)) C.M.V., E.G.V.; (EUROSPAN) A.D.; (Family Heart Study FHS) I.B.B., M.F.F.; (FBPP) A.C., G.B.E.; (FENLAND) T.P., C.P.; (FENLAND GWAS) J.H.Z., J.L.; (FIN-D2D 2007) A.U.J.; (FINCAVAS) L.L.; (Framingham) L.A.C., G.M.P.; (FRISCII and NSPHS) Å.J.; (FUSION stage 2) T.M.T.; (GenomEUTwin-FINRISK) J.K.; (GenomEUTwin-FINTWIN) K.H.; (GenomEUTwin-GENMETS) I.S.; (GenomEUTwin-SWETWIN) P.K.M.; (GenomEUTwin-UK-TWINS) M.M.; (GLACIER) D.S.; (GLACIER) P.W.F.; (Go-DARTS) C.N.P., L.A.D.; (GxE/Spanish Town) C.D.P.; (HUNT) A.U.J.; (IMPROVE) R.J.S.; (InCHIANTI) T.T.; (KORAF3) M.M.; (KORAF4) A.P.; (LifeLines) I.M.N.; (LOLIPOP) W.Z.; (LURIC) M.E.K.; (MDC) B.F.V.; (MDC) P.F., R.D.; (METSIM) A.U.J.; (MRC/UVRI GPC GWAS) R.N.N.; (MRC National Survey of Health & Development) A.W., J.L.; (NFBC1986) M.K., I.S., S.K.S.; (NSPHS and FRISCII) Å.J.; (PARC) X.L.; (PIVUS) C.S., E.I.; (PROMIS) J.D., D.F.F., K.S.; (Rotterdam Study) A.I.; (SardiNIA) C.S., J.L.B., S.S.; (SCARFSHEEP) R.J.S.; (SEYCHELLES) G.B.E., M.B.; (SUVIMAX) T.J.; (Swedish Twin Reg.) C.S., E.I.; (TAICHI) D.A., T.L.A., H.C., M.G., C.A.H., T.Q., L.L.W; (THISEAS) S.K.; (Tromsø) A.U.J.; (TWINGENE) A.G., E.I.; (ULSAM) C.S., E.I., S.G.; (WGHS) D.I.C.; (Whitehall II) S.S.

ACKNOWLEDGEMENTS

We especially thank the >196,000 volunteers who participated in our study. Detailed acknowledgement of funding sources is provided in the supplementary online material.

References

1. KannelWBDawberTRKaganARevotskieNStokesJ3rdFactors of risk in the development of coronary heart disease--six year follow-up experience. The Framingham StudyAnnals of Internal Medicine1961553350[PubMed][Google Scholar]
2. CastelliWPCholesterol and lipids in the risk of coronary artery disease--the Framingham Heart StudyCanadian Journal of Cardiology19884Suppl A5A10A[PubMed][Google Scholar]
3. Lloyd-JonesDHeart disease and stroke statistics--2010 update: a report from the American Heart AssociationCirculation2010121e46e215[PubMed][Google Scholar]
4. TeslovichTMBiological, clinical and population relevance of 95 loci for blood lipidsNature201046670713[PubMed][Google Scholar]
5. BarterPJRyeKACholesteryl ester transfer protein (CETP) inhibition as a strategy to reduce cardiovascular iskJournal of Lipid Research2012[Google Scholar]
6. RahalkarARHegeleRAMonogenic pediatric dyslipidemias: classification, genetics and clinical spectrumMolecular Genetics and Metabolism20089328294[PubMed][Google Scholar]
7. MusunuruKFrom noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locusNature20104667149[PubMed][Google Scholar]
8. VoightBFThe Metabochip, a Custom Genotyping Array for Genetic Studies of Metabolic, Cardiovascular, nd Anthropometric TraitsPLoS Genetics2012(in press)[Google Scholar]
9. The 1000 Genomes ProjectA map of human genome variation from population scale sequencingNature2010467[Google Scholar]
10. SannaSFine mapping of five loci associated with low-density lipoprotein cholesterol detects variants that double the explained heritabilityPLoS Genet20117e1002198[PubMed][Google Scholar]
11. DevlinBRoederKGenomic control for association studiesBiometrics1999559971004[PubMed][Google Scholar]
12. AsselbergsFWLarge-scale gene-centric meta-analysis across 32 studies identifies multiple lipid lociAm J Hum Genet20129182338[PubMed][Google Scholar]
13. WelchCLGenetic regulation of cholesterol homeostasis: chromosomal organization of candidate genesJournal of Lipid Research199637140621[PubMed][Google Scholar]
14. SarriaAJPaniniSREvansRMA functional role for vimentin intermediate filaments in the metabolism of lipoprotein-derived cholesterol in human SW-13 cellsJournal of Biological Chemistry19922671945563[PubMed][Google Scholar]
15. HagbergCEVascular endothelial growth factor B controls endothelial fatty acid uptakeNature201046491721[PubMed][Google Scholar]
16. AshburnerMGene ontology: tool for the unification of biology. The Gene Ontology ConsortiumNature Genetics200025259[PubMed][Google Scholar]
17. SegreAVGroopLMoothaVKDalyMJAltshulerDCommon inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic traitsPLoS Genet20106[Google Scholar]
18. FitzgeraldMLMooreKJFreemanMWNuclear hormone receptors and cholesterol trafficking: the orphans find a new homeJ Mol Med (Berl)20028027181[PubMed][Google Scholar]
19. RossinEJProteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biologyPLoS Genet20117e1001273[PubMed][Google Scholar]
20. PlyteSEHughesKNikolakakiEPulvererBJWoodgettJRGlycogen synthase kinase-3: functions in oncogenesis and developmentBiochimica et Biophysica Acta1992111414762[PubMed][Google Scholar]
21. TokerACantleyLCSignalling through the lipid products of phosphoinositide-3-OH kinaseNature19973876736[PubMed][Google Scholar]
22. KaprioJFerrellREKottkeBAKambohMISingCFEffects of polymorphisms in apolipoproteins E, A-IV, and H on quantitative traits related to risk for cardiovascular diseaseArteriosclerosis and Thrombosis199111133048[PubMed][Google Scholar]
23. ErnstJMapping and analysis of chromatin state dynamics in nine human cell typesNature2011473439[PubMed][Google Scholar]
24. The ENCODE Project ConsortiumA user’s guide to the encyclopedia of DNA elements (ENCODE)PLoS Biol20119e1001046[PubMed][Google Scholar]
25. BuyskeSEvaluation of the metabochip genotyping array in African Americans and implications for fine mapping of GWAS-identified loci: the PAGE studyPLoS ONE20127e35651[PubMed][Google Scholar]
26. PalmenJThe functional interaction on in vitro gene expression of APOA5 SNPs, defining haplotype APOA52, and their paradoxical association with plasma triglyceride but not plasma apoAV levelsBiochimica et Biophysica Acta2008178244752[PubMed][Google Scholar]
27. SchunkertHLarge-scale association analysis identifies 13 new susceptibility loci for coronary artery diseaseNature Genetics2011433338[PubMed][Google Scholar]
28. The Coronary Artery Disease (C4D) ConsortiumA genome-wide association study in Europeans and South Asians identifies five new loci for coronary artery diseaseNature Genetics20114333944[PubMed][Google Scholar]
29. VoightBFTwelve type 2 diabetes susceptibility loci identified through large-scale association analysisNature Genetics20104257989[PubMed][Google Scholar]
30. SpeliotesEKAssociation analyses of 249,796 individuals reveal 18 new loci associated with body mass ndexNature Genetics20104293748[PubMed][Google Scholar]
31. HeidIMMeta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distributionNature Genetics20104294960[PubMed][Google Scholar]
32. EhretGBGenetic variants in novel pathways influence blood pressure and cardiovascular disease riskNature20114781039[PubMed][Google Scholar]
33. DupuisJNew genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes riskNature Genetics20104210516[PubMed][Google Scholar]
34. FreathyRMCommon variation in the FTO gene alters diabetes-related metabolic traits to the extent expected given its effect on BMIDiabetes200857141926[PubMed][Google Scholar]
35. ClarkeRCholesterol fractions and apolipoproteins as risk factors for heart disease mortality in older menArchives of Internal Medicine200716713738[PubMed][Google Scholar]
36. WillerCJNewly identified loci that influence lipid concentrations and risk of coronary artery diseaseNature Genetics2008401619[PubMed][Google Scholar]
37. VoightBFPlasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation studyLancet2012[Google Scholar]
38. Frikke-SchmidtRAssociation of loss-of-function mutations in the ABCA1 gene with high-density lipoprotein cholesterol levels and risk of ischemic heart diseaseJAMA2008299252432[PubMed][Google Scholar]
39. DemirkanAGenome-wide association study identifies novel loci associated with circulating phospho- and sphingolipid concentrationsPLoS Genet20128e1002490[PubMed][Google Scholar]
40. FriedewaldWTLevyRIFredricksonDSEstimation of the concentration of low-density lipoprotein holesterol in plasma, without use of the preparative ultracentrifugeClinical Chemistry197218499502[PubMed][Google Scholar]
41. PriceALPrincipal components analysis corrects for stratification in genome-wide association studiesNature Genetics2006389049[PubMed][Google Scholar]
42. KangHMVariance component model to account for sample structure in genome-wide association studiesNature Genetics20104234854[PubMed][Google Scholar]
43. StoufferSASuchmanEADeVinneyLCStarSAWilliamsRMJAdjustment During Army Life1949Princeton University Press.Princeton, NJ.
44. WillerCJLiYAbecasisGRMETAL: fast and efficient meta-analysis of genomewide association scansBioinformatics20102621901[PubMed][Google Scholar]
45. KeatingBJConcept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studiesPLoS ONE20083e3583[PubMed][Google Scholar]
46. SchadtEEMapping the genetic architecture of gene expression in human liverPLoS Biol20086e107[PubMed][Google Scholar]
47. ChasmanDIForty-three loci associated with plasma lipoprotein size, concentration, and cholesterol content in genome-wide analysisPLoS Genet20095e1000730[PubMed][Google Scholar]