MicroRNA Related Polymorphisms and Breast Cancer Risk
Abstract
Genetic variations, such as single nucleotide polymorphisms (SNPs) in microRNAs (miRNA) or in the miRNA binding sites may affect the miRNA dependent gene expression regulation, which has been implicated in various cancers, including breast cancer, and may alter individual susceptibility to cancer. We investigated associations between miRNA related SNPs and breast cancer risk. First we evaluated 2,196 SNPs in a case-control study combining nine genome wide association studies (GWAS). Second, we further investigated 42 SNPs with suggestive evidence for association using 41,785 cases and 41,880 controls from 41 studies included in the Breast Cancer Association Consortium (BCAC). Combining the GWAS and BCAC data within a meta-analysis, we estimated main effects on breast cancer risk as well as risks for estrogen receptor (ER) and age defined subgroups. Five miRNA binding site SNPs associated significantly with breast cancer risk: rs1045494 (odds ratio (OR) 0.92; 95% confidence interval (CI): 0.88–0.96), rs1052532 (OR 0.97; 95% CI: 0.95–0.99), rs10719 (OR 0.97; 95% CI: 0.94–0.99), rs4687554 (OR 0.97; 95% CI: 0.95–0.99, and rs3134615 (OR 1.03; 95% CI: 1.01–1.05) located in the 3′ UTR of CASP8, HDDC3, DROSHA, MUSTN1, and MYCL1, respectively. DROSHA belongs to miRNA machinery genes and has a central role in initial miRNA processing. The remaining genes are involved in different molecular functions, including apoptosis and gene expression regulation. Further studies are warranted to elucidate whether the miRNA binding site SNPs are the causative variants for the observed risk effects.
Introduction
Breast cancer is the most common women's cancer and is a leading cause of cancer mortality [1]. Inherited genetic variation has been associated with the initiation, development and progression of breast cancer. Studies on twins have suggested that hereditary predisposing factors are involved in up to one third of all breast cancers [2]. Many genetic loci have been associated with breast cancer risk and collectively explain approximately 35% of the familial risk [3], [4]. The largest genetic association study of breast cancer to date identified 41 novel low penetrance susceptibility loci [4] by selecting nearly 30,000 SNPs from a meta-analysis of nine genome-wide association (GWA) studies and genotyping them using 41,785 cases and 41,880 controls of European ancestry from studies in the Breast Cancer Association Consortium (BCAC). These 41 susceptibility loci probably represent the tip of the ice berg, and additional SNPs from the combined GWAS might explain a similar fraction of familial risk to that attributed to the already identified loci [4].
Mature miRNAs are 20–23 nucleotide, single-stranded RNA molecules that play a crucial role in gene expression regulation for many cellular processes including differentiation potential and development pattern. MiRNAs undergo a stepwise maturation process involving an array of miRNA machinery components. Drosha and DGCR8 mediate the cleavage of long primary miRNA transcripts (pri-miRNAs) into shorter pre-miRNAs in the nucleus [5], [6]. The pre-miRNAs are then transported to the cytoplasm where they are further cleaved by Dicer to produce mature miRNAs [7]. MiRNAs interact by pairing with the 3′ untranslated region (UTR), and also within the coding region and 5′ UTR of the corresponding mRNAs leading to mRNA destabilization, cleavage or translation repression. More effective mRNA destabilization is achieved when miRNA targets the 3'UTR rather than other mRNA regions [8]–[10]. An individual miRNA may regulate approximately 100 distinct mRNAs, and together more than 1000 human miRNAs are believed to modulate more than half of the mRNA species encoded in the genome [11], [12]. Additionally, most mRNAs possess binding sites for miRNAs [13]. MiRNAs are involved in tumorigenesis in that they can be either oncogenic when tumor suppressor genes are targeted, or genomic guardians (tumour suppressor miRNAs) when oncogenes are targeted [14]. Additionally it has been suggested that they may modulate both metastasis [15] and chemotherapy resistance [16]. MiRNAs have also been shown to have altered expression levels in tumours compared to normal tissue and between tumor subtypes in breast cancer among other carcinoma types [17]–[19]. SNPs may affect miRNA machinery genes or miRNAs activity; however SNPs can also create, abolish or modify miRNA binding sites in their binding regions. Polymorphisms in miRNA binding sites have been studied in regard to the risk of several cancers [20], including breast cancer [21]–[23]. These studies have found evidence for association of miRNA related SNPs and cancer risk, but the study sample sizes have been relatively small.
In this study, we investigate associations between miRNA-related polymorphisms and breast cancer risk by using a meta-analysis of nine GWAS and subsequent genotyping of top hits using 41,785 cases and 41,880 controls of European ancestry from the BCAC. To our knowledge, this is thus far the largest investigation of associations between miRNA-related polymorphisms and breast cancer susceptibility.
Materials and Methods
SNP selection and genotyping
SNPs in mature or pre-miRNAs, in genes of the miRNA machinery and in 3'UTR regions of protein coding genes with a potential effect on miRNA binding were systematically searched from Ensembl (hg18/build36) and Patrocles databases [24]. Additionally, tagging SNPs for such with r2≥0.8 were also identified utilizing the public HapMap SNP database. By this in silico approach we identified altogether 147,801 candidate SNPs and 12,550 tagging SNPs. These SNPs were then overlayed with those from the combined GWAS from the BCAC [4] and altogether 2196 SNPs were present (either genotyped or imputed) in the combined GWAS. These SNPs were genotyped with Illumina or Affymetrix arrays, as described previously [25]–[32]. The combined GWAS data were imputed for all scans using HapMap version 2 CEU as a reference in similar fashion to that presented by Michailidou and colleagues [4] with the exception that the HapMap version 2 release 21 was used at the time the overlay was performed. Analysis using a 1-degree-of-freedom trend test of these 2196 SNPs in the combined GWAS indicated some evidence of association with breast cancer risk for 44 SNPs (p<0.09). Notably, the combined GWAS included imputed data generated using HapMap version 2 release 21 (based on NCBI build 35 (dbSNP b125)), whereas the results presented here for the combined GWAS are based on imputation using HapMap version 2 release 22 (based on NCBI build 36 (dbSNP b126)). In the release 22, a number of SNPs were excluded due to mapping inconsistencies in build 35 relative to build 36. Hence, the estimates from the combined GWAS may slightly differ from the initial association analysis. The 44 SNPs (including 30 candidate and 14 tagging SNP) were genotyped on additional samples in the BCAC using the custom Illumina Infinium array (iCOGS) which included a total of 211,155 SNPs as described previously. The detailed description of quality control process for combined GWAS and iCOGS genotyping data was presented in [4].
Of the 42 SNPs that passed quality control [4], two were located in miRNA genes (one candidate SNP located in pre-miRNA hsa-miR-2110 and one tag SNP tagging a mature hsa-mir-548l variant), and four SNPs were located in miRNA machinery genes (SMAD5, SND1, CNOT4 and DROSHA). The genotyped DROSHA SNP tags the 3′ UTR miRNA binding site variant in the DROSHA gene. The remaining 38 candidate or tag SNPs were located in, or tagged to a predicted miRNA binding site in the 3′ UTR of protein coding genes. All 42 SNPs are described in Table 1. The workflow of the SNP selection in different stages is illustrated in Figure 1.
| Functional SNP (Tag SNP, R-squared) | Chr | Position | Coding | Gene | miRNA | SNP effecta | ||
| rs17091403 | 10 | 115923895 | GA | hsa-miR-2110 | ||||
| rs13447640 (rs1805360, r2 = 1) | 11 | 93866677 | GA | hsa-mir-548l | ||||
| rs3764941 | 5 | 135497426 | AC | SMAD5 | ||||
| rs17151639 | 7 | 127425052 | AG | SND1 | ||||
| rs17480616 | 7 | 134773600 | CG | CNOT4 | ||||
| rs10719 | 5 | 31437204 | GA | DROSHA | hsa-miR-1298 | AC | ||
| rs2550303 | 16 | 54953111 | AG | AMFR | hsa-miR-577 | AC | ||
| rs7513934 | 1 | 52590776 | GA | CC2D1B | hsa-miR-384/hsa-miR-577 | CNC | ||
| rs1128226 | 7 | 21908194 | AC | CDCA7L | hsa-miR-548g | AC | ||
| rs3796133 | 3 | 100000533 | GA | DCBLD2 | hsa-miR-624* | AC | ||
| rs7441 | 12 | 90063806 | GA | DCN | hsa-miR-135b* | AC | ||
| rs1803439 | 21 | 37807312 | AG | DYRK1A | hsa-miR-550 | AC | ||
| rs3797 | 15 | 27199858 | AG | FAM189A1 | hsa-miR-570 | AC | ||
| rs7130622 | 11 | 128186721 | AC | FLI1 | hsa-miR-138-2* | AC | ||
| rs1052532 | 15 | 89275240 | AG | HDDC3 | hsa-miR-1224-3p/hsa-miR-1260/hsa-miR-1280 | AC | ||
| rs7040123 | 9 | 7160742 | AG | KDM4C | hsa-miR-154*/hsa-miR-487a | AC | ||
| rs1062225 | 10 | 49313232 | AG | MAPK8 | hsa-miR-203 | AC | ||
| rs41739 | 7 | 116224740 | AG | MET | hsa-miR-576-5p | AC | ||
| rs702681 | 5 | 56253786 | AG | MIER3 | hsa-miR-196a* | AC | ||
| rs3134615 | 1 | 40134653 | CA | MYCL1 | hsa-miR-1827 | ANC | ||
| rs2304669 | 2 | 238830402 | AG | PER2 | hsa-miR-885-3p | AC | ||
| rs13422 | 17 | 15074900 | AC | PMP22 | hsa-miR-29b-1* | AC | ||
| rs7562391 | 2 | 201444411 | AC | PPIL3 | hsa-miR-493*/hsa-miR-499-3p | AC | ||
| rs7520333 | 1 | 40862837 | AG | RIMS3 | hsa-let-7d/hsa-let-7e | CNC | ||
| rs739692 | 18 | 53178524 | GA | ST8SIA3 | hsa-miR-96/hsa-miR-1271/hsa-miR-182 | AC | ||
| rs1058450 | 4 | 120200088 | GA | SYNPO2 | hsa-miR-183 | AC | ||
| rs4351800 | 11 | 7446395 | CA | SYT9 | hsa-miR-544 | AC | ||
| rs12438324 | 15 | 55366808 | AG | TCF12 | hsa-miR-591 | AC | ||
| rs12869870 | 13 | 99415306 | GA | ZIC5 | hsa-miR-34a/hsa-miR-34c-5p/hsa-miR-449a/hsa-miR-449b | AC | ||
| rs9990 (rs1444418, r2 = 1) | 10 | 64230476 | AG | ADO | hsa-miR-512-5p/hsa-miR-510 | AC | ||
| rs757537 (rs4705870, r2 = 1) | 5 | 132187033 | GA | ANKRD43 | hsa-miR-320a/hsa-miR-320b/hsa-miR-320c/hsa-miR-320d | AC | ||
| rs3774729 (rs2037119, r2 = 0.943) | 3 | 63969919 | GA | ATXN7 | hsa-miR-1206 | AC | ||
| rs1045487 (rs1045494, r2 = 1) | 2 | 201860026 | AG | CASP8 | hsa-miR-938 | AC | ||
| rs7288826 (rs8140217, r2 1) | 22 | 37547947 | GA | CBX6 | hsa-miR-1207-5p | AC | ||
| rs17569034 (rs17512204, r2 = 0.835) | 2 | 118449301 | GA | CCDC93 | hsa-miR-1178 | AC | ||
| rs3205281 (rs7674744, r2 = 1) | 4 | 78874296 | GA | CNOT6L | hsa-miR-643/hsa-miR-297 | AC | ||
| rs13005 (rs9473, r2 = 0.964) | 10 | 13727177 | GA | FRMD4A | hsa-miR-548m | AC | ||
| rs3809831 (rs3809828, r2 = 1) | 17 | 7187575 | GA | KCTD11 | hsa-miR-892b | AC | ||
| rs6445538 (rs4687554, r2 = 1) | 3 | 52839175 | AG | MUSTN1 | hsa-miR-891b | AC | ||
| rs7818 (rs9371201, r2 = 0.875) | 6 | 150186694 | GA | PCMT1 | hsa-miR-595 | AC | ||
| rs9844202 (rs7635553, r2 = 1) | 3 | 168646064 | GA | SERPINI2 | hsa-miR-1272 | AC | ||
| rs2271565 (rs7086917, r2 = 1) | 10 | 49867441 | AC | WDFY4 | hsa-miR-657/hsa-miR-214/hsa-miR-15a/hsa-miR-16/hsa-miR-15b/hsa-miR-195/hsa-miR-424/hsa-miR-497 | AC | ||
Tag SNPs used in the analysis are presented in the parenthesis along with the R squred value relative to the functional SNP.
aAccording to Patrocles prediction; AC = abolishes conserved binding site, ANC = abolishes non-conserved binding site, CNC = creates non-conserved binding site (Target sites are considered conserved if they are shared by at least one primate, one rodent and one nonprimate/nonrodent mammal [24]).
Study sample
The combined GWAS included nine breast cancer studies totalling 10,052 cases and 12,575 controls of European ethnic background. Details and study-specific subject numbers are presented in Table S1. Since the GWAS were limited to patients of European ethnic background we further utilized 41,785 cases ascertained for their first primary, invasive breast cancer and 41,880 controls of European ancestry from 41 BCAC studies genotyped using the iCOGS array (Table S2). For a subgroup analysis of ER negative and ER positive cases, as well as cases aged less than 50 years at diagnosis, we included all the cases for which the respective data were available. The ER subgroup analysis was based on 702 ER negative cases and 2,019 ER positive cases from five GWAS studies and 7,200 ER negative cases from 40 BCAC studies and 26,302 ER positive cases from 34 BCAC studies. The analysis of cases aged less than 50 years at diagnosis was based on 3,470 cases from three GWAS studies and 9,483 cases from 35 BCAC studies. All participating studies conform to the Declaration of Helsinki and were approved by the respective ethical review boards and ethics committees (Tables S1 and S2), and all participants in these studies had provided written consent for the research.
Statistical methods
We used logistic regression to estimate per-allele log-odds ratios and standard errors including the study as a covariate. We also included principal components as covariates in order to correct for potential hidden population structure. In the GWAS, for two studies (UK2 and HEBCS) the estimates were adjusted for the first three principal components and in the iCOGS analysis we used the first six principal components and an additional component to reduce inflation for the LMBC study, as described previously [4]. Subgroup analyses were carried out for ER negative and positive subgroups and for the group aged less than 50 years at diagnosis. For meta-analysis, we combined the estimates from the combined GWAS and iCOGS with a fixed effects model using the inverse variance weighted method. In the meta-analysis, the subjects involved in both combined GWAS and iCOGS (1880) were only taken into account once. In order to adust for P-values against multiple testing, we used Benjamini Hochberg correction. The adjusted P-values are shown in Table 2 along with the nominal P-values. In the text we report the nominal P-values. The statistical analyses were conducted using the R 2.14.0 statistical computing environment (http://www.r-project.org/).
| SNP | Chr | Position | coding1 | GWAS OR (95%CI)2 | GWAS P3 | iCOGS OR (95% CI)2 | iCOGS P3 | Combined GWAS + iCOGS OR (95% CI)2 | Combined GWAS + iCOGS P3(BH corrected P)4 | Gene |
| rs702681 | 5 | 56253786 | AG | 1.07 (1.02–1.11) | 3.92×10−3 | 1.06 (1.04–1.09) | 2.76×10−8 | 1.06 (1.04–1.08) | 3.88×10−10 (1.63×10−8) | MIER3 |
| rs1045494 | 2 | 201860026 | AG | 0.90 (0.81–1.00) | 4.74×10−2 | 0.92 (0.88–0.96) | 4.47×10−4 | 0.92 (0.88–0.96) | 5.94×10−5 (1.25×10−3) | CASP8 |
| rs1052532 | 15 | 89275240 | AG | 0.94 (0.90–0.98) | 7.94×10−3 | 0.97 (0.95–0.99) | 1.47×10−2 | 0.97 (0.95–0.99) | 7.78×10−4 (1.09×10−2) | HDDC3 |
| rs10719 | 5 | 31437204 | GA | 0.92 (0.88–0.97) | 8.79×10−4 | 0.98 (0.95–1.00) | 5.32×10−2 | 0.97 (0.94–0.99) | 1.35×10−3 (1.42×10−2) | DROSHA |
| rs4687554 | 3 | 52839175 | AG | 0.94 (0.90–0.99) | 1.23×10−2 | 0.97 (0.95–1.00) | 2.39×10−2 | 0.97 (0.95–0.99) | 1.71×10−3 (1.44×10−2) | MUSTN1 |
| rs3134615 | 1 | 40134653 | CA | 1.04 (0.99–1.09) | 9.97×10−2 | 1.03 (1.00–1.05) | 2.09×10−2 | 1.03 (1.01–1.05) | 5.07×10−3 (3.55×10−2) | MYCL1 |
| rs7635553 | 3 | 168646064 | GA | 0.89 (0.83–0.95) | 9.73×10−4 | 0.98 (0.95–1.01) | 1.98×10−1 | 1.00 (0.97–1.04) | 9.24×10−3 (5.54×10−2) | SERPINI2 |
| rs3796133 | 3 | 100000533 | GA | 1.18 (1.08–1.29) | 4.18×10−4 | 1.01 (0.97–1.06) | 5.74×10−1 | 1.04 (1.00–1.09) | 3.93×10−2 (1.45×10−1) | DCBLD2 |
| rs4351800 | 11 | 7446395 | CA | 1.04 (1.00–1.08) | 4.48×10−2 | 1.01 (0.99–1.03) | 1.98×10−1 | 1.02 (1.00–1.04) | 4.15×10−2 (1.45×10−1) | SYT9 |
| rs17512204 | 2 | 118449301 | GA | 1.06 (0.98–1.14) | 1.20×10−1 | 1.03 (0.99–1.06) | 1.63×10−1 | 1.03 (1.00–1.07) | 5.22×10−2 (1.57×10−1) | CCDC93 |
| rs3809828 | 17 | 7187575 | GA | 1.17 (1.06–1.28) | 1.97×10−3 | 1.01 (0.97–1.05) | 5.22×10−1 | 0.99 (0.95–1.03) | 7.93×10−2 (2.22×10−1) | KCTD11 |
| rs7441 | 12 | 90063806 | GA | 1.11 (1.03–1.20) | 8.70×10−3 | 1.01 (0.97–1.05) | 5.98×10−1 | 1.03 (0.99–1.06) | 1.04×10−1 (2.57×10−1) | DCN |
| rs7086917 | 10 | 49867441 | AC | 0.96 (0.93–1.00) | 6.35×10−2 | 0.99 (0.97–1.01) | 4.38×10−1 | 0.99 (0.97–1.00) | 1.29×10−1 (3.01×10−1) | WDFY4 |
| rs7040123 | 9 | 7160742 | AG | 1.11 (0.99–1.23) | 7.59×10−2 | 1.02 (0.97–1.07) | 5.14×10−1 | 1.00 (0.95–1.04) | 1.79×10−1 (3.74×10−1) | KDM4C |
| rs7674744 | 4 | 78874296 | GA | 0.94 (0.89–0.99) | 2.83×10−2 | 0.99 (0.97–1.02) | 6.91×10−1 | 1.01 (0.98–1.03) | 1.81×10−1 (3.74×10−1) | CNOT6L |
| rs12438324 | 15 | 55366808 | AG | 0.87 (0.79–0.97) | 1.01×10−2 | 1.00 (0.94–1.05) | 8.69×10−1 | 1.02 (0.98–1.07) | 1.87×10−1 (3.74×10−1) | TCF12 |
| rs17151639 | 7 | 127425052 | AG | 0.96 (0.92–1.01) | 1.09×10−1 | 0.99 (0.97–1.02) | 5.66×10−1 | 1.00 (0.98–1.02) | 2.19×10−1 (4.18×10−1) | SND1 |
| rs17480616 | 7 | 134773600 | CG | 0.87 (0.72–1.04) | 1.27×10−1 | 0.99 (0.93–1.04) | 6.39×10−1 | 0.98 (0.92–1.03) | 3.70×10−1 (5.98×10−1) | CNOT4 |
| rs7513934 | 1 | 52590776 | GA | 1.04 (1.00–1.08) | 7.98×10−2 | 1.00 (0.98–1.02) | 9.99×10−1 | 1.01 (0.99–1.02) | 4.37×10−1 (6.34×10−1) | CC2D1B |
| rs2304669 | 2 | 238830402 | AG | 0.96 (0.91–1.02) | 1.86×10−1 | 1.00 (0.97–1.02) | 8.17×10−1 | 0.99 (0.97–1.02) | 4.38×10−1 (6.34×10−1) | PER2 |
| rs1058450 | 4 | 120200088 | GA | 0.96 (0.91–1.01) | 1.33×10−1 | 1.00 (0.97–1.02) | 9.28×10−1 | 1.01 (0.98–1.03) | 4.59×10−1 (6.43×10−1) | SYNPO2 |
The SNPs with consistent odds ratios in combined GWAS and iCOGS analysis are shown. (Results for all 42 SNPs are presented in Table S3.)
1Build 36 position.
2Per allele odds ratio for the minor allele relative to the major allele.
31df p-trend.
41df p-trend adjusted against multiple testing by Benjamini–Hochberg correction method.
Results
For the 42 SNPs we successfully genotyped, estimates of association from the combined GWAS and from iCOGS analysis are shown in Table S3. Twenty-one SNPs showed consistent associations with breast cancer risk in the combined GWAS and in iCOGS analysis; results from the meta-analysis are shown in Table 2. The most significantly associated SNP, rs702681 (OR 1.06 [95%CI 1.04–1.08]; P 3.9×10−10), is located in the 3'UTR of MIER3, close to the known breast cancer susceptibility gene MAP3K1. The SNP rs702681 is located at the same 5q11.2 locus as the previously published risk SNP rs889312 [33] (correlation r2 = 0.3). When the two SNPs were analysed in the same logistic regression model, the association with rs889312, but not that with rs702681 remained nominally statistically significant, suggesting that rs702681 is unlikely to be the causal SNP at this locus. The five SNPs with the significant novel associations from the meta-analysis (P≤5.07×10−3and adjusted P≤3.55×10−2 after correction for multiple testing) were rs1045494, (OR 0.92 [95%CI 0.88–0.96]; P = 5.90×10−5), rs1052532, (OR 0.97 [95%CI 0.95–0.99]; P = 7.78×10−4), rs10719, (OR 0.97 [95%CI 0.94–0.99]; P = 1.35×10−3) rs4687554 (OR 0.97 [95%CI 0.95–0.99]; P = 1.71×10−3) and rs3134615 (OR 1.03 [95%CI 1.01–1.05]; P = 5.07×10−3) located in 3′ UTR of Caspase-8 (CASP8), HD Domain Containing 3 (HDDC3), DROSHA, Musculoskeletal, Embryonic Nuclear Protein 1 (MUSTN1) and V-Myc Myelocytomatosis Viral Oncogene Homolog 1 (MYCL1), respectively (Table 2). SNP rs1045494 is tagging the hsa-miR-938 binding site SNP rs1045487 (r2 = 1.0) of CASP8 and the SNP rs1052532 in HDDC3 is predicted to abolish the binding site for hsa-miR-1224-3p. The SNP rs10719 is predicted to abolish the hsa-miR-1298 binding site in the 3′ UTR of DROSHA. SNP rs4687554 tags the hsa-miR-891b binding site SNP rs6445538 (r2 = 1.0) of MUSTN1 and rs3134615 is located at the binding site of hsa-miR-1827 of MYCL1. There was no evidence for heterogeneity in the per-allele OR for any SNP. The per study per allele ORs for these five miRNA binding site SNPs from the combined GWAS along with per-SNP heterogeneity variance P-values are shown in Figure S1 and from the iCOGS in Figure S2. Next we analysed the SNPs by ER status-defined subtype, and for cases aged less than 50 years at diagnosis, for risk associations in the meta-analysis of combined GWAS and iCOGS (Tables S4, S5 and S6). These analyses did not reveal any additional significant results. For rs1045494 in CASP8, rs4687554 in MUSTN1 and rs3134615 in MYCL1 (OR 1.03 [95%CI 1.01–1.05]; P = 7.75×10−4) a more significant association with breast cancer risk was found for the ER positive subgroup than in the main analysis, but the result from the test for heterogeneity by ER status was not significant (data not shown). All associations were estimated using an additive inheritance model. Dominant and recessive models did not improve the estimates (data not shown).
Discussion
We investigated associations between genetic variation in miRNAs, in the genes of the miRNA machinery and in the miRNA binding sites and the risk of breast cancer. We identified several SNPs that are predicted to abolish an miRNA binding site and that are significantly associated with breast cancer risk. Previous studies investigating miRNA related SNPs, especially in miRNA binding sites have included predefined sets of genes. Nicoloso and colleagues investigated 38 previously identified breast cancer risk SNPs and found two to modify miRNA binding sites in TGFB1 and XRCC1 in vitro [23]. Neither of these were included in our data set. Liang and colleagues investigated 134 potential miRNA binding sites in cancer-related genes and found six miRNA binding site SNPs that were associated with ovarian cancer risk [34].
In the meta-analysis of combined GWAS and iCOGS for main effects, for four of the five most significant miRNA binding site SNPs, the minor allele was associated with a decreased breast cancer risk. The minor allele of SNP rs3134615 in 3′ UTR of MYCL1 was associated with an increased breast cancer risk. All the five most significant miRNA binding site SNPs locate in 3′ UTR and have been predicted to abolish the miRNA binding site. The defect in miRNA-mediated regulation would be expected to lead to an increase in the translation of the corresponding encoded protein. The five genes, whose regulation may be affected by the miRNA-associated SNPs, include the pre-apoptotic gene CASP8, HDDC3, miRNA biogenesis master regulator DROSHA, MYC-family member MYCL1 and MUSTN1. CASP8 is involved in apoptosis in breast cancer cells [35], and many studies have reported polymorphisms in this gene to be associated with risks for several cancers [36], [37] including breast cancer [38], [39], indicating the importance of CASP8 in tumor development. SNP rs1045494 studied here is located close to the coding region SNP rs1045485 that has been previously shown to have a stronger protective effect [38], [40], [41]. Interestingly, Michalidou and colleagues reported this SNP as having only weak evidence for an association (P 0.0013 in combined GWAS and iCOGS) [4], but these two SNPs (rs1045485 and rs1045494) are not correlated (r2 = 0.001 in Caucasian population). Neither is rs1045494 correlated with the more strongly associated rs1830298 SNP, identified through fine-mapping of the region (r2 = 0.02) [42]. Rs1045494 tags SNP rs1045487 (r2 = 1.0) which is predicted to abolish the hsa-miR-938 binding site and thus may affect CASP8 expression. There is very little reported evidence on the involvement of HDDC3 or the hsa-miR-1224-3p in cancer, indicating a novel association with risk. HDDC3 has been suggested to be involved in the starvation response [43]. The HDDC3 gene is expressed at higher levels by several different tumor types, including breast tumors, than by normal tissue [44]. DROSHA is a miRNA master regulator. It is a member of the RNase III enzyme family, belongs to the miRNA biogenesis pathway and is the core nuclease that processes pri-miRNAs into pre-miRNAs in the nucleus [5], [6]. The SNP rs10719 in the 3′ UTR of DROSHA is predicted to abolish the hsa-miR-1298 binding site. Hsa-miR-1298 is predicted to target DROSHA by the Patrocles prediction as well as by TargetScan [45] and PITA [46] prediction algorithms. Recently a small Korean study reported another SNP rs644236, tagging the SNP rs10719 (r2 = 0.955 in CEU population and r2 = 0.876 in Asian population (combined CHB and JPT)) to be associated with elevated breast cancer risk [47]. When taking into account the opposite major and minors alleles in the Asian and European populations for SNPs rs644236 and rs10719, this result is in concordance with our results where both the combined GWAS as well as the iCOGS analysis consistently indicated an association of the minor allele of SNP rs10719 with reduced breast cancer risk. We also found the minor allele of SNP rs3134615 in the 3′ UTR of MYCL1 to be associated with an increased risk. MYCL1 (L-MYC) belongs to the same family of transcription factors as the known proto-oncogene MYC (C-MYC) and they share a high degree of structural similarity [48]. The MYCL1 gene has previously been reported to be amplified and overexpressed in ovarian cancer [49]. A case-control study by Xiong and colleagues reported SNP rs3134615 to be significantly associated with increased risk of small cell lung cancer [50]. SNP rs3134615 was predicted by Patrocles to abolish the hsa-miR-1827 binding site. This has also been suggested by functional studies where MYCL1 was found as the target of hsa-miR-1827 and the SNP rs3134615 was also found to increase MYCL1 expression [50]. The evidence from functional studies is consistent with our finding that SNP rs3134615 might increase breast cancer risk. MUSTN1 has been shown to be involved in the development and regeneration of the musculoskeletal system [51]. Thus far no evidence of association between MUSTN1 and breast cancer has been reported, but the MUSTN1 gene is expressed in the mammary glands [52].
Since only a small fraction of miRNA binding sites has been experimentally validated, we selected SNPs that had been computationally predicted to affect miRNA binding sites. For our original SNP selection we used the Patrocles database that contains predicted miRNA binding sites and also compiles perturbation prediction of SNP effects. There are a multitude of prediction programs and their performance has been evaluated [53]. Witkos and colleagues find target prediction algorithms that utilize orthologous sequence alignment, like Patrocles, to be the most reliable.
The followup of the 42 miRNA related SNPs identified five significant associations with breast cancer risk. Although the individual risk effects were subtle, considering that we could only investigate a small proportion of our initial in silico data set of miRNA related SNPs (over 140,000 SNPs) this may suggest that genetic polymorphisms affecting the miRNA regulation could have a considerable combined effect on breast cancer risk.
It should be noted that, until fine mapping studies are carried out for these loci, it is not clear whether these miRNA-related SNPs are the variants responsible for the observed associations.
This comprehensive analysis of miRNA related polymorphisms using a large two stage study of women with European ancestry provides evidence for miRNA related SNPs being potential modulators of breast cancer risk.
Supporting Information
Forest plots for the five most significant miRNA binding site SNPs from the combined GWAS. Squares indicate the estimated per-allele OR for the minor allele in Europeans. The horizontal lines indicate 95% confidence limits. The vertical blue dashed lines indicate clipping of the confidence intervals for presentation purpose. The area of the square is inversely proportional to the variance of the estimate. The diamond indicates the estimated per-allele OR from the combined analysis.
(PDF)
Forest plots for the five most significant miRNA binding site SNPs from the iCOGS. Squares indicate the estimated per-allele OR for the minor allele in Europeans. The horizontal lines indicate 95% confidence limits. The vertical blue dashed lines indicate clipping of the confidence intervals for presentation purpose. The area of the square is inversely proportional to the variance of the estimate. The diamond indicates the estimated per-allele OR from the combined analysis.
(PDF)
A description of each GWAS study, number of subjects and genotyping platform used in combined GWAS.
(DOC)
A description of each BCAC study with subjects of European origin in iCOGS.
(DOC)
Frequencies and effect sizes of the 42 SNPs in the main analysis; combined GWAS and iCOGS.
(DOC)
Results for SNPs in the GWAS and iCOGS separately and combined GWAS+iCOGS analysis for ER negative subgroup.
(DOC)
Results for SNPs in the GWAS and iCOGS separately and combined GWAS+iCOGS analysis for ER positive subgroup.
(DOC)
Results for SNPs in the GWAS and iCOGS separately and combined GWAS+iCOGS analysis for cases less than 50 years at diagnosis.
(DOC)
Acknowledgments
We thank all the individuals who took part in these studies and all the researchers, study staff, clinicians and other health care providers, technicians and administrative staff who have enabled this work to be carried out. The HEBCS thanks Dr. Karl von Smitten and RN Irja Erkkilä for their help with the HEBCS data and samples. The ABCFS thanks Maggie Angelakos, Judi Maskiell and Gillian Dite. The OFBCR thanks Teresa Selander, Nayana Weerasooriya and Gord Glendon. The ABCS would like to acknowledge Ellen van der Schoot for DNA of controls. The BBCC thanks Silke Landrith, Sonja Oeser, Matthias Rübner. The BBCS thanks Eileen Williams, Elaine Ryder-Mills and Kara Sargus. The BIGGS thanks Niall McInerney, Gabrielle Colleran, Andrew Rowan and Angela Jones. The BSUCH thanks Peter Bugert and the Medical Faculty, Mannheim. The CGPS thanks the staff and participants of the Copenhagen General Population Study, and Dorthe Uldall Andersen, Maria Birna Arnadottir, Anne Bank, Dorthe Kjeldgård Hansen for excellent technical assistance. The CNIO-BCS acknowledge the support of Nuria Álvarez, Daniel Herrero, Primitiva Menendez and the Human Genotyping-CEGEN Unit (CNIO). The DFBBCS thanks Margreet Ausems, Christi van Asperen, Senno Verhoef, and Rogier van Oldenburg for providing samples from their Clinical Genetic centers. We also thank Pascal Arp, Mila Jhamai, Marijn Verkerk, Lizbeth Herrera and Marjolein Peters for their help in creating the GWAS database, and Karol Estrada and Maksim V. Struchalin for their support in creation and analysis of imputed data. The authors are grateful to the study participants, the staff from the Rotterdam Study and the participating general practitioners and pharmacists. The ESTHER thanks Hartwig Ziegler, Sonja Wolf and Volker Hermann, Katja Butterbach. The GC-HBOC would like to thank the following persons for providing additional information and samples: Prof. Dr. Norbert Arnold, Dr. Sabine Preissler-Adams, Dr. Monika Mareeva-Varon, Dr. Dieter Niederacher, Prof. Dr. Brigitte Schlegelberger, Dr. Clemens Mül, Heide Hellebrand, and Stefanie Engert. The HMBCS thanks Peter Hillemanns, Hans Christiansen and Johann H. Karstens. The KBCP thanks Eija Myöhänen and Helena Kemiläinen. kConFab/AOCS wish to thank Heather Thorne, Eveline Niedermayr, all the kConFab research nurses and staff, the heads and staff of the Family Cancer Clinics, and the Clinical Follow Up Study for their contributions to this resource, and the many families who contribute to kConFab. The LMBC thanks Gilian Peuteman, Dominiek Smeets, Thomas Van Brussel and Kathleen Corthouts. The MARIE would like to thank Alina Vrieling, Katharina Buck, Ursula Eilber, Muhabbet Celik, and Sabine Behrens. The MBCSG thanks Siranoush Manoukian, Bernard Peissel and Daniela Zaffaroni of the Fondazione IRCCS Istituto Nazionale dei Tumori (INT); Bernardo Bonanni, Irene Feroce and Angela Maniscalco of the Istituto Europeo di Oncologia (IEO) and the personnel of the Cogentech Cancer Genetic Test Laboratory. The MTLGEBCS gratefully acknowledge the assistance of Lesley Richardson and Marie-Claire Goulet in conducting the study. We would like to thank Martine Tranchant (Cancer Genomics Laboratory, CHU de Québec Research Center), Marie-France Valois, Annie Turgeon and Lea Heguy (McGill University Health Center, Royal Victoria Hospital; McGill University) for DNA extraction, sample management and skillful technical assistance. J.S. is Chairholder of the Canada Research Chair in Oncogenetics. The OBCS thanks Meeri Otsukka and Kari Mononen. The ORIGO thanks E. Krol-Warmerdam, and J. Blom for patient accrual, administering questionnaires, and managing clinical information. The LUMC survival data were retrieved from the Leiden hospital-based cancer registry system (ONCDOC) with the help of Dr. J. Molenaar. The OSU thanks Robert Pilarksi and Charles Shapiro, who were instrumental in the formation of the OSU Breast Cancer Tissue Bank. We thank the Human Genetics Sample Bank for processing of samples. OSU Columbus area control specimens were provided by the Ohio State University's Human Genetics Sample Bank. The PBCS thanks Mark Sherman, Neonila Szeszenia-Dabrowska, Beata Peplonska, Witold Zatonski, Pei Chao and Michael Stagner. The RBCS thanks Petra Bos, Jannet Blom, Ellen Crepin, Anja Nieuwlaat, Annette Heemskerk and the Erasmus MC Family Cancer Clinic. The SBCS thanks Sue Higham, Ian Brock, Sabapathy Balasubramanian, Helen Cramp and Dan Connley. The SEARCH thanks the SEARCH and EPIC-Norfolk teams. The iCOGS study would not have been possible without the contributions of the following: Qin Wang (BCAC), Andrew Berchuck (OCAC), Rosalind A. Eeles, Ali Amin Al Olama, Zsofia Kote-Jarai, Sara Benlloch (PRACTICAL), Antonis Antoniou, Lesley McGuffog and Ken Offit (CIMBA), Andrew Lee, and Ed Dicks, Craig Luccarini and the staff of the Centre for Genetic Epidemiology Laboratory, Anna Gonzalez-Neira and the staff of the CNIO genotyping unit, Daniel C. Tessier, Francois Bacot, Daniel Vincent, Sylvie LaBoissière and Frederic Robidoux and the staff of the McGill University and Génome Québec Innovation Centre, and the staff of the Copenhagen DNA laboratory, and Julie M. Cunningham, Sharon A. Windebank, Christopher A. Hilker, Jeffrey Meyer and the staff of Mayo Clinic Genotyping Core Facility.
Consortia members
GENICA Network. Hiltrud Brauch, Wing-Yee Lo, Christina Justenhoven: Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, and University of Tübingen, Germany. Yon-Dschun Ko, Christian Baisch: Department of Internal Medicine, Evangelische Kliniken Bonn gGmbH, Johanniter Krankenhaus, Bonn, Germany. Hans-Peter Fischer: Institute of Pathology, University of Bonn, Bonn, Germany. Ute Hamann: Molecular Genetics of Breast Cancer, Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, Germany. Thomas Brüning, Beate Pesch, Sylvia Rabstein, Anne Lotz: Institute of the Ruhr University Bochum (IPA), Bochum, Germany. Volker Harth: Institute for Occupational Medicine and Maritime Medicine, University Medical Center Hamburg-Eppendorf, Germany.
kConFab Investigators. See http://www.kconfab.org/Organisation/Members.aspx
AOCS. See http://www.aocstudy.org/org_coll.asp
References
- 1. , , , , ,
et al (2011) Global cancer statistics. CA Cancer J Clin61: 69–90.[PubMed][Google Scholar] - 2. , , , , ,
et al (2000) Environmental and heritable factors in the causation of cancer—analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med343: 78–85.[PubMed][Google Scholar] - 3. , , , , ,
et al (2012) Genome-wide association analysis identifies three new breast cancer susceptibility loci. Nat Genet44: 312–318.[PubMed][Google Scholar] - 4. , , , , ,
et al (2013) Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 45: 352–361e351 –[Google Scholar] - 5. , , , , (2004) Processing of primary microRNAs by the Microprocessor complex. Nature432: 231–235.[PubMed][Google Scholar]
- 6. , , , , ,
et al (2003) The nuclear RNase III Drosha initiates microRNA processing. Nature425: 415–419.[PubMed][Google Scholar] - 7. , , , , ,
et al (2001) A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science293: 834–838.[PubMed][Google Scholar] - 8. , , , , ,
et al (2004) Organization of the teicoplanin gene cluster in Actinoplanes teichomyceticus. Microbiology150: 95–102.[PubMed][Google Scholar] - 9. , , (2008) Mechanisms of post-transcriptional regulation by microRNAs: are the answers in sight?Nat Rev Genet9: 102–114.[PubMed][Google Scholar]
- 10. , , (2011) MicroRNAs: Processing, Maturation, Target Recognition and Regulatory Functions. Mol Cell Pharmacol3: 83–92.[PubMed][Google Scholar]
- 11. , , (2010) The widespread regulation of microRNA biogenesis, function and decay. Nat Rev Genet11: 597–610.[PubMed][Google Scholar]
- 12. , , (2012) miRNAs in human cancer. Methods Mol Biol822: 295–306.[PubMed][Google Scholar]
- 13. , , , (2009) Most mammalian mRNAs are conserved targets of microRNAs. Genome Res19: 92–105.[PubMed][Google Scholar]
- 14. , , , (2013) MicroRNAs in Human Cancer. Adv Exp Med Biol774: 1–20.[PubMed][Google Scholar]
- 15. , , , , ,
et al (2013) microRNA-200c modulates the epithelial-to-mesenchymal transition in human renal cell carcinoma metastasis. Oncol Rep30: 643–650.[PubMed][Google Scholar] - 16. , , , , ,
et al (2010) Involvement of miR-326 in chemotherapy resistance of breast cancer through modulating expression of multidrug resistance-associated protein 1. Biochem Pharmacol79: 817–824.[PubMed][Google Scholar] - 17. , (2013) Prognostic microRNA/mRNA signature from the integrated analysis of patients with invasive breast cancer. Proc Natl Acad Sci U S A110: 7413–7417.[PubMed][Google Scholar]
- 19. Comprehensive molecular portraits of human breast tumours. Nature490: 61–70.[PubMed][Google Scholar]
- 20. , , , , ,
et al (2008) Polymorphisms within micro-RNA-binding sites and risk of sporadic colorectal cancer. Carcinogenesis29: 579–584.[PubMed][Google Scholar] - 21. , , , , ,
et al (2009) An miR-502-binding site single-nucleotide polymorphism in the 3'-untranslated region of the SET8 gene is associated with early age of breast cancer onset. Clin Cancer Res15: 6292–6300.[PubMed][Google Scholar] - 22. , , , , (2010) Single nucleotide polymorphisms in miRNA binding sites and miRNA genes as breast/ovarian cancer risk modifiers in Jewish high-risk women. Int J Cancer127: 589–597.[PubMed][Google Scholar]
- 23. , , , , ,
et al (2010) Single-nucleotide polymorphisms inside microRNA target sites influence tumor susceptibility. Cancer Res70: 2789–2798.[PubMed][Google Scholar] - 24. , , , , (2010) Patrocles: a database of polymorphic miRNA-mediated gene regulation in vertebrates. Nucleic Acids Res38: D640–651.[PubMed][Google Scholar]
- 25. , , , , ,
et al (2003) Familial risks, early-onset breast cancer, and BRCA1 and BRCA2 germline mutations. J Natl Cancer Inst95: 448–457.[PubMed][Google Scholar] - 26. , , , , ,
et al (2006) Inconsistent association between the STK15 F31I genetic polymorphism and breast cancer risk. J Natl Cancer Inst98: 1014–1018.[PubMed][Google Scholar] - 27. , , , , ,
et al (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet39: 870–874.[PubMed][Google Scholar] - 28. , , , , ,
et al (2006) Association of the CASP10 V410I variant with reduced familial breast cancer risk and interaction with the CASP8 D302H variant. Carcinogenesis27: 606–609.[PubMed][Google Scholar] - 29. , , , , ,
et al (2008) Risk of different histological types of postmenopausal breast cancer by type and regimen of menopausal hormone therapy. Int J Cancer123: 933–941.[PubMed][Google Scholar] - 30. , , , , ,
et al (2011) A combined analysis of genome-wide association studies in breast cancer. Breast Cancer Res Treat126: 717–727.[PubMed][Google Scholar] - 31. , , , , ,
et al (2010) NordicDB: a Nordic pool and portal for genome-wide control data. Eur J Hum Genet18: 1322–1326.[PubMed][Google Scholar] - 32. , , , , ,
et al (2010) Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet42: 504–507.[PubMed][Google Scholar] - 33. , , , , ,
et al (2007) Genome-wide association study identifies novel breast cancer susceptibility loci. Nature447: 1087–1093.[PubMed][Google Scholar] - 34. , , , , ,
et al (2010) Genetic variants in MicroRNA biosynthesis pathways and binding sites modify ovarian cancer risk, survival, and treatment response. Cancer Res70: 9765–9776.[PubMed][Google Scholar] - 35. , , (2000) Interferon-gamma treatment elevates caspase-8 expression and sensitizes human breast tumor cells to a death receptor-induced mitochondria-operated apoptotic program. Cancer Res60: 5673–5680.[PubMed][Google Scholar]
- 36. , , , , ,
et al (2011) Genome-wide association study identifies three new melanoma susceptibility loci. Nat Genet43: 1108–1113.[PubMed][Google Scholar] - 38. , , , , ,
et al (2007) A common coding variant in CASP8 is associated with breast cancer risk. Nat Genet39: 352–358.[PubMed][Google Scholar] - 39. , , , , ,
et al (2011) Genetic polymorphisms and breast cancer risk: evidence from meta-analyses, pooled analyses, and genome-wide association studies. Breast Cancer Res Treat127: 309–324.[PubMed][Google Scholar] - 40. , , , , ,
et al (2004) Association of a common variant of the CASP8 gene with reduced risk of breast cancer. J Natl Cancer Inst96: 1866–1869.[PubMed][Google Scholar] - 41. , , , , ,
et al (2005) Re: Association of a common variant of the CASP8 gene with reduced risk of breast cancer. J Natl Cancer Inst 97: 1012 author reply 1012–1013.[PubMed][Google Scholar] - 43. , , , , ,
et al (2010) A metazoan ortholog of SpoT hydrolyzes ppGpp and functions in starvation responses. Nat Struct Mol Biol17: 1188–1194.[PubMed][Google Scholar] - 44. , , , , ,
et al (2008) Systematic bioinformatic analysis of expression levels of 17,330 human genes across 9,783 samples from 175 types of healthy and pathological tissues. Genome Biol9: R139.[PubMed][Google Scholar] - 45. , , (2005) Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell120: 15–20.[PubMed][Google Scholar]
- 46. , , , , (2007) The role of site accessibility in microRNA target recognition. Nat Genet39: 1278–1284.[PubMed][Google Scholar]
- 47. , , , , ,
et al (2011) Common genetic polymorphisms of microRNA biogenesis pathway genes and risk of breast cancer: a case-control study in Korea. Breast Cancer Res Treat130: 939–951.[PubMed][Google Scholar] - 48. , , , , ,
et al (1988) L-myc cooperates with ras to transform primary rat embryo fibroblasts. Mol Cell Biol8: 2668–2673.[PubMed][Google Scholar] - 49. , , , , ,
et al (2003) Amplification and overexpression of the L-MYC proto-oncogene in ovarian carcinomas. Am J Pathol162: 1603–1610.[PubMed][Google Scholar] - 50. , , , , ,
et al (2011) Genetic variation in an miRNA-1827 binding site in MYCL1 alters susceptibility to small-cell lung cancer. Cancer Res71: 5175–5181.[PubMed][Google Scholar] - 51. , , (2004) Molecular cloning and characterization of Mustang, a novel nuclear protein expressed during skeletal development and regeneration. FASEB J18: 52–61.[PubMed][Google Scholar]
- 52. , , , , ,
et al (2012) Gene Expression Atlas update—a value-added database of microarray and sequencing-based functional genomics experiments. Nucleic Acids Res40: D1077–1081.[PubMed][Google Scholar] - 53. , , (2011) Practical Aspects of microRNA Target Prediction. Curr Mol Med11: 93–109.[PubMed][Google Scholar]
