The Molecular Taxonomy of Primary Prostate Cancer.
Journal: 2016/February - Cell
ISSN: 1097-4172
Abstract:
There is substantial heterogeneity among primary prostate cancers, evident in the spectrum of molecular abnormalities and its variable clinical course. As part of The Cancer Genome Atlas (TCGA), we present a comprehensive molecular analysis of 333 primary prostate carcinomas. Our results revealed a molecular taxonomy in which 74% of these tumors fell into one of seven subtypes defined by specific gene fusions (ERG, ETV1/4, and FLI1) or mutations (SPOP, FOXA1, and IDH1). Epigenetic profiles showed substantial heterogeneity, including an IDH1 mutant subset with a methylator phenotype. Androgen receptor (AR) activity varied widely and in a subtype-specific manner, with SPOP and FOXA1 mutant tumors having the highest levels of AR-induced transcripts. 25% of the prostate cancers had a presumed actionable lesion in the PI3K or MAPK signaling pathways, and DNA repair genes were inactivated in 19%. Our analysis reveals molecular heterogeneity among primary prostate cancers, as well as potentially actionable molecular defects.
Relations:
Content
Citations
(393)
References
(90)
Grants
(165)
Diseases
(1)
Conditions
(1)
Chemicals
(3)
Organisms
(1)
Processes
(6)
Similar articles
Articles by the same authors
Discussion board
Cell 163(4): 1011-1025

The molecular taxonomy of primary prostate cancer

Introduction

Prostate cancer is the second most common cancer in men and the fourth most common tumor type worldwide (Ferlay et al., 2013). It is estimated that in 2015, prostate cancer will be diagnosed in 220,800 men in the United States alone and that 27,540 will die of their disease (Siegel et al., 2015). Multiple genetic and demographic factors, including age, family history, genetic susceptibility, and race, contribute to the high incidence of prostate cancer (Al Olama et al., 2014).

In the current era of PSA screening, nearly 90% of prostate cancers are clinically localized at the time of their diagnosis (Penney et al., 2013). The clinical behavior of localized prostate cancer is highly variable – while some men will have aggressive cancer leading to metastasis and death from the disease, many others will have indolent cancers that are cured with initial therapy or may be safely observed. Multiple risk stratification systems have been developed, combining the best currently available clinical and pathological parameters (such as Gleason score, PSA levels, clinical and pathological staging); however, these tools still do not adequately predict outcome (Cooperberg et al., 2009, D’Amico et al., 1998, Kattan et al., 1998). Further risk stratification using molecular features could potentially help distinguish indolent from aggressive prostate cancer.

Molecular and genetic profiles are increasingly being used to subtype cancers of all types and to guide selection of more precisely targeted therapeutic interventions. Several recent studies have explored the molecular basis of primary prostate cancer and have identified multiple recurrent genomic alterations that include mutations, DNA copy-number changes, rearrangements, and gene fusions (Baca et al., 2013, Barbieri et al., 2012, Berger et al., 2011, Lapointe et al., 2007, Pflueger et al., 2011, Taylor et al., 2010, Tomlins et al., 2007, Wang et al., 2011). The most common alterations in prostate cancer genomes are fusions of androgen-regulated promoters with ERG and other members of the ETS family of transcription factors. In particular, the TMPRSS2-ERG fusion is the most common molecular alteration in prostate cancer (Tomlins et al., 2005), being found in between 40 and 50% of prostate tumor foci, translating to more than 100,000 cases annually in the United States (Tomlins et al., 2009). Nevertheless, among treated prostate cancers, and despite extensive study, patients with fusion-bearing tumors do not appear to have a significantly different prognosis following prostatectomy than those without (Gopalan et al., 2009, Pettersson et al., 2012). Prostate cancers also have varying degrees of DNA copy-number alteration; indolent and low-Gleason tumors have few alterations, whereas more aggressive primary and metastatic tumors have extensive burdens of copy-number alteration genome-wide (Taylor et al., 2010, Hieronymus et al., 2014, Lalonde et al., 2014). In contrast, somatic point mutations are less common in prostate cancer than in most other solid tumors. The most frequently mutated genes in primary prostate cancers are SPOP, TP53, FOXA1, and PTEN (Barbieri et al., 2012). Only recently has the spectrum of epigenetic changes in prostate cancer genomes been explored (Börno et al., 2012, Friedlander et al., 2012, Kim et al., 2011, Kobayashi et al., 2012, Mahapatra et al., 2012).

Importantly, no studies have comprehensively integrated diverse omics data types to assess the robustness of previously defined subtypes and potentially prognostic alterations. Here, to gain further insight into the molecular-genetic heterogeneity of primary prostate cancer and to establish a molecular taxonomy of the disease for future diagnostic, prognostic, and therapeutic stratification, the TCGA Network has comprehensively characterized 333 primary prostate cancers using seven genomic platforms. This analysis reveals novel molecular features that provide a better understanding of this disease and suggest potential therapeutic strategies.

Results

Cohort and platforms

The cohort of primary prostate cancers analyzed resulted from extensive pathologic, analytical, and quality control review, yielding 333 tumors from 425 available cases. Images of frozen tissue were evaluated by multiple expert genitourinary pathologists and cases were excluded if no tumor cells were identifiable in the sample, or if there was evidence of significant RNA degradation (Figure S1, Supplementary Methods). For the subset of cases reviewed by two pathologists, tumor cellularity estimates were within 20% of each other in 71% of cases. In total, 78% of Gleason scores were concordant within one grade of the secondary pattern (Supplementary Methods). Moreover, due to the challenge of acquiring primary prostate cancer specimens of high tumor cellularity, we also performed a multi-platform analysis of tumor content, estimating tumor purity with analytical approaches utilizing both DNA (Carter et al., 2012, Prandi et al., 2014) and RNA (Quon et al., 2013, Ahn et al., 2013) sequencing data. The molecular and pathologic estimates are presented in Table S1A and Figure S1. The clinical and pathological characteristics of the final cohort are presented in Table 1. The average follow-up time following radical prostatectomy was just under two years, which precluded outcomes analysis due to the long natural history of primary prostate cancer.

Table 1

Cohort Characteristics

Clinical feature
Age61 (43–76)
Pre-operative PSA7.4 (1.6–87.0)
Gleason Score
 3+365
 3+4102
 4+378
 ≥ 888
Tumor Cellularity (pathology)
 <20%7
 21–40%40
 41–60%84
 61–80%115
 81–100%87
Pathologic Stage
 pT2a/b18
 pT2c111
 pT3a110
 pT3b82
 pT46
PSA Recurrence
 Yes33
 No[a]248
 Not Available47
Margin Status
 Positive69
 Negative193
 Not Available71
Ethnicity
 Caucasian270
 African descent43
 Asian8
 Not Available12
Either no evidence of recurrence or insufficient follow-up

We characterized isolated biomolecules from these 333 tumor samples using four platforms: whole-exome sequencing for somatic mutations, array-based methods for profiling both somatic copy-number changes and DNA methylation, and messenger RNA (mRNA) sequencing. We also performed microRNA (miRNA) sequencing on 330 of these samples, reverse-phase protein array (RPPA) on 152 samples, and low-pass and high-pass whole-genome sequencing (WGS) on 100 and 19 tumor/normal pairs, respectively (Supplementary Methods). For 19 samples, non-malignant adjacent prostate samples were also examined for DNA methylation and RNA/miRNA expression analyses.

The molecular taxonomy of primary prostate cancer

Previous studies indicate that many genomically distinct subsets of prostate cancer exist. These are driven in some cases by frequent events, such as androgen-regulated fusions of ERG and other ETS-family members, or recurrent SPOP mutations, and in other cases by less common genomic aberrations. Given the comprehensive nature of our data, we sought to unify these disparate findings to establish a molecular taxonomy of primary disease that integrates results from somatic mutations, gene fusions, somatic copy-number alterations (SCNA), gene expression, and DNA methylation. We first performed unsupervised clustering of data from each molecular platform as well as integrative clustering using iCluster (Shen et al., 2009). These analyses uncovered both known and novel associations with 74% of all tumors being assignable to one of seven molecular classes based on distinct oncogenic drivers: fusions involving 1) ERG, 2) ETV1, 3) ETV4, or 4) FLI1 (46, 8, 4, and 1% respectively), or mutations in 5) SPOP, 6) FOXA1, or 7) IDH1 mutations (11, 3, and 1% respectively) (Figures 1 and S2, Table S1A).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f1.jpg
The molecular taxonomy of primary prostate cancer

Comprehensive molecular profiling of 333 primary prostate cancer samples revealed seven genomically distinct subtypes, defined (top to bottom) by ERG fusions (46%), ETV1/ETV4/FLI1 fusions or overexpression (8%, 4%, 1%, respectively), or by SPOP (11%), FOXA1 (3%), and IDH1 (1%) mutations. A subset of these subtypes was correlated with clusters computationally derived from the individual characterization platforms (somatic copy-number alterations, methylation, mRNA, microRNA, and protein levels from reverse phase protein arrays). The heatmap shows DNA copy-number for all cases, with chromosomes shown from left to right. Regions of loss are indicated by shades of blue, and gains by shades of red.

See alsoFigures S1, S2, S3, S4, S5, S6, S7, and Tables S1A, S1B, S1E, S2.

In total, 53% of tumors were found to have ETS-family gene fusions (ERG, ETV1, ETV4, and FLI1) after analysis with two complementary algorithms (Sboner et al., 2010, Wang et al., 2010) (see Methods). While TMPRSS2 was the most frequent fusion partner in all ETS fusions, we identified fusions with other previously described androgen-regulated 5′ partner genes, including SLC45A3 and NDRG1 (Table S5). We also identified several tumors that overexpressed full-length ETS transcripts that were mutually exclusive with ETS fusions (12 ETV1 high tumors, 6 ETV4, and 2 FLI1) (Table S5). ETS overexpression in these cases could possibly be mediated via epigenetic mechanisms or cryptic translocations of the entire gene locus to a transcriptionally active neighborhood. In the one case with elevated ETV1 full-length expression studied by whole-genome sequencing, we identified a cryptic genomic rearrangement 3′ of the ETV1 locus with a region on chromosome 14 near the MIPOL1 gene adjacent to FOXA1. This event is similar to previously described ETV1 translocations in LNCaP and MDA-PCa2b cell lines and in patient samples (Tomlins et al., 2007, Gasi et al., 2011). Overall, while fusions in the four genes were mostly mutually exclusive, three tumors showed evidence for fusions involving more than one of these genes (Table S5). Given that histologically defined single tumor foci have been shown to be rarely composed of different ETS fusion positive clones (Cooper et al., 2015, Kunju et al., 2014, Pflueger et al., 2011), it is likely these cases reflect convergent phenotypic evolution in clonally heterogeneous tumors. Tumors defined by SPOP mutations were mutually exclusive with all ETS fusion-positive cases, though four of the SPOP-mutant tumors also possessed FOXA1 mutations. In all four of these tumors, both the SPOP and FOXA1 mutations were clonal, indicating they are present in the same tumor cells.

Beyond the class-defining lesions, there were multiple patterns of both known and novel concurrent alterations in key prostate cancer genes. The former included the preponderance of PTEN deletions in ERG fusion-positive cases (Taylor et al., 2010). Similarly, SPOP mutations have previously been found to occur in approximately 10% of clinically localized prostate cancers, were mutually exclusive of tumors defined by ETS rearrangements, and may designate a distinct molecular class of disease based primarily on distinctive SCNA profiles (including deletion of CHD1, 6q, and 2q) (Barbieri et al., 2012, Blattner et al., 2014). Beyond reaffirming these known patterns, our taxonomy revealed new relationships and subtypes. Specifically, the SPOP-mutant/CHD1-deleted subset of prostate cancers had notable molecular features, including elevated levels of DNA methylation, homogeneous gene expression patterns, as well as frequent overexpression of SPINK1 mRNA, supporting SPOP mutation as a key feature in the molecular taxonomy of prostate cancer. Interestingly, mRNA, copy-number, and methylation profiles were similar in tumors with FOXA1 mutations and those with SPOP mutations. Furthermore, we identified a new genomically distinct subtype of prostate cancer defined by hotspot mutations in IDH1, described in greater depth below.

Despite this detailed molecular taxonomy of primary prostate cancers, 26% of all tumors studied appeared to be driven by still-occult molecular abnormalities or by one or more frequent alterations that co-occur with the genomically defined classes. Some of these tumors showed a high burden of copy-number alterations or DNA hypermethylation. Enrichment analysis indicated that this subset of tumors was enriched for mutations in TP53, KDM6A, and KMT2D; deletions of chromosomes 6 and 16; and amplifications of chromosomes 8 (spanning MYC) and 11 (CCND1) (Table S2). To further characterize this group further, we performed whole genome sequencing of 19 tumor specimens and their matched normal tissues, a subset of which had high tumor cellularity but still lacked DNA copy-number alterations or any known or presumed driver lesions. Interestingly, no occult driver abnormalities or highly recurrent regulatory mutations were identified, such as the TERT promoter mutation common to many other tumor types (Khurana et al., 2013). Therefore, a significant (up to 26%) subset of primary prostate cancers of both good and poor clinical prognosis (including those with Gleason scores of >8) are driven by as yet unexplained molecular alterations.

mRNA clusters were tightly correlated with ETS fusion status, where mRNA cluster 1 consisted primarily of ETS-negative tumors and mRNA clusters 2 and 3 were split among ETS fusion-positive tumors (Figure 1, Figure S4). miRNA clustering showed a similar pattern, revealing a general difference in miRNA expression between ETS-positive and negative tumors (Figure 1, Figure S6). Clustering of RPPA data identified three distinct subgroups, with cluster 3 exhibiting elevated PI3K/AKT, MAP-Kinase and receptor tyrosine kinase activity (Figure S7A). The cluster was not enriched, however, in genomic alterations in these pathways, and in general there was little correlation of increased pathway activity (as measured by phospho-AKT and other downstream phospho-proteins) with the frequent genomic alterations in the pathways (see the example of PTEN deletions in Figure S7B).

Recurrently altered genes and their patterns across subtypes

The overall mutational burden of the cohort, inferred from whole-exome sequencing, was 0.94 mutations per megabase (median, range 0.04 – 28 per megabase), which corresponds to 19 non-synonymous mutations per tumor genome (median; 13–25, 25th and 75th percentiles respectively). This is consistent with prior exome and genome-scale sequencing results for localized prostate cancers (Barbieri et al., 2012, Baca et al., 2013), and is lower than the mutational burden of metastatic prostate cancers (Gundem et al., 2015, Grasso et al., 2012, Robinson et al., 2015). These results reaffirm that prostate cancer possesses a lower mutational burden than many other epithelial tumor types that are not associated with a strong exogenous mutagen (Alexandrov et al., 2013, Lawrence et al., 2013). Prior exome sequencing of 112 prostate cancers identified 12 recurrently mutated genes through focused assessment of point mutations and short insertions and deletions (Barbieri et al., 2012). By comparison, mutational significance analysis of these 333 tumor-normal pairs by MutSigCV (Lawrence et al., 2013, 2014) identified 13 significantly mutated genes (q-value < 0.1), seven of which had not been previously identified (Figure 2 and Tables S1B–S1C). Among the significantly mutated genes, SPOP, TP53, FOXA1, PTEN, MED12, and CDKN1B were previously identified as recurrently mutated. Additional clinically relevant genes were identified with lower mutation frequencies; these included genes within canonical kinase signaling pathways (BRAF, HRAS, AKT1), the beta-catenin pathway (CTNNB1), and the DNA repair pathway (ATM). The rate of BRAF mutations (2.4%) seen in this study is higher than previously reported; these include several known activating mutations, but curiously not the canonical V600E hotspot. We identified no BRAF fusions, which had previously been reported in a subset of clinically advanced prostate cancer (Palanisamy et al., 2010). NKX3-1, previously implicated in familial prostate cancer syndromes and often found to be deleted, was also somatically mutated in this cohort (1% of tumors). While its functional significance is unknown, ZMYM3, an epigenetic regulatory protein not previously implicated in prostate cancer but infrequently mutated in Ewing sarcomas (Tirode et al., 2014) and various pediatric cancers (Huether et al, 2014), was also recurrently mutated (2% of tumors). Genes with known biological relevance that were mutated at frequencies just below the threshold of significance (q-value < 0.01) included KMT2C (MLL3), KMT2D (MLL4), APC, IDH1, and PIK3CA (Figure 2 and Tables S1B–S1C). Mutations in the tumor suppressor genes KMT2C, KMT2D, and APC were mostly truncating; the IDH1 and PIK3CA mutations occurred in previously characterized hotspots and thus may have therapeutic relevance for those occasional tumors with these mutations.

An external file that holds a picture, illustration, etc.
Object name is nihms730293f2.jpg
Recurrent alterations in primary prostate cancer

The spectrum and type of recurrent alterations and genes (mutations, fusions, deletions, and overexpression) in the cohort are shown (left to right) grouped by the molecular subtypes defined in Figure 1. On the right, the statistical significance of individual mutant genes (MutSig q-value) is shown. Mutations in IDH1, PIK3CA, RB1, KMT2D, CHD1, BRCA2, and CDK12 are also shown, despite their not being statistically significant. SPINK1 overexpression is shown for reference.

See alsoTables S1B, S1C, S1D, S1E.

Notwithstanding these key somatic mutations, the most frequent molecular abnormalities involved chromosomal arm-level copy-number alterations (Taylor et al., 2010). These alterations included recurrent genomic gains of chromosome 7 and 8q and heterozygous losses of 8p, 13q, 16q and 18 (Figure S3A). Significance analysis of recurrent focal DNA copy-number alterations revealed 20 amplifications and 35 deletions (q-value < 0.25, GISTIC 2.0; Figure S3A and Table S1D). Recurrent focal amplifications included those spanning known oncogenes such as CCND1 (11q13.2, 2%), MYC (8q24.21, 8%), as well as FGFR1 and WHSC1L1 (8p11.23, 8%). Recurrent focal deletions were much more common: Homozygous deletions spanning the PTEN locus occurred at one of the highest rates of any tumor type studied thus far (15%). Focal deletions of the region between the TMPRSS2 and ERG genes on 21q22.3, which result in TMPRSS2-ERG fusions, were unique to prostate cancers as expected. Other focal deletions include those spanning tumor suppressors TP53 (17p13.1), CDKN1B (12p13.1), MAP3K1 (5q11.2), FANCD2 (3p26), as well as SPOPL (2q22.1) and the complex locus spanning FOXP1/RYBP/SHQ1 (3p13). MAP3K7 (6q.12-22) was also frequently deleted, along with deletion of CHD1 (5q15-q21); co-deletion of these loci has been associated with aggressive ETS-negative prostate cancer (Kluth et al., 2013, Rodrigues et al., 2015).

As the pattern and extent of SCNAs in prostate cancer genomes have been associated with probability of disease recurrence and metastasis in primary prostate cancers (Taylor et al., 2010, Hieronymus et al., 2014, van Dekken et al., 2004, Paris et al., 2004), we sought to identify similar structure in the burden of SCNAs by performing hierarchical clustering of arm-level alterations. We identified three major groups of prostate cancers, one with mostly unaltered genomes (hereafter referred to as quiet), a second group encompassing 50% of all tumors with an intermediate level of SCNAs, and a third group with a high burden of arm level genomic gains and losses (Figure S3B–S3C). While a formal outcome analysis was not possible due to the limited clinical follow-up available for this cohort, the subset of tumors with the greatest burden of SCNAs had significantly higher Gleason scores and PSA levels than the other two groups (Figure S3B–S3D). The tumors in this group also had significantly higher tumor cellularity (Figure S3C).

Epigenetic changes define molecularly distinct subtypes of prostate cancer

Integrative analysis of genetic and epigenetic changes revealed a diversity of DNA methylation changes that defined molecularly distinct subsets of primary prostate cancer (Figure 3). Unsupervised hierarchical clustering of the most variably hypermethylated CpGs identified four epigenetically distinct groups of prostate cancers (Figure S5A–S5B). When integrated with the molecular taxonomy defined above, we found a number of striking associations. Among these was a notable pattern within ERG fusion-positive tumors. Specifically, while nearly two-thirds of all ERG fusion-positive tumors belonged to an unsupervised cluster with only moderately elevated DNA methylation (DNA methylation cluster 3), the remaining ERG fusion-positive tumors comprised a distinct hypermethylated cluster (cluster 1) that was almost exclusively associated with ERG fusions. On average, this cluster contained twice the number of hypermethylated loci as DNA methylation cluster 3 (Figure S5A), and the epigenetic patterns were largely distinct from those of ETV1 and ETV4 fusion-positive tumors, which showed more heterogeneous methylation. What drives these epigenetically distinct groups of ETS fusion-positive tumors is unknown, but there is considerable diversity in their DNA methylation profiles that may reflect altered epigenetic silencing (Figure S5A–S5B). Together, these results support further ETS fusion-based subtyping of disease, but also reveal a greater molecular and likely biological diversity among ERG fusion-positive tumors than previously appreciated. Likewise, these results are consistent with in vivo mouse modeling and expression profiling studies that suggest important molecular and clinicopathological differences between ERG and non-ERG ETS fusion positive tumors (Baena et al., 2013, Tomlins et al., 2015).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f3.jpg
Hypermethylation is common across primary prostate cancer

A. Primary prostate cancers show diverse methylation changes compared to normal prostate samples (left). Unsupervised clustering was performed on the beta-values of the 5,000 most hypermethylated loci, and the results mapped to the genomic subtypes. ERG-positive tumors had a high diversity of methylation changes, with a distinct subgroup (cluster 1) nearly unique to this group. SPOP and FOXA1 mutant tumors also exhibited global hypermethylation. B.IDH1-mutant prostate cancers, which are associated with younger age, are among the most hypermethylated tumors, as in Glioblastoma (GBM) and Acute Myeloid Leukemia (AML).

See alsoFigure S4 and Table S1F.

SPOP and FOXA1-mutant tumors exhibited homogeneous epigenetic profiles. These tumors belonged almost exclusively to DNA methylation cluster 2, a group that also contained a majority of the ETV1 and ETV4 but not ERG-positive tumors. Lastly, the IDH1-mutant tumors were notable given their strongly elevated levels of genome-wide DNA hypermethylation (Figure S5B). While of low incidence, these IDH1 R132-mutant tumors defined a distinct subgroup of what appears to be early onset prostate cancer (Figure 3B) that possesses fewer DNA copy-number alterations (see Figure 1) or other canonical genomic lesions that are common to most other prostate cancers. IDH1 and IDH2 mutations have been associated with a DNA methylation phenotype in other tumor types, most notably in gliomas (Noushmehr et al., 2010) and acute myeloid leukemias (AML, Figueroa et al., 2010). Curiously, IDH1-mutant prostate cancers possessed even greater levels of genome-wide hypermethylation than do either glioma or AML IDH1-mutant tumors (Figure 3B). After further investigating DNA methylation differences between IDH-mutant and wild type tumors among prostate cancers, gliomas, and AMLs, we found that hypermethylated loci were specific to the cancer type rather than IDH mutants (Figure S5F).

Integrating these epigenetic data with mRNA expression levels, we identified 164 genes that were epigenetically silenced in subsets of the cohort (Figure S5C, Table S1F). These silenced genes were significantly enriched for genes previously found to be differentially expressed in prostate cancer; specifically, genes that are down-regulated in metastatic prostate cancer (Chandran et al., 2007) and genes involved in prostate organ development (Schaeffer et al., 2008) (q-value < 2.0·10). These 164 silenced genes displayed heterogeneous frequencies of epigenetic silencing across the cohort. For example, SHF, FAXDC2, GSTP1, ZNF154, and KLF8 were epigenetically silenced in almost all tumors (>85%) whereas STAT6 was silenced predominantly in ETS fusion-positive tumors and not in SPOP and IDH1-mutant tumors. Conversely, HEXA was silenced preferentially in SPOP-mutant tumors compared to ERG fusion-positive tumors (86.5 versus 14.5% respectively, p<5.4·10). Consistent with their increased DNA hypermethylation, the IDH1-mutant prostate tumors also possessed the greatest number of epigenetically silenced genes among all prostate tumors (Table S1F).

AR activity is variable in primary prostate cancers

The androgen receptor (AR) regulates normal prostate development as well as critical growth and survival programs in prostate carcinoma. Primary prostate cancer is androgen dependent, and androgen activity is a central axis in prostate cancer pathogenesis, driving the creation and overexpression of most ETS fusion genes (Lin et al., 2011, Mani et al., 2009, Tomlins et al., 2005). However, the extent to which individual primary prostate cancers differ in androgen sensitivity or dependence is unknown, and the issue has translational implications because AR targeting is therapeutically important. To address these questions, we sought to infer the AR output of tumors by calculating an AR activity score from the expression pattern of 20 genes that are experimentally validated AR transcriptional targets (Hieronymus et al., 2006). This score suggested that a broad spectrum of AR activity exists across all prostate tumors as well as between genomic subtypes (Figure 4A). Although ETS fusion genes are under AR control, the ETS fusion-positive groups had variable AR transcriptional activity. In contrast, we found that tumors with SPOP or FOXA1 mutations had the highest AR transcriptional activity of all genotypically distinct subsets of prostate cancer (p = 1.1·10 and 0.04, respectively, t-test). Consistent with this, SPOP mutations have been previously implicated in androgen signaling in model systems, since both AR and AR coactivators are substrates deregulated by SPOP mutation (Geng et al. 2013, An et al., 2014, Geng et al. 2014), providing a possible explanation for the associated increase in AR activity seen in this subtype of prostate cancers.

An external file that holds a picture, illustration, etc.
Object name is nihms730293f4.jpg
The diversity of androgen receptor activity in primary prostate cancer

A. Androgen receptor activity, as inferred by the induction of AR target genes, was significantly increased in SPOP and FOXA1 mutant tumors when compared to normal prostate or ERG-positive tumors. This increase in activity cannot be fully explained by AR mRNA or protein levels. B. Multiple known AR splice variants were detected in benign prostate (left) and primary prostate cancer (right), with the AR-V7 variant detected in 50% of tumors. C. Real-time qPCR comparison of AR-V7 in 74 tumor samples (grey) and 5 adjacent-normal samples (blue). D.FOXA1 missense mutations were clustered in the forkhead domain, mostly in residues that do not form contacts with DNA (see also the 3-D structure in panel E).

While AR transcriptional output is a proxy for ligand-driven AR activity in many tumors, AR transcript variants have been described that encode truncated AR proteins that lack the ligand-binding domain and hence are capable of activating AR target genes in the absence of androgens (Dehm et al., 2008, Watson et al., 2010). Using RNA sequencing reads that spanned the splice junctions unique to each AR variant, we quantified the expression of these AR transcript variants. This analysis revealed that several AR splice variants, most notably AR-V7, can be detected at low levels in primary tumors and, in a few cases, in adjacent benign prostate tissue (Figure 4B), and we validated these expression levels with qPCR (Figure 4C). However, their expression was not associated with differential expression of known AR target genes or with the seven previously defined genomic subtypes. Most detected splice forms were truncated after the DNA-binding domain by the presence of a cryptic exon rather than by skipping those exons encoding the ligand-binding domain. Truncated AR splice variants were previously assumed to be expressed primarily in metastatic castration-resistant prostate cancers, where, at least for AR-V7, their presence was associated with resistance to hormone therapy (Antonarakis et al., 2014). Hence, our finding that they are expressed in hormone-naive primary prostate cancers is notable.

In prostate cancers, the degree of AR pathway output is controlled not only by AR mRNA and protein expression levels, but also by expression of and mutations in AR cofactors (Heemers et al., 2007). It is therefore notable that FOXA1 was recurrently mutated in our cohort, as it is a pioneering transcription factor that targets AR and has a demonstrated role in prostate cancer oncogenesis (Jin et al., 2013). We identified FOXA1 mutations in 4% of the primary prostate cancers studied here, which is similar to the mutation frequency observed previously (Barbieri et al., 2012, Grasso et al., 2012) (Figure 4A). While a subset of these mutations were present in tumors that also possessed SPOP mutations and had elevated levels of AR output, FOXA1 mutations were mutually exclusive with all other alterations that define the genomic subclasses described here. While there were some truncating mutations near the C-terminus and the C-terminal part of the forkhead domain, the majority of the mutations found here and in other prostate cancer cohorts were missense mutations that primarily affect the winged-helix DNA binding domain of FOXA1. Curiously, these mutations do not directly alter FOXA1 DNA-binding residues (Figure 4D–E), a pattern similar to the FOXA1 mutations recently found in lobular breast cancers (TCGA, unpublished), which suggests that the impact of FOXA1 mutations has less to do with altering DNA binding than with disrupting or altering interactions with other chromatin-bound cofactors.

Clinically actionable DNA repair defects in primary prostate cancers

Prior data indicate that several DNA repair pathways are disrupted in a subset of prostate cancers (Karanika et al., 2014, Pritchard et al., 2014). Moreover, the PARP inhibitor olaparib is effective in some patients with prostate cancer (Mateo et al., 2014). Here, we found inactivation of several DNA repair genes that collectively affected 19% of patients (Figure 5A). While we found only one inactivating BRCA1 germline mutation, a frameshift at V923 caused by a 4bp deletion (Clinvar RCV000083190.3), BRCA2 inactivation affected 3% of tumors, including both germline and somatic truncating mutations. All six BRCA2 germline mutations were K3326*, a C-terminal truncating mutation with debated functional impact but increased prevalence in several tumor types (Farrugia et al., 2008, Martin et al., 2005, Delahaye-Sourdeix et al., 2015). Two additional tumors possessed focal BRCA2 homozygous deletions that were accompanied by very low BRCA2 transcript expression. Four tumors (1%) possessed either loss-of-function mutations or homozygous deletion of CDK12, a gene that has been implicated in DNA repair by regulating expression levels of several DNA damage response genes (Blazek et al., 2011) and is recurrently mutated in metastatic prostate cancer (Grasso et al., 2012). ATM, an apical kinase of the DNA damage response, which is activated by the Mre11 complex and mediates downstream checkpoint signaling, was affected by a nonsense mutation in one case and by a likely kinase-dead hotspot N2875 mutation in two cases. FANCD2 was similarly affected by diverse uncommon lesions including a truncating mutation in one tumor, homozygous deletion in two tumors, and focal heterozygous losses in 6% of the cohort (Figure 5B). RAD51C (3%) was affected by focal DNA losses, most of which were heterozygous. Finally, it was notable that heterozygous losses of BRCA2 (13q13.1) almost always coincided with concurrent loss of the distant RB1 tumor suppressor gene (13q14.2) (Figure S3D). The observation that nearly 20% of primary prostate cancers bear genomic defects involving DNA repair pathways is remarkably consistent with the recently announced TOPARP-A Phase II trial results in patients with metastatic castration-resistant prostate cancer indicating that clinical responses to the PARP inhibitor olaparib likely occurred in the subgroup of tumors bearing defects in DNA repair genes (Mateo et al., 2014, Robinson et al., 2015).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f5.jpg
Alterations in clinically relevant pathways

A. Alterations in DNA repair genes were common in primary prostate cancer, affecting almost 20% of samples through mutations or deletions in BRCA2, BRCA1, CDK12, ATM, FANCD2, or RAD51C. B. Focal deletions of FANCD2 were found in 7% of samples and were associated with reduced mRNA expression of FANCD2. C. The RAS or PI-3-Kinase pathways were altered in about a quarter of tumors, mostly through deletion or mutation of PTEN, but also through rare mutations in other pathway members. D.AKT1 mutations were found in three samples. Two of them were the known activating E17K, and the third one affected the D323 residue, which is adjacent to E17 in the protein structure. E. One of the observed PIK3CB mutations, E552K, is paralogous to the known activating E545K mutation in PIK3CA, and the RAC1 Q61 and RRAS2 Q72 mutations are paralogous to the Q61 mutations in KRAS. F. BRAF mutations were found in 2% of samples, mostly in known non-V600E hotspots in the kinase domain.

See alsoFigure S3.

Clinically actionable lesions in PI3K and Ras signaling

The long tail of the frequency distribution of molecular abnormalities is particularly notable among primary prostate cancers. Beyond PTEN, which was deleted or mutated in 17% of the cohort, various driver mutations in effectors of PI3K signaling were present at low incidence (Figure 5C). PIK3CA, which encodes the 110 kDa catalytic subunit of phosphatidylinositol 3-kinase, was mutated in six tumors including one case possessing coincident activating mutations (E542A and N345I), both of which appeared to be subclonal. The other four PIK3CA mutations were all known activating mutational hotspots (E545K, Q546K, N345I, C420R), while one had a mutation of unknown function (E474A). Focal PIK3CA amplification with associated mRNA overexpression occurred in ~1% of cases. Interestingly, PIK3CB was mutated in two tumors that also possessed coincident homozygous deletions of PTEN, both of which were clonal. PIK3CB E552K was found in one tumor at a paralogous residue to the canonical PIK3CA helical domain E545K mutant, and is presumably activating (Figure 5E). As PTEN-deleted tumors are likely PIK3CB-dependent due to the feedback inhibition of PIK3CA, co-existent loss and mutation of PTEN and PIK3CB may be elevating PI3K pathway output and perhaps indicating a set of tumors in which combined PI3K and androgen signaling inhibition may be effective (Schwartz et al., 2015). Among other lesions that drive PI3K signaling, AKT1 was mutated in three tumors. Two tumors had the known E17K hotspot mutation, while another encoded a D323Y mutation. Whereas E17K is the most common hotspot in AKT1 across human cancer, the D323Y variant is uncommon, having been identified previously in one lung adenocarcinoma (Cancer Genome Atlas Research Network, 2014) and one urothelial bladder cancer (Guo et al., 2013). Nevertheless, while distant linearly from the activating E17K hotspot, in three dimensions this D323Y kinase domain mutant directly abuts the PH-domain containing E17K (Figure 5D), and has been described as potentially activating (Parikh et al., 2012).

We also identified known or presumed driver mutations in several other genes of the MAPK pathway, affecting 25% of the tumors (Figure 5A). HRAS was mutated in four tumors, of which three were Q61R hotspot mutations. Two mutations arose in other Ras-family small GTPases. While both RAC1 Q61R and RRAS2 Q72L occurred only once each, they affected residues paralogous to the RAS Q61 hotspot (Figure 5E) (Chang et al., 2015). We also identified eight BRAF mutations, though, curiously, none were the common V600E mutation that is prevalent in cutaneous melanomas, thyroid cancers, and many other tumor types. Five BRAF mutations are likely activating, including known hotspots (K601E, G469A, L597R), two of which confer sensitivity to MEK inhibitors (Dahlman et al., 2012, Bowyer et al., 2014). Another mutation was a likely activating in-frame 3 amino acid deletion at K601 (Figure 5F), while the final mutation (F468C) affected the adjacent residue to the known G469 hotspot. Together, these findings reveal a long tail of low-incidence potentially actionable predicted driver mutations present across the molecular taxonomy of prostate cancer.

Comparison with metastatic prostate cancer

To put these results in context, we compared our findings with those from a recently published cohort of 150 castration-resistant metastatic prostate cancer samples (Robinson et al., 2015). The analysis revealed some similarities and many differences between primary and treated metastatic disease. Although the overall burden of copy-number alterations and mutations was significantly higher in the metastatic samples (Figure 6A), consistent with previous findings (Taylor et al., 2010, Grasso et al., 2012), the primary and metastatic samples were remarkably similar in their subtype distribution, with the exception that the metastatic data set contained no IDH1-mutant tumors (Figure 6B). We compared the frequencies of all recurrently altered genes described in both studies and found that, similar to the overall burden of genomic alterations (Figure 6A), many genes and pathways have increased alteration rates in the metastatic samples (Figure 6C, Table S3). Androgen receptor signaling was more frequently altered in the metastatic samples, most often by amplification or mutation of AR, events that were essentially absent in primary samples. Interestingly, SPOP mutations were somewhat less frequent in the metastatic samples (8% vs 11% in the primary samples). DNA repair and PI3K pathway alterations were more frequent in the metastatic samples, as were mutations or deletions of TP53, RB1, KMT2C and KMT2D. Interestingly, we found no focal, clonal MYCL amplifications, which were recently described in primary prostate cancer (Boutros et al., 2015), in either data set nor in a separate set of 63 untreated prostate cancer samples (Hovelson et al., 2015).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f6.jpg
Comparison of primary with metastatic prostate cancer

A. Metastatic prostate cancer samples have more copy-number alterations (top panel, measured as fraction of genome altered) and mutations (bottom panel). B. The relative distribution of main subtypes (ERG, ETV1/4, FLI1, SPOP, FOXA1, IDH1, other) is similar in primary and metastatic samples. C. The alteration frequencies of several genes and pathways are higher in metastatic samples. The upper bar for each gene indicates the alteration frequency in primary samples, the lower bar for metastatic samples. The most notable differences in alteration frequencies involve the Androgen Receptor pathway, the PI3K pathway, and TP53.

See alsoTable S3.

Cohort and platforms

The cohort of primary prostate cancers analyzed resulted from extensive pathologic, analytical, and quality control review, yielding 333 tumors from 425 available cases. Images of frozen tissue were evaluated by multiple expert genitourinary pathologists and cases were excluded if no tumor cells were identifiable in the sample, or if there was evidence of significant RNA degradation (Figure S1, Supplementary Methods). For the subset of cases reviewed by two pathologists, tumor cellularity estimates were within 20% of each other in 71% of cases. In total, 78% of Gleason scores were concordant within one grade of the secondary pattern (Supplementary Methods). Moreover, due to the challenge of acquiring primary prostate cancer specimens of high tumor cellularity, we also performed a multi-platform analysis of tumor content, estimating tumor purity with analytical approaches utilizing both DNA (Carter et al., 2012, Prandi et al., 2014) and RNA (Quon et al., 2013, Ahn et al., 2013) sequencing data. The molecular and pathologic estimates are presented in Table S1A and Figure S1. The clinical and pathological characteristics of the final cohort are presented in Table 1. The average follow-up time following radical prostatectomy was just under two years, which precluded outcomes analysis due to the long natural history of primary prostate cancer.

Table 1

Cohort Characteristics

Clinical feature
Age61 (43–76)
Pre-operative PSA7.4 (1.6–87.0)
Gleason Score
 3+365
 3+4102
 4+378
 ≥ 888
Tumor Cellularity (pathology)
 <20%7
 21–40%40
 41–60%84
 61–80%115
 81–100%87
Pathologic Stage
 pT2a/b18
 pT2c111
 pT3a110
 pT3b82
 pT46
PSA Recurrence
 Yes33
 No[a]248
 Not Available47
Margin Status
 Positive69
 Negative193
 Not Available71
Ethnicity
 Caucasian270
 African descent43
 Asian8
 Not Available12
Either no evidence of recurrence or insufficient follow-up

We characterized isolated biomolecules from these 333 tumor samples using four platforms: whole-exome sequencing for somatic mutations, array-based methods for profiling both somatic copy-number changes and DNA methylation, and messenger RNA (mRNA) sequencing. We also performed microRNA (miRNA) sequencing on 330 of these samples, reverse-phase protein array (RPPA) on 152 samples, and low-pass and high-pass whole-genome sequencing (WGS) on 100 and 19 tumor/normal pairs, respectively (Supplementary Methods). For 19 samples, non-malignant adjacent prostate samples were also examined for DNA methylation and RNA/miRNA expression analyses.

Cohort and platforms

The cohort of primary prostate cancers analyzed resulted from extensive pathologic, analytical, and quality control review, yielding 333 tumors from 425 available cases. Images of frozen tissue were evaluated by multiple expert genitourinary pathologists and cases were excluded if no tumor cells were identifiable in the sample, or if there was evidence of significant RNA degradation (Figure S1, Supplementary Methods). For the subset of cases reviewed by two pathologists, tumor cellularity estimates were within 20% of each other in 71% of cases. In total, 78% of Gleason scores were concordant within one grade of the secondary pattern (Supplementary Methods). Moreover, due to the challenge of acquiring primary prostate cancer specimens of high tumor cellularity, we also performed a multi-platform analysis of tumor content, estimating tumor purity with analytical approaches utilizing both DNA (Carter et al., 2012, Prandi et al., 2014) and RNA (Quon et al., 2013, Ahn et al., 2013) sequencing data. The molecular and pathologic estimates are presented in Table S1A and Figure S1. The clinical and pathological characteristics of the final cohort are presented in Table 1. The average follow-up time following radical prostatectomy was just under two years, which precluded outcomes analysis due to the long natural history of primary prostate cancer.

Table 1

Cohort Characteristics

Clinical feature
Age61 (43–76)
Pre-operative PSA7.4 (1.6–87.0)
Gleason Score
 3+365
 3+4102
 4+378
 ≥ 888
Tumor Cellularity (pathology)
 <20%7
 21–40%40
 41–60%84
 61–80%115
 81–100%87
Pathologic Stage
 pT2a/b18
 pT2c111
 pT3a110
 pT3b82
 pT46
PSA Recurrence
 Yes33
 No[a]248
 Not Available47
Margin Status
 Positive69
 Negative193
 Not Available71
Ethnicity
 Caucasian270
 African descent43
 Asian8
 Not Available12
Either no evidence of recurrence or insufficient follow-up

We characterized isolated biomolecules from these 333 tumor samples using four platforms: whole-exome sequencing for somatic mutations, array-based methods for profiling both somatic copy-number changes and DNA methylation, and messenger RNA (mRNA) sequencing. We also performed microRNA (miRNA) sequencing on 330 of these samples, reverse-phase protein array (RPPA) on 152 samples, and low-pass and high-pass whole-genome sequencing (WGS) on 100 and 19 tumor/normal pairs, respectively (Supplementary Methods). For 19 samples, non-malignant adjacent prostate samples were also examined for DNA methylation and RNA/miRNA expression analyses.

The molecular taxonomy of primary prostate cancer

Previous studies indicate that many genomically distinct subsets of prostate cancer exist. These are driven in some cases by frequent events, such as androgen-regulated fusions of ERG and other ETS-family members, or recurrent SPOP mutations, and in other cases by less common genomic aberrations. Given the comprehensive nature of our data, we sought to unify these disparate findings to establish a molecular taxonomy of primary disease that integrates results from somatic mutations, gene fusions, somatic copy-number alterations (SCNA), gene expression, and DNA methylation. We first performed unsupervised clustering of data from each molecular platform as well as integrative clustering using iCluster (Shen et al., 2009). These analyses uncovered both known and novel associations with 74% of all tumors being assignable to one of seven molecular classes based on distinct oncogenic drivers: fusions involving 1) ERG, 2) ETV1, 3) ETV4, or 4) FLI1 (46, 8, 4, and 1% respectively), or mutations in 5) SPOP, 6) FOXA1, or 7) IDH1 mutations (11, 3, and 1% respectively) (Figures 1 and S2, Table S1A).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f1.jpg
The molecular taxonomy of primary prostate cancer

Comprehensive molecular profiling of 333 primary prostate cancer samples revealed seven genomically distinct subtypes, defined (top to bottom) by ERG fusions (46%), ETV1/ETV4/FLI1 fusions or overexpression (8%, 4%, 1%, respectively), or by SPOP (11%), FOXA1 (3%), and IDH1 (1%) mutations. A subset of these subtypes was correlated with clusters computationally derived from the individual characterization platforms (somatic copy-number alterations, methylation, mRNA, microRNA, and protein levels from reverse phase protein arrays). The heatmap shows DNA copy-number for all cases, with chromosomes shown from left to right. Regions of loss are indicated by shades of blue, and gains by shades of red.

See alsoFigures S1, S2, S3, S4, S5, S6, S7, and Tables S1A, S1B, S1E, S2.

In total, 53% of tumors were found to have ETS-family gene fusions (ERG, ETV1, ETV4, and FLI1) after analysis with two complementary algorithms (Sboner et al., 2010, Wang et al., 2010) (see Methods). While TMPRSS2 was the most frequent fusion partner in all ETS fusions, we identified fusions with other previously described androgen-regulated 5′ partner genes, including SLC45A3 and NDRG1 (Table S5). We also identified several tumors that overexpressed full-length ETS transcripts that were mutually exclusive with ETS fusions (12 ETV1 high tumors, 6 ETV4, and 2 FLI1) (Table S5). ETS overexpression in these cases could possibly be mediated via epigenetic mechanisms or cryptic translocations of the entire gene locus to a transcriptionally active neighborhood. In the one case with elevated ETV1 full-length expression studied by whole-genome sequencing, we identified a cryptic genomic rearrangement 3′ of the ETV1 locus with a region on chromosome 14 near the MIPOL1 gene adjacent to FOXA1. This event is similar to previously described ETV1 translocations in LNCaP and MDA-PCa2b cell lines and in patient samples (Tomlins et al., 2007, Gasi et al., 2011). Overall, while fusions in the four genes were mostly mutually exclusive, three tumors showed evidence for fusions involving more than one of these genes (Table S5). Given that histologically defined single tumor foci have been shown to be rarely composed of different ETS fusion positive clones (Cooper et al., 2015, Kunju et al., 2014, Pflueger et al., 2011), it is likely these cases reflect convergent phenotypic evolution in clonally heterogeneous tumors. Tumors defined by SPOP mutations were mutually exclusive with all ETS fusion-positive cases, though four of the SPOP-mutant tumors also possessed FOXA1 mutations. In all four of these tumors, both the SPOP and FOXA1 mutations were clonal, indicating they are present in the same tumor cells.

Beyond the class-defining lesions, there were multiple patterns of both known and novel concurrent alterations in key prostate cancer genes. The former included the preponderance of PTEN deletions in ERG fusion-positive cases (Taylor et al., 2010). Similarly, SPOP mutations have previously been found to occur in approximately 10% of clinically localized prostate cancers, were mutually exclusive of tumors defined by ETS rearrangements, and may designate a distinct molecular class of disease based primarily on distinctive SCNA profiles (including deletion of CHD1, 6q, and 2q) (Barbieri et al., 2012, Blattner et al., 2014). Beyond reaffirming these known patterns, our taxonomy revealed new relationships and subtypes. Specifically, the SPOP-mutant/CHD1-deleted subset of prostate cancers had notable molecular features, including elevated levels of DNA methylation, homogeneous gene expression patterns, as well as frequent overexpression of SPINK1 mRNA, supporting SPOP mutation as a key feature in the molecular taxonomy of prostate cancer. Interestingly, mRNA, copy-number, and methylation profiles were similar in tumors with FOXA1 mutations and those with SPOP mutations. Furthermore, we identified a new genomically distinct subtype of prostate cancer defined by hotspot mutations in IDH1, described in greater depth below.

Despite this detailed molecular taxonomy of primary prostate cancers, 26% of all tumors studied appeared to be driven by still-occult molecular abnormalities or by one or more frequent alterations that co-occur with the genomically defined classes. Some of these tumors showed a high burden of copy-number alterations or DNA hypermethylation. Enrichment analysis indicated that this subset of tumors was enriched for mutations in TP53, KDM6A, and KMT2D; deletions of chromosomes 6 and 16; and amplifications of chromosomes 8 (spanning MYC) and 11 (CCND1) (Table S2). To further characterize this group further, we performed whole genome sequencing of 19 tumor specimens and their matched normal tissues, a subset of which had high tumor cellularity but still lacked DNA copy-number alterations or any known or presumed driver lesions. Interestingly, no occult driver abnormalities or highly recurrent regulatory mutations were identified, such as the TERT promoter mutation common to many other tumor types (Khurana et al., 2013). Therefore, a significant (up to 26%) subset of primary prostate cancers of both good and poor clinical prognosis (including those with Gleason scores of >8) are driven by as yet unexplained molecular alterations.

mRNA clusters were tightly correlated with ETS fusion status, where mRNA cluster 1 consisted primarily of ETS-negative tumors and mRNA clusters 2 and 3 were split among ETS fusion-positive tumors (Figure 1, Figure S4). miRNA clustering showed a similar pattern, revealing a general difference in miRNA expression between ETS-positive and negative tumors (Figure 1, Figure S6). Clustering of RPPA data identified three distinct subgroups, with cluster 3 exhibiting elevated PI3K/AKT, MAP-Kinase and receptor tyrosine kinase activity (Figure S7A). The cluster was not enriched, however, in genomic alterations in these pathways, and in general there was little correlation of increased pathway activity (as measured by phospho-AKT and other downstream phospho-proteins) with the frequent genomic alterations in the pathways (see the example of PTEN deletions in Figure S7B).

Recurrently altered genes and their patterns across subtypes

The overall mutational burden of the cohort, inferred from whole-exome sequencing, was 0.94 mutations per megabase (median, range 0.04 – 28 per megabase), which corresponds to 19 non-synonymous mutations per tumor genome (median; 13–25, 25th and 75th percentiles respectively). This is consistent with prior exome and genome-scale sequencing results for localized prostate cancers (Barbieri et al., 2012, Baca et al., 2013), and is lower than the mutational burden of metastatic prostate cancers (Gundem et al., 2015, Grasso et al., 2012, Robinson et al., 2015). These results reaffirm that prostate cancer possesses a lower mutational burden than many other epithelial tumor types that are not associated with a strong exogenous mutagen (Alexandrov et al., 2013, Lawrence et al., 2013). Prior exome sequencing of 112 prostate cancers identified 12 recurrently mutated genes through focused assessment of point mutations and short insertions and deletions (Barbieri et al., 2012). By comparison, mutational significance analysis of these 333 tumor-normal pairs by MutSigCV (Lawrence et al., 2013, 2014) identified 13 significantly mutated genes (q-value < 0.1), seven of which had not been previously identified (Figure 2 and Tables S1B–S1C). Among the significantly mutated genes, SPOP, TP53, FOXA1, PTEN, MED12, and CDKN1B were previously identified as recurrently mutated. Additional clinically relevant genes were identified with lower mutation frequencies; these included genes within canonical kinase signaling pathways (BRAF, HRAS, AKT1), the beta-catenin pathway (CTNNB1), and the DNA repair pathway (ATM). The rate of BRAF mutations (2.4%) seen in this study is higher than previously reported; these include several known activating mutations, but curiously not the canonical V600E hotspot. We identified no BRAF fusions, which had previously been reported in a subset of clinically advanced prostate cancer (Palanisamy et al., 2010). NKX3-1, previously implicated in familial prostate cancer syndromes and often found to be deleted, was also somatically mutated in this cohort (1% of tumors). While its functional significance is unknown, ZMYM3, an epigenetic regulatory protein not previously implicated in prostate cancer but infrequently mutated in Ewing sarcomas (Tirode et al., 2014) and various pediatric cancers (Huether et al, 2014), was also recurrently mutated (2% of tumors). Genes with known biological relevance that were mutated at frequencies just below the threshold of significance (q-value < 0.01) included KMT2C (MLL3), KMT2D (MLL4), APC, IDH1, and PIK3CA (Figure 2 and Tables S1B–S1C). Mutations in the tumor suppressor genes KMT2C, KMT2D, and APC were mostly truncating; the IDH1 and PIK3CA mutations occurred in previously characterized hotspots and thus may have therapeutic relevance for those occasional tumors with these mutations.

An external file that holds a picture, illustration, etc.
Object name is nihms730293f2.jpg
Recurrent alterations in primary prostate cancer

The spectrum and type of recurrent alterations and genes (mutations, fusions, deletions, and overexpression) in the cohort are shown (left to right) grouped by the molecular subtypes defined in Figure 1. On the right, the statistical significance of individual mutant genes (MutSig q-value) is shown. Mutations in IDH1, PIK3CA, RB1, KMT2D, CHD1, BRCA2, and CDK12 are also shown, despite their not being statistically significant. SPINK1 overexpression is shown for reference.

See alsoTables S1B, S1C, S1D, S1E.

Notwithstanding these key somatic mutations, the most frequent molecular abnormalities involved chromosomal arm-level copy-number alterations (Taylor et al., 2010). These alterations included recurrent genomic gains of chromosome 7 and 8q and heterozygous losses of 8p, 13q, 16q and 18 (Figure S3A). Significance analysis of recurrent focal DNA copy-number alterations revealed 20 amplifications and 35 deletions (q-value < 0.25, GISTIC 2.0; Figure S3A and Table S1D). Recurrent focal amplifications included those spanning known oncogenes such as CCND1 (11q13.2, 2%), MYC (8q24.21, 8%), as well as FGFR1 and WHSC1L1 (8p11.23, 8%). Recurrent focal deletions were much more common: Homozygous deletions spanning the PTEN locus occurred at one of the highest rates of any tumor type studied thus far (15%). Focal deletions of the region between the TMPRSS2 and ERG genes on 21q22.3, which result in TMPRSS2-ERG fusions, were unique to prostate cancers as expected. Other focal deletions include those spanning tumor suppressors TP53 (17p13.1), CDKN1B (12p13.1), MAP3K1 (5q11.2), FANCD2 (3p26), as well as SPOPL (2q22.1) and the complex locus spanning FOXP1/RYBP/SHQ1 (3p13). MAP3K7 (6q.12-22) was also frequently deleted, along with deletion of CHD1 (5q15-q21); co-deletion of these loci has been associated with aggressive ETS-negative prostate cancer (Kluth et al., 2013, Rodrigues et al., 2015).

As the pattern and extent of SCNAs in prostate cancer genomes have been associated with probability of disease recurrence and metastasis in primary prostate cancers (Taylor et al., 2010, Hieronymus et al., 2014, van Dekken et al., 2004, Paris et al., 2004), we sought to identify similar structure in the burden of SCNAs by performing hierarchical clustering of arm-level alterations. We identified three major groups of prostate cancers, one with mostly unaltered genomes (hereafter referred to as quiet), a second group encompassing 50% of all tumors with an intermediate level of SCNAs, and a third group with a high burden of arm level genomic gains and losses (Figure S3B–S3C). While a formal outcome analysis was not possible due to the limited clinical follow-up available for this cohort, the subset of tumors with the greatest burden of SCNAs had significantly higher Gleason scores and PSA levels than the other two groups (Figure S3B–S3D). The tumors in this group also had significantly higher tumor cellularity (Figure S3C).

Epigenetic changes define molecularly distinct subtypes of prostate cancer

Integrative analysis of genetic and epigenetic changes revealed a diversity of DNA methylation changes that defined molecularly distinct subsets of primary prostate cancer (Figure 3). Unsupervised hierarchical clustering of the most variably hypermethylated CpGs identified four epigenetically distinct groups of prostate cancers (Figure S5A–S5B). When integrated with the molecular taxonomy defined above, we found a number of striking associations. Among these was a notable pattern within ERG fusion-positive tumors. Specifically, while nearly two-thirds of all ERG fusion-positive tumors belonged to an unsupervised cluster with only moderately elevated DNA methylation (DNA methylation cluster 3), the remaining ERG fusion-positive tumors comprised a distinct hypermethylated cluster (cluster 1) that was almost exclusively associated with ERG fusions. On average, this cluster contained twice the number of hypermethylated loci as DNA methylation cluster 3 (Figure S5A), and the epigenetic patterns were largely distinct from those of ETV1 and ETV4 fusion-positive tumors, which showed more heterogeneous methylation. What drives these epigenetically distinct groups of ETS fusion-positive tumors is unknown, but there is considerable diversity in their DNA methylation profiles that may reflect altered epigenetic silencing (Figure S5A–S5B). Together, these results support further ETS fusion-based subtyping of disease, but also reveal a greater molecular and likely biological diversity among ERG fusion-positive tumors than previously appreciated. Likewise, these results are consistent with in vivo mouse modeling and expression profiling studies that suggest important molecular and clinicopathological differences between ERG and non-ERG ETS fusion positive tumors (Baena et al., 2013, Tomlins et al., 2015).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f3.jpg
Hypermethylation is common across primary prostate cancer

A. Primary prostate cancers show diverse methylation changes compared to normal prostate samples (left). Unsupervised clustering was performed on the beta-values of the 5,000 most hypermethylated loci, and the results mapped to the genomic subtypes. ERG-positive tumors had a high diversity of methylation changes, with a distinct subgroup (cluster 1) nearly unique to this group. SPOP and FOXA1 mutant tumors also exhibited global hypermethylation. B.IDH1-mutant prostate cancers, which are associated with younger age, are among the most hypermethylated tumors, as in Glioblastoma (GBM) and Acute Myeloid Leukemia (AML).

See alsoFigure S4 and Table S1F.

SPOP and FOXA1-mutant tumors exhibited homogeneous epigenetic profiles. These tumors belonged almost exclusively to DNA methylation cluster 2, a group that also contained a majority of the ETV1 and ETV4 but not ERG-positive tumors. Lastly, the IDH1-mutant tumors were notable given their strongly elevated levels of genome-wide DNA hypermethylation (Figure S5B). While of low incidence, these IDH1 R132-mutant tumors defined a distinct subgroup of what appears to be early onset prostate cancer (Figure 3B) that possesses fewer DNA copy-number alterations (see Figure 1) or other canonical genomic lesions that are common to most other prostate cancers. IDH1 and IDH2 mutations have been associated with a DNA methylation phenotype in other tumor types, most notably in gliomas (Noushmehr et al., 2010) and acute myeloid leukemias (AML, Figueroa et al., 2010). Curiously, IDH1-mutant prostate cancers possessed even greater levels of genome-wide hypermethylation than do either glioma or AML IDH1-mutant tumors (Figure 3B). After further investigating DNA methylation differences between IDH-mutant and wild type tumors among prostate cancers, gliomas, and AMLs, we found that hypermethylated loci were specific to the cancer type rather than IDH mutants (Figure S5F).

Integrating these epigenetic data with mRNA expression levels, we identified 164 genes that were epigenetically silenced in subsets of the cohort (Figure S5C, Table S1F). These silenced genes were significantly enriched for genes previously found to be differentially expressed in prostate cancer; specifically, genes that are down-regulated in metastatic prostate cancer (Chandran et al., 2007) and genes involved in prostate organ development (Schaeffer et al., 2008) (q-value < 2.0·10). These 164 silenced genes displayed heterogeneous frequencies of epigenetic silencing across the cohort. For example, SHF, FAXDC2, GSTP1, ZNF154, and KLF8 were epigenetically silenced in almost all tumors (>85%) whereas STAT6 was silenced predominantly in ETS fusion-positive tumors and not in SPOP and IDH1-mutant tumors. Conversely, HEXA was silenced preferentially in SPOP-mutant tumors compared to ERG fusion-positive tumors (86.5 versus 14.5% respectively, p<5.4·10). Consistent with their increased DNA hypermethylation, the IDH1-mutant prostate tumors also possessed the greatest number of epigenetically silenced genes among all prostate tumors (Table S1F).

AR activity is variable in primary prostate cancers

The androgen receptor (AR) regulates normal prostate development as well as critical growth and survival programs in prostate carcinoma. Primary prostate cancer is androgen dependent, and androgen activity is a central axis in prostate cancer pathogenesis, driving the creation and overexpression of most ETS fusion genes (Lin et al., 2011, Mani et al., 2009, Tomlins et al., 2005). However, the extent to which individual primary prostate cancers differ in androgen sensitivity or dependence is unknown, and the issue has translational implications because AR targeting is therapeutically important. To address these questions, we sought to infer the AR output of tumors by calculating an AR activity score from the expression pattern of 20 genes that are experimentally validated AR transcriptional targets (Hieronymus et al., 2006). This score suggested that a broad spectrum of AR activity exists across all prostate tumors as well as between genomic subtypes (Figure 4A). Although ETS fusion genes are under AR control, the ETS fusion-positive groups had variable AR transcriptional activity. In contrast, we found that tumors with SPOP or FOXA1 mutations had the highest AR transcriptional activity of all genotypically distinct subsets of prostate cancer (p = 1.1·10 and 0.04, respectively, t-test). Consistent with this, SPOP mutations have been previously implicated in androgen signaling in model systems, since both AR and AR coactivators are substrates deregulated by SPOP mutation (Geng et al. 2013, An et al., 2014, Geng et al. 2014), providing a possible explanation for the associated increase in AR activity seen in this subtype of prostate cancers.

An external file that holds a picture, illustration, etc.
Object name is nihms730293f4.jpg
The diversity of androgen receptor activity in primary prostate cancer

A. Androgen receptor activity, as inferred by the induction of AR target genes, was significantly increased in SPOP and FOXA1 mutant tumors when compared to normal prostate or ERG-positive tumors. This increase in activity cannot be fully explained by AR mRNA or protein levels. B. Multiple known AR splice variants were detected in benign prostate (left) and primary prostate cancer (right), with the AR-V7 variant detected in 50% of tumors. C. Real-time qPCR comparison of AR-V7 in 74 tumor samples (grey) and 5 adjacent-normal samples (blue). D.FOXA1 missense mutations were clustered in the forkhead domain, mostly in residues that do not form contacts with DNA (see also the 3-D structure in panel E).

While AR transcriptional output is a proxy for ligand-driven AR activity in many tumors, AR transcript variants have been described that encode truncated AR proteins that lack the ligand-binding domain and hence are capable of activating AR target genes in the absence of androgens (Dehm et al., 2008, Watson et al., 2010). Using RNA sequencing reads that spanned the splice junctions unique to each AR variant, we quantified the expression of these AR transcript variants. This analysis revealed that several AR splice variants, most notably AR-V7, can be detected at low levels in primary tumors and, in a few cases, in adjacent benign prostate tissue (Figure 4B), and we validated these expression levels with qPCR (Figure 4C). However, their expression was not associated with differential expression of known AR target genes or with the seven previously defined genomic subtypes. Most detected splice forms were truncated after the DNA-binding domain by the presence of a cryptic exon rather than by skipping those exons encoding the ligand-binding domain. Truncated AR splice variants were previously assumed to be expressed primarily in metastatic castration-resistant prostate cancers, where, at least for AR-V7, their presence was associated with resistance to hormone therapy (Antonarakis et al., 2014). Hence, our finding that they are expressed in hormone-naive primary prostate cancers is notable.

In prostate cancers, the degree of AR pathway output is controlled not only by AR mRNA and protein expression levels, but also by expression of and mutations in AR cofactors (Heemers et al., 2007). It is therefore notable that FOXA1 was recurrently mutated in our cohort, as it is a pioneering transcription factor that targets AR and has a demonstrated role in prostate cancer oncogenesis (Jin et al., 2013). We identified FOXA1 mutations in 4% of the primary prostate cancers studied here, which is similar to the mutation frequency observed previously (Barbieri et al., 2012, Grasso et al., 2012) (Figure 4A). While a subset of these mutations were present in tumors that also possessed SPOP mutations and had elevated levels of AR output, FOXA1 mutations were mutually exclusive with all other alterations that define the genomic subclasses described here. While there were some truncating mutations near the C-terminus and the C-terminal part of the forkhead domain, the majority of the mutations found here and in other prostate cancer cohorts were missense mutations that primarily affect the winged-helix DNA binding domain of FOXA1. Curiously, these mutations do not directly alter FOXA1 DNA-binding residues (Figure 4D–E), a pattern similar to the FOXA1 mutations recently found in lobular breast cancers (TCGA, unpublished), which suggests that the impact of FOXA1 mutations has less to do with altering DNA binding than with disrupting or altering interactions with other chromatin-bound cofactors.

Clinically actionable DNA repair defects in primary prostate cancers

Prior data indicate that several DNA repair pathways are disrupted in a subset of prostate cancers (Karanika et al., 2014, Pritchard et al., 2014). Moreover, the PARP inhibitor olaparib is effective in some patients with prostate cancer (Mateo et al., 2014). Here, we found inactivation of several DNA repair genes that collectively affected 19% of patients (Figure 5A). While we found only one inactivating BRCA1 germline mutation, a frameshift at V923 caused by a 4bp deletion (Clinvar RCV000083190.3), BRCA2 inactivation affected 3% of tumors, including both germline and somatic truncating mutations. All six BRCA2 germline mutations were K3326*, a C-terminal truncating mutation with debated functional impact but increased prevalence in several tumor types (Farrugia et al., 2008, Martin et al., 2005, Delahaye-Sourdeix et al., 2015). Two additional tumors possessed focal BRCA2 homozygous deletions that were accompanied by very low BRCA2 transcript expression. Four tumors (1%) possessed either loss-of-function mutations or homozygous deletion of CDK12, a gene that has been implicated in DNA repair by regulating expression levels of several DNA damage response genes (Blazek et al., 2011) and is recurrently mutated in metastatic prostate cancer (Grasso et al., 2012). ATM, an apical kinase of the DNA damage response, which is activated by the Mre11 complex and mediates downstream checkpoint signaling, was affected by a nonsense mutation in one case and by a likely kinase-dead hotspot N2875 mutation in two cases. FANCD2 was similarly affected by diverse uncommon lesions including a truncating mutation in one tumor, homozygous deletion in two tumors, and focal heterozygous losses in 6% of the cohort (Figure 5B). RAD51C (3%) was affected by focal DNA losses, most of which were heterozygous. Finally, it was notable that heterozygous losses of BRCA2 (13q13.1) almost always coincided with concurrent loss of the distant RB1 tumor suppressor gene (13q14.2) (Figure S3D). The observation that nearly 20% of primary prostate cancers bear genomic defects involving DNA repair pathways is remarkably consistent with the recently announced TOPARP-A Phase II trial results in patients with metastatic castration-resistant prostate cancer indicating that clinical responses to the PARP inhibitor olaparib likely occurred in the subgroup of tumors bearing defects in DNA repair genes (Mateo et al., 2014, Robinson et al., 2015).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f5.jpg
Alterations in clinically relevant pathways

A. Alterations in DNA repair genes were common in primary prostate cancer, affecting almost 20% of samples through mutations or deletions in BRCA2, BRCA1, CDK12, ATM, FANCD2, or RAD51C. B. Focal deletions of FANCD2 were found in 7% of samples and were associated with reduced mRNA expression of FANCD2. C. The RAS or PI-3-Kinase pathways were altered in about a quarter of tumors, mostly through deletion or mutation of PTEN, but also through rare mutations in other pathway members. D.AKT1 mutations were found in three samples. Two of them were the known activating E17K, and the third one affected the D323 residue, which is adjacent to E17 in the protein structure. E. One of the observed PIK3CB mutations, E552K, is paralogous to the known activating E545K mutation in PIK3CA, and the RAC1 Q61 and RRAS2 Q72 mutations are paralogous to the Q61 mutations in KRAS. F. BRAF mutations were found in 2% of samples, mostly in known non-V600E hotspots in the kinase domain.

See alsoFigure S3.

Clinically actionable lesions in PI3K and Ras signaling

The long tail of the frequency distribution of molecular abnormalities is particularly notable among primary prostate cancers. Beyond PTEN, which was deleted or mutated in 17% of the cohort, various driver mutations in effectors of PI3K signaling were present at low incidence (Figure 5C). PIK3CA, which encodes the 110 kDa catalytic subunit of phosphatidylinositol 3-kinase, was mutated in six tumors including one case possessing coincident activating mutations (E542A and N345I), both of which appeared to be subclonal. The other four PIK3CA mutations were all known activating mutational hotspots (E545K, Q546K, N345I, C420R), while one had a mutation of unknown function (E474A). Focal PIK3CA amplification with associated mRNA overexpression occurred in ~1% of cases. Interestingly, PIK3CB was mutated in two tumors that also possessed coincident homozygous deletions of PTEN, both of which were clonal. PIK3CB E552K was found in one tumor at a paralogous residue to the canonical PIK3CA helical domain E545K mutant, and is presumably activating (Figure 5E). As PTEN-deleted tumors are likely PIK3CB-dependent due to the feedback inhibition of PIK3CA, co-existent loss and mutation of PTEN and PIK3CB may be elevating PI3K pathway output and perhaps indicating a set of tumors in which combined PI3K and androgen signaling inhibition may be effective (Schwartz et al., 2015). Among other lesions that drive PI3K signaling, AKT1 was mutated in three tumors. Two tumors had the known E17K hotspot mutation, while another encoded a D323Y mutation. Whereas E17K is the most common hotspot in AKT1 across human cancer, the D323Y variant is uncommon, having been identified previously in one lung adenocarcinoma (Cancer Genome Atlas Research Network, 2014) and one urothelial bladder cancer (Guo et al., 2013). Nevertheless, while distant linearly from the activating E17K hotspot, in three dimensions this D323Y kinase domain mutant directly abuts the PH-domain containing E17K (Figure 5D), and has been described as potentially activating (Parikh et al., 2012).

We also identified known or presumed driver mutations in several other genes of the MAPK pathway, affecting 25% of the tumors (Figure 5A). HRAS was mutated in four tumors, of which three were Q61R hotspot mutations. Two mutations arose in other Ras-family small GTPases. While both RAC1 Q61R and RRAS2 Q72L occurred only once each, they affected residues paralogous to the RAS Q61 hotspot (Figure 5E) (Chang et al., 2015). We also identified eight BRAF mutations, though, curiously, none were the common V600E mutation that is prevalent in cutaneous melanomas, thyroid cancers, and many other tumor types. Five BRAF mutations are likely activating, including known hotspots (K601E, G469A, L597R), two of which confer sensitivity to MEK inhibitors (Dahlman et al., 2012, Bowyer et al., 2014). Another mutation was a likely activating in-frame 3 amino acid deletion at K601 (Figure 5F), while the final mutation (F468C) affected the adjacent residue to the known G469 hotspot. Together, these findings reveal a long tail of low-incidence potentially actionable predicted driver mutations present across the molecular taxonomy of prostate cancer.

Comparison with metastatic prostate cancer

To put these results in context, we compared our findings with those from a recently published cohort of 150 castration-resistant metastatic prostate cancer samples (Robinson et al., 2015). The analysis revealed some similarities and many differences between primary and treated metastatic disease. Although the overall burden of copy-number alterations and mutations was significantly higher in the metastatic samples (Figure 6A), consistent with previous findings (Taylor et al., 2010, Grasso et al., 2012), the primary and metastatic samples were remarkably similar in their subtype distribution, with the exception that the metastatic data set contained no IDH1-mutant tumors (Figure 6B). We compared the frequencies of all recurrently altered genes described in both studies and found that, similar to the overall burden of genomic alterations (Figure 6A), many genes and pathways have increased alteration rates in the metastatic samples (Figure 6C, Table S3). Androgen receptor signaling was more frequently altered in the metastatic samples, most often by amplification or mutation of AR, events that were essentially absent in primary samples. Interestingly, SPOP mutations were somewhat less frequent in the metastatic samples (8% vs 11% in the primary samples). DNA repair and PI3K pathway alterations were more frequent in the metastatic samples, as were mutations or deletions of TP53, RB1, KMT2C and KMT2D. Interestingly, we found no focal, clonal MYCL amplifications, which were recently described in primary prostate cancer (Boutros et al., 2015), in either data set nor in a separate set of 63 untreated prostate cancer samples (Hovelson et al., 2015).

An external file that holds a picture, illustration, etc.
Object name is nihms730293f6.jpg
Comparison of primary with metastatic prostate cancer

A. Metastatic prostate cancer samples have more copy-number alterations (top panel, measured as fraction of genome altered) and mutations (bottom panel). B. The relative distribution of main subtypes (ERG, ETV1/4, FLI1, SPOP, FOXA1, IDH1, other) is similar in primary and metastatic samples. C. The alteration frequencies of several genes and pathways are higher in metastatic samples. The upper bar for each gene indicates the alteration frequency in primary samples, the lower bar for metastatic samples. The most notable differences in alteration frequencies involve the Androgen Receptor pathway, the PI3K pathway, and TP53.

See alsoTable S3.

Discussion

The comprehensive molecular analyses of primary prostate cancers presented here reveal highly diverse genomic, epigenomic, and transcriptomic patterns. Major subtypes could be defined by fusions of the ETS family genes ERG, ETV, ETV4, or FLI1 and by mutations in SPOP, FOXA1, or IDH1. However, even within the groups there was significant diversity in DNA copy-number alterations, gene expression, and DNA methylation. The mutational heterogeneity mirrors the heterogeneous natural history of primary prostate cancers.

Although the broad spectrum of copy-number alterations in tumors with ETS fusions has been previously characterized (Demichelis et al., 2009, Taylor et al., 2010), here we uncovered additional differences between the epigenetic profiles of those tumors. We found that ERG fusion-positive tumors can be subdivided into two methylation subtypes: one with lower levels of methylation, and one with a distinct spectrum of hypermethylation. Many genes were epigenetically silenced as a result of the hypermethylation in the latter tumors. While further studies will be required to determine which silencing events are linked to prostate cancer pathogenesis, the findings presented here reveal variability among what was previously considered to be genetically homogeneous prostate cancer subtypes.

We have also identified a distinct subgroup of tumors with IDH1 R132 mutations that is associated with younger age at diagnosis. Although IDH1 mutations have previously been identified in prostate cancer with a similar incidence (2.7%) (Ghiam et al., 2012, Kang et al., 2009), we show here that those tumors are all ETS fusion-negative and SPOP wildtype, have little SCNA burden, and possess elevated levels of genome-wide methylation. The levels of methylation observed in this methylator phenotype are higher than those observed in IDH1-mutant GBMs and AMLs. Consistent with our observations, a recently published clinical study of 117 prostate cancers identified a single IDH1-mutant prostate cancer from 56-year old patient that also lacked significant copy number alterations, ETS gene fusions, or driver mutations (Hovelson et al., 2015). Future studies in cohorts with sufficient clinical follow-up will be able to ask whether the IDH1-mutant prostate cancers are prognostically distinct, as noted for gliomas (Noushmehr et al., 2010) and AMLs (Mardis et al., 2009), and if they are sensitive to newly developed IDH1-targeted therapeutics (Rohle et al., 2013).

Interestingly, 26% of the tumors in this study could not be characterized by one of the taxonomy-defining cardinal genomic alterations. The 26% were clinically and genomically heterogeneous, with some tumors exhibiting extensive DNA copy-number alterations and high Gleason scores indicative of poorer prognosis. About a third of them were genomically similar to SPOP and FOXA1 mutant tumors but lacked any canonical mutation (iCluster 1, methylation cluster 2, mRNA cluster 1); others were enriched for mutations of TP53, KDM6A, and KMT2D or specific SCNAs spanning MYC and CCND1. Many of the tumors had low Gleason score with few if any DNA copy-number alterations and a normal-like DNA methylation pattern. As previously reported, tumors with fewer genomic alterations were also more commonly Gleason score 6 tumors (38% in the ‘quiet’ class vs 8% in the class with the greatest burden of alterations).

Tumor cellularity, as assessed by pathology review, was lower among tumors with fewer SCNAs (one-sided Mann-Whitney test, p = 0.0002), indicating that the apparent lower burden of alterations in tumors with smaller volumes may be due in part to their tumor purities being lower. However, the lower cellularity of these tumors did not limit the detection of clonal molecular lesions since tumor cellularity between ETS fusion-positive and these fusion-negative tumors was not significantly different (two-sided Mann-Whitney test, p = 0.32). One must also keep in mind that this study was limited to a single tumor focus for each patient, even though the vast majority of primary prostate tumors are multifocal and molecular heterogeneity between different foci has been demonstrated (Cooper et al., 2015, Boutros et al., 2015, Lindberg et al., 2013). Such issues must be considered when designing new therapeutic approaches and biomarker panels for clinical use, as patients likely have more than one of these molecular subtypes present due to this commonly occurring tumor multifocality and molecular heterogeneity.

Primary prostate cancers exhibit a wide range of androgen receptor activity. This study demonstrates for the first time a direct association between mutations in SPOP or FOXA1 and increased AR-driven transcription in human prostate cancers. Further studies in preclinical models as well as in clinical trial settings will be required to understand the implications of variable AR activity in the contexts of chemoprevention and prostate cancer-directed treatment strategies (Mostaghel et al., 2010). Other, more immediately actionable opportunities for targeted therapy exist for the 19% of primary prostate cancers that have defects in DNA repair, and for the nearly equal number of cancers with altered key effectors of both PI3K and MAPK pathways. While the numbers of DNA repair defects found in organ-confined prostate tumors may be lower than those found in metastatic prostate cancer (Robinson et al., 2015), an increase in the number of such defects with disease progression suggests a possible advantage to targeting DNA repair-deficient tumors at an earlier stage of disease, perhaps at initial diagnosis. Such strategies may include preventing DNA damage as well as targeting deficient DNA repair (Ferguson et al., 2015). Alterations in the PI3K/MTOR pathway also play an important role: beyond the frequent inactivation of PTEN, we document rare activation of PIK3CA, PIK3CB, AKT1, and MTOR, and of several small GTPases, including HRAS, as well as BRAF. As DNA sequencing of tumor samples becomes more widely adopted earlier in the clinical care of cancer patients, such alterations may emerge as candidates for inclusion in clinical trials after front-line therapy.

In summary, our integrative assessment of 333 primary prostate cancers has confirmed previously defined molecular subtypes across multiple genomic platforms and identified novel alterations and subtype diversity. It provides a resource for continued investigation into the molecular and biological heterogeneity of the most common cancer in American men.

Experimental Procedures

Tumor and matched normal specimens were obtained from prostate cancer patients that provided informed consent and were approved for collection and distribution by local Institutional Review boards. Blocks frozen in OCT were made of all tumors and of paired benign tissue when present. A 5 micron section was cut from both the top and bottom of the OCT block of 111 tumor cases and from the top or bottom only of the OCT block of 222 tumor cases. Out of 39 normal samples included in the freeze 23 underwent pathology review, and prostate origin (i.e., no seminal vesicles) and absence of tumor and high grade prostate intraepithelial neoplasia (HGPIN) was confirmed. Tissue images were reviewed by eight genitourinary pathologists, who reported the primary and secondary Gleason patterns of cancer for each slide and estimates of tumor cellularity in 10% increments (from 0–100%). In case of discrepancies of Gleason scores between the top and bottom sections, the Gleason scores of cancer in the section with the largest area of tumor was used. A subset of 54 cases was reviewed by two pathologists. Discrepancies that occurred between the two pathologists were reconciled by blind review by a third pathologist.

DNA, RNA, and protein were purified and distributed throughout the TCGA network. Samples with evidence for RNA degradation were excluded from the study (Supplementary Methods). In total, 333 primary tumors with associated clinicopathologic data were assayed on at least four molecular profiling platforms. Platforms included exome and whole genome DNA sequencing, RNA sequencing, miRNA sequencing, SNP arrays, DNA methylation arrays, and reverse phase protein arrays. Integrated multiplatform analyses were performed.

The data and analysis results can be explored through the Broad Institute FireBrowse portal (http://firebrowse.org/?cohort=PRAD), the cBioPortal for Cancer Genomics (http://www.cbioportal.org/study.do?cancer_study_id=prad_tcga_pub), TCGA Batch Effects (http://bioinformatics.mdanderson.org/tcgambatch/), Regulome Explorer (http://explorer.cancerregulome.org/) and Next-Generation Clustered Heat Maps (http://bioinformatics.mdanderson.org/TCGA/NGCHMPortal/). See also Supplemental Information and the TCGA publication page (https://tcga-data.nci.nih.gov/docs/publications/prad_2015/).

Supplementary Material

1

10

11

12

2

3

4

5

6

7

8

9

1

Click here to view.(1.2M, pdf)

10

Click here to view.(42M, xls)

11

Click here to view.(34K, xls)

12

Click here to view.(31K, xls)

2

Click here to view.(7.9M, pdf)

3

Click here to view.(3.2M, pdf)

4

Click here to view.(8.2M, pdf)

5

Click here to view.(13M, pdf)

6

Click here to view.(1.5M, pdf)

7

Click here to view.(1.6M, pdf)

8

Click here to view.(19K, docx)

9

Click here to view.(291K, pdf)

Acknowledgments

We are grateful to all the patients and families who contributed to this study. We thank Margi Sheth and Ina Felau for project management. Supported by the following grants from the United States National Institutes of Health: 5U24CA143799, 5U24CA143835, 5U24CA143840, 5U24CA143843, 5U24CA143845, 5U24CA143848, 5U24CA143858, 5U24CA143866, 5U24CA143867, 5U24CA143882, 5U24CA143883, 5U24CA144025, U54HG003067, U54HG003079, and U54HG003273, P30CA16672.

The Cancer Genome Atlas Research Network, The Cancer Genome Atlas Program Office, National Cancer Institute at NIH, 31 Center Drive, Bldg. 31, Suite 3A20, Bethesda, MD 20892, USA;
Corresponding authors: Nikolaus Schultz (gro.ccksm.oibc@ztluhcs), Massimo Loda (ude.dravrah.icfd@adol_omissam), Chris Sander (gro.balrednas@sirhc)
Publisher's Disclaimer

Summary

There is substantial heterogeneity among primary prostate cancers, evident in the spectrum of molecular abnormalities and its variable clinical course. As part of The Cancer Genome Atlas (TCGA), we present a comprehensive molecular analysis of 333 primary prostate carcinomas. Our results revealed a molecular taxonomy in which 74% of these tumors fell into one of seven subtypes defined by specific gene fusions (ERG, ETV1/4, FLI1) or mutations (SPOP, FOXA1, IDH1). Epigenetic profiles showed substantial heterogeneity, including an IDH1-mutant subset with a methylator phenotype. Androgen receptor (AR) activity varied widely and in a subtype-specific manner with SPOP and FOXA1 mutant tumors having the highest levels of AR-induced transcripts. 25% of the prostate cancers had a presumed actionable lesion in the PI3K or MAPK signaling pathways, and DNA repair genes were inactivated in 19%. Our analysis reveals molecular heterogeneity among primary prostate cancers, as well as potentially actionable molecular defects.

Summary

Footnotes

Author Contributions

Project leaders: M. Loda and C. Sander; analysis leaders: N.S. and B.S.T.; project coordinators: I.F. and M. Sheth; data coordinators: J. Armenia and N.S.; supplement coordinator: J. Armenia; pathology review: M.I., M. Loda, V.E.R., B.R., M.A.R., P. Troncoso, L.D.T., and H.Y.; clinical data: J. Bowen, N.M.C., L.I., K.M.L., and T.M.L.; tumor cellularity analysis: J. Ahn, P.C.B., A.D.C., F.D., S.H., D.P., M.A.R., J. Shih, S.T., and W.W.; exome sequencing analysis: M. Gupta, C. Sougnez, and E.V.A.; whole-genome sequencing analysis: J. Armenia, Y.F., M.B.G., M. Gupta, E.K., L. Lochovsky, A. Pantazi, C. Sougnez, and E.V.A.; copy-number analysis: A.D.C., and J. Shih; mRNA expression and fusion analysis: A. Sboner, N.S., M.D.W., and C.-C.W.; androgen receptor analysis: J. Armenia, R.K.B., H.D., R.A.M., A.J.M., P.S.N., and N.S.; DNA methylation analysis: P.W.L., S.K.R., and H.S.; miRNA analysis: A.G.R.; RPPA analysis: R.A., W.L., Y.L., and G.B.M.; integrative analysis: A. Arora, A.D.C., D.I.H., L.I., N.S., R. Shen, B.S.T., and M.D.W.; batch effects analysis: R.A., A.K., K.-V.L., S. Ling, A.R., and J.N.W.; mRNA degradation analysis: A.K., K.-V.L., G.R., M.A.R., N.S., and M.D.W.; manuscript writing: C.E.B., C.C.B., P.R.C., Y.C., A.D.C., F.D., S.M.L., M. Loda, J. Simko, P.S.N., S.K.R., A.G.R., M.A.R., C. Sander, A. Sboner, N.S., B.S.T., S.A.T., E.V.A., and J.N.W.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Footnotes
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.