Cell Rep 27(6): 1781-1793e4.

PMC: PMC6546434

PMID: 31067463

Dysregulation of EMT Drives the Progression to Clinically Aggressive Sarcomatoid Bladder Cancer

June Lee+16 authors

INTRODUCTION

Bladder cancer is the ninth most common cancer worldwide, affecting 430,000 people and resulting in 165,000 deaths annually (Ferlay et al., 2013). In the United States, it is the fourth most common cancer in men, with an estimated incidence of 81,000 cases in 2018 (Siegel et al., 2018). More than 90% of bladder cancers are urothelial carcinomas (UCs), which originate from precursor lesions in the epithelial layer of the bladder called the urothelium (Grignon et al., 2016). They progress along two distinct tracks, referred to as papillary and non-papillary, that represent clinically and molecularly different forms of the disease (Czerniak et al., 2016). Non-invasive papillary tumors have a high tendency for recurrence, which necessitates lifetime surveillance that is both intrusive and costly to the patient. Non-papillary carcinomas are clinically aggressive, exhibiting a high propensity for invasive growth, and a large proportion of them are lethal because of metastatic spread (Kamat et al., 2016). We showed that papillary tumors are almost exclusively of a luminal molecular subtype that recapitulates the expression pattern of markers characteristic of normal, intermediate, and terminal urothelial differentiation (Choi et al., 2014b). In contrast, non-papillary UCs are of a basal molecular subtype and exhibit an expression pattern of genes characteristic of the normal basal urothelial layer. Molecular subtyping has shown that invasive UCs can be almost equally divided into luminal and basal subtypes that have distinct clinical behaviors and responses to frontline chemotherapy (Cancer Genome Atlas Research Network, 2014; Choi et al., 2014b; Damrauer et al., 2014; Robertson et al., 2017). In addition to conventional UCs, many microscopically distinct bladder cancer variants have been described, and in general they are thought to develop via progression of conventional disease (Amin, 2009; Grignon et al., 2016). The most frequent of these variants are sarcomatoid, small cell, and micropapillary, all of which are clinically more aggressive than conventional UCs and require uniquely tailored therapeutic management, which is often unavailable (Amin, 2009; Grignon et al., 2016; Kamat et al., 2016).

In this report, we focus on one of these more common variants, referred to as sarcomatoid carcinoma (SARC) (Amin, 2009; Grignon et al., 2016). SARC represents, in various published series, 5%–15% of bladder cancer and frequently coexists with conventional UC (Lopez-Beltran et al., 1998; Sanfrancesco et al., 2016). Clinically, it has a predilection for early metastatic spread to distant organs, and it is associated with shorter survival compared with conventional UC (Amin, 2009; Lopez-Beltran et al., 1998; Sanfrancesco et al., 2016). Here, we report on the genome-wide characterization of bladder SARC, including its miRNA, gene expression, and whole-exome mutational profiles, which identified molecular features associated with its aggressive nature that may be relevant for the early detection and treatment of this highly lethal variant of bladder cancer.

RESULTS

Fifty-two paraffin-embedded SARC tissue samples and 84 invasive conventional bladder UC samples from the MD Anderson Cancer Center cohort were analyzed retrospectively. A cohort of 408 muscle-invasive bladder cancers in The Cancer Genome Atlas (TCGA) was used as a reference. The samples were characterized by clinical and pathological data as well as by several genomic platforms. Sufficient high-quality DNA was available for 13 SARC cases and 5 paired SARC and conventional UC cases for whole-exome sequencing. Gene expression profiling was performed on 28 of the cases using Illumina’s DASL platform, and the data were merged with those obtained from a cohort of 84 conventional UCs. Panel quantitative RT-PCR was used to analyze the miRNA expression levels of all 52 SARC samples and 80 conventional UC samples.

Mutational Signature

The mutational profile of conventional UC was characterized by significant levels of recurrent somatic mutations in 30 genes (Figure 1A). The 10 most frequently mutated genes in UC were TP53 (47%), ARID1A (25%), KDM6A (22%), PIK3CA (22%), RB1 (17%), EP300 (15%), FGFR3 (14%), STAG2 (14%), ELF3 (12%), and CREBBP (11%). The overall mutational landscapes of luminal and basal bladder UC were similar, but several mutated genes were distinctively enriched in specific molecular subtypes. Mutated FGFR3, ELF3, CDKN1A, and TSC1 genes were enriched in luminal tumors, whereas mutated TP53, RB1, and PIK3CA genes were enriched in basal tumors (Figure 1A). SARCs showed high overall mutational rates (median mutational frequency 259, interquartile range 174), and their significantly mutated genes were similar to those observed in conventional UC (Figure 1B; Table S1). However, the top three genes—TP53 (72%), PIK3CA (39%), and RB1 (39%)—were mutated at significantly higher frequencies in SARCs than they were in conventional UCs (p < 0.01). This suggests that SARC evolved from precursor conventional UC carrying these mutations and that mutations in these genes may drive the progression process. Several of the genes that are frequently mutated in conventional UC, including ARID1A, KDM6A, EP300, ELF3, and CREBBP, were not mutated in SARC, and these genes are involved in chromatin remodeling (Dutto et al., 2018; Ler et al., 2017; Skulte et al., 2014; Tang et al., 2013; Wang et al., 2017; Zheng et al., 2018). In general, as a group, chromatin-remodeling genes were not mutated in SARC. Instead, SARC carried frequent mutations of MYO1F (33%), CNGB1 (22%), FXR1 (22%), RBM5 (22%), SEMA3D (22%), TTBK2 (22%), and ZNF90 (22%), which are involved in cellular motility, RNA binding, transmembrane ion channeling, kinase activity, and signaling (Ahonen et al., 2013; Bechara et al., 2013; Bouskila et al., 2011; Hamm et al., 2016; Kim et al., 2006; Qian et al., 2015; Somma et al., 1991). The functional significance of the mutations of these genes for sarcomatoid progression remains unclear, but they are attractive candidates for future mechanistic studies. Interestingly, FGFR3 mutations, which were present in 14% of conventional UCs, were not present in SARC. Nearly all mutations present in the SARCs were also present in the paired precursor conventional UCs of the same patients, indicating that the SARC and the presumed precursor lesions were clonally related.

An external file that holds a picture, illustration, etc.
Object name is nihms-1528840-f0002.jpg

Figure 1.

Mutational Landscape of SARC

(A) Mutational landscape among the molecular subtypes of 408 muscle-invasive bladder cancers from the TCGA cohort showing the frequency of mutations in individual tumors and somatic mutations for significantly mutated genes. The frequencies of mutations of individual genes in the luminal and basal subtypes are shown on the left. Asterisks denote the genes with statistically significant mutation frequencies in the luminal and basal subtypes. Bars on the right show the numbers of specific substitutions for individual genes.

(B) Mutational landscape of 13 cases of SARC and 5 paired samples of precursor conventional UC showing the frequency of mutations in individual genes and somatic mutations for significantly mutated genes. The frequencies of mutations of individual genes are shown on the left. Bars on the right show the numbers of specific substitutions for individual genes.

(C) Composite bar graphs showing the distribution of all nucleotide substitutions in three sets of samples corresponding to the TCGA cohort, paired precursor conventional UC, and SARC.

(D) Proportion of single-nucleotide variants (SNVs) in specific nucleotide motifs for each category of substitution in three sets of samples as shown in (C).

(E) False discovery rate (FDR) for specific nucleotide motifs in three sets of samples as shown in (C).

(F) Average weight scores of mutagenesis patterns in three sets of samples as shown in (C).

(G) Weight scores of mutagenesis patterns in individual tumor samples of the TCGA cohort.

(H) Weight scores of mutagenesis patterns in SARCs and paired precursor conventional UCs.

(I) Statistical significance of mutagenesis patterns (p value) in SARC compared with conventional UC. For (E) and (I), p value was calculated using Wilcoxon rank-sum and Kruskal-Wallis tests, respectively.

Mechanisms of Mutagenesis

To further characterize the mutational process associated with progression from conventional UC to SARC, we examined six single-base substitutions (C > A, C > G, C > T, T > A, T > C, and T > G) in all cancer samples (Alexandrov et al., 2013; Nik-Zainal et al., 2012). The results revealed that SARCs were enriched with C > T mutations compared with conventional UCs, and this increase was already apparent in the precursor conventional UCs that were associated with SARCs (Figures 1C and and1D).1D). Analyses of Sanger mutational signatures (Faltas et al., 2016) revealed the presence of six dominant signatures in the conventional UCs in the TCGA cohort: signatures 1, 2, 3 (BRCA1/2 mutagenesis), 13 (APOBEC), 19, and 30 (Figures 1F and and1G).1G). Clustering separated the conventional tumors into two subsets (a and b) that were characterized by different levels of signature 13 (APOBEC) prevalence. In contrast, SARCs and paired precursor conventional UCs were characterized by the uniform dominance of signature 1, which was present in all SARC and precursor conventional UC samples (Figure 1H). In addition, clustering segregated SARCs into two subsets, and they were also characterized by different levels of APOBEC activity. Mutagenesis signatures 1 and 19 were significantly enriched in SARCs compared with conventional UC (Figure 1I). Overall, these data suggest that SARCs evolve from a distinct subset of conventional UCs.

Gene Expression Profile

mRNA expression profiling revealed that more than 6,000 genes were differentially expressed between SARC and UC. We performed multiple unsupervised clustering analyses using all the differentially expressed genes in SARC and performed similar analyses using the top 100 and top 10 upregulated and downregulated genes. All these analyses separated SARC and UC into two distinct clusters (Figures S1A and S1B). One cluster contained conventional UC almost exclusively, whereas the other cluster contained most of the SARCs. Among the top upregulated genes in the SARC cluster were FAM101B (or RFLNB), UHRF1, and PHC2, all of which are members of the chromatinremodeling superfamily (Benavente et al., 2014; Gay et al., 2011; Isono et al., 2005) (Figure S1B). The top downregulated genes included the differentiation-associated transcription factor, ELF3, and genes involved in terminal urothelial differentiation, such as uroplakins and cell adherence genes, as well as LAD1, a component of anchoring filaments in the basement membrane (Hogg et al., 1999). The median survival for the SARC patients (11 months) was significantly shorter than that of the patients with conventional UC (24 months) (p = 0.0326) (Figure S1C).

Intrinsic Molecular Subtypes

Several molecular studies, including our own, have divided bladder cancer into two molecular subtypes that preferentially express basal or luminal genes (Cancer Genome Atlas Research Network, 2014; Choi et al., 2014a; Damrauer et al., 2014; Robertson et al., 2017). To investigate whether the intrinsic molecular types of conventional bladder UC applied to SARC, we analyzed the luminal and basal gene expression signatures in SARC. We used a previously developed classifier that includes markers of luminal and basal types (Choi et al., 2014a). The set of conventional UC was separated into two major groups (Figure 2A). The first group, comprising 53 of the 84 samples (63%), was characterized by high mRNA expression levels of luminal markers such as KRT20, GATA3, uroplakins, ERBB2, ERBB3, PPARG, FOXA1, and XBP1 and was referred to as the luminal subtype. The remaining 31 conventional UC samples (37%) were characterized by high expression levels of basal markers such as CD44, CDH3, KRT5, KRT6, and KRT14 and were referred to as the basal subtype.

An external file that holds a picture, illustration, etc.
Object name is nihms-1528840-f0003.jpg

Figure 2.

Luminal and Basal Molecular Subtypes in Conventional UC and SARC

(A) The expression of luminal and basal markers in molecular subtypes of conventional UC (n = 84) and SARC (n = 28).

(B) Kaplan-Meier analysis of molecular subtypes of conventional UC and SARC and log-rank testing. A p value < 0.05 was considered statistically significant.

(C) The immunohistochemical expression of signature luminal and basal markers in representative luminal and basal cases of conventional UC and in representative basal and double-negative SARC. Scale bars indicate 40 μm.

Unlike the conventional UCs, all 28 SARCs had low mRNA expression levels of the luminal genes, which suggested that they developed from a basal subtype of precursor conventional UC (Figure 2A). In addition, nearly half of the SARCs (12 [43%]) were characterized by the retained expression of basal keratins, CD44, and P-cadherin (CDH3), whereas the remaining SARCs (16 [57%]) lacked expression of canonical basal and luminal markers, and we therefore referred to them as “double-negative.” This subset of SARCs shares the “mesenchymal” molecular phenotype with the claudin-low and TCGA cluster IV subtypes identified previously (Cancer Genome Atlas Research Network, 2014; Damrauer et al., 2014; Kardos et al., 2016). Survival analysis revealed that the double-negative mesenchymal SARCs were the most aggressive of the molecular subtypes (Figure 2B). Furthermore, the mean survival duration of patients with double-negative SARC (10 months) was shorter than that of patients with basal SARC (18 months); however, this difference was not significant statistically, probably because of the limited number of cases. We verified the expression patterns of signature luminal markers (GATA3) and basal markers (P63, KRT5/6, KRT14) by immunohistochemistry using tissue microarrays containing the same cases (Figure 2C). The epithelial-basal SARCs were focally positive for basal markers such as p63, KRT5/6, and KRT14 and were negative for the signature luminal marker GATA3. In contrast, the purely mesenchymal double-negative SARCs were immunohistochemically negative for all luminal and basal markers, consistent with the RNA expression data.

Canonical and Upstream Regulator Pathways

In order to identify candidate mechanisms underlying the progression to the SARC phenotype, we used the Ingenuity Pathway Analysis (IPA) upstream pathways and gene set enrichment analysis (GSEA) functions to identify signaling pathways associated with the gene expression patterns observed (Figures 3A–3D). SARCs were characterized by downregulation of G1/S checkpoint genes and upregulation of INK4 family of cyclindependent kinase inhibitors (CDKN2A-D), consistent with our observation that RB1 mutations were more common in SARCs. Similarly, SARCs exhibited decreased p53 pathway activity, which was consistent with their high mutational rate of the p53 gene. SARCs were also characterized by loss of epithelial adherens genes and downregulation of p63 pathway activity complemented with activation of TGFb and RhoA, all of which are characteristics of epithelial-mesenchymal transition (EMT) permissive state.

An external file that holds a picture, illustration, etc.
Object name is nihms-1528840-f0004.jpg

Figure 3.

Enrichment of Canonical Pathways and Transcriptional Regulators in SARC Compared with Conventional UC

(A) The top ten canonical pathways dysregulated in SARC, as revealed by IPA and GSEA.

(B) The top ten upstream regulators altered in SARC, as revealed by IPA and GSEA.

(D) Expression pattern of epithelial adherens genes in molecular subtypes of conventional UC and SARC.

The dotted lines indicate the significance level. For (A) and (B), p values were calculated using Fisher’s exact test.

Immune Infiltrate

Immune checkpoint blockade is clinically active in about 15% of patients with advanced bladder cancer, in which response is associated with high tumor mutational burden (TMB), the “genomically unstable” luminal subtypes, and infiltration with activated cytotoxic T lymphocytes (Mariathasan et al., 2018; Sharma et al., 2016; Sjödahl et al., 2012). Given the relatively high mutational frequencies observed in SARCs, we characterized the patterns of immune-related gene expression in them (Figure 4). In conventional UCs, 11% (9 of 84) showed enrichment of an immune gene expression signature and clustered in the luminal molecular subtype. In contrast, 39% of the SARCs (9 of 28) demonstrated overexpression of immune signature genes and preferentially clustered in the mesenchymal double-negative subtype. The enrichment of immune gene signature in SARC compared with conventional UC was confirmed by GSEA (Figure 5A). We additionally validated these findings by calculating immune score and showing its increase predominantly in the double-negative purely mesenchymal SARC. (Figure 5B) More in-depth analysis of immune infiltrate was performed using the CIBERSORT algorithm, which provided quantitative assessment of 22 immune cell types (Figure 5C) (Newman et al., 2015). It confirmed the enrichment of immune infiltrate in SARC compared with conventional UC (Figure 5D). Most important, it identified the significant increase of CD8+ cytotoxic T lymphocytes in double-negative purely mesenchymal and the most aggressive variant of SARC (Figure 5E). We also evaluated the expression signature of immune checkpoint ligands and their receptors (Sharma and Allison, 2015), including CD70 and CD27, CD80 and CTLA4, TNFSF9 and TNFRSF18, ADA and ADORA2A, and PDCD1G2 and CD274 (Figure 5F). In general, the overexpression of these genes was present in the same group of SARCs characterized by enrichment for the overexpression of immune signature genes. Of potential therapeutic significance, SARC exhibited the enhanced expression of programmed cell death ligand PD-L1 (CD274), and mRNA overexpression of PD-L1 was observed in more than 50% of the SARCs (15 of 28) (Figure 5G). This was confirmed by immunohistochemistry, which showed strong overexpression of the PD-L1 protein (Figure 5H), suggesting that immune checkpoint therapy may be an attractive therapeutic option for a subset of SARC patients.

An external file that holds a picture, illustration, etc.
Object name is nihms-1528840-f0005.jpg

Figure 4.

Expression Pattern of Immune Cell Infiltrate in Molecular Subtypes of Conventional UC and SARC

Top to bottom: B cell, T cell, CD8, MacTH1, and dendritic cell expression clusters. Boxed areas identify samples with enrichment of immune cell infiltrate.

An external file that holds a picture, illustration, etc.
Object name is nihms-1528840-f0006.jpg

Figure 5.

Expression Patterns of Immune Checkpoint Genes in Conventional UC and SARC

(A) GSEA of immune cell infiltrate of SARC compared with conventional UC. A p value < 0.05 was considered statistically significant.

(B) Boxplot of immune scores calculated using the gene expression profile shown in Figure 4 for molecular subtypes of conventional UC and SARC.

(C) Heatmap of CIBERSORT scores for 22 immune cell types in molecular subtypes of conventional UC and SAR. A p value < 0.05 was considered statistically significant.

(D) Proportion of cases with significant CIBERSORT scores in conventional UC and SARC.

(E) Boxplot of CIBERSORT scores for CD8+ T cells in molecular subtypes of conventional UC and SARC.

(F) Expression of immune checkpoint genes in conventional UC and SARC in relation to their molecular subtypes. The black boxes indicate the cases with enrichment for immune profile shown in Figure 4.

(G) Boxplot of mRNA PD-L1 expression levels in conventional UC and SARC.

(H) Examples of immunohistochemical expression of PD-L1 in conventional UC and SARC. Scale bars indicate 40 μm.

For (A), p value was calculated using the standard GSEA method, for (B) and (E) using the Kruskal-Wallis test, for (D) using Fisher’s exact test, and for (G) using the Wilcoxon rank-sum test.

MicroRNA Expression Profile

Similar to the gene expression profile, microRNA (miRNA) expression levels were widely dysregulated in SARC compared with conventional UC. Nearly 200 individual miRNA species were either overexpressed or downregulated in SARC (Figure S2A). In unsupervised clustering, a small subset of differentially expressed miRNAs could bimodally separate SARC from conventional UC (Figure S2B). Although the biological significance of the miRNAs that were upregulated in SARCs is not clear, the entire miR-200 family was downregulated in the double-negative subset of SARCs (Figure S2B), and it is likely that it plays an important role in their progression. Members of the miR-200 family play well-established roles in the maintenance of epithelial phenotype by suppressing the transcriptional regulators of EMT of the SNAIL, TWIST, and ZEB families (Peinado et al., 2007).

Dysregulation of the EMT Network

Microscopically, SARCs comprise mixed lineages of purely mesenchymal cells and cells with at least partial retention of an epithelial phenotype, reflecting various degrees of EMT (Sung et al., 2013). We previously showed that TP63 controls the expression of high-molecular weight keratins (KRT5, KRT6, and KRT14) and suppresses EMT (Choi et al., 2014b; Tran et al., 2013). The central role of p63 in the maintenance of epithelial phenotype and EMT was confirmed in several variants of solid tumors (Soares and Zhou, 2018; Stacy et al., 2017; Tanaka et al., 2018). We therefore performed additional analyses to further characterize the role of EMT in SARC. Among the EMT transcriptional regulators of the SNAIL, TWIST, and ZEB families, SNAIL2 was significantly overexpressed in SARC compared with conventional UC. Correspondingly, E-cadherin (CDH1) and other epithelial markers, including claudin-1 (CLDN1) and tight junction protein 1 (TJP1), were downregulated in SARC (Figure 6A; Figures S3 and S4A). In solid tumors, the EMT permissive state may be controlled by p53 and RB1 (Jiang et al., 2011), consistent with the mutational and gene expression characteristics of SARCs. Dysregulation of several additional pathways may have a synergistic effect on the EMT permissive state, including the upregulation of TGFB1 and RhoA, which were among the top upregulated pathways in SARC by IPA and GSEA. In addition, p63 and its downstream target genes were downregulated in SARC (Figures 3A, ,3B,3B, and and6A).6A). Importantly, SARC had widespread downregulation of miRNAs involved in EMT, including miR-100, the p63 targets miR-203 and miR-205, and all members of the miR-200 family (Tucci et al., 2012) (Figure 6A; Figure S4B). Because SARCs can be divided into two subgroups (i.e., SARCs that retain epithelial marker expression, reflecting partial EMT, and SARCs that are purely mesenchymal, reflecting complete EMT), we quantified the EMT levels across multiple tumor samples by calculating EMT scores on the basis of a 76-gene signature identified by Byers et al. (2013) (Figure 6B). This provided a quantitative assessment of the epithelial versus mesenchymal phenotype; positive EMT scores corresponded to the epithelial phenotype, whereas negative EMT scores reflected the mesenchymal phenotype. In general, the conventional UCs were characterized by positive EMT scores corresponding to their epithelial phenotype, whereas the basal and double-negative mesenchymal SARCs had intermediate and low EMT scores, reflecting their partial and complete EMT states. (Figure 6C). Essentially identical results were obtained by GSEA using the 175 EMT gene signature developed by Yu et al. (2013). The enrichments of SARCs compared with conventional UCs, as well as SARCs of the basal (partial EMT) subtypes compared with double-negative (complete EMT), were significantly different (Figures S4C and S4D). Immunohistochemistry revealed that tumors within the epithelial SARC subset showed focal retention of p63 and E-cadherin, both of which are involved in the maintenance of epithelial differentiation, whereas double-negative, purely mesenchymal SARCs were negative for the epithelial markers, confirming these subtypes’ partial and complete EMT states, respectively (Figure 6D). Finally, the loss of E-cadherin expression was confirmed in selected SARC samples by western blotting (Figure 6E).

An external file that holds a picture, illustration, etc.
Object name is nihms-1528840-f0007.jpg

Figure 6.

Dysregulation of the EMT Regulatory Network in Molecular Subtypes of Conventional UC and SARC

(A) Expression patterns of representative genes in the EMT regulatory network.

(B) EMT scores in molecular subtypes of conventional UC and SARC.

(C) Boxplot of EMT scores in molecular subtypes of conventional UC and SARC. A p value < 0.05 was considered statistically significant.

(D) Examples of immunohistochemical expression of p63 and E-cadherin in conventional UC and SARC.

(E) Western blot documenting loss of E-cadherin expression in SARC.