A stratified genetic risk assessment for testicular cancer
Introduction
Testicular germ cell tumours (TGCT) constitute the most common cancer in young men in many countries. White race, previous TGCT, positive family history, infertility, cryptorchidism and testicular microlithiasis are known risk factors [reviewed in (McGlynn & Cook, 2009; Greene et al., 2010)]. Although genetic and environmental factors contribute to TGCT development, the genetic component is also strong. Sons of men with TGCT have a four- to sixfold increased risk of TGCT cancer compared with the general population, and brothers of the affected have an 8- to 10-fold increased risk (Forman et al., 1992; Heimdal et al., 1996). Dizygotic/monozygotic twin brothers of men with TGCT are reported to have 37-fold/76.5-fold elevated risks of TGCT respectively (Swerdlow et al., 1997). Despite this strong familial relative risk, a genome-wide genetic linkage study of 179 families by the International Testicular Cancer Linkage Consortium did not uncover a major, highly penetrant gene predisposing to TGCT (Crockford et al., 2006). Therefore, it has been suggested that TGCT is a genetically complex trait.
Consistent with the idea that multiple loci contribute to TGCT risk, a candidate study has identified the chromosome Y AZFc region (gr/gr deletion) as a TGCT risk locus (OR = 3.2 and 2.1 in familial and sporadic TGCT respectively) (Nathanson et al., 2005). In addition, three TGCT genome-wide association studies (GWAS) from the UK (Rapley et al., 2009; Turnbull et al., 2010) and the US (Kanetsky et al., 2009) have revealed several new risk variants. Single nucleotide polymorphisms (SNPs) with significant associations were identified in or near KITLG (ligand for the tyrosine kinase KIT), SPRY4 (an inhibitor of the mitogen-activated protein kinase pathway acting downstream of KITLG-KIT) (Kanetsky et al., 2009; Rapley et al., 2009) and BAK1 (also acting downstream of KITLG) (Rapley et al., 2009). Additional risk variants were discovered in or near DMRT1 (involved in gender determination), TERT and ATF7IP (both genes are involved in telomere maintenance) (Turnbull et al., 2010). The TERT locus on chromosome 5 has previously been associated with various other cancers (McKay et al., 2008; Wang et al., 2008; Landi et al., 2009; Rafnar et al., 2009; Shete et al., 2009; Stacey et al., 2009; Petersen et al., 2010). A summary of recently uncovered variants is provided in Table 1. Notably, the calculated association effect size of variants in the region of KITLG (OR = 2.69) is the highest reported for any cancer to date (Chanock, 2009). Here, we used a statistical approach to dissect the question of whether TGCT risk variants are of clinical value. For this analysis, we simulated data based on published odds ratios (OR) and risk allele frequencies in a Caucasian population assuming multiplicative combination of alleles both within and between SNP. Our analysis suggested that genetic testing may not be useful in the general population, but might be of value in a small group of men with underlying clinical risk factors.
Table 1
Summary of known testicular cancer risk loci
Locus | Implicated gene | Non-risk/risk allele | Risk allele frequency | Per allele odds ratio |
---|---|---|---|---|
rs2736100 | TERT | G/T | 0.49 | 1.33 |
rs4635969 | TERT/CLPTM1L | C/T | 0.20 | 1.54 |
rs755383 | DMRT1 | C/T | 0.63 | 1.37 |
rs2900333 | ATF7IP | T/C | 0.62 | 1.27 |
rs4624820a | SPRY4 | G/A | 0.55 | 1.37 |
rs210138 | BAK1 | A/G | 0.20 | 1.50 |
rs995030a | KITLG | A/G | 0.80 | 2.69 |
Y AZFc region | several | – | 0.013 | 2.10 |
Material and methods
Selection of genetic risk variants
We considered a genetic risk model with seven SNPs and one deletion (Table 1). The first GWAS published by investigators from the UK (Rapley et al., 2009) reported evidence that the two SNPs, rs995030 and rs1508595, in the KITLG locus were independently associated with disease risk. As these two SNPs are in high linkage disequilibrium, (r = 0.85) we conservatively chose only rs995030 for the risk model. For the white population, the OR of the risk alleles were estimated from the replication studies (Nathanson et al., 2005; Kanetsky et al., 2009; Rapley et al., 2009; Turnbull et al., 2010) to avoid the bias caused by the winner’s curse in GWAS. All genetic variants were in linkage equilibrium.
Receiver operating characteristic curves for genetic risk models
For the i genetic variant, we let φi denote the OR of per risk allele and Gi denote the genotype or the number of the risk allele. Here, G1, …, G7 ∈ {0, 1, 2} are the genotypes for the SNPs and G8 ∈ {0, 1} is the genotype for the deletion. We assume that the SNPs follow the Hardy Weinberg Equilibrium (HWE) law in the general population. As TGCTs are rare, the HWE law holds approximately in the control population. Hence, for the i SNP, the frequencies of the genotypes {0, 1, 2} are: (1 − pi), 2pi(1 − pi), . Under a multiplicative model, the relative risk for a given combination of genotypes at the eight loci is .
For a threshold T, a subject with genotypes (G1, …, G8) is classified as a patient if his relative risk and a non-patient if . The false-positive rate and the true-positive rate for the classifier are and . Hence, the false-positive rate and the true-positive rate are computed by summing the probabilities of (G1, …, G8) satisfying in the control population and in the case population respectively. To compute the true-positive rate, we first derived the allele frequencies of the risk alleles in the patient population, assuming a multiplicative model and HWE using formula: qi = piφi/(1 − pi + piφi). Let T vary and then [X(T), Y(T)] forms the receiver operating characteristic (ROC) curve. The area under the curve (AUC) is computed numerically.
Testicular cancer risk prediction
We assume that the 5-year absolute risk is proportional to the relative risk , where R0 is the baseline 5-year absolute risk for a subject with no risk allele. For each 5-year category, the absolute risk for the population can be expressed as:
where pi(Gi) is the genotype frequency under HWE, and the sum is overall 3 × 2 = 4374 combinations of (G1, …, G8). For each 5-year category, we obtained from SEER the absolute risk for general population white males, and then derived the baseline 5-year risk R0 based on the above equation. Let A be the set of genotype combinations (G1, …, G8) with the relative genetic risk in the top 1% of the screened population. Then the average 5-year absolute risk for this group is calculated as . The average risk for the top 5, 10 and 20% of populations with the highest risk was computed using the same procedure.
Selection of genetic risk variants
We considered a genetic risk model with seven SNPs and one deletion (Table 1). The first GWAS published by investigators from the UK (Rapley et al., 2009) reported evidence that the two SNPs, rs995030 and rs1508595, in the KITLG locus were independently associated with disease risk. As these two SNPs are in high linkage disequilibrium, (r = 0.85) we conservatively chose only rs995030 for the risk model. For the white population, the OR of the risk alleles were estimated from the replication studies (Nathanson et al., 2005; Kanetsky et al., 2009; Rapley et al., 2009; Turnbull et al., 2010) to avoid the bias caused by the winner’s curse in GWAS. All genetic variants were in linkage equilibrium.
Receiver operating characteristic curves for genetic risk models
For the i genetic variant, we let φi denote the OR of per risk allele and Gi denote the genotype or the number of the risk allele. Here, G1, …, G7 ∈ {0, 1, 2} are the genotypes for the SNPs and G8 ∈ {0, 1} is the genotype for the deletion. We assume that the SNPs follow the Hardy Weinberg Equilibrium (HWE) law in the general population. As TGCTs are rare, the HWE law holds approximately in the control population. Hence, for the i SNP, the frequencies of the genotypes {0, 1, 2} are: (1 − pi), 2pi(1 − pi), . Under a multiplicative model, the relative risk for a given combination of genotypes at the eight loci is .
For a threshold T, a subject with genotypes (G1, …, G8) is classified as a patient if his relative risk and a non-patient if . The false-positive rate and the true-positive rate for the classifier are and . Hence, the false-positive rate and the true-positive rate are computed by summing the probabilities of (G1, …, G8) satisfying in the control population and in the case population respectively. To compute the true-positive rate, we first derived the allele frequencies of the risk alleles in the patient population, assuming a multiplicative model and HWE using formula: qi = piφi/(1 − pi + piφi). Let T vary and then [X(T), Y(T)] forms the receiver operating characteristic (ROC) curve. The area under the curve (AUC) is computed numerically.
Testicular cancer risk prediction
We assume that the 5-year absolute risk is proportional to the relative risk , where R0 is the baseline 5-year absolute risk for a subject with no risk allele. For each 5-year category, the absolute risk for the population can be expressed as:
where pi(Gi) is the genotype frequency under HWE, and the sum is overall 3 × 2 = 4374 combinations of (G1, …, G8). For each 5-year category, we obtained from SEER the absolute risk for general population white males, and then derived the baseline 5-year risk R0 based on the above equation. Let A be the set of genotype combinations (G1, …, G8) with the relative genetic risk in the top 1% of the screened population. Then the average 5-year absolute risk for this group is calculated as . The average risk for the top 5, 10 and 20% of populations with the highest risk was computed using the same procedure.
Results
Testicular cancer risk variants discriminate retrospectively between cases and controls
With the use of published GWAS and candidate gene data (Nathanson et al., 2005; Kanetsky et al., 2009; Rapley et al., 2009; Turnbull et al., 2010), and by employing the ROC curve analysis (Gail, 2008; Park et al., 2010; Wacholder et al., 2010), we estimated the AUC as a measure of discrimination between TGCT cases and controls. Random classification of case and control subjects is defined as an AUC = 50%, whereas perfect classification leads to an AUC = 100%. We estimated that a TGCT risk model that included seven SNPs and the gr/gr deletion (Table 1) would have an AUC value of 69.2%. This suggested that about 69.2% of the time, a randomly selected patient with TGCT had a higher estimated risk than that for a randomly selected control subject. Notably, this discriminative power is considerably higher than that calculated for other common cancers such as breast, prostate and colon cancer (Gail, 2008; Park et al., 2010; Wacholder et al., 2010).
Testicular cancer risk loci and risk prediction
Based on US cancer incidence rates from 2005 to 2007, ~0.49% of White men born today will be diagnosed with cancer of the testis at some time during their lifetime (1 in 204 men). For US Black men, the lifetime risk is considerably lower (0.08% or 1 in 1250 men) (http://seer.cancer.gov). As most of the men included in the published GWAS were White, our analysis focused on White men. Using published GWAS and candidate study data (Table 1) (Nathanson et al., 2005; Kanetsky et al., 2009; Rapley et al., 2009; Turnbull et al., 2010) and a multiplicative model, we estimated that White men in the top 1% of genetic risk (i.e. on average, 1 of 100 individuals is in the top 1% of genetic risk) had an average relative risk that was ~10.5-fold greater than that of the white male population. In White men aged 30–34 years who were in the top 1% of genetic risk, the estimated 5-year risk of TGCT increased from 0.087%, in the general population, to 0.91% (Fig. 1a and b).
Estimated 5-year risk (%) of developing testicular germ-cell tumours in different age groups. The estimated risk is modestly increased in men with cryptorchidism (panel a), or with an elevated genetic risk (panel b). In contrast, the estimated risk is strongly increased in men with both, cryptorchidism and elevated genetic risk (panel c).
Risk assessment based on clinical risk factors of testicular cancer and risk SNPs
We next addressed the question of whether the magnitude of the predicted risk (and the clinical utility) can be improved by incorporation of both, genetic and clinical risk factors. A previous meta-analysis estimated that the relative risk of TGCT among men with cryptorchidism was 4.8 (95% CI: 4.0–5.7) (Dieckmann & Pichlmeier, 2004). According to a recent cohort study, male factor infertility conferred a significantly increased risk of testicular cancer (standardized incidence ratio 2.8, 95% CI: 1.5–4.8) (Walsh et al., 2009). However, it is currently unknown whether the recently identified TGCT risk loci are associated with TGCT clinical risk factors such as cryptorchidism. Notably, a recent nationwide cohort study from Denmark did not support the hypothesis of shared heritability of cryptorchidism and TGCT (Schnack et al., 2010). Under the assumption that the genetic and clinical risk factors are independent, we estimated that white men aged 30–34 years with a history of cryptorchidism who were also in the top 1% of genetic risk had an estimated 5-year risk of 4.4%, as opposed to 0.087% in the general population (Fig. 1c). We calculated that the lifetime risk increased from ~0.5% (average risk) to ~26% in men who have cryptorchidism and who were also in the top 1% of genetic risk (data not shown). This represents an ~50.5-fold increased risk.
Testicular cancer risk variants discriminate retrospectively between cases and controls
With the use of published GWAS and candidate gene data (Nathanson et al., 2005; Kanetsky et al., 2009; Rapley et al., 2009; Turnbull et al., 2010), and by employing the ROC curve analysis (Gail, 2008; Park et al., 2010; Wacholder et al., 2010), we estimated the AUC as a measure of discrimination between TGCT cases and controls. Random classification of case and control subjects is defined as an AUC = 50%, whereas perfect classification leads to an AUC = 100%. We estimated that a TGCT risk model that included seven SNPs and the gr/gr deletion (Table 1) would have an AUC value of 69.2%. This suggested that about 69.2% of the time, a randomly selected patient with TGCT had a higher estimated risk than that for a randomly selected control subject. Notably, this discriminative power is considerably higher than that calculated for other common cancers such as breast, prostate and colon cancer (Gail, 2008; Park et al., 2010; Wacholder et al., 2010).
Testicular cancer risk loci and risk prediction
Based on US cancer incidence rates from 2005 to 2007, ~0.49% of White men born today will be diagnosed with cancer of the testis at some time during their lifetime (1 in 204 men). For US Black men, the lifetime risk is considerably lower (0.08% or 1 in 1250 men) (http://seer.cancer.gov). As most of the men included in the published GWAS were White, our analysis focused on White men. Using published GWAS and candidate study data (Table 1) (Nathanson et al., 2005; Kanetsky et al., 2009; Rapley et al., 2009; Turnbull et al., 2010) and a multiplicative model, we estimated that White men in the top 1% of genetic risk (i.e. on average, 1 of 100 individuals is in the top 1% of genetic risk) had an average relative risk that was ~10.5-fold greater than that of the white male population. In White men aged 30–34 years who were in the top 1% of genetic risk, the estimated 5-year risk of TGCT increased from 0.087%, in the general population, to 0.91% (Fig. 1a and b).
Estimated 5-year risk (%) of developing testicular germ-cell tumours in different age groups. The estimated risk is modestly increased in men with cryptorchidism (panel a), or with an elevated genetic risk (panel b). In contrast, the estimated risk is strongly increased in men with both, cryptorchidism and elevated genetic risk (panel c).
Risk assessment based on clinical risk factors of testicular cancer and risk SNPs
We next addressed the question of whether the magnitude of the predicted risk (and the clinical utility) can be improved by incorporation of both, genetic and clinical risk factors. A previous meta-analysis estimated that the relative risk of TGCT among men with cryptorchidism was 4.8 (95% CI: 4.0–5.7) (Dieckmann & Pichlmeier, 2004). According to a recent cohort study, male factor infertility conferred a significantly increased risk of testicular cancer (standardized incidence ratio 2.8, 95% CI: 1.5–4.8) (Walsh et al., 2009). However, it is currently unknown whether the recently identified TGCT risk loci are associated with TGCT clinical risk factors such as cryptorchidism. Notably, a recent nationwide cohort study from Denmark did not support the hypothesis of shared heritability of cryptorchidism and TGCT (Schnack et al., 2010). Under the assumption that the genetic and clinical risk factors are independent, we estimated that white men aged 30–34 years with a history of cryptorchidism who were also in the top 1% of genetic risk had an estimated 5-year risk of 4.4%, as opposed to 0.087% in the general population (Fig. 1c). We calculated that the lifetime risk increased from ~0.5% (average risk) to ~26% in men who have cryptorchidism and who were also in the top 1% of genetic risk (data not shown). This represents an ~50.5-fold increased risk.
Discussion
In general, genomic risk prediction based on combinations of risk SNPs identified through GWAS has limited clinical utility to date (Pharoah et al., 2008; Kraft & Hunter, 2009; Kraft et al., 2009; Manolio, 2010). Given that TGCTs are highly treatable tumours, a 5-year cancer risk of 0.91% in men within the top 1% of genetic risk may not justify a general population genetic risk assessment. However, under the assumption that clinical factors are not correlated with TGCT risk variants, and do not interact with the TGCT loci as contributors to TGCT risk, a stratified genetic risk assessment (Figs 1c and and2)2) may be worth exploring in men with these clinical risk factors. Other clinical risk factors such as family history and infertility could be also used in a stratified genetic risk assessment approach. However, extensive statistical modelling and additional genotyping of a large systematically ascertained population characterized for relevant phenotypes such as cryptorchidism, infertility etc. are required before such accurate risk estimates are feasible. A risk prediction model that incorporates TGCT loci and clinical factors needs to be checked for calibration in independent cohort data. It has been suggested that before incorporating risk variants into individualized cancer risk assessment, the GWAS findings require validation in prospective studies to confirm the efficacy of these variants in predicting disease risk (Stadler et al., 2010).
Stratified genetic testing approach for testicular germ cell tumours
What can be done clinically once a prediction tool has been developed?
Risk prediction aims at the identification of men who are at elevated risk of developing disease. In contrast, a screening method may detect disease in men who are at increased risk. Notably, a recent evidence review for the U.S. Preventive Services Task Force (USPSTF) found no evidence in the published literature on the benefits or harms of screening for TGCT that would affect the USPSTF’s previous recommendation against screening (Lin & Sharangpani, 2010). Any clinical action that might result from the use of a newly developed TGCT prediction tool would require thorough verification through a clinical trial. Early detection of these tumours resulting in less intense treatment with fewer acute and long-term side effects is the main goal of any TGCT screen. The risk/benefit assessment of any TGCT screening strategy must account for its high cure rates, even for advanced stage disease. Moreover, the screening should be tailored and should take into account the magnitude of the predicted risk, the precision of this estimate and, most importantly, the patients’ preferences and age. Currently available screening tools include testicular self-examination, serial trans-scrotal testicular ultrasounds and/or testicular biopsy. While testicular biopsy can detect carcinoma in situ (CIS), the surgery may be complicated by oedema and haematoma. The other screening modalities are not invasive, but cannot detect CIS. Immunocytological analysis of semen samples is a relatively new screening technique. Although this approach can detect CIS, the sensitivity of this test is still low.
What can be done clinically once a prediction tool has been developed?
Risk prediction aims at the identification of men who are at elevated risk of developing disease. In contrast, a screening method may detect disease in men who are at increased risk. Notably, a recent evidence review for the U.S. Preventive Services Task Force (USPSTF) found no evidence in the published literature on the benefits or harms of screening for TGCT that would affect the USPSTF’s previous recommendation against screening (Lin & Sharangpani, 2010). Any clinical action that might result from the use of a newly developed TGCT prediction tool would require thorough verification through a clinical trial. Early detection of these tumours resulting in less intense treatment with fewer acute and long-term side effects is the main goal of any TGCT screen. The risk/benefit assessment of any TGCT screening strategy must account for its high cure rates, even for advanced stage disease. Moreover, the screening should be tailored and should take into account the magnitude of the predicted risk, the precision of this estimate and, most importantly, the patients’ preferences and age. Currently available screening tools include testicular self-examination, serial trans-scrotal testicular ultrasounds and/or testicular biopsy. While testicular biopsy can detect carcinoma in situ (CIS), the surgery may be complicated by oedema and haematoma. The other screening modalities are not invasive, but cannot detect CIS. Immunocytological analysis of semen samples is a relatively new screening technique. Although this approach can detect CIS, the sensitivity of this test is still low.
Conclusions
We found that a TGCT risk model that included seven SNPs and the gr/gr deletion (Table 1) would have a discriminative power that is considerably higher than that calculated for other common cancers such as breast, prostate and colon cancer. Our results suggest that a stratified genetic risk assessment strategy may be useful for a small group of men with known clinical risk factors for TGCT. Importantly, additional research is required to determine to what extent clinical and genetic risk factors interact and correlate when contributing to disease risk.
Acknowledgments
This work was supported by the Intramural Research Program of the National Institutes of Health and the National Cancer Institute. We are grateful to Dr. Mitchell Gail for his insightful comments.
Summary
Three genome-wide association studies of testicular cancer have uncovered predisposition alleles in or near KITLG, BAK1, SPRY4, TERT, ATF7IP and DMRT1. We investigated whether testicular cancer-risk alleles can be utilized in the clinical setting. We employed the receiver operating characteristic curves for genetic risk models to measure the discriminatory power of a risk variant-based risk model, and found that the newly discovered variants provided a discriminatory power of 69.2%. This suggested that about 69.2% of the time, a randomly selected patient with testicular cancer had a higher estimated risk than the risk for a randomly selected control subject. Using a multiplicative model, we estimated that white men in the top 1% of genetic risk as defined by eight risk variants had a relative risk that was 10.5-fold greater than that for the general white male population. This risk differential does not appear to be clinically useful, given the relative rarity and highly curable nature of testicular germ cell tumour (TGCT). In the authors’ view, a stratified genetic risk assessment strategy might be useful, theoretically, for men who also have independent clinical risk factors for testicular cancer. Several established TGCT risk factors, such as cryptorchidism (RR = 4.8) and male infertility (SIR = 2.8) might prove useful in that context, but we currently do not know whether these testicular cancer-risk loci are associated with, or independent of, such clinical risk factors. More research is required before we can utilize testicular cancer-risk loci for clinically meaningful risk prediction.