Adaptive prediction model in prospective molecular signature-based clinical studies.
Journal: 2014/October - Clinical Cancer Research
ISSN: 1078-0432
Abstract:
Use of molecular profiles and clinical information can help predict which treatment would give the best outcome and survival for each individual patient, and thus guide optimal therapy, which offers great promise for the future of clinical trials and practice. High prediction accuracy is essential for selecting the best treatment plan. The gold standard for evaluating the prediction models is prospective clinical studies, in which patients are enrolled sequentially. However, there is no statistical method using this sequential feature to adapt the prediction model to the current patient cohort. In this article, we propose a reweighted random forest (RWRF) model, which updates the weight of each decision tree whenever additional patient information is available, to account for the potential heterogeneity between training and testing data. A simulation study and a lung cancer example are used to show that the proposed method can adapt the prediction model to current patients' characteristics, and, therefore, can improve prediction accuracy significantly. We also show that the proposed method can identify important and consistent predictive variables. Compared with rebuilding the prediction model, the RWRF updates a well-tested model gradually, and all of the adaptive procedure/parameters used in the RWRF model are prespecified before patient recruitment, which are important practical advantages for prospective clinical studies.
Relations:
Content
Citations
(3)
References
(32)
Grants
(279)
Diseases
(1)
Organisms
(1)
Processes
(2)
Affiliates
(3)
Similar articles
Articles by the same authors
Discussion board
Clin Cancer Res 20(3): 531-539

Adaptive prediction model in prospective molecular-signature-based clinical studies

Introduction

The goal of molecular-signature-based medicine is to use patients’ molecular profiles and clinical information to predict their clinical outcomes such as survival before treatment and thereby select the best possible therapy, which can greatly improve the efficacy and reduce the toxicities of treatments. Recent studies have shown its feasibility and challenges. For example, the gene expression profiles have been used to predict disease prognosis(14) and responses to treatments (5, 6) in multiple types of cancers. The conventional strategy for developing those prediction models is to build a model (also called a predictive signature) from one dataset (called a training set), and then validate the model using one or several independent datasets (called testing or validation sets) (710). In this circumstance, testing the model prospectively, i.e., using prospective studies as the testing dataset, is the most objective and unbiased approach. A major challenge in developing a clinically useful predictive signature is the heterogeneity between the training and testing datasets, which may be caused by different patients’ cohorts and experimental procedures. In addition, the testing dataset in a prospective study is always collected after the training set, so potential batch effects associated with profiling experiments may also lead to the heterogeneity. One feature of a prospective study is that the patients are usually recruited sequentially into the study, so clinical outcomes from the earlier patients accumulate during the study. The conventional approaches use a fixed prediction model, which is built on the training data only, throughout the entire study. Intuitively, such an approach can be less efficient for patients enrolling later in the testing set, as information on patients enrolling earlier in the testing set is not used. It is desirable to have a rigorous and pre-specified mechanism, so that the information accumulated from earlier patients in the study can be utilized to update the prediction model for subsequent patients. In this paper, we develop an adaptive prediction methodology to address these heterogeneity and efficiency issues, by using the accumulated information in the testing cohort to validate and update the prediction model. There are a few related issues for molecular-signature-based clinical studies, such as the sample size and designs. We acknowledge the importance of these issues, but refer to other published studies for more discussions (1114).

Random forest (RF) prediction model (15) is an ensemble learning method using classification trees as the base classifier. For high-dimensional data, RF has comparable or superior performance compared with alternatives(16, 17). In this paper, we introduce a re-weighted random forest (RWRF) model, which gives different weights to decision trees in the prediction model. The weights are adjusted within the clinical study, using the information accumulated from earlier patients. By doing so, the prediction model becomes adapted to the patient cohort in the current study, and the prediction performance will be improved. Figure 1 illustrates the rationale of our approach. Before the study starts, a prediction model (Model 1) is built based on training data. When the first patient is enrolled into the study, Model 1 is used for prediction. Model 1 is also applied to the second patient. Suppose that the clinical outcome (resistant or sensitive to the treatment) of the first patient is available before the third patient is enrolled. This information will be used to update Model 1, by adjusting the weight of each classification tree. The classification trees that correctly predict the outcome of the first patient will have increased weights, and those that predict incorrectly will have decreased weights. As the third patient enters the study, the prediction is made based on the updated model (Model 2). This evaluation and updating process continues whenever new information is available throughout the entire study. The prediction is always made by the newest model, which has been updated using all available information. In this method, the adaptation refers to updating the prediction model, using available patient information to improve the prediction accuracy in the new cohort.

An external file that holds a picture, illustration, etc.
Object name is nihms547866f1.jpg

Illustration of adaptive prediction models for molecular-signature-based clinical studies

Recent studies (18, 19) have shown the benefit of using weighted approaches to combine classifiers. Wolpert (20) developed a stacking method using a weighted average. Pan et al. (21) demonstrated a more general scheme using input-dependent weights. The goal of those methods is to combine different classifiers. The proposed approach shares a similar spirit. In addition, it may significantly advance from the existing approaches by continuously adapting the prediction model to the new patient cohort. In the proposed RWRF method, the weights are updated using the newly available information, and so the prediction model is adjusted to the patients’ characteristics in the new cohort.

A motivating example

Lung Cancer is the leading cause of death from cancer in the United States, with a 5-year survival rate of 15%(2), and non-small-cell lung cancer (NSCLC) accounts for up to 85% of lung cancer deaths(22). The goal of treating late stage NSCLC with chemotherapy is to prolong patients’ survival time with limited toxicities. Current first-line chemotherapy options for patients with advanced NSCLC, such as the combination of a platinum-based agent with Paclitaxel, Gemcitabine, Vinorelbine, or Docetaxel, have substantial toxicity and limited clinical efficacy(23). Gefitinib (Iressa, ZD1839; AstraZeneca, Wilmington, DE) is an orally active epidermal growth factor receptor (EGFR) tyrosine kinase inhibitor, and has been approved by the Food and Drug Administration (FDA) to treat advanced NSCLC. Four phase I studies have shown that Gefitinib is generally well tolerated(2326). NSCLC patients’ responses to Gefitinib are very diverse --- some patients can completely recover from the advanced cancer but others do not respond to the treatment at all. Therefore, identifying the subgroup of patients who will respond to Gefitinib has tremendous clinical benefit for NSCLC treatment. A promising approach to identifying the Gefitinib sensitive patient subgroup is to use genomic profiling, in order to predict tumor sensitivity to Gefitinib. However, it is challenging to develop Gefitinib response predictive signatures using patients’ molecular profiles, since not many patients have been treated with Gefitinib, and have frozen tumor samples available for molecular profiling. Alternatively, the predictive profiles can be generated using cell-line models (in vitro) as a “short cut”(27). For decades, human immortal cancer cell lines have constituted an accessible, easily usable set of biological models with which to investigate cancer biology, and to explore the potential efficacy of anticancer drugs. Nowadays, cancer cell lines have become valuable sources for studying responses to new therapeutic drugs because cell line responses to any treatment can be tested in a laboratory right after the drug development process. To develop a Gefitinib response gene signature, University of Texas Lung Specialized Program of Research Excellence (SPORE) collected a large amount of data on drug response to Gefitinib, and gene expression for 86 NSCLC cell lines. The Gefitinib response of each cell line was measured by 50% inhibition concentration (IC50) values using MTS assay, and each cell line can be categorized as sensitive or resistant to Gefitinib based on its IC50 value. The expression of approximately 43,000 probes from 86 NSCLC cell lines, along with 59 primary tumor samples from NSCLC patients were measured using Affymetrix U133AB GeneChips. The goal of the study is to develop a predictive signature using the cell line data, and then test the signature on the independent datasets with primary tumors from lung cancer patients.

We first checked whether the expression profiles from primary tumors are different from those from cell lines. The gene expression data was processed by robust multichip average (RMA) approach and quantile–quantile normalization(28). All gene expression values were log2-transformed. Average values were used for the different probe sets corresponding to the same gene. Figure 2 shows the hierarchical clustering result of NSCLC lines and lung cancer patients’ primary tumors, based on gene expression profiles. It is clear that the cell lines and patients samples were separated into different clusters, indicating that differences remain even after stringent normalization. These differences between training (cell line) data and testing (patient) data need to be taken into account in prediction models. This motivated us to develop an adaptive prediction method to account for the difference between testing and training data, by gradually adjusting the prediction model using available information.

An external file that holds a picture, illustration, etc.
Object name is nihms547866f2.jpg

Hierarchical clustering after quantile-quantile normalization. Gene expressions from patients are labeled as green color and those from cell line are labeled as red. Even after normalization, the patient and cell line samples from different clusters, indicating difference between patient and cell line gene expression.

Method

Random forest

Random forests are classification and regression methods based on growing an ensemble of many randomized classification trees. For integrity of this article, we give a brief review of the method here. We will denote x as the input variables and y as the outcome variables in a training dataset with size N. For the drug response example, x is the gene expression profile, and y is the drug response status (y=1 for sensitive cases and y= −1 for resistant cases) for lung cancer cell lines. The random forest algorithm proceeds as follows: 1. Randomly select N samples from the original training set with replacement (the bootstrap samples). 2. A tree-based classifier fb(x) is constructed using the b random training set. In the drug response example, fb(x) is a binary function (fb(x) = 1 for the predicted sensitive cases and fb(x) = −1 for the predicted resistant cases). 3. Repeat steps 1 and 2 for B times. 4. The final classifier from the random forest model is determined by the majority vote of all B trees, and the prediction is based on:

frf=sign(1Bb=1Bfb(x))
(1)

The random forest model predicts the probability that a new observation is sensitive to the treatment. The prediction was dichotomized into binary variable using 0.5 as cut-off (i.e. a new observation with a predicted probability of being sensitive greater than 0.5 was predicted as sensitive, and otherwise as resistant).

Reweighted random forest (RWRF) for adaptive prediction

The classifiers from the conventional random forest models are constructed based on the training data only. In this study, we develop a reweighted random forest method to incorporate the information generated from earlier patients in the clinical study, in order to account for the potential heterogeneity between training and testing data, and hence improve prediction performance.

Suppose that the total number of patients in a prospective clinical study is M, in our proposed method, the prediction for the kth patient is based on a weighted average of the classification trees:

fi(xk)=1b=1Bwb,ib=1Bwb,ifb(xk)
(2)

where xk is the gene expression data of patient k, and i denotes the index set of subjects whose clinical outcomes are available when patient k is enrolled in the study (i ≤ k). fb(x) denotes the b classification tree, and its weight in the new prediction model is wb,i. Here, fi(xk) is the prediction model for patient k. In the proposed model, the weight of each individual tree is determined by its performance with the previous patients whose clinical outcomes are available. At the beginning of the study, the weights are set to be equal, i.e., w1,1 = w2,1 = … = wB,1 = 1, so f1(x) is equivalent to frf(x), the standard RF. When the clinical outcome of patient i+1 is available, the weights are adjusted according to the prediction performance of each individual tree by:

wb,i+1 = wb,ieαI[yi+1=f(xi+1)]
(3)

where I(x)is an indicator function, which equals 1 for a correct prediction and 0 otherwise, and α is a positive constant which determines the learning speed. If a tree predicts the outcome of patient i+1 correctly, then its relative weight, wb,i+1wb,i+1in the prediction model for later patients, fi+1(xk), k≥i, will increase, and vice versa. Intuitively, the model gives more weights to the trees with good prediction performance in all previous samples (both in the training and available testing datasets).

To evaluate the role of each variable (i.e., the importance of expression of a single gene) in the prediction models, we define cj,i as the contribution of variable (gene) j in the prediction model i as:

cj,i=b=1Bwb,iqj,b
(4)

where qj,b is the frequency of variable j appearing in the b classification tree, and wb,i is the relative weight of the b classification tree in the prediction model i. In the RWRF model, the contributions of variables change as the study goes on, and we define an adaptive score (AS) for gene j as

ASj=cj,Mcj,1
(5)

where cj,M is the contribution of gene j in the final model, and cj,1 is the contribution of gene j in the initial model.

Random forest

Random forests are classification and regression methods based on growing an ensemble of many randomized classification trees. For integrity of this article, we give a brief review of the method here. We will denote x as the input variables and y as the outcome variables in a training dataset with size N. For the drug response example, x is the gene expression profile, and y is the drug response status (y=1 for sensitive cases and y= −1 for resistant cases) for lung cancer cell lines. The random forest algorithm proceeds as follows: 1. Randomly select N samples from the original training set with replacement (the bootstrap samples). 2. A tree-based classifier fb(x) is constructed using the b random training set. In the drug response example, fb(x) is a binary function (fb(x) = 1 for the predicted sensitive cases and fb(x) = −1 for the predicted resistant cases). 3. Repeat steps 1 and 2 for B times. 4. The final classifier from the random forest model is determined by the majority vote of all B trees, and the prediction is based on:

frf=sign(1Bb=1Bfb(x))
(1)

The random forest model predicts the probability that a new observation is sensitive to the treatment. The prediction was dichotomized into binary variable using 0.5 as cut-off (i.e. a new observation with a predicted probability of being sensitive greater than 0.5 was predicted as sensitive, and otherwise as resistant).

Reweighted random forest (RWRF) for adaptive prediction

The classifiers from the conventional random forest models are constructed based on the training data only. In this study, we develop a reweighted random forest method to incorporate the information generated from earlier patients in the clinical study, in order to account for the potential heterogeneity between training and testing data, and hence improve prediction performance.

Suppose that the total number of patients in a prospective clinical study is M, in our proposed method, the prediction for the kth patient is based on a weighted average of the classification trees:

fi(xk)=1b=1Bwb,ib=1Bwb,ifb(xk)
(2)

where xk is the gene expression data of patient k, and i denotes the index set of subjects whose clinical outcomes are available when patient k is enrolled in the study (i ≤ k). fb(x) denotes the b classification tree, and its weight in the new prediction model is wb,i. Here, fi(xk) is the prediction model for patient k. In the proposed model, the weight of each individual tree is determined by its performance with the previous patients whose clinical outcomes are available. At the beginning of the study, the weights are set to be equal, i.e., w1,1 = w2,1 = … = wB,1 = 1, so f1(x) is equivalent to frf(x), the standard RF. When the clinical outcome of patient i+1 is available, the weights are adjusted according to the prediction performance of each individual tree by:

wb,i+1 = wb,ieαI[yi+1=f(xi+1)]
(3)

where I(x)is an indicator function, which equals 1 for a correct prediction and 0 otherwise, and α is a positive constant which determines the learning speed. If a tree predicts the outcome of patient i+1 correctly, then its relative weight, wb,i+1wb,i+1in the prediction model for later patients, fi+1(xk), k≥i, will increase, and vice versa. Intuitively, the model gives more weights to the trees with good prediction performance in all previous samples (both in the training and available testing datasets).

To evaluate the role of each variable (i.e., the importance of expression of a single gene) in the prediction models, we define cj,i as the contribution of variable (gene) j in the prediction model i as:

cj,i=b=1Bwb,iqj,b
(4)

where qj,b is the frequency of variable j appearing in the b classification tree, and wb,i is the relative weight of the b classification tree in the prediction model i. In the RWRF model, the contributions of variables change as the study goes on, and we define an adaptive score (AS) for gene j as

ASj=cj,Mcj,1
(5)

where cj,M is the contribution of gene j in the final model, and cj,1 is the contribution of gene j in the initial model.

Simulation studies

We used simulation studies to evaluate the performance of the proposed RWRF method. We simulated 100 samples in the training set, and 200 patients enrolled in the study sequentially as the testing set. The outcome variable of the simulation study is sensitive or resistant to treatment, and there are 50 variables. Among those variables, 10 are “good” predictors, which are defined as the genes whose expressions are associated with the outcomes in both the training set and the testing set. These genes are real biomarkers. Another 10 are “unstable” predictors, which were simulated to be associated with the outcomes only in the training set, but not in the testing set. Such unstable predictors are commonly seen in gene signature studies. For example, assume a gene whose expression level is associated with patient outcomes and it is identified as a predictor using the training set. However, the measurement of its expression level in the testing set is very noisy due to some technical problems, so it is not a good predictor in the testing set. In this study, we proposed the adaptive prediction model to minimize the impact of the unstable predictors on the prediction performance. The variables are summarized in Supplementary Table 1. If a variable is a predictor (with a check mark in Supplementary Table 1), then it was simulated from a normal distribution N(0,1) for sensitive and N(1,1) for resistant cases. Otherwise, it was simulated from N(µ,1), where µ is a random variable from Uniform(0,1) distribution, for both sensitive and resistant cases. In this study, the variables X1,…,X10 are good predictors, X11,…,X20 are unstable predictors, and X21, … X50 are noises predictors (i.e., not associated with outcome in the training set).

For simplicity, in the simulation, we assumed that the patients’ clinical outcomes are available immediately after the prediction and treatment, which is close to real practice when the response time is short and accrual rate is low. In simulation studies, we built the RF using training data and predicted the cases in the testing data one by one, using the proposed reweighted random forest approach. The simulations were repeated 1000 times, and the prediction accuracy is the mean accuracy across the 1000 simulations.

Figure 3(A) shows the prediction accuracy versus the number of patients enrolled in the study. The prediction accuracy increases as the study goes on. At the beginning of the study, the overall prediction accuracy is 0.82, and the accuracy increases as the prediction model gradually adapts to the testing data. The improvement is fast at the beginning of the study and slows down as the model is close to fully adapted. The overall accuracy at the end of the study is 0.92, which is significantly increased in comparison to the accuracy at the beginning. Figure 3(A) also compared the performance of RWRF with the learning curve (i.e. prediction accuracy increases with the number of training samples increases) and shows the accuracy improvement from RWRF is much faster than the learning curve.

An external file that holds a picture, illustration, etc.
Object name is nihms547866f3.jpg

(A) Prediction accuracy using RWRF model at different stages of the studies. The accuracy for patient k is the mean accuracy across 1000 simulations for the k patient enrolled in the study. The circles are the mean accuracy for the k patient, and the red line is the smoothed lowess curve for accuracy of the RWRF model. The blue line represents the learning curve (i.e. prediction accuracy increases with the number of training samples increases). For the learning curve, the x-axis is the number increased in the training set. For example, the starting accuracy is 0.82, which is corresponding to the number of training samples is 100 (the original sample size); as the x (the number of additional training samples besides the original 100 training samples) is 50 (i.e. the total size of training set increases to 100+50=150), the accuracy is 0.83; as the x increases to 200 (the total size of training set is 100+200), the accuracy increases to 0.84. (B) Box plots for the adaptive scores for good predictors and unstable predictors in the prediction models. At the end of the study, the contributions from the good predictors increased, while those from unstable predictors decreased.

We also used the receiver operating characteristic (ROC) curves to summarize the average prediction performance in 1000 simulations at different time points, and compared the performance at the beginning and the end of the study. The predicted probability score was used to determine the ROC curves. Sensitivity is the proportion of sensitive cases that were predicted correctly by the prediction model, and specificity is the proportion of resistant cases that were predicted correctly by the model. Supplementary Figure 1 shows the ROC curves for standard RF (black), the RWRF at the beginning of the study (red) and at the end of the study (blue). At the beginning, the RWRF has equal weights for the classification trees, so it has similar performance to the traditional RF. The prediction performance at the end of the study (with area under the curve (AUC) =0.94) improves significantly (p value <0.001) beyond the traditional RF (AUC=0.91) as the prediction model fully adapts to the testing data set.

For variable j, we calculated the adaptive score (defined in Equation 5) to check the contribution changes between the initial model and the final model. Figure 3 (B) shows the adaptive score box-plots for both the good and the unstable predictors. As expected, it shows that the contributions from good predictors increased, and those from unstable predictors decreased as the prediction model adapted to the testing data set. It indicates that the model adapted to the new data by increasing the contributions from the good predictors and decreasing those from unstable predictors.

Analysis of Lung Cancer Data

Now return to our motivating example, where the cell line gene expression and drug sensitivity data were used to predict the clinical response to Gefitinib treatment in lung cancer patients. In this study, we applied the RWRF model to account for the differences between cell line (training) and patient (testing) data. A training set of 86 cell lines were used to build the prediction model, and its performance was evaluated using an independent testing set of 59 patients with lung cancer. Our goal was to develop a prediction model that can work in the prospective clinical studies, where patients are enrolled sequentially and the samples in the training and testing sets are different, since they came from different types of samples. As the clinical outcome of each earlier patient was available, the weights of the classification trees with correct prediction increased, and the weights of the trees with incorrect prediction decreased. By adjusting the weights, the RF built from training data can gradually adapt to testing data. Figure 4 illustrates the flowchart of the adaptive prediction model in clinical setting.

An external file that holds a picture, illustration, etc.
Object name is nihms547866f4.jpg

Flowchart of the adaptive prediction models. The prediction model was built on the cell line data, and used to predict the tumor response of cancer patients. As the clinical outcome of a patient is available, the information will be used to update the prediction model.

In the current study, the patients’ information was retrospectively collected. In order to simulate the prospective studies, we randomly assigned an enrollment date to each patient and used this information to test the proposed method. Previous studies (29, 30) have demonstrated that a patient with an EGFR mutation is likely to response to Gefitinib. Because the responses of patients were not currently available, we used the status of the EGFR mutation, which is a major surrogate biomarker for the response to Gefitinib therapy, in order to evaluate the performance of the proposed model. In this study, we assume a patient with an EGFR mutation will response to Gefitinib therapy. We repeated the procedure 2000 times to derive the average performance at different time points of the study.

To account for the fact that the patients’ gene expression data (testing data) is collected sequentially in prospective clinical studies, we carefully normalized the gene expression data as follows: First the training data (cell line gene expression data) was normalized using quantile normalization(28), then the new (patient) gene expression data was normalized one by one, in order to have the same distribution as the expression of the training data. In this fashion, we normalized the training and testing data in the same way as much as possible without assuming that the testing data come altogether. The genes with low variability (standard error less than 1) in the cell line data were removed and the remaining 1473 genes were used as predictors to build the RF model.

Figure 5 (A) shows the improvement of the overall accuracy at different stages of the study. The accuracy at the beginning of the study was 0.74, and increased to 0.84 when the number of accumulated patients reached 40. Figure 5 (B) presents the ROC curves for standard RF, as well as the starting performance and ending performance of the RWRF model. The ROC curves of the RWRF represent the average performance of 2000 simulations. It shows that the performance at the beginning of the study is the same as that of the standard RF, as expected. As the study goes on, the weights of the classification trees were adjusted, using the information accumulated from earlier patients. The performance of the final model improved significantly from the initial model, as the prediction model adapted to the patient data.

An external file that holds a picture, illustration, etc.
Object name is nihms547866f5.jpg

(A) The prediction accuracy vs. the number of patients enrolled in the study. The circle is the estimated accuracy obtained by averaging across 2000 simulation runs, and red line is smoothed lowess curve. The accuracy at the beginning of the trial was 0.74 and increased as the patients accumulate. The accuracy saturated at 0.84 as the number of patients is about 40. (B) ROC curves for the prediction of patients' response to Gefitinib treatment. The black, red and blue lines represent the standard RF, the starting performance of re-weighted RF and ending performance of re-weighted RF. As expected, the starting performance of re-weighted RF performs the same as standard RF and the prediction performance has significantly improved at the end of the study.

We also checked the genes with increased adaptive scores in the prediction model, in order to see whether they are cancer related genes. 13 genes have adaptive scores bigger than a 30-fold increase in their contributions to prediction. As summarized in Supplementary Table 2, eleven out of 13 genes have been shown to be associated with cancer diagnosis and prognosis in other clinical studies. For example, gene ZFHX1B is an important transcriptional repressor in the EGFR pathway(31), which is the targeting pathway of Gefitinib treatment, THBS1 (Homo sapiens thrombospondin 1) is a tumorgenesis gene associated with the prognosis and drug response in many types of cancer(32), and CD24 (Homo sapiens CD24 antigen: small cell lung carcinoma cluster 4 antigen) is a prognostic marker of survival in NSCLC (33) and other cancer types. On the other hand, most genes with decreased weight (indicating they are unstable predictors) have been found to be differentially expressed between cell line data and patient data. For example, gene IFITM2 had the greatest reduction in weight among all the genes, and was differentially expressed between cell line and patient (p value < 0.0001). These results indicate that the proposed model will increase the weights of good predictors and decrease the weights of unstable predictors, as expected. The ability of identifying the important genes for translational study automatically diminishes the effect of the unstable predictors, which is a key advantage of having the new prediction model smoothly “evolved” from the original model built from the cell line data.

Discussion

In real practice, it is challenging to use the model developed from one data source to predict the outcome from another data source. It is especially difficult in molecular-signature-based clinical studies, because microarray datasets are variable and the expression measurements tend to be different from dataset to dataset. In this paper, we proposed a RWRF model to account for the differences between training data and testing data. A key feature of the proposed method is to use the outcomes from earlier patients in the clinical study, in order to adapt the prediction model to the current study cohort.

The procedure for updating the weights in our proposed method was inspired by ADAboost(34), which uses the multiplicative rule to update the weights. In ADAboost, the weights are adjusted for different observations to put more weights on the misclassified observations in previous iteration. In each iteration, the successive classifier becomes more focused on those observations misclassified by the previous one(35). In our proposed method, the weights are adjusted for different classification trees in RF, to increase the weights of the trees with correct prediction in previous patients. The trees with correct prediction will gain more weights in the successive classifier. The trees suitable for translation from cell line to patient will increase in weight, which forces the classifier gradually adapted from cell line expression profile to patient expression profile. Intuitively, the multiplicative rule allows some classification trees that do not fit the new testing set to be gradually excluded from the model by having their weights exponentially decay to zero.

Our proposed method is similar to the stacking method(20) for model averaging but serves for different purposes. In stacking, the final classifier is b=1Bwbstfb(x), and the stacking weight wst is determined by wst=arg mini=1m[yib=1Bwbstfib(x)]2, where the fib(x)is the prediction of observation i made by using a dataset without the ith training observation(20). In stacking method, the weights are determined by cross validation. The stacking method can outperform each individual classifier by weighting them appropriately. In signature-based clinical studies, cross validation is not an option as the patients are enrolled sequentially. In our proposed method, the weights are determined by the performance of each individual tree in previous patients in the clinical study, in order to make the classifier adapt to current patients’ characteristics.

Our proposed method used a similar method as ADAboost for adjusting the weights, and we use α = 0.1 to illustrate the idea. The parameter α in Equation 3 controls the speed of learning. If α is large (α >1), the model will adapt quickly, but may lose stability. On the other hand, if α is small (α <0.1) the model adapt to new data slowly, but the prediction is relatively stable. As long as α has a moderate value (0.1~1), the model performs reasonably well. In practice, similar to adaptive designs for clinical trials, it may be necessary to conduct extensive simulation studies to pick a α value that gives the best operation characteristics. In original ADAboost, α is a function of prediction error, and in RF prediction model, the prediction accuracy can be estimated internally using out-of-bag (OOB) estimator. So, we are studying how to control the learning speed using OOB estimation of prediction accuracy. If the prediction accuracy is much lower than that in the training data, indicating a large difference between the training and testing data, then α should be large to make the prediction model adapt quickly. On the other hand, if the prediction accuracies in the training and testing data are close, then α should be small to decrease the learning speed and gain stability.

To make use of newly acquired data in the prediction model, instead of gradually adjusting, a tempting alternative is to rebuild the entire prediction model using newly acquired data as a part of the training set whenever new informative is available. The major problem with totally rebuilding a prediction model for a clinical study is its instability. A prediction model must go through a series of tests and validation steps before use in the genomic signature based clinical studies. However, these tests are time consuming and we cannot afford to test and validate each time the model is built. In real practice, we would rather make gradual updates on a well-tested and validated prediction model, than totally re-build it. Furthermore, in our proposed method, the new model and the initial model are based on the same set of genes, and the only difference is the weight of the classification tree that makes the model more stable. Another advantage of this method is that it can automatically select genes that are important in the translational study from cell line to patient. The genes identified in the real data example have been shown to be important cancer genes and are of great interests for future biological studies. In addition, all of the adaptive procedures/parameters used in the RWRF model are predefined before recruiting the new cohort, which is an important practical advantage for prospective clinical studies using adaptive designs (36, 37).

It is worth noting that, as the adaptive prediction model gradually improves the prediction accuracy, the patients who enter early in the study are less likely to be correctly assigned to effective therapy than patients who enter late in the study. The patients entered in the study benefit from the adaptive prediction model with better accuracy, but the early enrolled patients benefit less while the late enrolled patients benefit more. One possible alternative to the proposed adaptive prediction model is to use an "adaptation" dataset, which contains molecular profiles and clinical outcomes for patients. This adaptation dataset would be used to adapt the model to new situations, but no patient would be assigned based on the model prediction until a predetermined threshold of acceptable predictive accuracy is reached.

In summary, the proposed RWRF model can effectively adapt the predictive models to current patients’ characteristics and therefore improve prediction accuracy significantly. The RWRF model provides a rigorous statistical framework with predefined procedures, in order to account for the potential heterogeneity between the training and testing cohorts. The method can facilitate using molecular signatures to predict the clinical outcomes of patients in prospective clinical studies.

Supplementary Material

1

1

Click here to view.(102K, pdf)

Acknowledgements

This work was supported by NIH grants 5R01CA152301 to YX, 1R01CA172211 to GX and YX, 4R33DA027592 to GX, University of Texas SPORE in Lung Cancer (P50CA70907) to JDM and YX, and Cancer Prevention Research Institute of Texas award RP101251 to GX and YX.

Quantitative Biomedical Research Center, Department of Clinical Sciences, University of Texas Southwestern Medical Center.
Department of Biostatistics, School of Public Health, Yale University.
Simmons Cancer Center, University of Texas Southwestern Medical Center.
Department of Internal Medicine, University of Texas Southwestern Medical Center.
Department of Pharmacology, University of Texas Southwestern Medical Center.
Hamon Center for Therapeutic Oncology, University of Texas Southwestern Medical Center.
Corresponding Author: Yang Xie, M.D, PhD, Quantitative Biomedical Research Center, Department of Clinical Sciences, Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX 75390; ude.nretsewhtuoSTU@eiX.gnaY

Abstract

Use of molecular profiles and clinical information can help predict which treatment would give the best outcome and survival for each individual patient, and thus guide optimal therapy, which offers great promise for the future of clinical trials and practice. High prediction accuracy is essential for selecting the best treatment plan. The gold standard for evaluating the prediction models is prospective clinical studies, where patients are enrolled sequentially. However, there is no statistical method utilizing this sequential feature to adapt the prediction model to the current patient cohort. In this paper, we proposed a re-weighted random forest (RWRF) model, which updates the weight of each decision tree whenever additional patient information is available, in order to account for the potential heterogeneity between training and testing data. A simulation study and a lung cancer example were used to show that the proposed method can adapt the prediction model to current patients’ characteristics, and therefore improve prediction accuracy significantly. We also showed that the proposed method can identify important and consistent predictive variables. Compared to rebuilding the prediction model, the RWRF updates a well-tested model gradually, and all of the adaptive procedure/parameters used in the RWRF model are pre-specified before patient recruitment, which are important practical advantages for prospective clinical studies.

Keywords: adaptive prediction model, high dimensional data, molecular signature, random forest, predicting clinical outcomes
Abstract

Footnotes

Disclosure of Potential Conflicts of Interest

No potential conflicts of interest were disclosed.

Footnotes

References

  • 1. Huang E, Cheng SH, Dressman H, Pittman J, Tsou MH, Horng CF, et al Gene expression predictors of breast cancer outcomes. Lancet. 2003;361:1590–1596.[PubMed][Google Scholar]
  • 2. Michiels S, Koscielny S, Hill CPrediction of cancer outcome with microarrays: A multiple random validation strategy. Lancet. 2005;365:488–492.[PubMed][Google Scholar]
  • 3. Nevins JR, Huang ES, Dressman H, Pittman J, Huang AT, West MTowards integrated clinico-genomic models for personalized medicine: Combining gene expression signatures and clinical factors in breast cancer outcomes prediction. Hum Mol Genet. 2003;12(Spec No 2):R153–R157.[PubMed][Google Scholar]
  • 4. Xie Y, Xiao G, Coombes KR, Behrens C, Solis LM, Raso G, et al Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients. Clin Cancer Res. 2011;17:5705–5714.[Google Scholar]
  • 5. Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG, Gutierrez MC, Elledge R, et al Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet. 2003;362:362–369.[PubMed][Google Scholar]
  • 6. Tang H, Xiao G, Behrens C, Schiller J, Allen J, Chow CW, et al A 12-gene set predicts survival benefits from adjuvant chemotherapy in non-small cell lung cancer patients. Clin Cancer Res. 2013;19:1577–1586.[Google Scholar]
  • 7. Minna JD, Girard L, Xie YTumor mrna expression profiles predict responses to chemotherapy. J Clin Oncol. 2007;25:4329–4336.[PubMed][Google Scholar]
  • 8. Xie Y, Minna JDPredicting the future for people with lung cancer. Nat Med. 2008;14:812–813.[Google Scholar]
  • 9. Xie Y, Minna JDNon-small-cell lung cancer mrna expression signature predicting response to adjuvant chemotherapy. J Clin Oncol. 2010;28:4404–4407.[PubMed][Google Scholar]
  • 10. Xie Y, Minna JDA lung cancer molecular prognostic test ready for prime time. Lancet. 2012;379:785–787.[Google Scholar]
  • 11. Freidlin B, Simon RAdaptive signature design: An adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin Cancer Res. 2005;11:7872–7878.[PubMed][Google Scholar]
  • 12. Sargent DJ, Conley BA, Allegra C, Collette LClinical trial designs for predictive marker validation in cancer treatment trials. J Clin Oncol. 2005;23:2020–2027.[PubMed][Google Scholar]
  • 13. Simon R, Wang SJUse of genomic signatures in therapeutics development in oncology and other diseases. Pharmacogenomics J. 2006;6:166–173.[PubMed][Google Scholar]
  • 14. Wang SJBiomarker as a classifier in pharmacogenomics clinical trials: A tribute to 30th anniversary of psi. Pharm Stat. 2007;6:283–296.[PubMed][Google Scholar]
  • 15. Breiman LRandom forests. Machine Learning. 2001;45:5–32.[PubMed][Google Scholar]
  • 16. Diaz-Uriarte R, Alvarez de Andres SGene selection and classification of microarray data using random forest. BMC Bioinformatics. 2006;7:3.[Google Scholar]
  • 17. Wu B, Abbott T, Fishman D, McMurray W, Mor G, Stone K, et al Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics. 2003 19 %6:1636–43%&amp;. [[PubMed][Google Scholar]
  • 18. Huang X, Pan W, Han X, Chen Y, Miller LW, Hall JBorrowing information from relevant microarray studies for sample classification using weighted partial least squares. Comput Biol Chem. 2005;29:204–211.[PubMed][Google Scholar]
  • 19. Zhang Z, Chen D, Fenstermacher DAIntegrated analysis of independent gene expression microarray datasets improves the predictability of breast cancer outcome. BMC Genomics. 2007;8:331.[Google Scholar]
  • 20. Wolpert DHStacked generalization. Neural networks. 1992;5:241–259.[PubMed][Google Scholar]
  • 21. Pan W, Xiao G, Huang XUsing input dependent weights for model combination and model selection with multiple sources of data. Statistica Sinica. 2006;16:523–540.[PubMed][Google Scholar]
  • 22. Tsuboi M, Ohira T, Saji H, Miyajima K, Kajiwara N, Uchida O, et al The present status of postoperative adjuvant chemotherapy for completely resected non-small cell lung cancer. Ann Thorac Cardiovasc Surg. 2007;13:73–77.[PubMed][Google Scholar]
  • 23. Herbst RS, Maddox AM, Rothenberg ML, Small EJ, Rubin EH, Baselga J, et al Selective oral epidermal growth factor receptor tyrosine kinase inhibitor zd1839 is generally well-tolerated and has activity in non-small-cell lung cancer and other solid tumors: Results of a phase i trial. J Clin Oncol. 2002;20:3815–3825.[PubMed][Google Scholar]
  • 24. Baselga J, Rischin D, Ranson M, Calvert H, Raymond E, Kieback DG, et al Phase i safety, pharmacokinetic, and pharmacodynamic trial of zd1839, a selective oral epidermal growth factor receptor tyrosine kinase inhibitor, in patients with five selected solid tumor types. J Clin Oncol. 2002;20:4292–4302.[PubMed][Google Scholar]
  • 25. Nakagawa K, Tamura T, Negoro S, Kudoh S, Yamamoto N, Takeda K, et al Phase i pharmacokinetic trial of the selective oral epidermal growth factor receptor tyrosine kinase inhibitor gefitinib ('iressa', zd1839) in japanese patients with solid malignant tumors. Ann Oncol. 2003;14:922–930.[PubMed][Google Scholar]
  • 26. Ranson M, Hammond LA, Ferry D, Kris M, Tullo A, Murray PI, et al Zd1839, a selective oral epidermal growth factor receptor-tyrosine kinase inhibitor, is well tolerated and active in patients with solid, malignant tumors: Results of a phase i trial. J Clin Oncol. 2002;20:2240–2250.[PubMed][Google Scholar]
  • 27. van't Veer LJ, Bernards REnabling personalized cancer medicine through analysis of gene-expression patterns. Nature. 2008;452:564–570.[PubMed][Google Scholar]
  • 28. Bolstad BM, Irizarry RA, Astrand M, Speed TPA comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193.[PubMed][Google Scholar]
  • 29. Minna JD, Gazdar AF, Sprang SR, Herz J. Cancer. A bull's eye for targeted lung cancer therapy. Science. 2004;304:1458–1461.[PubMed]
  • 30. Paez JG, Janne PA, Lee JC, Tracy S, Greulich H, Gabriel S, et al Egfr mutations in lung cancer: Correlation with clinical response to gefitinib therapy. Science. 2004;304:1497–1500.[PubMed][Google Scholar]
  • 31. Garcia JAHifing the brakes: Therapeutic opportunities for treatment of human malignancies. Sci STKE. 2006;2006:25.[PubMed][Google Scholar]
  • 32. Dudek AZ, Mahaseth HCirculating angiogenic cytokines in patients with advanced non-small cell lung cancer: Correlation with treatment response and survival. Cancer Invest. 2005;23:193–200.[PubMed][Google Scholar]
  • 33. Kristiansen G, Schluns K, Yongwei Y, Denkert C, Dietel M, Petersen ICd24 is an independent prognostic marker of survival in nonsmall cell lung cancer patients. Br J Cancer. 2003;88:231–236.[Google Scholar]
  • 34. Freund Y, Schapire RE Experiments with a new boosting algorithm. 1996. pp. 148–156. vol.: edn. [PubMed][Google Scholar]
  • 35. Hastie T, R T, J F The elements of statistical learning. New York: Springer-Verlag; 2001. [PubMed][Google Scholar]
  • 36. McShane L, Cavenagh M, Lively T, Eberhard D, Bigbee W, Williams P, et al Criteria for the use of omics-based predictors in clinical trials: Explanation and elaboration. BMC Medicine. 2013;11:220.[Google Scholar]
  • 37. McShane LM, Cavenagh MM, Lively TG, Eberhard DA, Bigbee WL, Williams PM, et al Criteria for the use of omics-based predictors in clinical trials. Nature. 2013;502:317–320.[Google Scholar]
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.