Short-term prediction of secondary progression in a sliding window: A test of a predicting algorithm in a validation cohort
Introduction
The outcome spectrum in multiple sclerosis (MS) is diverse; it spans from clinically monosymptomatic MS to MS-related death.1 By consensus, it is accepted that a conversion from the relapsing–remitting MS (RRMS) phase to the secondary progression (SP) phase marks the onset of continuous disability accrual.2,3 Several survival and proportional hazards models have provided medium- to long-range predictions of outcomes (e.g. SP transition or disability milestones), based on clinical markers that appear in the early phases of MS.4–10 Some studies found that the initial relapse phenotype could predict the long-term outcome,9,11 however, that effect was not confirmed in other reports.10 The relationship between relapses and disability remains controversial. Recent studies compared the accrual of disability in periods with or without relapses and found that relapses were associated with an increase in periods of disability worsening.12–14 However, this association was not confirmed when relapses were related to 1-year periods of sustained worsening or with successive periods defined with the Expanded Disability Status Scale (EDSS).15,16 Here, we examined the association between relapses, age and SP onset.
The Multiple Sclerosis Prediction Score (MSPS)17 is based on the same principle that was used for the Fracture Risk Assessment Tool (FRAX) online prediction of osteoporosis.18 The MSPS estimates, at 10 points per year, the immediate risk of conversion to an SP course. It is based on commonly available clinical data, including the severity of the most recent relapse. The MSPS model was derived from a longitudinal follow-up of the Gothenburg Incidence Cohort (GIC).19 In the present study, we aimed to validate this model with an essentially untreated Swedish cohort, the Uppsala MS cohort (UMS).
Methods
The UMS included patients in the Uppsala region that were registered in the Swedish National MS Registry (www.neuroreg.se).20,21 Uppsala joined the registry in 2001, and patient data for our study period were entered retrospectively, during 2001–2002. These data were retrieved from the medical records stored at the Uppsala University Neurology Department, including records from other hospitals on patients that migrated to the Uppsala region after disease onset. The Uppsala department had the highest frequency of registered hospital visits with MS attack information of any centre in Sweden. In the present study, all patients with RRMS in the GIC and UMS were included that fulfilled the Poser criteria.22 Patients were excluded when SP occurred before the second distinct attack. For the present study, patient data were included in a database (matrix) starting from the second attack (the diagnostic event) and ending at detection of SP, censoring due to death or the last examination before study termination. From the UMS we included patients with onset of MS from January 1, 1975 and termination of follow-up December 31, 2000. We previously derived the MSPS, based on data from the GIC with 50 years of longitudinal follow-up data. However, in the present validation study, the UMS-data were aligned only with the first 25 years of each GIC patient.
During this follow-up, no disease-modifying therapy (DMT) was used in the GIC, but in the UMS, a few patients received ‘first generation DMT’ (IFN-beta or glatiramer acetate) 1996–1999 (covering 99/1762 patient-years). Definitions of SP and MS attacks were described previously17 (see text box).
Outcome data
Complete relapse information was available in 84% in the GIC and >99% in the UMS. The proportion of attacks in each severity grade were similar in the UMS (grade 0: 21%, grade 1: 47%, grade 2: 32%) and GIC (grade 0: 18%, grade 1: 43%, grade 2: 38%).
The MSPS was calculated yearly for each patient, based on current age, time since last relapse, and severity grade of the last relapse17 (Figure 1). The first MSPS was recorded 1st January after the second attack, and the last 24 years later. Patients were excluded when they entered SP or were censored during the same calendar year as the second attack, or when they were censored during the following year. Based on these criteria, the validation included 145 patients in the UMS and 144 in the GIC. The timescale for SP was 1 year.
Statistical methods and primary statistical analysis
The derivation (GIC) and validation (UMS) cohorts were described in terms of the median time to SP, evaluated by Kaplan–Meier estimates (SPSS version 22). Established predictors for time to SP were tested by log-rank test.19
To evaluate the MSPS, we constructed a matrix with yearly MSPS values (GIC = 1902, UMS n = 1101) and observed outcomes (SP or not) during the following year (1-year test period). Next, periods with predetermined MSPS strata (<0.025, 0.025–0.05, etc.) were summed. In each stratum the expected number of SP transitions was aligned with the number of annual observed SP transitions. The dispersions between expected and observed SP-transitions in each stratum of the UMS were calculated with chi-square analysis.
This study was approved by the Gothenburg Research Ethics Committee DNR 2016-08-15
Outcome data
Complete relapse information was available in 84% in the GIC and >99% in the UMS. The proportion of attacks in each severity grade were similar in the UMS (grade 0: 21%, grade 1: 47%, grade 2: 32%) and GIC (grade 0: 18%, grade 1: 43%, grade 2: 38%).
The MSPS was calculated yearly for each patient, based on current age, time since last relapse, and severity grade of the last relapse17 (Figure 1). The first MSPS was recorded 1st January after the second attack, and the last 24 years later. Patients were excluded when they entered SP or were censored during the same calendar year as the second attack, or when they were censored during the following year. Based on these criteria, the validation included 145 patients in the UMS and 144 in the GIC. The timescale for SP was 1 year.
Statistical methods and primary statistical analysis
The derivation (GIC) and validation (UMS) cohorts were described in terms of the median time to SP, evaluated by Kaplan–Meier estimates (SPSS version 22). Established predictors for time to SP were tested by log-rank test.19
To evaluate the MSPS, we constructed a matrix with yearly MSPS values (GIC = 1902, UMS n = 1101) and observed outcomes (SP or not) during the following year (1-year test period). Next, periods with predetermined MSPS strata (<0.025, 0.025–0.05, etc.) were summed. In each stratum the expected number of SP transitions was aligned with the number of annual observed SP transitions. The dispersions between expected and observed SP-transitions in each stratum of the UMS were calculated with chi-square analysis.
This study was approved by the Gothenburg Research Ethics Committee DNR 2016-08-15
Results
Comparison of the cohorts
The Kaplan–Meier estimates (Figure 2) indicated that the UMS tended to experience a lower frequency of SP transitions than the GIC; the median time to SP from the second attack was longer in the UMS than in the GIC (15.0 vs. 11.5 years, p = 0.19), and the median age at SP tended to be higher in the UMS than in the GIC (50.9 vs. 46.0 years, p = 0.066) (Table 1). Four factors that previously predicted the onset and 5-year limit in the GIC (sex, polyfocality, efferent/afferent symptoms, and complete remission8) could not predict the time from the second attack to SP in the UMS.
Table 1.
Characteristics | UMS | GIC |
---|---|---|
At least a 2nd attack before the outcome (n) | 172 | 156 |
Females | 76% | 65% |
Mean age at 2nd attack, y (SD) | 33 (9.1) | 33 (9.9) |
Median time to SP, from 1st attack, y, KM (95%CI) | 16.7 (13.5–19.8) | 15.0 (12.7–17.3) |
Median time to SP, from 2nd attack, y, KM (95%CI) | 15.0 (10.9–19.1) | 11.5 (9.2–13.8) |
Median age at SP, y, KM (95%CI) | 50.9 (46.8–55.0) | 46.0 (43.3–47.7) |
Transition to SP during 25-year follow-up, KM | 66% | 69% |
Patients with SP or censored during the period from the 2nd attack to the first prediction point (January 1st), censored during the first follow-up year, or lacking attack information (n) | 27 | 12 |
Patients included in the validation years 0 to 24 (n) | 145 | 144 |
Transition to SP in years 0 to 24 (n) | 54 | 100 |
Median time to SP, from 2nd attack, y, KM (95%CI) | 14.5 (10.4–18.6) | 11.7 (9.8–13.6) |
Number of attacks during the 0–24 years from 2nd attack (including the 2nd attack) | ||
Attacks that did not fulfil the criteria (n) | 6 | 91 |
Attacks that fulfilled the criteria and were included in the validation (n) | 627 | 478 |
UMS = Uppsala MS cohort, GIC = Gothenburg Incidence Cohort, SP = secondary progression, KM = Kaplan–Meier estimate.
Internal validation of the prediction score
As described in a previous study17 the MSPS gave a prediction of the immediate risk of SP at time point t, for a particular patient, which was interpreted as the instantaneous risk intensity at time t. These risk intensities were expressed on a yearly scale; thus when the risk intensity was constant during the following year, the probability of SP during that year would be approximately equal to the risk intensity (slightly smaller, particularly with high risk intensity).
To investigate whether it remained valid to use the immediate updated intensity as a prediction of the probability of disease incidence during the coming year, we re-evaluated the first 25 years of data from the GIC.17 For each patient-year combination i (of the matrix) we designated the expected probability (the MSPS) and the observed SP-transitions as either 1 or 0, based on whether SP took place or not, respectively, for that patient during the following year. Next, we sorted the -pairs into six interval strata of MSPS scores, ordered from low to high , and we summed the components for each strata (Table 2).
Table 2.
MSPS defined risk strata | One-year periods | Observed | Expectedb |
---|---|---|---|
<0.025 | 186 | 1 | 3.1 |
0.025–0.05 | 722 | 18 | 27.4 |
0.05–0.075 | 400 | 28 | 25.0 |
0.075–0.10 | 328 | 25 | 28.2 |
0.10–0.125 | 95 | 9 | 10.4 |
>0.125 | 171 | 19 | 26.5 |
TOTAL | 1902 | 100 | 120.6 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the expected column indicates the sum of the predicted probabilities that SP would occur during the following year, for the patient-years indicated in each strata.
The total number of observed SP events (n = 100) was substantially smaller than expected (n = 120.6), particularly in the two largest risk groups. The ratio, k, of the total observed SPs/expected SPs, was calculated as follows:
This ratio was used as a simple scale-calibration, and the results provided a better fit to the data. We illustrated this fit by sorting the calibrated pairs of expected and observed SPs in increasing order of and displaying the partial sums of these components in a graph. The plotted curve was near the diagonal (Figure 3), which indicated that the scale-calibrated MSPS was valid for both the high- and low-risk strata.
After the 25-year censoring time point, nine SP events occurred during 534 patient-years in the GIC, which were not included in the present study. Most of these patient-years represented low-risk strata, which could partially explain the overestimation of the MSPS observed in the first part of the plot in Figure 3.
Evaluations of Gothenburg predictions on Uppsala data
We performed an external statistical validation of the Gothenburg-based method, and the Gothenburg-calibrated Gothenburg-based method, with the independent Uppsala data set.
We first evaluated the observed and expected SP incidences that occurred in the UMS during the coming year (Table 3).
Table 3.
MSPS defined risk strata | One-year periods | Observed | Expectedb |
---|---|---|---|
<0.025 | 43 | 1 | 0.7 |
0.025–0.05 | 263 | 8 | 10.8 |
0.05–0.075 | 267 | 12 | 17.1 |
0.075–0.10 | 262 | 10 | 22.4 |
0.10–0.125 | 68 | 5 | 7.5 |
>0.125 | 198 | 18 | 31.7 |
TOTAL | 1101 | 54 | 90.2 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the expected column indicates the sum of the predicted probabilities that SP would occur during following year, for the patient-years indicated in each strata.
Next, we considered whether there was a simple way to test how well the observed and predictive (expected) values in Table 3 coincided. First, we assumed that all the 0–1-prediction experiments were fixed in both number and probabilities; thus all were independent, and the risks were small and perfectly estimated the probabilities. Then, we could approximate as independent standardised normally distributed random variables and applied a chi-square goodness-of-fit test. Basically, the would be approximately independent and the E(j) would be Poisson-distributed (). To reduce the amount of the data we amalgamated the two smallest and the two largest risk strata (Table 4).
Table 4.
MSPS defined risk strata | Observed | Expectedb |
---|---|---|
<0.05 | 9 | 11.5 |
0.05–0.075 | 12 | 17.1 |
0.075–0.10 | 10 | 22.4 |
>0.10 | 23 | 39.2 |
TOTAL | 54 | 90.2 |
T = 15.62c | p-value = 0.0036 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the expected column indicates the sum of the predicted probabilities of SP occurrence during the following year, for the patient-years indicated in each strata; the observed statistic T and the corresponding p-value were derived from a chi-square test. Here, we used 4 degrees of freedom.
The goodness-of-fit test statistic was calculated as follows:
and the degree of freedom was set to 4. The quite low p-value (0.0036, Table 4) indicated that the model did not fit the data well. Thus, the Gothenburg-based MSPS predictions tended to overestimate the risks.
Next we asked: Would the Gothenburg-calibrated method (multiplying the expected number by 0.829) work better? The chi-square(4) statistic value was 7.1 (p = 0.069) for the Gothenburg-calibrated Uppsala data (no table reported). Again, this result was much lower than predicted (i.e. 75 and 54, for predicted and observed cases, respectively).
Another risk-scaling-variable was the total number of cases divided by the expected number of cases (ratio = 0.599) in the Uppsala material (Table 5). Here the values in the idealised independent Poisson approximation model were conditional on the total number of cases. This can be interpreted as the outcome of a multinomial with probabilities proportional to the original expected number for any unconditional model, where the parameters are proportional to the original prediction probabilities with an arbitrary proportionality parameter. This proportionality parameter is estimated by the ratio 54/90 = 0.599.
Table 5.
MSPS defined risk strata | Observed | Modified expectedb |
---|---|---|
<0.05 | 9 | 6.9 |
0.05–0.075 | 12 | 10.2 |
0.075–0.10 | 10 | 13.4 |
>0.10 | 23 | 23.5 |
TOTAL | 54 | 54 |
T ′ = 1.83c | p-value = 0.61 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the Modified expected column indicates the sum of the predicted probabilities of SP occurrence during the following year, for the patient-years indicated in each strata, calibrated by multiplying by the factor, k = 0.599; the observed statistic T ′ and the corresponding p-value were derived from a chi-square test. Here we used 3 degrees of freedom.
For this idealised Poisson model with a scaling factor, the relevant conditional chi-square test, now with 3 degrees of freedom, was calculated as follows:
The resulting observed test statistic showed a good fit to the data (i.e. p-value = 0.61, which indicated an insignificant difference between the model and the data). Figure 4 shows a cumulative plot of the Uppsala-calibrated predictions for the Uppsala data.
Remarks about the idealised modelling
In reality both N(j) and E(j) are random and dependent between the strata. The convenient Poisson approximation used in the derivation of the MSPS has to be substituted by independent normal (0,1) approximations of the components under the null hypothesis of a perfect time-updated prediction model that produces prediction probabilities e() for person-year combinations, . This approximation worked well for large numbers of patients. And it could be used to derive the approximate chi-square(4) distributions for the goodness-of-fit tests on the UMS validation cohort, using either the original Gothenburg-calibrated, or the Gothenburg-calibrated, Gothenburg-estimated prediction models. This normal approximation could also be used to derive the chi-square(3) distribution of T´, in a model where the prediction model is perfect for an unknown positive scaling factor. However we cannot derive that approximation for the conditional distribution using the idealised normal approximation; instead, we have to interpret the outcome as a result of the unconditional distribution of:
where N is the total number of SPs (observed), and E is the total of expected SPs (sum of MSPSs).
Comparison of the cohorts
The Kaplan–Meier estimates (Figure 2) indicated that the UMS tended to experience a lower frequency of SP transitions than the GIC; the median time to SP from the second attack was longer in the UMS than in the GIC (15.0 vs. 11.5 years, p = 0.19), and the median age at SP tended to be higher in the UMS than in the GIC (50.9 vs. 46.0 years, p = 0.066) (Table 1). Four factors that previously predicted the onset and 5-year limit in the GIC (sex, polyfocality, efferent/afferent symptoms, and complete remission8) could not predict the time from the second attack to SP in the UMS.
Table 1.
Characteristics | UMS | GIC |
---|---|---|
At least a 2nd attack before the outcome (n) | 172 | 156 |
Females | 76% | 65% |
Mean age at 2nd attack, y (SD) | 33 (9.1) | 33 (9.9) |
Median time to SP, from 1st attack, y, KM (95%CI) | 16.7 (13.5–19.8) | 15.0 (12.7–17.3) |
Median time to SP, from 2nd attack, y, KM (95%CI) | 15.0 (10.9–19.1) | 11.5 (9.2–13.8) |
Median age at SP, y, KM (95%CI) | 50.9 (46.8–55.0) | 46.0 (43.3–47.7) |
Transition to SP during 25-year follow-up, KM | 66% | 69% |
Patients with SP or censored during the period from the 2nd attack to the first prediction point (January 1st), censored during the first follow-up year, or lacking attack information (n) | 27 | 12 |
Patients included in the validation years 0 to 24 (n) | 145 | 144 |
Transition to SP in years 0 to 24 (n) | 54 | 100 |
Median time to SP, from 2nd attack, y, KM (95%CI) | 14.5 (10.4–18.6) | 11.7 (9.8–13.6) |
Number of attacks during the 0–24 years from 2nd attack (including the 2nd attack) | ||
Attacks that did not fulfil the criteria (n) | 6 | 91 |
Attacks that fulfilled the criteria and were included in the validation (n) | 627 | 478 |
UMS = Uppsala MS cohort, GIC = Gothenburg Incidence Cohort, SP = secondary progression, KM = Kaplan–Meier estimate.
Internal validation of the prediction score
As described in a previous study17 the MSPS gave a prediction of the immediate risk of SP at time point t, for a particular patient, which was interpreted as the instantaneous risk intensity at time t. These risk intensities were expressed on a yearly scale; thus when the risk intensity was constant during the following year, the probability of SP during that year would be approximately equal to the risk intensity (slightly smaller, particularly with high risk intensity).
To investigate whether it remained valid to use the immediate updated intensity as a prediction of the probability of disease incidence during the coming year, we re-evaluated the first 25 years of data from the GIC.17 For each patient-year combination i (of the matrix) we designated the expected probability (the MSPS) and the observed SP-transitions as either 1 or 0, based on whether SP took place or not, respectively, for that patient during the following year. Next, we sorted the -pairs into six interval strata of MSPS scores, ordered from low to high , and we summed the components for each strata (Table 2).
Table 2.
MSPS defined risk strata | One-year periods | Observed | Expectedb |
---|---|---|---|
<0.025 | 186 | 1 | 3.1 |
0.025–0.05 | 722 | 18 | 27.4 |
0.05–0.075 | 400 | 28 | 25.0 |
0.075–0.10 | 328 | 25 | 28.2 |
0.10–0.125 | 95 | 9 | 10.4 |
>0.125 | 171 | 19 | 26.5 |
TOTAL | 1902 | 100 | 120.6 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the expected column indicates the sum of the predicted probabilities that SP would occur during the following year, for the patient-years indicated in each strata.
The total number of observed SP events (n = 100) was substantially smaller than expected (n = 120.6), particularly in the two largest risk groups. The ratio, k, of the total observed SPs/expected SPs, was calculated as follows:
This ratio was used as a simple scale-calibration, and the results provided a better fit to the data. We illustrated this fit by sorting the calibrated pairs of expected and observed SPs in increasing order of and displaying the partial sums of these components in a graph. The plotted curve was near the diagonal (Figure 3), which indicated that the scale-calibrated MSPS was valid for both the high- and low-risk strata.
After the 25-year censoring time point, nine SP events occurred during 534 patient-years in the GIC, which were not included in the present study. Most of these patient-years represented low-risk strata, which could partially explain the overestimation of the MSPS observed in the first part of the plot in Figure 3.
Evaluations of Gothenburg predictions on Uppsala data
We performed an external statistical validation of the Gothenburg-based method, and the Gothenburg-calibrated Gothenburg-based method, with the independent Uppsala data set.
We first evaluated the observed and expected SP incidences that occurred in the UMS during the coming year (Table 3).
Table 3.
MSPS defined risk strata | One-year periods | Observed | Expectedb |
---|---|---|---|
<0.025 | 43 | 1 | 0.7 |
0.025–0.05 | 263 | 8 | 10.8 |
0.05–0.075 | 267 | 12 | 17.1 |
0.075–0.10 | 262 | 10 | 22.4 |
0.10–0.125 | 68 | 5 | 7.5 |
>0.125 | 198 | 18 | 31.7 |
TOTAL | 1101 | 54 | 90.2 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the expected column indicates the sum of the predicted probabilities that SP would occur during following year, for the patient-years indicated in each strata.
Next, we considered whether there was a simple way to test how well the observed and predictive (expected) values in Table 3 coincided. First, we assumed that all the 0–1-prediction experiments were fixed in both number and probabilities; thus all were independent, and the risks were small and perfectly estimated the probabilities. Then, we could approximate as independent standardised normally distributed random variables and applied a chi-square goodness-of-fit test. Basically, the would be approximately independent and the E(j) would be Poisson-distributed (). To reduce the amount of the data we amalgamated the two smallest and the two largest risk strata (Table 4).
Table 4.
MSPS defined risk strata | Observed | Expectedb |
---|---|---|
<0.05 | 9 | 11.5 |
0.05–0.075 | 12 | 17.1 |
0.075–0.10 | 10 | 22.4 |
>0.10 | 23 | 39.2 |
TOTAL | 54 | 90.2 |
T = 15.62c | p-value = 0.0036 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the expected column indicates the sum of the predicted probabilities of SP occurrence during the following year, for the patient-years indicated in each strata; the observed statistic T and the corresponding p-value were derived from a chi-square test. Here, we used 4 degrees of freedom.
The goodness-of-fit test statistic was calculated as follows:
and the degree of freedom was set to 4. The quite low p-value (0.0036, Table 4) indicated that the model did not fit the data well. Thus, the Gothenburg-based MSPS predictions tended to overestimate the risks.
Next we asked: Would the Gothenburg-calibrated method (multiplying the expected number by 0.829) work better? The chi-square(4) statistic value was 7.1 (p = 0.069) for the Gothenburg-calibrated Uppsala data (no table reported). Again, this result was much lower than predicted (i.e. 75 and 54, for predicted and observed cases, respectively).
Another risk-scaling-variable was the total number of cases divided by the expected number of cases (ratio = 0.599) in the Uppsala material (Table 5). Here the values in the idealised independent Poisson approximation model were conditional on the total number of cases. This can be interpreted as the outcome of a multinomial with probabilities proportional to the original expected number for any unconditional model, where the parameters are proportional to the original prediction probabilities with an arbitrary proportionality parameter. This proportionality parameter is estimated by the ratio 54/90 = 0.599.
Table 5.
MSPS defined risk strata | Observed | Modified expectedb |
---|---|---|
<0.05 | 9 | 6.9 |
0.05–0.075 | 12 | 10.2 |
0.075–0.10 | 10 | 13.4 |
>0.10 | 23 | 23.5 |
TOTAL | 54 | 54 |
T ′ = 1.83c | p-value = 0.61 |
The observed column indicates how many patients in a particular risk group (strata) experienced SP during the following year; the Modified expected column indicates the sum of the predicted probabilities of SP occurrence during the following year, for the patient-years indicated in each strata, calibrated by multiplying by the factor, k = 0.599; the observed statistic T ′ and the corresponding p-value were derived from a chi-square test. Here we used 3 degrees of freedom.
For this idealised Poisson model with a scaling factor, the relevant conditional chi-square test, now with 3 degrees of freedom, was calculated as follows:
The resulting observed test statistic showed a good fit to the data (i.e. p-value = 0.61, which indicated an insignificant difference between the model and the data). Figure 4 shows a cumulative plot of the Uppsala-calibrated predictions for the Uppsala data.
Remarks about the idealised modelling
In reality both N(j) and E(j) are random and dependent between the strata. The convenient Poisson approximation used in the derivation of the MSPS has to be substituted by independent normal (0,1) approximations of the components under the null hypothesis of a perfect time-updated prediction model that produces prediction probabilities e() for person-year combinations, . This approximation worked well for large numbers of patients. And it could be used to derive the approximate chi-square(4) distributions for the goodness-of-fit tests on the UMS validation cohort, using either the original Gothenburg-calibrated, or the Gothenburg-calibrated, Gothenburg-estimated prediction models. This normal approximation could also be used to derive the chi-square(3) distribution of T´, in a model where the prediction model is perfect for an unknown positive scaling factor. However we cannot derive that approximation for the conditional distribution using the idealised normal approximation; instead, we have to interpret the outcome as a result of the unconditional distribution of:
where N is the total number of SPs (observed), and E is the total of expected SPs (sum of MSPSs).
Discussion
We tested whether our MSPS algorithm, which was derived from the GIC, could be generalised, based on the Swedish UMS validation cohort. We found that the expected number of SP transitions estimated with the MSPS corresponded well to the observed number in the UMS, after we calibrated it, based on a general trend towards lower rates of SP conversions in the UMS.
The MSPS is a novel type of predictor. It is based on a floating starting-point, with a timescale of 1 month, applied as a sliding window throughout the RRMS course.17 The starting point might, for example, be a patient’s visit or conceivably the start of a registry study with SP as the outcome. The average annual risk of a transition from RRMS to SP was previously estimated at approximately 4%.17 The MSPS identified a wide range of individual risks for SP, from <2.5% to >12.5% annually. This model apparently uncovered a basic relationship between relapses and progression, or rather, between age, severity, and relapse frequency and the onset of SP.
The strength of the MSPS was that it reduced complex data sets to a few readily available clinical parameters. Individual courses may alternate between periods of low and high risk, as defined by the MSPS, which presumably corresponded to periods of increasing or decreasing inflammatory activity. A limitation of the MSPS was that the EDSS was not included, due to the lack of sufficient EDSS values for the derivation process.
The main strength of the present validation procedure was the identical criteria in the derivation and validation cohorts, including the common relapse severity grade. However, the related structures of the derivation and validation cohorts might also be considered a limitation, because the generalizability of the MSPS remains to be demonstrated with data collected and studied outside Sweden and with different methods.
A limitation was the unavoidable subjectivity of timing of the SP; however, this parameter might have been more informative than the EDSS records, particularly in studies with few EDSS records available.14 A relationship between EDSS stage 4 and the onset of SP was described previously.25
A remaining challenge is the 25-year difference of the onset of the two cohorts. There is a contemporary trend towards a milder course of MS probably due to improved awareness and new diagnostic tests, and which was described before the impact of the new disease-modifying MS drugs.26–29 This may explain why the calibration of the MSPS (0.59 times × calculated risk) provided a better fit with the data. This change is not trivial because it suggests a better prognosis for SP. The change might be related to inclusion of more benign cases. Today all MS patients get disease-modifying MS drugs (DMDs). There is no consensus on whether the effect of DMDs is exerted by diminishing the relapse rate and associated progression, or whether there is a lower propensity for entering SP, or that SP indeed becomes slower and less disabling.
Practical conclusion
In this study, the MSPS identified a basic relapse-progression relationship. It allowed us to identify segments of the individual MS course that were associated with an increased risk of transition to SP. The web-based version (www.msprediction.com) is currently based on the calibration factor derived from the UMS, which was essentially an untreated cohort. If current MS therapy acts primarily by suppressing intermittent inflammatory episodes with a proportional reduction in the risk of subsequent SP, the MSPS may be valid in treated cohorts without further calibration. However, if there is a selective propensity for transition to SP in cohorts under highly effective therapy,30 further calibration may be required (estimated from total observed/MSPS data). The MSPS has potential that is much needed for controlling natural course confounders in registry studies and for dimensioning clinical trials. The use of the MSPS might considerably reduce the number of patients needed in a clinical trial.17
Practical conclusion
In this study, the MSPS identified a basic relapse-progression relationship. It allowed us to identify segments of the individual MS course that were associated with an increased risk of transition to SP. The web-based version (www.msprediction.com) is currently based on the calibration factor derived from the UMS, which was essentially an untreated cohort. If current MS therapy acts primarily by suppressing intermittent inflammatory episodes with a proportional reduction in the risk of subsequent SP, the MSPS may be valid in treated cohorts without further calibration. However, if there is a selective propensity for transition to SP in cohorts under highly effective therapy,30 further calibration may be required (estimated from total observed/MSPS data). The MSPS has potential that is much needed for controlling natural course confounders in registry studies and for dimensioning clinical trials. The use of the MSPS might considerably reduce the number of patients needed in a clinical trial.17
Conflict of Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Oluf Andersen received a research grant from Sanofi.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Gothenburg Multiple Sclerosis Society, Björnsson Foundation, Gothenburg.
Short abstract
Introduction
The Multiple Sclerosis Prediction Score (MSPS, www.msprediction.com) estimates, for any month during the course of relapsing–remitting multiple sclerosis (MS), the individual risk of transition to secondary progression (SP) during the following year.
Objective
Internal verification of the MSPS algorithm in a derivation cohort, the Gothenburg Incidence Cohort (GIC, n = 144) and external verification in the Uppsala MS cohort (UMS, n = 145).
Methods
Starting from their second relapse, patients were included and followed for 25 years. A matrix of MSPS values was created. From this matrix, a goodness-of-fit test and suitable diagnostic plots were derived to compare MSPS-calculated and observed outcomes (i.e. transition to SP).
Results
The median time to SP was slightly longer in the UMS than in the GIC, 15 vs. 11.5 years (p = 0.19). The MSPS was calibrated with multiplicative factors: 0.599 for the UMS and 0.829 for the GIC; the calibrated MSPS provided a good fit between expected and observed outcomes (chi-square p = 0.61 for the UMS), which indicated the model was not rejected.
Conclusion
The results suggest that the MSPS has clinically relevant generalizability in new cohorts, provided that the MSPS was calibrated to the actual overall SP incidence in the cohort.
Acknowledgements
We are grateful to Professor Anders Odén for the development of the methodology and the earlier work on implementation of this on the Gothenburg cohort. We thank Leszek Stawiarz for the excellent help from the Swedish MS Registry, including the initial survey showing that Uppsala had the highest frequency of recordings of relapses with severity indicators, and for the elegant solution of problems originating from the database of the registry.
References
- 1. Tedeholm H, Skoog B, Lisovskaja V, et al. The outcome spectrum of multiple sclerosis: disability, mortality, and a cluster of predictors from onset.J Neurol 2015; 262: 1148–1163. [[PubMed][Google Scholar]
- 2. Kantarci OH, Weinshenker BG. Natural history of multiple sclerosis.Neurol Clin 2005; 23: 17–38. [[PubMed]
- 3. Rovaris M, Confavreux C, Furlan R, et al. Secondary progressive multiple sclerosis: current knowledge and future challenges.Lancet Neurol 2006; 5: 343–354. [[PubMed][Google Scholar]
- 4. Bergamaschi R, Quaglini S, Trojano M, et al. Early prediction of the long term evolution of multiple sclerosis: the Bayesian Risk Estimate for Multiple Sclerosis (BREMS) score.J Neurol Neurosurg Psychiatry 2007; 78: 757–759. [Google Scholar]
- 5. Confavreux C, Vukusic S, Adeleine P. Early clinical predictors and progression of irreversible disability in multiple sclerosis: an amnesic process.Brain 2003; 126: 770–782. [[PubMed]
- 6. Weinshenker BG, Rice GP, Noseworthy JH, et al. The natural history of multiple sclerosis: a geographically based study. 3. Multivariate analysis of predictive factors and models of outcome.Brain 1991; 114: 1045–1056. [[PubMed][Google Scholar]
- 7. Scalfari A, Neuhaus A, Daumer M, et al. Onset of secondary progressive phase and long-term evolution of multiple sclerosis. J Neurol Neurosurg Psychiatry 2014; 85: 67–75. [[PubMed][Google Scholar]
- 8. Runmarker B, Andersen O. Prognostic factors in a multiple sclerosis incidence cohort with twenty-five years of follow-up.Brain 1993; 116: 117–134. [[PubMed]
- 9. Eriksson M, Andersen O, Runmarker B. Long-term follow up of patients with clinically isolated syndromes, relapsing-remitting and secondary progressive multiple sclerosis.Mult Scler 2003; 9: 260–274. [[PubMed]
- 10. Manouchehrinia A, Zhu F, Piani-Meier D, et al. Predicting risk of secondary progression in multiple sclerosis: a nomogram. Mult Scler 2019: 25: 1102–1112. [[PubMed][Google Scholar]
- 11. Kalincik T, Manouchehrinia A, Sobisek L, et al. Towards personalized therapy for multiple sclerosis: prediction of individual treatment response.Brain.2017; 140: 2426–2443. [[PubMed][Google Scholar]
- 12. Scott TF, Gettings EJ, Hackett CT, et al. Specific clinical phenotypes in relapsing multiple sclerosis: the impact of relapses on long-term outcomes.Mult Scler Relat Disord 2016; 5: 1–6. [[PubMed][Google Scholar]
- 13. Koch-Henriksen N, Thygesen LC, Sorensen PS, et al. Worsening of disability caused by relapses in multiple sclerosis: a different approach.Mult Scler Relat Disord 2019; 32: 1–8. [[PubMed][Google Scholar]
- 14. Jokubaitis VG, Spelman T, Kalincik T, et al. Predictors of long-term disability accrual in relapse-onset multiple sclerosis.Ann Neurol 2016; 80: 89–100. [[PubMed][Google Scholar]
- 15. University of California SFMSET, Cree BAC, Hollenbach JA, et al. Silent progression in disease activity-free relapsing multiple sclerosis.Ann Neurol 2019; 85: 653–666. [Google Scholar]
- 16. Lizak N, Lugaresi A, Alroughani R, et al. Highly active immunomodulatory therapy ameliorates accumulation of disability in moderately advanced and advanced multiple sclerosis.J Neurol Neurosurg Psychiatry 2017; 88: 196–203. [[PubMed][Google Scholar]
- 17. Skoog B, Tedeholm H, Runmarker B, et al. Continuous prediction of secondary progression in the individual course of multiple sclerosis.Mult Scler Relat Disord 2014; 3: 584–592. [[PubMed][Google Scholar]
- 18. Kanis JA, Harvey NC, Johansson H, et al. FRAX update.J Clin Densitom 2017; 20: 360–367. [[PubMed][Google Scholar]
- 19. Skoog B, Runmarker B, Winblad S, et al. A representative cohort of patients with non-progressive multiple sclerosis at the age of normal life expectancy.Brain 2012; 135: 900–911. [[PubMed][Google Scholar]
- 20. Andersen O. From the Gothenburg Cohort to the Swedish Multiple Sclerosis Registry.Acta Neurol Scand Suppl 2012: 13–19. [[PubMed]
- 21. Hillert J, Stawiarz L. The Swedish MS registry – clinical support tool and scientific resource.Acta Neurol Scand 2015; 132: 11–19.
- 22. Poser C, Paty D, Scheinberg L, et al. New diagnostic criteria for multiple sclerosis: guidelines for research protocols.Ann Neurol 1983; 13: 227–231. [[PubMed][Google Scholar]
- 23. Lublin FD, Reingold SC. Defining the clinical course of multiple sclerosis: results of an international survey. National Multiple Sclerosis Society (USA) Advisory Committee on Clinical Trials of New Agents in Multiple Sclerosis.Neurology 1996; 46: 907–911. [[PubMed]
- 24. Schumacher GA, Beebe G, Kibler RF, et al. Problems of experimental trials of therapy in multiple sclerosis: report by the Panel on the Evaluation of Experimental Trials of Therapy in Multiple Sclerosis.Ann N Y Acad Sci 1965; 122: 552–568. [[PubMed][Google Scholar]
- 25. Lorscheider J, Buzzard K, Jokubaitis V, et al. Defining secondary progressive multiple sclerosis.Brain 2016; 139: 2395–2405. [[PubMed][Google Scholar]
- 26. Tremlett H, Paty D, Devonshire V. Disability progression in multiple sclerosis is slower than previously reported.Neurology 2006; 66: 172–177. [[PubMed]
- 27. Kister I, Chamot E, Cutter G, et al. Increasing age at disability milestones among MS patients in the MSBase Registry.J Neurol Sci 2012; 318: 94–99. [[PubMed][Google Scholar]
- 28. Koch-Henriksen N, Laursen B, Stenager E, et al. Excess mortality among patients with multiple sclerosis in Denmark has dropped significantly over the past six decades: a population based study.J Neurol Neurosurg Psychiatry 2017; 88: 626–631. [[PubMed][Google Scholar]
- 29. Burkill S, Montgomery S, Hajiebrahimi M, et al. Mortality trends for multiple sclerosis patients in Sweden from 1968 to 2012.Neurology 2017; 89: 555–562. [[PubMed][Google Scholar]
- 30. Granqvist M, Boremalm M, Poorghobad A, et al. Comparative effectiveness of rituximab and other initial treatment choices for multiple sclerosis.JAMA Neurol 2018; 75: 320–327. [Google Scholar]