3 Committee discussion

The evaluation committee considered evidence submitted by Sanofi, a review of this submission by the external assessment group (EAG), and responses from stakeholders. See the committee papers for full details of the evidence.

The condition

Multiple myeloma

3.1

Multiple myeloma is an incurable, relapsing and remitting cancer of plasma cells. It is a chronic condition that affects how long people live and the quality of their lives. The patient experts emphasised that multiple myeloma is a highly individual and complex cancer that has a wide range of symptoms and varies in severity. They explained that the condition has a large psychological impact because of the constant possibility of relapse. With each relapse, the condition is more difficult to treat and the number of future treatment options becomes more limited. The patient experts added that the condition can also have a large impact on quality of life, affecting all aspects of life for people with the condition, and their family and carers. The committee acknowledged that multiple myeloma is a chronic, incurable highly individual condition that can have a negative impact on quality of life for people with the condition, and their families and carers.

Treatment pathway

3.2

First-line treatment options for people with multiple myeloma depend on whether a stem cell transplant may be suitable. NICE recommends the following treatment options at first line when a stem cell transplant is not suitable:

thalidomide, cyclophosphamide and dexamethasone (see NICE's technology appraisal guidance on bortezomib and thalidomide for the first-line treatment of multiple myeloma [TA228])
bortezomib, cyclophosphamide and dexamethasone (Bor‑Cyclo‑Dex; see TA228)
bortezomib, melphalan and prednisone (Mel‑Bor‑Pred; see TA228)
lenalidomide and dexamethasone (Len‑Dex; see NICE's technology appraisal guidance on Len-Dex for previously untreated multiple myeloma [TA587])
daratumumab, lenalidomide and dexamethasone (Dar‑Len‑Dex; see NICE's technology appraisal guidance on Dar-Len-Dex for untreated multiple myeloma when a stem cell transplant is unsuitable [TA917]).

The clinical experts explained that multiple myeloma becomes resistant to subsequent lines of treatment. This means the most effective treatment should be used as early as possible in the treatment pathway to achieve the deepest response and to prolong remission. The company explained that the thalidomide combination is very rarely used in the NHS. It added that Dar‑Len‑Dex is standard care for NHS patients with newly diagnosed multiple myeloma when an autologous stem cell transplant (ASCT) is unsuitable. The clinical experts explained that Dar‑Len‑Dex is the most relevant comparator. They added that people whose condition is not suitable for Dar‑Len‑Dex would not generally be offered isatuximab plus bortezomib, lenalidomide and dexamethasone (Isa‑Bor‑Len‑Dex). But they highlighted that a few people who would currently have bortezomib-based regimens at first line for specific reasons such as renal failure may be offered Isa‑Bor‑Len‑Dex instead. The NHS England Cancer Drugs Fund clinical lead explained that, each year in the NHS, 2,400 people have Dar‑Len‑Dex and 300 people have Len‑Dex. They also explained that NHS England does not collect data on the number of people who have bortezomib-based regimens. At the first meeting, the committee noted that a few people are offered combination treatments other than Dar‑Len‑Dex in the NHS. But it concluded that Dar‑Len‑Dex was the most relevant comparator for Isa‑Bor‑Len‑Dex.

Clinical evidence

Key clinical trial: IMROZ

3.3

The clinical-effectiveness evidence for Isa‑Bor‑Len‑Dex came from IMROZ, a phase 3, multicentre, international, randomised, open-label study. It compared Isa‑Bor‑Len‑Dex (n=265) with bortezomib, lenalidomide and dexamethasone (Bor‑Len‑Dex; n=181) in adults with newly diagnosed multiple myeloma when an ASCT was unsuitable. IMROZ had an initiation phase comprising 4 cycles of 6 weeks of treatment. This was followed by a maintenance phase of 4‑week treatment cycles in which bortezomib was no longer used. In the maintenance phase, people randomised to the Bor‑Len‑Dex arm whose condition had progressed were allowed to crossover from Len‑Dex to Isa‑Len‑Dex.

The primary endpoint was progression-free survival (PFS) as assessed by an independent review committee. The EAG noted that the company had presented results using data from the 26 September 2023 data cut, at which time median follow up was 59.7 months. It explained that this data could be considered immature because few progression events had been observed during the follow-up period. So, median PFS had not been reached in the Isa‑Bor‑Len‑Dex arm. Median overall survival (OS) had not been reached in either trial arm. Clinical advice to the EAG was that people in IMROZ had similar demographic and disease characteristics to people seen in the NHS. So, the results of the trial were generalisable to NHS clinical practice. At the first meeting, the committee noted that the average age of people in IMROZ was 71.6 years, which was younger than would be expected for people in the NHS. One reason for this was because IMROZ excluded people who were over 80 years. The clinical experts explained that quadruplet combination treatments are usually only suitable for people who are fit enough to have them. They advised that a small proportion of people over 80 years would be considered fit enough to have Isa‑Bor‑Len‑Dex. They added that only about 60% of people currently offered Dar‑Len‑Dex would be considered fit enough to have Isa‑Bor‑Len‑Dex because of the additional treatment burden of quadruplet over triplet treatment combinations. At the first meeting, the committee concluded that the IMROZ population was younger and fitter than the NHS population. But it agreed that the relative effects from IMROZ are likely to be generalisable to NHS clinical practice. It considered whether the data from IMROZ was sufficiently mature. It concluded that, although the data was immature and this contributed to uncertainty in the survival analysis, the results of the trial were suitable for decision making.

Indirect treatment comparisons

3.4

The company considered a network meta-analysis (NMA) for the comparison with Dar‑Len‑Dex, Len‑Dex and Mel‑Bor‑Pred. The company explained that it had included SWOG S0777 to allow network connectivity but this may have introduced substantial biases to the NMA. This was because outcome data was not reported specifically for the transplant-ineligible subgroup relevant to this decision problem. Instead, it was categorised by:

age ('under 65 years' compared with '65 years and over')
intent to transplant ('yes' compared with 'no')
whether there was a transplant ('yes' compared with 'no').

The company proposed that the subgroup of people aged 65 years and over was the most appropriate proxy for a transplant-ineligible subgroup analysis. But it explained that, because transplant eligibility is not solely defined by age, this subgroup may not fully have represented the transplant-ineligible population. Also, 18 out of 91 people in this subgroup had an intention to have a transplant, but the number who went on to have a transplant in this age group was unknown. Randomisation was also not preserved between treatment arms within this subgroup because the trial was not stratified by age. This led to potential imbalances in patient characteristics between treatment arms, biasing the estimate of relative treatment effects. The company also explained that in SWOG S0777, bortezomib was administered intravenously rather than subcutaneously. This caused a large proportion (23%) of people to prematurely stop Cyclo‑Bor‑Dex induction treatment because of neuropathy. This meant that the survival outcomes for the Cyclo‑Bor‑Dex arm may have been underestimated and confounded by early treatment discontinuation.

The EAG agreed with the company that the NMA results were unlikely to be robust because of the inclusion of the non-randomised subgroup from SWOG S0777 data. So, the company explained that it instead chose to use an unanchored matching-adjusted indirect comparison (MAIC). The EAG noted that unanchored MAICs rely on strong assumptions, including that all prognostic factors and treatment-effect modifiers are adjusted for. It explained that bias from unmeasured confounding may have been introduced. It also explained that it was not possible to adjust for 3 out of 9 of the identified prognostic factors or treatment-effect modifiers (chromosomal abnormality 1q21+, serum lactate dehydrogenase [LDH] levels and frailty). The clinical experts advised that, of these, only frailty was potentially of concern because it is closely related to age. They added that this is an important prognostic variable and also determines potential eligibility for Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex. The company explained that frailty is usually a composite measure involving variables such as age and Eastern Cooperative Oncology Group Performance Status (ECOG PS). It said that that it had already adjusted for these in the analysis. The clinical experts agreed that these were components of frailty, but that frailty was much broader and included several other variables such as comorbidities.

At the first meeting, the committee considered the relative merits of both indirect treatment comparison (ITC) approaches. It noted that it would prefer to use randomised evidence when available. It agreed that there was not enough justification to discount the possibility of using results from the NMA with the inclusion of SWOG S0777 data. It noted that the relative effect from SWOG S0777 was likely to be applicable to the transplant-ineligible population. The committee suggested to the company that it could be possible to preserve randomisation by using the no intent-to-transplant subgroup as a proxy for when an ASCT is unsuitable. The company responded that its decision to discount this possibility had been informed by clinical expert opinion. The clinical experts explained that intent to transplant is not the same as having an ASCT. This is because intent is not always based on fitness. So, it is not the same as transplant suitability because the intention may change over time after the initial clinical assessment. The EAG noted that an unanchored MAIC is less robust than an NMA. But it explained that the limitations of the MAIC were known, whereas the risk and direction of bias in the company's NMA were unknown. The committee expressed concern that use of the NMA results was not considered as part of the company's base case. It was particularly concerned about whether it was possible to preserve randomisation using the no intent-to-transplant subgroup as a proxy. The committee noted that randomisation was stratified based on intent to transplant. At the first meeting, the committee concluded that it would like the company to present results from an NMA using the SWOG S0777 data with randomisation preserved. It added that this should be done using both the intention-to-treat (ITT) population and the no intent-to-transplant subgroup as a proxy for the transplant-ineligible population.

At consultation, the company presented results from an updated NMA that used the ITT and no intent-to-transplant subgroups from the SWOG S0777 study. The exact results of the NMA are considered confidential by the company and cannot be reported here. The company highlighted that the NMA result, which compared the ITT population in SWOG S0777 with that in IMROZ, introduced bias. This was because there were substantial differences in baseline characteristics (including age and lactate dehydrogenase levels between the 2 populations) and formulation of bortezomib. Other key characteristics of the ITT population taken from SWOG S0777 were not reported, such as frailty. The company said that the difference in baseline characteristics of the 2 populations limited the comparability of different treatment effects. Also, the company highlighted that a proportion of people in the SWOG S0777 ITT population had an autologous stem cell transplant, which is a confounding factor. The EAG agreed that the baseline characteristics for the SWOG SO777 and IMROZ ITT populations were substantially different from each other. So, they would not produce a reliable estimate of treatment effectiveness between Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex.

At the second meeting, the clinical experts confirmed that age and frailty would have affected the treatment-effectiveness estimates. They also noted that SWOG S0777 was outdated and may have recruited people with different characteristics. The company noted that the no intent-to-transplant subgroup may have included people eligible for a transplant but who chose not to have one. It also noted that the transplant-ineligible population in IMROZ was a clinically defined criterion that accounted for age, comorbidity and performance status. So, because of this, the no intent-to-transplant subgroup would have included everyone who could have a transplant but also people who chose not to have one despite being eligible. The company said it was likely that there were key differences between the 2 populations. There were effect modifiers such as age, renal function and frailty. There were no baseline characteristics reported for the no intent-to-transplant subgroup. So, potential differences between the IMROZ trial population and the no intent-to-transplant subgroup in SWOG S0777 could not be assessed. The company concluded that the no intent-to-transplant subgroup was a more appropriate subgroup to inform the NMA. But differences between the 2 populations remained because some people eligible for transplant had been included. The company opted for its original MAIC to inform the base case, which used a non-time-varying approach. The EAG agreed with the company that the no intent-to-transplant subgroup was a more suitable population to inform the NMA than the ITT population. But it thought that this introduced confounding that could not be quantified. The EAG raised concerns that this method was also unsuitable because no baseline characteristics were available for the no intent-to-transplant population.

The committee highlighted that none of the OS results from any ITC method yielded statistically significant results for OS. The PFS results from the updated NMA using randomised populations were similar to the PFS estimated from the MAIC. The exact results for OS and PFS are considered confidential by the company so cannot be reported here. The committee noted that it would be preferable to use an ITC that maintains randomisation, so it preferred the NMA methodology. But it noted the substantial concerns about the heterogeneity between the studies that meant adjustments could not be made because:
there is no data
there is a lack of baseline characteristics in the no intent-to-transplant population
the validity to make an indirect comparison was not met.

The committee concluded that its preferred analysis was a non-time-varying MAIC using a constant hazard ratio because of the limitations in using an NMA that maintains randomisation.

Clinical-effectiveness results

3.5

For the ITT population, death or disease progression occurred in 84 (31.7%) people in the Isa‑Bor‑Len‑Dex arm and 78 (43.1%) in the Bor-Len-Dex arm. Median follow up was 59.70 months. The hazard ratio was 0.596 (98.5% confidence interval [CI] 0.406 to 0.876; p=0.0005). This corresponded to a 40.4% reduction in the risk of disease progression or death with Isa‑Bor‑Len‑Dex compared with Bor‑Len‑Dex. The median PFS was not reached (NR) in the Isa‑Bor‑Len‑Dex group and was 54.34 months (95% CI 45.207 to NR) in the Bor-Len-Dex group. OS was a secondary endpoint. At median follow up, 69 (26.0%) of people had died in the Isa‑Bor‑Len‑Dex arm and 59 (32.6%) had died in Bor‑Len‑Dex arm. The hazard ratio for OS for Isa‑Bor‑Len‑Dex compared with Bor‑Len‑Dex was 0.776 (99.97% CI 0.407 to 1.480; p=0.0760). Results from the unanchored MAICs and the NMA for Dar‑Len‑Dex and Len‑Dex are considered confidential by the company and cannot be reported here. For Mel‑Bor‑Pred, the hazard ratio for the comparison with Isa‑Bor‑Len‑Dex for PFS was 0.20 (95% CI 0.15 to 0.27) and for OS was 0.50 (95% CI 0.37 to 0.67). For Bor‑Cyclo‑Dex, the hazard ratio for the comparison with Isa‑Bor‑Len‑Dex for PFS was 0.34 (95% CI 0.25 to 0.47) and for OS was 0.48 (95% CI 0.33 to 0.69).

Economic model

Company's modelling approach

3.6

The company provided a partitioned survival model to estimate the cost effectiveness of Isa‑Bor‑Len‑Dex compared with Dar‑Len‑Dex and the other comparator combinations. The model included 3 health states: progression free (with subhealth states for on and off treatment), progressed disease and death. The probability of being in each health state was calculated using extrapolated PFS and OS curves. The model used a cycle length of 2 weeks with a half-cycle correction over a lifetime horizon of 29 years (the starting age in the model was 71.6 years). The OS rate was capped by the age and gender-matched general population mortality rate. Everyone was assumed to be dead at age 100 years. In each cycle, the PFS rate was capped by the OS rate for the same time period to ensure that OS was always greater than PFS. At the first meeting, the committee concluded that, overall, the company's model structure was acceptable for decision making. But it noted that the multiple myeloma treatment pathway is becoming increasingly complex and with increasing lines of treatment available. So, it noted that having a single progressed-disease health state was a simplification and may not fully reflect the current treatment pathway and quality of life in this health state. It also recalled that the starting age used in the model, based on the age in IMROZ, was younger than would be expected in NHS clinical practice (see section 3.3). So, it requested that the model be updated to include a starting age reflecting the NHS population and based on an appropriate source. Its preference was people having Dar‑Len‑Dex in Systemic Anti-Cancer Therapy (SACT) data.

At consultation, the company updated the starting age in the model to 75 years. This value was taken from Djebbari et al. which reported the median age of a transplant-ineligible population in the NHS. The company highlighted that SACT data was not available for a transplant-ineligible population and the median age of people having Dar‑Len‑Dex is not reported in the literature. The Cancer Drugs Fund clinical lead said that the median age of people starting Dar‑Len‑Dex in the NHS was 77 years and that this age is higher than in IMROZ. The EAG highlighted that it was inappropriate to have a starting age in the model that differed from the starting age in IMROZ. The exception would be that the treatment-effectiveness estimates in the trial were adjusted to reflect the updated starting age in the model. The committee recalled that the average age of people in IMROZ was lower than that of people who would be having treatment in the NHS (see section 3.3). So, it preferred the starting age of 75 years in the model to reflect an NHS population.

Modelling PFS and OS

3.7

In its base case, the company modelled differences in PFS and OS between treatments based on extrapolated data from IMROZ Kaplan–Meier curves and the ITCs. The company jointly fitted distributions to MAIC-adjusted IMROZ data (Isa‑Bor‑Len‑Dex) and comparator (Dar‑Len‑Dex, Mel‑Bor‑Pred and Len‑Dex) trial OS and PFS Kaplan–Meier data. For Bor‑Cyclo-Dex, the company also jointly fitted distributions to IMROZ ITT data for Isa‑Bor‑Len‑Dex. For Bor‑Cyclo‑Dex, it used PFS data and OS adjusted with inverse probability weighting. The company then estimated time-varying hazard ratios by comparing intervention and comparator survival estimates at different time points. These hazard ratios were applied to IMROZ ITT population OS and PFS distributions to generate survival estimates for people having Dar‑Len‑Dex, Mel‑Bor‑Pred, Len‑Dex and Bor‑Cyclo‑Dex. The company thought that the most appropriate distributions to generate long-term survival estimates were the Gompertz distribution for OS and the Gamma distribution for PFS. The EAG explained its view that the company's approach was overly complicated. It thought that the fitted distributions used to estimate time-varying hazard ratios could have been used directly in the company's model. It further explained that, at all time points, survival estimates based on IMROZ MAIC-adjusted OS and PFS data were lower than the survival estimates based on IMROZ ITT data. So, generating survival estimates based on IMROZ Isa‑Bor‑Len‑Dex ITT Kaplan–Meier data generated optimistic OS estimates for Isa‑Bor‑Len‑Dex. The EAG also noted that the company had fitted distributions to the full 68 months of available IMROZ data but that, after 60 months, the only events remaining were censoring events. The company agreed with the EAG that most events past 60 months were censoring events. But it explained that limiting analysis to 60 months did not take into account all the available evidence. This was particularly true in the case of the MAIA trial (Dar‑Len‑Dex compared with Len‑Dex in people with newly diagnosed multiple myeloma in whom an ASCT is unsuitable). Survival data from this trial was available for up to 100 months of follow up.

The company explained that conventional extrapolation techniques apply less emphasis to the 'tail' of data when there are fewer people at risk. Censoring people after 60 months for Isa‑Bor‑Len‑Dex only marginally changed the survival estimates for OS and PFS. This suggested that the tail did not introduce statistically significant uncertainty. The EAG disagreed with the company's preference for including data beyond 60 months, and noted that including this data contributed to overly optimistic OS estimates for Isa‑Bor‑Len‑Dex. So, at the clarification stage, it asked the company to provide analysis in which distributions were fitted only to the first 60 months of data. The company provided 2 new analyses in response to the EAG's request. Scenario A was limited to 60 months of data from IMROZ and MAIA, and was the analysis preferred by the EAG. Scenario B used the full 68‑month follow up of IMROZ and also included additional follow up from MAIA up to 100 months. But the company did not do as the EAG had requested. This was to fit separate distributions to the first 60 months of MAIC-adjusted IMROZ Isa‑Bor‑Len‑Dex data and comparator trial data and use these distributions directly in the economic model. Instead, the company maintained its original approach, but using the 60‑month data. The clinical experts explained that the high proportion of censoring events after 60 months in IMROZ added uncertainty to the survival analysis. They added that it is often preferable to use as much clinical trial data as is available. But they thought it was reasonable to exclude IMROZ data after 60 months from the analysis.

After the EAG's request at the clarification stage for new analyses up to 60 months, the company revised its selection of parametric curves for Isa‑Bor‑Len‑Dex. It preferred using Weibull for PFS and generalised gamma for OS. For PFS, the EAG explained that it thought that the Gompertz distribution was a better choice than Weibull. This was because it was similarly ranked and generated estimates that were more closely aligned to clinical expert opinion. For OS, the EAG noted that the Gompertz distribution was a better fit based on Akaike information criterion and Bayesian information criterion statistics. It also generated OS estimates that were closer to clinician landmark estimates. At the first meeting, the clinical experts noted that, because of the age of people at diagnosis, it was very difficult to validate estimates of PFS and OS. This was particularly difficult out to a 20‑year timepoint in which the model extrapolations were all overoptimistic. But, despite this caveat, the clinical experts thought that the clinical expert opinion provided by the company was reasonable. They also preferred the EAG's chosen distributions because of their closer alignment with these clinician landmark estimates. The EAG explained that the choice of extrapolation had a relatively small impact on cost effectiveness. This was particularly so when compared with the choice of whether to use the 60‑month or 68‑month analysis.

At the first meeting, the committee agreed with the EAG that the company's approach was overly complicated, and that the calculation of time-varying hazard ratios was an unnecessary step. The committee noted that the company's and EAG's approach to modelling OS and PFS was highly uncertain because it relied on the results from the unanchored MAIC. It recalled that it would have preferred to see ITC results from an NMA that maintained randomisation (see section 3.4). So, it was unable to conclude on the most appropriate OS and PFS parametric distribution, and whether 60 or 68 months of data should be used. The committee concluded that, because it had not seen an appropriate analysis to model PFS and OS, it could not state its preference. If appropriate, and data allows, the committee's preferred method to model OS and PFS for Isa‑Bor‑Len‑Dex and comparators would be to apply the hazard ratio generated from an NMA (which maintained randomisation) to an appropriate reference curve such as Dar‑Len‑Dex OS and PFS curves from MAIA or Dar‑Len‑Dex SACT data.

At consultation, the company applied the hazard ratio generated from the MAIC to the Dar‑Len‑Dex curves from MAIA. Also, it used the same extrapolations for Dar‑Len‑Dex that were used in TA917. For OS, the generalised gamma reference curve informed the company's base case. For PFS, the gamma reference curve was used. A 68‑month IMROZ data cut was maintained in its analysis. The company said that the updated survival estimates aligned with the clinical experts that they consulted. The EAG highlighted that the decision to anchor the hazard ratios to MAIA or IMROZ should be based on which trial population is most similar to the NHS. The committee heard that the average age of people in MAIA was slightly older compared with IMROZ. At the second committee meeting, the clinical experts confirmed that the updated survival and progression-free estimates predicted in the model were more aligned with what they expected for Dar‑Len‑Dex. The committee concluded that the company's approach was appropriate for extrapolating OS and PFS and that the hazard ratios should be applied to the Dar‑Len‑Dex curves from MAIA because the average age in MAIA is more reflective of people having treatment in the NHS.

OS benefit

3.8

The company explained that the MAIC results suggested an OS benefit for Isa‑Bor‑Len‑Dex compared with Dar‑Len‑Dex because the hazard ratio was less than 1. Based on Kaplan–Meier curves, the mortality rates for Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex were:

equal for the first 12 months
possibly higher for Dar‑Len‑Dex between months 12 and 24
possibly higher for Isa‑Bor‑Len‑Dex between months 24 and 36, becoming equal again from month 36 onwards.

The EAG noted that the available clinical-effectiveness evidence was not sufficient to support the assumption that people having Isa‑Bor‑Len‑Dex live longer than people having Dar‑Len‑Dex. It also noted that this apparent fluctuation in mortality rates over time may have been a statistical artefact. This was because the OS hazard ratios presented by the company for this comparison were close to 1 and not statistically significantly different from 1.

At the first meeting, the company explained its view that the increase in mortality for people having Isa‑Bor‑Len‑Dex could be attributed to the 12 COVID‑19-related deaths in IMROZ. No COVID‑19-related deaths were recorded in MAIA because some of the trial was done before the pandemic. Had these COVID‑19-related deaths not occurred in IMROZ, it would have expected the survival of Isa‑Bor‑Len‑Dex to have remained above the Dar‑Len‑Dex OS curve. But the EAG noted that it was not correct to assume that these deaths were caused by COVID‑19. This was because only a positive COVID‑19 test had been recorded on the death certificates. It was unknown whether COVID‑19 was the cause of death. The EAG also noted that the number of deaths associated with a positive COVID‑19 test in IMROZ was low (n=12). So, it seemed unlikely that this fully explained the fluctuations. It also did not explain why mortality hazards from month 36 onwards appeared essentially identical for people having Isa‑Bor‑Len‑Dex and people having Dar‑Len‑Dex. The EAG noted that the survival differences appeared to be small. Despite this, the company's life year gain estimates accounted for about 30% of the total quality-adjusted life year (QALY) gain for Isa‑Bor‑Len‑Dex compared with Dar‑Len‑Dex. The EAG explained that these inconsistencies and uncertainties in the OS data suggested that it was unclear whether people having Isa‑Bor‑Len‑Dex live longer than people having Dar‑Len‑Dex. So, it preferred to set OS as equal between the 2 treatments. At the first meeting, the committee agreed with the EAG that it had not yet seen sufficient clinical evidence to support an OS benefit for Isa‑Bor‑Len‑Dex compared with Dar‑Len‑Dex.

At consultation, the company highlighted that adding bortezomib to Len‑Dex improved OS in the ITT and the no intent-to-transplant population of the NMA. So, it maintained the assumption that Isa‑Bor‑Len‑Dex provided an OS benefit. This was because the MAIC provided a hazard ratio that was less than 1 and because of deeper minimal residual disease negativity rates. The company highlighted that, of people who had Isa‑Bor‑Len‑Dex in IMROZ, 58.1% had minimal residual disease and 46.7% had sustained minimal residual disease negativity for at least 12 months. It also claimed that minimal residual disease is an established surrogate for PFS and OS in multiple myeloma and sustained minimal residual disease is a predictor of long-term survival. The EAG highlighted that adding bortezomib to Len‑Dex did not support an assumption that adding bortezomib to Isa‑Len‑Dex would improve OS in this cohort.

The committee said that it could not be assumed that adding bortezomib to a doublet therapy and then adding it to a triplet therapy would have the same effect in OS. It also said that it would expect a diminishing return from the survival benefit when the number of treatments increases. The committee highlighted that the US Food and Drug Administration (FDA) oncology advisory committee accepted minimal residual disease negativity as a suitable endpoint for clinical trials but had not said whether it was a suitable surrogate for OS. The company raised that during the FDA advisory meeting, data was presented that supported an association of 12‑month minimal residual disease negativity. It added that this was prognostic of improved PFS for people with a new diagnosis but not OS. The clinical experts confirmed that minimal residual disease monitoring is not done routinely in the NHS. So, there is no possibility that knowledge of a minimal residual disease result would be associated with a quality-of-life benefit. The company confirmed that minimal residual disease negativity was not communicated to people in IMROZ. The clinical experts said that complete response was tested in the NHS, so this could bring a psychological benefit. The committee decided that it was plausible that Isa‑Bor‑Len‑Dex improves OS but that the current data may not support this. It concluded that the size of any OS benefit is highly uncertain.

Modelling of time to treatment discontinuation

3.9

To estimate the proportion of people having subsequent treatments, the company used IMROZ data after Isa‑Bor‑Len‑Dex and MAIA data after Dar‑Len‑Dex. It modelled PFS to be substantially longer for people having Isa‑Bor‑Len‑Dex compared with people having Dar‑Len‑Dex. But it also modelled time to treatment discontinuation (TTD) for Isa‑Bor‑Len‑Dex to be shorter than for Dar‑Len‑Dex. This resulted in a large difference between PFS and TTD for Isa‑Bor‑Len‑Dex but no substantial difference between PFS and TTD for Dar‑Len‑Dex. The durations are considered confidential by the company and cannot be reported here.

The EAG noted that both treatments are used until progression and the model assumed that the adverse events profile of Dar‑Len‑Dex is less favourable than that of Isa‑Bor‑Len‑Dex. So, the EAG would have expected TTD for Isa‑Bor‑Len‑Dex to be longer than for Dar‑Len‑Dex. It explained that this had an impact on subsequent treatment usage and costs in the model. This was because it was assumed that some people stop treatment with Isa‑Bor‑Len‑Dex but do not have any further treatment until progression. At the first meeting, the clinical experts confirmed that some people may stop treatment because of reasons such as toxicity or a desire to avoid repeated hospital visits. They also confirmed that some people may have a deep and lasting response to Isa‑Bor‑Len‑Dex even after treatment has stopped. The EAG noted that IMROZ had a shorter follow up than MAIA, so there was less time for people who had progressed to start a subsequent treatment. It suggested that the total cost associated with Isa‑Bor‑Len‑Dex could have been underestimated in the company's economic model if:

the difference between PFS and TTD modelled for Isa‑Bor‑Len‑Dex is not reflected in NHS practice
people who stop treatment with Isa‑Bor‑Len‑Dex have subsequent treatments before progression.

At the first meeting, the committee concluded that the TTD and PFS values used by the company were uncertain and may not reflect NHS practice. It also recalled there may be an additional treatment burden of quadruplet over triplet treatment combinations (see section 3.3). So, it would not expect Dar‑Len‑Dex to have a worse adverse-event profile than Isa‑Bor‑Len‑Dex. It requested that the company provide further justification for why:
there is such a large difference between TTD and PFS for Isa‑Bor‑Len‑Dex
there is such a small difference between TTD and PFS for Dar‑Len‑Dex
TTD was shorter for Isa‑Bor‑Len‑Dex than for Dar‑Len‑Dex.

At consultation, the company provided further clarification why a longer PFS, in relation to TTD, was modelled for Isa‑Bor‑Len‑Dex compared to Dar‑Len‑Dex. The company did a MAIC to compare minimal residual disease negativity between Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex using IMROZ and MAIA data. The results showed that Isa‑Bor‑Len‑Dex statistically significantly increased the odds of having minimal residual disease negativity compared with Dar‑Len‑Dex. The company highlighted that sustained minimal residual disease negativity beyond 12 months is an established surrogate for PFS in multiple myeloma. So, it thought that this supported the PFS benefit that was captured in the model.

The company presented results from a MAIC that compared TTD between Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex. This showed that there was no statistically significant difference in treatment duration between Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex. The company said that the MAIC comparing TTD between the 2 trials adjusted for treatment discontinuation because of adverse events. So, it thought that the tolerability between the 2 treatments was similar. The company thought that people with minimal residual disease negativity with Isa‑Bor‑Len‑Dex may choose to stop treatment before progression. The EAG maintained its concern that the prolonged PFS benefit continues after people have stopped treatment with Isa‑Bor‑Len‑Dex. But it thought that the company provided reasons to justify the assumption. It highlighted that time to next treatment would have been more informative to ensure the PFS benefit was not driven by having other subsequent treatments before progression.

At the second meeting, the company confirmed that time to next treatment and time to progression overlapped throughout IMROZ. The patient experts confirmed that they are informed of their condition's response to treatment in a qualitative way. The clinical experts said that, in a transplant eligible population, people taking lenalidomide stop maintenance treatment because of toxicity and remain in remission because of a complete response. They also highlighted trial data was beginning to reflect this and will inform how this combination is used in the transplant-ineligible population. The patient experts confirmed that people with multiple myeloma may stop treatment because of changes in personal circumstances that mean they are unable to attend hospital appointments. The committee accepted the company's base-case assumptions for PFS and TTD but decided that there were still uncertainties in time between TTD and PFS.

Subsequent treatments

3.10

At consultation, the company provided updated subsequent treatments to include selinexor, bortezomib and dexamethasone (Sel‑Bor‑Dex) and teclistamab. The EAG highlighted that it was unclear why more people who had had Dar‑Len‑Dex had Sel‑Bor‑Dex as second-line treatment compared with people who had had Isa‑Bor‑Len‑Dex. The clinical experts confirmed that there was no justification for this. They thought that subsequent treatment distribution should be the same for people who had Dar‑Len‑Dex or Isa‑Bor‑Len‑Dex at first line. The committee concluded that the proportions of people having subsequent treatments should be equal for people who have had Isa‑Bor‑Len‑Dex or Dar‑Len‑Dex at first line.

Health-state utility values

3.11

The company did not use the post-progression health-state utility value from IMROZ to inform the economic model. The utility value is considered confidential by the company, so cannot be reported here. The company thought that the utility value was overly optimistic because health-related quality-of-life records were clustered after progression events. So, it did not fully take account of the full post-progression period. The company also explained that the utility value only accounted for second-line treatment. But it should have accounted for multiple relapses up to fourth line in the economic model. The company also explained its view that this utility value was not sufficiently robust because it was derived from relatively few people. For these reasons the company explained that it preferred to use the post-progression utility value of 0.557 from TA587 (that is, Len‑Dex for untreated multiple myeloma). But the EAG noted that the post-progression utility value in IMROZ was based on data from 97 people. It thought this was a large enough sample for the utility value to be robust. It also explained that the PFS utility value from TA587 was low compared with the PFS utility value from IMROZ. So, it follows that the post-progression utility value from TA587 was also potentially too low. The EAG also noted that TA587 was done 12 years ago, and that there were now many more effective treatments available. It explained that the company's economic model was too simplistic to account for utility values falling as people progressed through multiple relapses and subsequent lines of treatment. The EAG also explained that the oversimplification of assuming that everyone would have a post-progression utility value as low as that at the time of TA587 was not clinically plausible. It preferred to use the IMROZ post-progression utility value in the model.

At the first meeting, the clinical experts agreed that the post-progression utility value should capture health-related quality of life over the course of multiple lines of treatment. So, they thought that it would have been preferable to look at more recent trials that were more representative of treatments used in the NHS. The committee noted that the company had included utility values from the study by Hatswell et al. (2019) as a scenario in the model. At the first meeting, the committee agreed that the post-progression utility value from TA587 was low. This was because, at the time of that evaluation, fewer treatment options were available post-progression, which is not reflective of the current treatment pathway. So, the committee concluded that it was not appropriate to use utility values from TA587. It would prefer to use post-progression utility values from IMROZ, or treatment-independent progressed-disease utility values derived by applying a decrement based on Hatswell et al. to the IMROZ PFS utility value.

At consultation, the company applied a utility decrement of 0.030. This was the utility decrement between first- and second-line treatment from Hatswell et al. applied to the IMROZ PFS utility values. The company said that the model included a single post-progression survival health state that aggregated all subsequent lines of treatment. It explained that, if it had applied multiple line-specific decrements, this would have needed different assumptions around the time spent in each line of treatment than those by the comparator. The EAG agreed with the approach the company took. The committee thought that the company's approach was reasonable and that its methodology was appropriate for decision making.

Costs

3.12

At consultation, a stakeholder stated that a cost code (SB12Z) for subcutaneous administration was applied in the model that was different to the cost code (N10AF) used in other multiple myeloma technology appraisals. The company said that SB12Z included nurse time (30 minutes) and chair time (up to 60 minutes), as per NHS guidelines. The chair time included time to observe people having the treatment and to administer the subcutaneous treatment. N10AF factors in the nurse time, but not chair time. The committee concluded that the cost code the company applied in the model was acceptable.

Cost-effectiveness estimates

Acceptable ICER

3.13

NICE's manual on health technology evaluations notes that, above a most plausible incremental cost-effectiveness ratio (ICER) of £20,000 per QALY gained, judgements about the acceptability of a technology as an effective use of NHS resources will take into account the degree of certainty around the ICER. The committee will be more cautious about recommending a technology if it is less certain about the ICERs presented. But it will also take into account other aspects including uncaptured health benefits. Because of confidential commercial prices for isatuximab, bortezomib and some comparator treatments, the ICERs are confidential and cannot be reported here. The committee noted the high level of uncertainty, specifically for an OS benefit for Isa‑Bor‑Len‑Dex compared with Dar‑Len‑Dex (see section 3.8). So, the committee concluded that an acceptable ICER would be towards the lower end of the range (£20,000 to £30,000 per QALY gained) NICE considers a cost-effective use of NHS resources.

Committee's preferred assumptions

3.14

The committee noted its preferred assumptions, which were:

informing the economic model by the non-time-varying MAIC (see section 3.4)
a starting age of 75 years in the model (see section 3.6)
modelling of OS and PFS for Isa‑Bor‑Len‑Dex and comparators by applying the hazard ratio generated from the non-time-varying MAIC to the Dar‑Len‑Dex reference curve from MAIA (see section 3.7)
making the proportions of people having subsequent treatments equal for people who had Isa‑Bor‑Len‑Dex and Dar‑Len‑Dex (see section 3.10)
using post-progression utility values in the economic model from Hatswell et al. (2019) (see section 3.11).

Company and EAG cost-effectiveness estimates

3.15

The committee considered the cost effectiveness of Isa‑Bor‑Len‑Dex compared with Dar‑Len‑Dex and the other relevant comparators. Applying the committee's preferred assumptions, the ICER was towards the lower end of the range that NICE considers a cost-effective use of NHS resources (£20,000 to £30,000 per QALY gained). The exact ICERs cannot be reported here because some prices are confidential.

Conclusion

Recommendation

3.16

Using the committee's preferred assumptions, the cost-effectiveness estimates were within what NICE considers a cost-effective use of NHS resources. So, Isa‑Bor‑Len‑Dex can be used routinely across the NHS for untreated multiple myeloma in adults when a stem cell transplant is unsuitable.