3 Committee discussion

The evaluation committee considered evidence submitted by AstraZeneca, a review of this submission by the external assessment group (EAG), and responses from stakeholders. See the committee papers for full details of the evidence.

The condition

Details of condition

3.1

Hepatocellular carcinoma (HCC) is the most common form of liver cancer in England, accounting for 65% of primary liver cancer diagnoses in men and 34% of diagnoses in women in 2021. It is commonly associated with liver cirrhosis (scarring of the liver), which can be caused by viral infections such as hepatitis B or C, excessive alcohol intake, or other diseases that result in chronic inflammation of the liver. NHS Cancer Registration Statistics show there was a total of 3,021 new diagnoses of HCC in England in 2021. Symptoms of HCC include abdominal pain and swelling, loss of appetite, fatigue and jaundice. In advanced HCC, people may also experience confusion or disorientation due to hepatic encephalopathy. The patient organisation said these symptoms are distressing, debilitating and have a significant impact on quality of life. They can make it difficult for people to eat, breathe and function normally. The prognosis for HCC is poor, with only 38% of people still alive 1 year after their diagnosis. The patient expert explained that their HCC diagnosis had been devastating. The committee concluded that advanced or unresectable HCC has a severe effect on both quality and length of life.

Treatment pathway

Current treatment

3.2

The goal of treatment for advanced or unresectable HCC is to delay progression of the condition and prolong life. The treatment used depends on the location and stage of the cancer, and how well the liver functions. In the NHS, standard care for advanced or unresectable hepatocellular carcinoma is systemic therapy with atezolizumab plus bevacizumab, or a tyrosine kinase inhibitor (lenvatinib or sorafenib). The NHS England Cancer Drugs Fund lead confirmed that in the last year around 69% of people with HCC had first-line treatment with atezolizumab plus bevacizumab. Around 16% of people with HCC had lenvatinib and 15% had sorafenib. Atezolizumab plus bevacizumab is typically preferred over lenvatinib or sorafenib because of superior efficacy. But atezolizumab plus bevacizumab is not suitable for people with contraindications such as variceal bleeding hypertension, renal dysfunction or tumour bleeding. Lenvatinib or sorafenib are typically used when atezolizumab plus bevacizumab is not suitable. The clinical expert explained that some clinicians may prefer lenvatinib or sorafenib because of familiarity with tyrosine kinase inhibitors. Lenvatinib is typically preferred over sorafenib because of superior clinical benefit. So sorafenib is typically only used when atezolizumab plus bevacizumab or lenvatinib are not suitable. People with HCC often feel frustrated by the limited treatment options, particularly because existing options may not be suitable for them or have unmanageable side effects. The committee concluded that atezolizumab plus bevacizumab is the most used treatment in this population. But lenvatinib and sorafenib are still used by a minority of people for whom atezolizumab plus bevacizumab is not suitable.

Unmet need

3.3

The clinical experts explained that they would use durvalumab with tremelimumab when atezolizumab plus bevacizumab is not suitable. Atezolizumab plus bevacizumab would continue to be their first choice because of familiarity with the treatment and a lack of evidence to justify preferring durvalumab with tremelimumab. They noted that, because the 2 regimens have different side effect profiles, having durvalumab with tremelimumab available as a treatment option would enable more people to benefit from immunotherapy than currently do. Patient submissions said people with advanced or unresectable HCC are fearful of the future because of the poor prognosis. They would value having extra time to spend with their families and put their affairs in order. So, people with HCC would welcome any new treatments, especially those with the potential to extend life. The committee concluded that people with untreated advanced or unresectable hepatocellular carcinoma would welcome an additional treatment option, particularly when atezolizumab plus bevacizumab is not suitable.

Clinical evidence

Data sources

3.4

The clinical trial data for durvalumab with tremelimumab comes from HIMALAYA, a randomised, open-label phase 3 trial. HIMALAYA compared a single dose of tremelimumab with durvalumab taken every 4 weeks (from here on, STRIDE) with sorafenib in adults with advanced or unresectable HCC (BCLC stage B or C), ineligible for locoregional therapy and who had had no previous systemic treatment. The EAG raised concerns about the generalisability of the HIMALAYA trial to the NHS population. It noted that no UK sites were included in the trial. It also highlighted that the median age and proportion of HCC with non-viral aetiologies were slightly lower than those seen in a UK audit of NHS patients. However, clinical experts stated that the HIMALAYA baseline characteristics were consistent with the population having systemic therapy in the NHS in the UK. They did not feel that the differences between the trial and NHS populations, such as the proportion of HCC with non-viral aetiologies, would result in clinically meaningful differences in clinical outcomes. Therefore, the committee concluded that HIMALAYA was suitable for decision making.

Clinical effectiveness

3.5

In HIMALAYA, STRIDE demonstrated a statistically significant improvement in median overall survival (OS) compared with sorafenib (16.43 months compared with 13.77 months; hazard ratio 0.76, 95% confidence interval 0.65 to 0.89). At 4 years, survival probability was 25.2% for STRIDE compared with 15.1% for sorafenib. At the primary data cut-off, there was a numerical but not statistically significant difference in investigator-assessed progression-free survival (PFS) between STRIDE and sorafenib (12.5% compared with 4.9%). The committee raised concerns about the clinical plausibility of observing a statistically significant benefit in OS for STRIDE but not in PFS. The company said that having non-significant changes in PFS but significant improvement in OS is not unusual for immunotherapy combination treatments. The clinical experts agreed that this was clinically plausible given the mechanism of action of durvalumab with tremelimumab. They also said this pattern has been seen in other tumour types. The clinical experts noted that it can be difficult to detect progression of HCC through imaging, which can limit the robustness of PFS data. Because this treatment is more palliative than curative, they said the most important outcome for this population is OS rather than PFS. The committee noted that because of the open-label nature of HIMALYA there is a potential for bias in the assessment of the treatment effect. Particularly as only investigator-assessed PFS was collected in the trial. It also questioned whether people in the STRIDE arm may have improved survival because of fewer deaths from other causes. The clinical experts said it was not possible to determine whether someone in this population dies from their cancer or a non-cancer cause (such as liver failure). But that there is no reason to think this would differ between treatment arms. The committee concluded that STRIDE is an effective treatment.

Network meta-analysis

3.6

Because of the lack of head-to-head evidence comparing STRIDE with atezolizumab plus bevacizumab, and lenvatinib, the company did a network meta-analysis (NMA). The analysis included data from 3 trials (HIMALAYA, IMbrave150 and REFLECT) to compare the efficacy of atezolizumab plus bevacizumab, lenvatinib and STRIDE with sorafenib. The results showed no significant benefit in PFS or OS for any of the 3 treatments compared with sorafenib. There were numerical differences in the hazard ratios, which favoured all 3 treatments over sorafenib, but the results were not statistically significant. The exact results are considered confidential by the company and cannot be reported here. The EAG had key concerns about the methodology used in the company NMA. It also noted that the hazard ratio from the company NMA for the comparison of atezolizumab plus bevacizumab with sorafenib for PFS was an outlier compared with the value in several other published NMAs. This added to its concerns about the robustness of the company NMA. So, the EAG preferred to use an NMA previously published by Vogel et al. (2023). The company said the Vogel NMA did not include the latest HIMALAYA data, so does not fully capture long-term efficacy for STRIDE. The EAG explained that the 'outlier' PFS result could be because the company NMA used investigator-assessed PFS rather than blinded independent central review PFS (see section 3.5). The company said that because only investigator-assessed PFS was collected in HIMALAYA, the NMA used investigator-assessed PFS across all studies to ensure alignment with that. It noted that interim data from HIMALAYA using blinded independent central review was available and was comparable to investigator-assessed PFS. The committee considered the 2 different NMA approaches. It noted that in general, the results from the company NMA were similar to the EAG-preferred NMA, with the exception of the PFS hazard ratio for atezolizumab plus bevacizumab compared with sorafenib. Both NMAs found no significant difference in OS or PFS for any of the treatments compared with sorafenib. The committee discussed the size of the networks and noted that both NMAs included only 1 comparison study for each treatment. While the EAG NMA had a larger network than the company NMA, the committee agreed that adding more studies does not improve the robustness of the results unless those studies include treatments being considered in this technology appraisal. The clinical experts noted that comparing data from different trials with different populations and follow-up periods is difficult. For example, HIMALAYA had a longer median follow up than the trials for the comparator treatments. The EAG noted that the NMA results are a key driver of cost effectiveness in the model. The committee concluded that blinded independent review PFS is preferred as an outcome measure over investigator-assessed PFS, because it is a more objective measure with less risk of bias. So, it concluded that the EAG NMA (from Vogel et al.) was preferred over the company NMA. However, the committee noted that the NMA was a key area of uncertainty.

Economic model

Modelling approaches for OS and PFS

3.7

The company used a 3-state partitioned survival model (progression-free, progressed disease and death). For STRIDE and sorafenib, the company modelled OS and PFS using individual patient-level data from HIMALAYA. It found evidence that the proportional hazards assumption was violated for STRIDE compared with sorafenib (PFS and OS), so used independently fitted parametric curves. It also found evidence of proportional hazards violation for lenvatinib compared with sorafenib. But, applying a constant hazard ratio yields conservative cost-effectiveness estimates when compared with STRIDE, so this approach was used as a conservative option. For atezolizumab plus bevacizumab, and lenvatinib, the company used hazard ratios from its NMA. The EAG had concerns about the company NMA (see section 3.6) and so also had concerns about using these hazard ratios in the model. It highlighted the inconsistency of approach between treatments (that is, using HIMALAYA data for STRIDE and sorafenib, and hazard ratios from the NMA for lenvatinib and atezolizumab plus bevacizumab). The EAG preferred to use independently fitted parametric curves using individual patient-level data for sorafenib (PFS and OS), and hazard ratios from the Vogel NMA for all other treatments. The company said it was not appropriate to apply a constant hazard ratio when the proportional hazards assumption had been violated (that is, for STRIDE). The committee noted the differences between the company and the EAG approaches. When considering the OS curves for atezolizumab plus bevacizumab and STRIDE, it felt it was clinically implausible for the STRIDE OS curve to cross the atezolizumab plus bevacizumab OS curve several years after treatment had started. The clinical experts discussed the challenges of extrapolating long-term survival based on limited data. But they felt that because of the mechanism of action of tremelimumab, it is plausible to have a durable treatment effect that may result in the crossing of OS curves. The committee concluded that there were limitations to both the company and the EAG modelling of OS and PFS. It was not satisfied that the crossing of OS curves for STRIDE and atezolizumab plus bevacizumab was plausible, and would like to see further analyses on this. This includes using equal hazard rate functions from the timepoint at which the atezolizumab plus bevacizumab and STRIDE OS curves cross. The committee also noted that for the sorafenib OS curves, the company preferred to apply a 1-knot hazard curve, whereas the EAG preferred to use generalised gamma. This difference was not considered to be a key model driver but overall the committee preferred the EAG approach for modelling OS with sorafenib.

Modelling time to treatment discontinuation

3.8

Time to treatment discontinuation (TTD) data is available for STRIDE and sorafenib from HIMALAYA. But equivalent data is not available for atezolizumab plus bevacizumab, and lenvatinib. To address this, the company assumed that PFS was equivalent to TTD for these 2 treatments. The EAG raised concerns around this approach. It noted that people with HCC often continue to have treatment after progression, so TTD is not equal to PFS in clinical practice. It also had concerns about the lack of consistency in the approach between the treatments (using TTD trial data for 2 treatments, and assumption of parity with PFS for the other 2 treatments). It said this could lead to bias and weaken the robustness of the model. The clinical expert agreed that it is expected that people in this population would have treatment after progression. The committee agreed that the assumption that PFS equals TTD is flawed and does not reflect clinical practice. It said this approach risks underestimating the costs for atezolizumab plus bevacizumab, and lenvatinib. This is particularly the case when using the company NMA results, where PFS for atezolizumab plus bevacizumab is considerably longer than in the EAG base case. The committee discussed that assuming equality between PFS and TTD for atezolizumab plus bevacizumab would likely underestimate the incremental cost-effectiveness ratios (ICERs) for STRIDE, particularly when using the company NMA results. Because TTD data is not available for atezolizumab plus bevacizumab, or lenvatinib, it concluded a consistent approach was preferred. So, the committee concluded that PFS should be assumed equivalent to TTD for all treatments.

Retreatment with tremelimumab

3.9

In HIMALAYA, 31 people in the STRIDE arm who had evidence of disease progression had retreatment with 1 additional dose tremelimumab. The company confirmed that there was no adjustment to efficacy data for this (any benefits from the additional dose are included in the efficacy data). It also confirmed that the costs of this additional dose were not included in the economic model. The committee was concerned that including the clinical benefit for retreatment with tremelimumab without including the additional costs would bias the cost-effectiveness results for STRIDE. The company said that this applied to only a small number of people (about 8% of the STRIDE arm). The committee concluded that it would like the cost-effectiveness results adjusted to include the additional costs of retreatment with tremelimumab.

Other assumptions

Time horizon

3.10

The company base case used a 40-year time horizon. The EAG preferred to use a 20-year time horizon because this was consistent with previous technology appraisals in HCC. NICE's manual on health technology evaluations states that the time horizon should be 'long enough to reflect all important differences in costs or outcomes between the technologies being compared'. The committee noted that some company scenarios show a significant proportion of people still alive at 20 years. Therefore, the committee concluded that a 40-year time horizon was preferrable to ensure all incremental costs and benefits were captured in the model.

Utility values

3.11

The company used EQ-5D-5L collected from HIMALAYA to inform the utility values for people having STRIDE and sorafenib. The company assumed that people having lenvatinib have the same utility value as those having sorafenib. This is because the treatments are both tyrosine kinase inhibitors and have comparable side effect profiles. Similarly, it assumed that people having atezolizumab plus bevacizumab have the same utility value as STRIDE, because both regimens include immune checkpoint inhibitor drugs. The company did not use different utility values for different health states. The EAG preferred to use the same utility values across all 4 treatments, but for utility values to decline as people approach death. So, the company approach was treatment-dependent, whereas the EAG approach was time-dependent. The committee said there were limitations with both approaches because of limited data collection after progression or treatment discontinuation. But, on balance, it preferred the time-dependent utility values approach used by the EAG, to reflect declining utility as disease progresses.

Severity

3.12

The committee considered the severity of the condition (the future health lost by people living with the condition and having standard care in the NHS). The committee may apply a greater weight to quality-adjusted life years (QALYs; a severity modifier) if technologies are indicated for conditions with a high degree of severity. The company provided absolute and proportional QALY shortfall estimates in line with NICE's health technology evaluations manual. The company and EAG agreed that no modifier is appropriate when doing a pairwise comparison with atezolizumab plus bevacizumab. The company said that a severity modifier of 1.2 should apply for any comparison against sorafenib or lenvatinib, based on the QALY shortfall. The EAG said that, because of the availability of atezolizumab plus bevacizumab as an established treatment option, no modifier should apply for any of the comparisons (including fully incremental analyses or pairwise comparisons). The committee considered both approaches, but it did not reach a conclusion on whether a severity modifier should be applied for fully incremental analyses or pairwise comparisons between STRIDE and lenvatinib or sorafenib.

Cost-effectiveness estimates

Acceptable ICER

3.13

NICE's manual on health technology evaluations notes that, above a most plausible ICER of £20,000 per QALY gained, judgements about the acceptability of a technology as an effective use of NHS resources will take into account the degree of certainty around the incremental cost-effectiveness ratio (ICER). The committee will be more cautious about recommending a technology if it is less certain about the ICERs presented. But it will also take into account other aspects including uncaptured health benefits. The committee noted the high level of uncertainty, specifically:

OS and PFS were not significantly longer for STRIDE, atezolizumab plus bevacizumab or lenvatinib compared with sorafenib in either the company or EAG NMA
the long-term modelling of OS results in which the STRIDE survival curve crossed the atezolizumab plus bevacizumab survival curve was not sufficiently justified
the use of PFS data as a proxy for TTD data, which is used to model treatment costs
the benefits of retreatment with tremelimumab were included in the model without including the additional costs.

For interventions that are less costly and less effective than a comparator, an intervention is considered cost effective if the ICER is above the level considered acceptable, rather than below it. NICE's health technology evaluations manual states that the committee does not use a precise maximum acceptable ICER above which a technology would automatically be defined as not cost effective or below which it would. The committee noted that because of the high amount of uncertainty in the modelling approaches taken and the comparative clinical effectiveness, an acceptable ICER would need to be above £30,000 per QALY gained in the south-west quadrant of the cost-effectiveness plane.

Company and EAG cost-effectiveness estimates

3.14

The company and EAG provided fully incremental analyses and pairwise analyses to compare STRIDE with the comparators. For decision-making purposes, confidential discounts were applied for the intervention and comparator treatments to best reflect the price relevant to the NHS. The price for one of the comparators differed between NHS regions because it is negotiated by the Medicines Procurement and Supply Chain (MPSC). So, the committee considered analyses based on the lowest, midpoint and the highest available prices for that treatment in its decision making. In the company base case, STRIDE was less costly and more effective than atezolizumab plus bevacizumab. In the EAG base case, STRIDE was less effective than atezolizumab plus bevacizumab. Whether STRIDE was more or less costly than atezolizumab plus bevacizumab depended on whether the low, mid or high MPSC price was used. The committee acknowledged the different impact of MPSC prices and noted the assumptions that had the biggest impact on the ICERs were the source of the NMA data (Vogel or company NMA) and the approach for modelling long-term OS and PFS.

Committee's preferred assumptions

3.15

The committee concluded that its preferred assumptions for the cost-effectiveness modelling were:

using hazard ratios derived from the Vogel NMA (see section 3.7)
assuming TTD is equivalent to PFS for all treatments (see section 3.8)
using generalised gamma for the sorafenib OS extrapolation (see section 3.7)
using a time-dependent approach for utility values (see section 3.11)
using a 40-year time horizon (see section 3.10).

The committee agreed that, according to clinical expert opinion and data from NHS England, atezolizumab plus bevacizumab is used most commonly in this population. The committee acknowledged that lenvatinib and sorafenib are taken by some people. So, the committee considered that a pairwise comparison with atezolizumab plus bevacizumab was an appropriate approach. But it also considered the fully incremental results.

Additional analyses

3.16

The committee requested further analyses from the company to address its outstanding concerns:

updated OS modelling that has equal hazard rate functions from the time point at which the atezolizumab plus bevacizumab and STRIDE curves cross
updated cost-effectiveness estimates that include the additional costs for retreatment with tremelimumab.

Assessment of cost effectiveness

3.17

Because some of the ICERs considered by the committee were in the south-west quadrant of the cost-effectiveness plane (less benefit at lower cost), committee considered net health benefit (NHB). It agreed that the most plausible incremental NHBs for STRIDE based on the currently available analyses were negative (at threshold values of £30,000 per QALY gained). But the committee stated it was unclear how the additional analyses would affect the cost-effectiveness estimates. It concluded that it could not recommend durvalumab plus tremelimumab for routine use in the NHS because it was not presented with sufficient evidence to conclude that it was a cost-effective use of NHS resources.

Other factors

Equality

3.18

The committee acknowledged that HCC disproportionally affects people from poor socioeconomic backgrounds, but agreed this was not something that could be addressed in its recommendations.

How are you taking part in this consultation?

Question on Consultation

Question on Consultation

Question on Consultation

Question on Consultation

Durvalumab with tremelimumab for untreated advanced or unresectable hepatocellular carcinoma

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

3.10

3.11

3.12

3.13

3.14

3.15

3.16

3.17

3.18