The search for consistency (and its desirability)
One of the emerging themes of this blog is the extent to which we can establish standardise aspects of producing HTA evidence and a paper I have just read by Lung et al is an illustration:
A meta-analysis of health state valuations for people with diabetes: explaining the variation across methods and implications for economic evaluation.
Lung TW, Hayes AJ, Hayen A, Farmer A, Clarke PM.
Qual Life Res. 2011 Apr 7. [Epub ahead of print]
This team carried out a literature search so thorough it has me worrying for their psychological stability to identify studies that used one of the QALY-compatible preference measures like EQ-5D or an SF measure, or which used time trade-off or standard gamble in people with diabetes.
They report huge ranges in the values obtained from a humble 14 points for diabetes with no complications (i.e. lowest was 0.74, highest was 0.88) through to 48 points (stroke and end-stage renal disease). It’s obvious that stroke, for example, would depend on severity and ESRD might depend on whether the person required dialysis and, if they did, whether it was hospital or home based. However, if an HTA organisation accepts published values from the literature, say because it used its preferred utility elicitation technique, it has handed a substantial element of choice to the people writing the HTA submission. For ESRD in diabetes will we use a utility value of 0.33 citing source study A or 0.81 citing source study B?
Using stats techniques I am too dull to understand they then carried out two analyses that I will pick out, a random effects meta-analysis (MA) and a random effects meta regression (MR).
The MA gives a point estimates of the mean utility value across the studies but, as important, it provides a 95% confidence interval (and sample size) ideal for use in sensitivity analysis. For example, the diabetes with no complications state had values from individual studies from 0.74 to 0.88 but in the MA the mean value was 0.81, 95% confidence interval 0.78 to 0.84. Fantastic, as a reviewer of HTA submissions, that is so helpful.
In the MR they analyse how much of the variation between the estimates in different studies can be explained by measured features such as the age of the patients, their sex, and the elicitation technique. Older age and being female led to lower utilities (<<insert joke of your choice>>), and of the elicitation techniques TTO and SG (combined) gave higher values than EQ-5D which, in turn, gave higher values than HUI-3 and SF-6D (combined).
Tom Lung and team, take a bow, my grateful thanks. The only other study like this I am aware of is in stroke:
Tengs TO, Lin TH.
Are other people aware of similar studies? And how should we use them?
Clearly, the ideal is still that companies measure quality of life in their trials in a way that is compatible with QALYs. However, in this ‘second best’ situation, I think there is a strong case for making values from these meta-analyses the default settings for an HTA submission. Of course I would be interested in listening to a company’s arguments for why this should not apply to their particular submission – for example, suppose it could be shown that for a particular treatment the ESRD experienced secondary to diabetes were always of a milder type than a utility value of 0.48 (from Lung et al’s meta-analysis) would imply.
But what we all need to get away from is where an HTA submission can select between two hugely different utility values and cite a supporting reference with equal authority. This should work for companies as well as it gives them greater certainty when they are estimating the likely cost per QALY at an early stage of a product’s life and it may save them some money on commissioning their own utility surveys.