Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

Status

Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

We have previously mentioned the enormous contributions made by John Ioannidis MD in the area of understanding the reliability of medical evidence. [Ioannidis, Delfini Blog, Giannakakis] We want to draw your attention to a recent publication dealing with the risks of relying on observational data for cause and effect conclusions. [Hemkens] In this recent study, Hemkens, Ioannidis and other colleagues assessed differences in mortality effect size reported in observational (routinely collected data [RCD]) studies as compared with results reported in RCTs.

Eligible RCD studies used propensity scores in an effort to address confounding bias in the observational studies. The authors  compared the results of RCD and RCTs. The analysis included only RCD studies conducted before any RCT was published on the same topic. They assessed the risk of bias for RCD studies and randomized controlled trials (RCTs) using The Cochrane Collaboration risk of bias tools.  The direction of treatment effects, confidence intervals and effect sizes (odds ratios) were compared between RCD studies and RCTs. The relative odds ratios were calculated across all pairs of RCD studies and trials.

The authors found that RCD studies systematically and substantially overestimated mortality benefits of medical treatments compared with subsequent trials investigating the same question. Overall, RCD studies reported significantly more favorable mortality estimates by a relative 31% than subsequent trials (summary relative odds ratio 1.31 (95% confidence interval 1.03 to 1.65; I2 (I square)=0%)).

These authors remind us yet again that If no randomized trials exist, clinicians and other decision-makers should not trust results from observational data from sources such as local or national databases, registries, cohort or case-control studies. 

References
Delfini Blog: http://delfini.org/blog/?p=292

Giannakakis IA, Haidich AB, Contopoulos-Ioannidis DG, Papanikolaou GN, Baltogianni MS, Ioannidis JP. Citation of randomized evidence in support of guidelines of therapeutic and preventive interventions. J Clin Epidemiol. 2002 Jun;55(6):545-55. PubMed PMID: 12063096.

Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JP. Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey. BMJ. 2016 Feb 8;352:i493. doi: 10.1136/bmj.i493. PubMed PMID: 26858277.

Ioannidis JPA. Why Most Published Research Findings are False. PLoS Med 2005; 2(8):696-701 PMID: 16060722

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Comparative Effectiveness Research (CER), “Big Data” & Causality

Status

Comparative Effectiveness Research (CER), “Big Data” & Causality

For a number of years now, we’ve been concerned that the CER movement and the growing love affair with “big data,” will lead to many erroneous conclusions about cause and effect.  We were pleased to see the following blog from Austin Frakt, an editor-in-chief of The Incidental Economist: Contemplating health care with a focus on research, an eye on reform

Ten impressions of big data: Claims, aspirations, hardly any causal inference

http://theincidentaleconomist.com/wordpress/ten-impressions-of-big-data-claims-aspirations-hardly-any-causal-inference/

+

Five more big data quotes: The ambitions and challenges

http://theincidentaleconomist.com/wordpress/five-more-big-data-quotes/

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Cochrane Risk Of Bias Tool For Non-Randomized Studies

Status

Cochrane Risk Of Bias Tool For Non-Randomized Studies

Like many others, our position is that, with very few exceptions, cause and effect conclusions regarding therapeutic interventions can only be drawn when valid RCT data exists. However, there are uses for observational studies which may be used to answer additional questions, and non-randomized studies (NRS) are often included in systematic reviews.

In September 2014, Cochrane published a tool for assessing bias in NRS for systematic review authors [1]. It may be of interest to our colleagues. The tool is called ACROBAT-NRSI (“A Cochrane Risk Of Bias Assessment Tool for Non-Randomized Studies”) and is designed to assist with evaluating the risk of bias (RoB) in the results of NRS that compare the health effects of two or more interventions.

The tool focuses on internal validity. It covers seven domains through which bias might be introduced into a NRS. The domains provide a framework for considering any type of NRS, and are summarized in the table below, and many of the biases listed here are described and explanations of how they may cause bias are presented in the full document, and you can see our rough summary here: http://www.delfini.org/delfiniClick_Observations.htm#robtable

Response options for each bias include: low risk of bias; moderate risk of bias; serious risk of bias; critical risk of bias; and no information on which to base a judgment.

Details are available in the full document which can be downloaded at—https://sites.google.com/site/riskofbiastool/

Delfini Comment
We again point out that non-randomized studies often report seriously misleading results even when treated and control groups appear similar in prognostic variables and agree with Deeks that, for therapeutic interventions ,“non-randomised studies should only be undertaken when RCTs are infeasible or unethical”[2]—and even then, buyer beware. Studies do not get “validity grace” because of scientific or practical challenges.

Furthermore, we are uncertain that this tool is of great value when assessing NRS. Deeks [2] identified 194 tools that could be or had been used to assess NRS. Do we really need another one? While it’s a good document for background reading, we are more comfortable approaching the problem of observational data by pointing out that, when it comes to efficacy, high quality RCTs have a positive predictive value of about 85% whereas well-done observational trials have a positive predictive value of about 20% [3].

References

Sterne JAC, Higins JPT, Reves BC on behalf of the development group for ACROBAT- NRSI. A Cochrane Risk Of Bias Asesment Tol: for Non-Randomized Studies of Interventions (ACROBAT- NRSI), Version 1.0.0, 24 September 2014. Available from htp:/www.riskofbias.info [accessed 10/11/14.

Deeks JJ, Dinnes J, D’Amico R, Sowden AJ, Sakarovitch C, Song F, Petticrew M, Altman DG; International Stroke Trial Collaborative Group; European Carotid Surgery Trial Collaborative Group. Evaluating non-randomised intervention studies. Health Technol Assess. 2003;7(27):iii-x, 1-173. Review. PubMed PMID: 14499048.

Ioannidis JPA. Why Most Published Research Findings are False. PLoS Med 2005; 2(8):696-701 PMID: 16060722.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Webinar: “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

Status

“Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

On Monday, May 20, 2013, we presented a webinar on “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities” for the member organizations of the Alliance of Community Health Plans (ACHP).

The 80-minute discussion addressed four topic areas, all of which have unique critical appraisal challenges. Webinar goals were to discuss issues that arise when conducting quality improvement efforts using real world data, such as data from claims, surveys and observational studies and other published healthcare evidence.

Key pitfalls were cherry picked for these four mini-seminars—

  • Pitfalls to avoid when using real-world data, dealing with heterogeneity, confounding-by-indication and causality.
  • Key issues in evaluating oncology studies — outcome issues and focus on how to address large attrition rates.
  • Important issues when conducting comparative safety reviews — assessing patterns through use of RCTs, systematic reviews, observational studies and registries.
  • Key issues in evaluating studies employing Kaplan-Meier estimates — time-to-event basics with attention to the important problem of censoring.

A recording of the webinar is available at—

https://achp.webex.com/achp/lsr.php?AT=pb&SP=TC&rID=45261732&rKey=1475c8c3abed8061&act=pb

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Interesting Comparative Effectiveness Research (CER) Case Study: “Real World Data” Hypothetical Migraine Case and Lack of PCORI Endorsement

Status

Interesting Comparative Effectiveness Research (CER) Case Study: “Real World Data” Hypothetical Migraine Case and Lack of PCORI Endorsement

In the October issue of Health Affairs, the journal’s editorial team created a fictional set of clinical trials and observational studies to see what various stakeholders would say about comparative effectiveness evidence of two migraine drugs.[1]

The hypothetical set-up is this:

The newest drug, Hemikrane, is an FDA-approved drug that has recently come on the market. It was reported in clinical trials to reduce both the frequency and the severity of migraine headaches. Hemikrane is taken once a week. The FDA approved Hemikrane based on two randomized, double-blind, controlled clinical trials, each of which had three arms.

  • In one arm, patients who experienced multiple migraine episodes each month took Hemikrane weekly.
  • In another arm, a comparable group of patients received a different migraine drug, Cephalal, a drug which was reported to be effective in earlier, valid studies. It is taken daily.
  • In a third arm, another equivalent group of patients received placebos.

The study was powered to find a difference between Hemikrane and placebo if there was one and if it were at least as effective as Cephalal. Each of the two randomized studies enrolled approximately 2,000 patients and lasted six months. They excluded patients with uncontrolled high blood pressure, diabetes, heart disease, or kidney dysfunction. The patients received their care in a number of academic centers and clinical trial sites. All patients submitted daily diaries, recording their migraine symptoms and any side effects.

Hypothetical Case Study Findings: The trials reported that the patients who took Hemikrane had a clinically significant reduction in the frequency, severity, and duration of headaches compared to placebo, but not to Cephalal.

The trials were not designed to evaluate the comparative safety of the drugs, but there were no safety signals from the Hemikrane patients, although a small number of patients on the drug experienced nausea.

Although the above studies reported efficacy of Hemikrane in a controlled environment with highly selected patients, they did not assess patient experience in a real-world setting. Does once weekly dosing improve adherence in the real world? The monthly cost of Hemikrane to insurers is $200, whereas Cephalal costs insurers $150 per month. (In this hypothetical example, the authors assume that copayments paid by patients are the same for all of these drugs.)

A major philanthropic organization with an interest in advancing treatments for migraine sufferers funded a collaboration among researchers at Harvard; a regional health insurance company, Trident Health; and, Hemikrane’s manufacturer, Aesculapion. The insurance company, Trident Health, provided access to a database of five million people, which included information on medication use, doctor visits, emergency department evaluations and hospitalizations. Using these records, the study identified a cohort of patients with migraine who made frequent visits to doctors or hospital emergency departments. The study compared information about patients receiving Hemikrane with two comparison groups: a group of patients who received the daily prophylactic regimen with Cephalal, and a group of patients receiving no prophylactic therapy.

The investigators attempted to confirm the original randomized trial results by assessing the frequency with which all patients in the study had migraine headaches. Because the database did not contain a diary of daily symptoms, which had been collected in the trials, the researchers substituted as a proxy the amount of medications such as codeine and sumatriptan (Imitrex) that patients had used each month for treatment of acute migraines. The group receiving Hemikrane had lower use of these symptom-oriented medications than those on Cephalal or on no prophylaxis and had fewer emergency department visits than those taking Cephalal or on no prophylaxis.

Although the medication costs were higher for patients taking Hemikrane because of its higher monthly drug cost, the overall episode-of-care costs were lower than for the comparison group taking Cephalal. As hypothesized, the medication adherence was higher in the once-weekly Hemikrane patients than in the daily Cephalal patients (80 percent and 50 percent, respectively, using the metric of medication possession ratio, which is the number of days of medication dispensed as a percentage of 365 days).

The investigators were concerned that the above findings might be due to the unique characteristics of Trident Health’s population of covered patients, regional practice patterns, copayment designs for medications, and/or the study’s analytic approach. They also worried that the results could be confounded by differences in the patients receiving Hemikrane, Cephalal, or no prophylaxis. One possibility, for example, was that patients who experienced the worst migraines might be more inclined to take or be encouraged by their doctors to take the new drug, Hemikrane, since they had failed all previously available therapies. In that case, the results for a truly matched group of patients might have shown even more pronounced benefit for Hemikrane.

To see if the findings could be replicated, the investigators contacted the pharmacy benefit management company, BestScripts, that worked withTrident Health, and asked for access to additional data. A research protocol was developed before any data were examined. Statistical adjustments were also made to balance the three groups of patients to be studied as well as possible—those taking Hemikrane, those taking Cephalal, and those not on prophylaxis—using a propensity score method (which included age, sex, number of previous migraine emergency department visits, type and extent of prior medication use and selected comorbidities to estimate the probability of a person’s being in one of the three groups) to balance the groups.

The pharmacy benefit manager, BestScripts, had access to data covering more than fifty million lives. The findings in this second, much larger, database corroborated the earlier assessment. The once-weekly prophylactic therapy with Hemikrane clearly reduced the use of medications such as codeine to relieve symptoms, as well as emergency department visits compared to the daily prophylaxis and no prophylaxis groups. Similarly, the Hemikrane group had significantly better medication adherence than the Cephalal group. In addition, BestScripts had data from a subset of employers that collected work loss information about their employees. These data showed that patients on Hemikrane were out of work for fewer days each month than patients taking Cephalal.

In a commentary, Joe Selby, executive director of the Patient-Centered Outcomes Research Institute (PCORI), and colleagues provided a list of problems with these real world studies including threats to validity. They conclude that these hypothetical studies would be unlikely to have been funded or communicated by PCORI.[2]

Below are several of the problems identified by Selby et al.

  • Selection Bias
    • Patients and clinicians may have tried the more familiar, less costly Cephalal first and switched to Hemikrane only if Cephalal failed to relieve symptoms, making the Hemikrane patients a group, who on average, would be more difficult to treat.
    • Those patients who continued using Cephalal may be a selected group who tolerate the treatment well and perceived a benefit.
    • Even if the investigators had conducted the study with only new users, it is plausible that patients prescribed Hemikrane could differ from those prescribed Cephalal. They may be of higher socioeconomic status, have better insurance coverage with lower copayments, have different physicians, or differ in other ways that could affect outcomes.
  • Performance Biases or Other Differences Between Groups is possible.
  • Details of any between-group differences found in these exploratory analyses should have been presented.

Delfini Comment

These two articles are worth reading if you are interested in the difficult area of evaluating observational studies and including them in comparative effectiveness research (CER). We would add that to know if drugs really work, valid RCTs are almost always needed. In this case we don’t know if the studies were valid, because we don’t have enough information about the risk of selection, performance, attrition and assessment bias and other potential methodological problems in the studies. Database studies and other observational studies are likely to have differences in populations, interventions, comparisons, time treated and clinical settings (e.g., prognostic variables of subjects, dosing, co-interventions, other patient choices, bias from lack of blinding) and adjusting for all of these variables and more requires many assumptions. Propensity scores do not reliably adjust for differences. Thus, the risk of bias in the evidence base is unclear.

This case illustrates the difficulty of making coverage decisions for new drugs with some potential advantages for some patients when several studies report benefit compared to placebo, but we already have established treatment agents with safety records. In addition new drugs frequently are found to cause adverse events over time.

Observational data is frequently very valuable. It can be useful in identifying populations for further study, evaluating the implementation of interventions, generating hypotheses, and identifying current condition scenarios (e.g., who, what, where in QI project work; variation, etc.). It is also useful in providing safety signals and for creating economic projections (e.g., balance sheets, models). In this hypothetical set of studies, however, we have only gray zone evidence about efficacy from both RCTs and observational studies and almost no information about safety.

Much of the October issue of Health Affairs is taken up with other readers’ comments. Those of you interested in the problems with real world data in CER activities will enjoy reading how others reacted to these hypothetical drug studies.

References

1. Dentzer S; the Editorial Team of Health Affairs. Communicating About Comparative Effectiveness Research: A Health Affairs Symposium On The Issues. Health Aff (Millwood). 2012 Oct;31(10):2183-2187. PubMed PMID: 23048094.

2. Selby JV, Fleurence R, Lauer M, Schneeweiss S. Reviewing Hypothetical Migraine Studies Using Funding Criteria From The Patient-Centered Outcomes Research Institute. Health Aff (Millwood). 2012 Oct;31(10):2193-2199. PubMed PMID: 23048096.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email