Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

Status

Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

We have previously mentioned the enormous contributions made by John Ioannidis MD in the area of understanding the reliability of medical evidence. [Ioannidis, Delfini Blog, Giannakakis] We want to draw your attention to a recent publication dealing with the risks of relying on observational data for cause and effect conclusions. [Hemkens] In this recent study, Hemkens, Ioannidis and other colleagues assessed differences in mortality effect size reported in observational (routinely collected data [RCD]) studies as compared with results reported in RCTs.

Eligible RCD studies used propensity scores in an effort to address confounding bias in the observational studies. The authors  compared the results of RCD and RCTs. The analysis included only RCD studies conducted before any RCT was published on the same topic. They assessed the risk of bias for RCD studies and randomized controlled trials (RCTs) using The Cochrane Collaboration risk of bias tools.  The direction of treatment effects, confidence intervals and effect sizes (odds ratios) were compared between RCD studies and RCTs. The relative odds ratios were calculated across all pairs of RCD studies and trials.

The authors found that RCD studies systematically and substantially overestimated mortality benefits of medical treatments compared with subsequent trials investigating the same question. Overall, RCD studies reported significantly more favorable mortality estimates by a relative 31% than subsequent trials (summary relative odds ratio 1.31 (95% confidence interval 1.03 to 1.65; I2 (I square)=0%)).

These authors remind us yet again that If no randomized trials exist, clinicians and other decision-makers should not trust results from observational data from sources such as local or national databases, registries, cohort or case-control studies. 

References
Delfini Blog: http://delfini.org/blog/?p=292

Giannakakis IA, Haidich AB, Contopoulos-Ioannidis DG, Papanikolaou GN, Baltogianni MS, Ioannidis JP. Citation of randomized evidence in support of guidelines of therapeutic and preventive interventions. J Clin Epidemiol. 2002 Jun;55(6):545-55. PubMed PMID: 12063096.

Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JP. Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey. BMJ. 2016 Feb 8;352:i493. doi: 10.1136/bmj.i493. PubMed PMID: 26858277.

Ioannidis JPA. Why Most Published Research Findings are False. PLoS Med 2005; 2(8):696-701 PMID: 16060722

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

“Reading” a Clinical Trial Won’t Get You There

Status

“Reading” a Clinical Trial Won’t Get You There—or Let’s Review (And Apply) Some Basics About Assessing The Validity of Medical Research Studies Claiming Superiority for Efficacy of Therapies

An obvious question raised by the title is, “Get you where?” Well, the answer is, “To where you know it is reasonable to think you can trust the results of the study you have just finished reading.” In this blog, our focus is on how to critically appraise medical research studies which claim superiority for efficacy of a therapy.

Because of Lack of Understanding Medical Science Basics, People May Be Injured or Die

Understanding basic requirements for valid medical science is very important. Numbers below are estimates, but are likely to be close or understated—

  1. Over 63,000 people with heart disease died after taking encainide or flecainide because many doctors thought taking these drugs “made biological sense,” but did not understand the simple need for reliable clinical trial information to confirm what seemed to “make sense” [Echt 91].
  2. An estimated 60,000 people in the United States died and another 140,000 experienced a heart attack resulting from the use of a nonsteroidal anti-inflammatory drug despite important benefit and safety information reported in the abstract of the pivotal trial used for FDA approval [Graham].
  3. In another example, roughly 42,000 women with advanced breast cancer suffered excruciating side effects without any proof of benefit, many of them dying as a result, and at a cost of $3.4 billion dollars [Mello].
  4. At least 64 deaths out of 751 cases in nearly half the United States were linked to fungal meningitis thought to be caused by a contaminated treatment that is used for back and radicular pain—but there is no reliable scientific evidence of benefit from that treatment [CDC].

In the above instances, these were preventable deaths and harms—from common treatments—which patients might have avoided if their physicians had better understood the importance and methods of evaluating medical science.

Failures to Understand Medical Science Basics

Many health care professionals don’t know how to quickly assess a trial for reliability and clinical usefulness—and yet mastering the basics is not difficult. Over the years, we have given a pre-test of 3 simple questions to more than a thousand physicians, pharmacists and others who have attended our training programs. Approximately 70% fail—”failure” being defined as missing 2 or 3 of the questions.

One pre-test question is designed to see if people recognize the lack of a comparison group in a report of the “effectiveness” of a new treatment. Without a comparison group of people with similar prognostic characteristics who are treated exactly the same except for the intervention under study, you cannot discern cause and effect of an intervention because a difference between groups may explain or affect the results.

A second pre-test question deals with presenting results as relative risk reduction (RRR) without absolute risk reduction (ARR) or event rates in the study groups. A “relative” measure raises the question, “Relative to what?” Is the reported RRR in our test question 60 percent of 100 percent? Or 60 percent of 1 percent?

The last of our pre-test questions assesses attendees’ basic understanding of only one of the two requirements to qualify as an Intention-to-Treat (ITT) analysis. The two requirements are that people should be randomized as analyzed and that all people should be included in the analysis whether they have discontinued, are missing or have crossed over to other treatment arms. The failure rate at knowing this last requirement is very high. (We will add that this last requirement means that a value has to be assigned if one is missing—and so, one of the most important aspects of critically appraising an ITT analysis is the evaluation of the methods for “imputing” missing data.)

By the end of our training programs, success rates have always markedly improved. Others have reported similar findings.

There is a Lot of  Science + Much of It May Not Be Reliable
Each week more than 13,000 references are added to the world’s largest library—the National Library of Medicine (NLM). Unfortunately, many of these studies are seriously flawed. One large review of 60,352 studies reported that only 7 percent passed criteria of high quality methods and clinical relevancy [McKibbon]. We and others have estimated that up to (and maybe more than) 90% of the published medical information that health care professionals rely on is flawed [Freedman, Glasziou].

Bias Distorts Results
We cannot know if an intervention is likely to be effective and safe without critically appraising the evidence for validity and clinical usefulness. We need to evaluate the reliability of medical science prior to seriously considering the reported therapeutic results because biases such as lack of or inadequate randomization, lack of successful blinding or other threats to validity—which we will describe below—can distort reported result by up to 50 percent or more [see Risk of Bias References].

Patients Deserve Better
Patients cannot make informed choices regarding various interventions without being provided with quantified projections of benefits and harms from valid science.

Some Simple Steps To Critical Appraisal
Below is a short summary of our simplified approach to critically appraising a randomized superiority clinical trial. Our focus is on “internal validity” which means “closeness to truth” in the context of the study. “External validity” is about the likelihood of reaching truth outside of the study context and requires judgment about issues such as fit with individuals or populations in circumstances other than those in the trial.

You can review and download a wealth of freely available information at our website at www.delfini.org including checklists and tools at http://www.delfini.org/delfiniTools.htm which can provide you with much greater information. Most relevant to this blog is our short critical appraisal checklist which you can download here—http://www.delfini.org/Delfini_Tool_StudyValidity_Short.pdf

The Big Questions
In brief, your overarching questions are these:

  1. Is reading this study worth my time? If the results are true, would they change my practice? Do they apply to my situation? What is the likely impact to my patients
  2. Can anything explain the results other than cause and effect? Evaluate the potential for results being distorted by bias (anything other than chance leading away from the truth) or random chance effects.
  3. Is there any difference between groups other than what is being studied? This is automatically a bias.
  4. If the study appears to be valid, but attrition is high, sometimes it is worth asking, what conditions would need to be present for attrition to distort the results? Attrition does not always distort results, but may obscure a true difference due to the reduction in sample size.

Evaluating Bias

There are four stages of a clinical trial, and you should ask several key questions when evaluating bias in each of the 4 stages.

  1. Subject Selection & Treatment Assignment—Evaluation of Selection Bias

Important considerations include how were subjects selected for study, were there enough subjects, how were they assigned to their study groups, and were the groups balanced in terms of prognostic variables?

Your critical appraisal to-do list includes—

a) Checking to see if the randomization sequence was generated in an acceptable manner. (Minimization may be an acceptable alternative.)

b) Determining if the investigators adequately concealed the allocation of subjects to each study group? Meaning, is the method for assigning treatment hidden so that an investigator cannot manipulate the assignment of a subject to a selected study group?

c) Examining the table of baseline characteristics to determine whether randomization was likely to have been successful, i.e., that the groups are balanced in terms of important prognostic variables (e.g., clinical and demographic variables).

  1. The Intervention & Context—Evaluation of Performance Bias

What is being studied, and what is it being compared to? Was the intervention likely to have been executed successfully? Was blinding likely to have been successful? Was duration reasonable for treatment as well as for follow-up? Was adherence reasonable? What else happened to study subjects in the course of the study such as use of co-interventions? Were there any differences in how subjects in the groups were treated?

Your to-do list includes evaluating:

a) Adequacy of blinding of subjects and all working with subjects and their data—including likely success of blinding;

b) Subjects’ adherence to treatment;

c) Inter-group differences in treatment or care except for the intervention(s) being studied.

  1. Data Collection & Loss of Data—Evaluation of Attrition Bias

What information was collected, and how was it collected? What data are missing and is it likely that missing data could meaningfully distort the study results?

Your to-do list includes evaluating—

a) Measurement methods (e.g., mechanisms, tools, instruments, means of administration, personnel issues, etc.)

b) Classification and quantification of missing data in each group (e.g., discontinuations due to ADEs, unrelated deaths, protocol violations, loss to follow-up, etc.)

c) Whether missing data are likely to distort the reported results? This is the area that the evidence on the distorting risk of bias provides the least help. And so, again, often it is worthwhile asking, “What conditions would need to be present for attrition to distort the results?”

  1. Results & Assessing The Differences In The Outcomes Of The Study Groups—Evaluating Assessment Bias

Were outcome measures reasonable, pre-specified and analyzed appropriately? Was reporting selective? How was safety assessed? Remember that models are not truth.

Your to-do list includes evaluating—

a) Whether assessors were blinded.

b) How the effect size was calculated (e.g., absolute risk reduction, relative risk, etc.). You especially want to know benefit or risk with and without treatment.

c) Were confidence intervals included? (You can calculate these yourself online, if you wish. See our web links at our website for suggestions.)

d) For dichotomous variables, was a proper intention-to-treat (ITT) analysis conducted with a reasonable choice for imputing values for missing data?

e) For time-to-event trials, were censoring rules unbiased? Were the number of censored subjects reported?

After you have evaluated a study for bias and chance and have determined that the study is valid, the study results should be evaluated for clinical meaningfulness, (e.g., the amount of clinical benefit and the potential for harm).  Clinical outcomes include morbidity; mortality; symptom relief; physical, mental and emotional functioning; and, quality of life—or any surrogate outcomes that have been demonstrated in valid studies to affect a clinical outcome.

Final Comment

It is not difficult to learn how to critically appraise a clinical trial. Health care providers owe it to their patients to gain these skills. Health care professionals cannot rely on abstracts and authors’ conclusions—they must assess studies first for validity and second for clinical usefulness.  Authors are often biased, even with the best of intentions. Remember that authors’ conclusions are opinions, not evidence. Authors frequently use misleading terms or draw misleading conclusions. Physicians and others who lack critical appraisal skills are often mislead by authors’ conclusions and summary statements. Critical appraisal knowledge is required to evaluate the validity of a study which must be done prior to seriously considering reported results.

For those who wish to go more deeply, we have books available and do training seminars. See our website at www.delfini.org.

Risk of Bias References

  1. Juni P, Altman DG, Egger M (2001) Systematic reviews in health care: assessing the quality of controlled clinical trials. BMJ 2001;323: 42-6. PubMed PMID: 11440947.
  2. Juni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA. 1999 Sep 15;282( 11): 1054-60. PubMed PMID: 10493204.
  3. Kjaergard LL, Villumsen J, Gluud C. Reported methodological quality and discrepancies between large and small randomized trials in metaanalyses. Ann Intern Med 2001;135: 982– 89. PMID 11730399.
  4. Moher D, Pham B, Jones A, Cook DJ, Jadad AR, Moher M, Tugwell P, Klassen TP. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet. 1998 Aug 22;352( 9128): 609-13. PubMed PMID: 9746022.
  5. Poolman RW, Struijs PA, Krips R, Inger N. Sierevelt IN, et al. (2007) Reporting of outcomes in orthopaedic randomized trials: Does blinding of outcome assessors matter? J Bone Joint Surg Am. 89: 550– 558. PMID 17332104.
  6. Savovic J, Jones HE, Altman DG, et al. Influence of Reported Study Design Characteristics on Intervention Effect Estimates From Randomized, Controlled Trials. Ann Intern Med. 2012 Sep 4. doi: 10.7326/ 0003-4819-157-6-201209180-00537. [Epub ahead of print] PubMed PMID: 22945832.
  7. van Tulder MW, Suttorp M, Morton S, et al. Empirical evidence of an association between internal validity and effect size in randomized controlled trials of low-back pain. Spine (Phila Pa 1976). 2009 Jul 15;34( 16): 1685-92. PubMed PMID: 19770609.

Other References

  1. CDC: http://www.cdc.gov/HAI/outbreaks/meningitis.html
  2. Echt DS, Liebson PR, Mitchell LB, Peters RW, Obias-Manno D, Barker AH, Arensberg D, Baker A, Friedman L, Greene HL, et al. Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial. N Engl J Med. 1991 Mar 21;324(12):781-8. PubMed PMID: 1900101.
  3. Freedman, David H. Lies, Damn Lies and Bad Medical Science. The Atlantic. November, 2010. www.theatlantic.com/ magazine/ archive/ 2010/ 11/ lies-damned-lies-and-medical-science/ 8269/, accessed 11/ 07/ 2010.
  4. Glasziou P. The EBM journal selection process: how to find the 1 in 400 valid and highly relevant new research articles. Evid Based Med. 2006 Aug; 11( 4): 101. PubMed PMID: 17213115.
  5. Graham Natural News: http://www.naturalnews.com/011401_Dr_David_Graham_the_FDA.html
  6. McKibbon KA, Wilczynski NL, Haynes RB. What do evidence-based secondary journals tell us about the publication of clinically important articles in primary health care journals? BMC Med. 2004 Sep 6;2: 33. PubMed PMID: 15350200.
  7. Mello MM, Brennan TA. The controversy over high-dose chemotherapy with autologous bone marrow transplant for breast cancer. Health Aff (Millwood). 2001 Sep-Oct;20(5):101-17. PubMed PMID: 11558695.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Progression Free Survival (PFS) in Oncology Trials

Status

Progression Free Survival (PFS) in Oncology Trials

Progression Free Survival (PFS) continues to be a frequently used endpoint in oncology trials. It is the time from randomization to the first of either objectively measured tumor progression or death from any cause. It is a surrogate outcome because it does not directly assess mortality, morbidity, quality of life, symptom relief or functioning. Even if a valid trial reports a statistically significant improvement in PFS and the reported effect size is large, PFS only provides information about biologic activity of the cancer and tumor burden or tumor response. Even though correlational analysis has shown associations between PFS and overall survival (OS) in some cancers, we believe that extreme caution should be exercised when drawing conclusions about efficacy of a new drug. In other words, PFS evidence alone is insufficient to establish a clinically meaningful benefit for patients or even a reasonable likelihood of net benefit. Many tumors do present a significant clinical burden for patients; however, clinicians frequently mistakenly believe that simply having a reduction in tumor burden equates with clinical benefit and that delaying the growth of a cancer is a clear benefit to patients.

PFS has a number of limitations which increases the risk of biased results and is difficult for readers to interpret. Unlike OS, PFS does not “identify” the time of progression since assessment occurs at scheduled visits and is likely to overestimate time to progression. Also, it is common to stop or add anti-cancer therapies in PFS studies (also a common problem in trials of OS) prior to documentation of tumor progression which may confound outcomes. Further, measurement errors may occur because of complex issues in tumor assessment. Adequate blinding is required to reduce the risk of performance and assessment bias. Other methodological issues include complex calculations to adjust for missed assessments and the need for complete data on adverse events.

Attrition and assessment bias are made even more difficult to assess in oncology trials using time-to-event methodologies. The intention-to-treat principle requires that all randomly assigned patients be observed until they experience the end point or the study ends. Optimal follow-up in PFS trials is to follow each subject to both progression and death.

Delfini Comment

FDA approval based on PFS may result in acceptance of new therapies with greater harms than benefits. The limitations listed above, along with a concern that investigators may be less willing to conduct trials with OS as an endpoint once a drug has been approved, suggest that we should use great caution when considering evidence from studies using PFS as the primary endpoint. We believe that PFS should be thought of as any other surrogate marker—i.e., it represents extremely weak evidence (even in studies judged to be at low risk of bias) unless it is supported by acceptable evidence of improvements in quality of life and overall survival.

When assessing the quality of a trial using PFS, we suggest the following:

  1. Remember that although in some cases PFS appears to be predictive of OS, in many cases it is not.
  2. In many cases, improved PFS is accompanied by unacceptable toxicity and unacceptable changes in quality of life.
  3. Improved PFS results of several months may be due to methodological flaws in the study.
  4. As with any clinical trial, assess the trial reporting PFS for bias such as selection, performance, attrition and assessment bias.
  5. Compare characteristics of losses (e.g., due to withdrawing consent, adverse events, loss to follow-up, protocol violations) between groups and, if possible, between completers and those initially randomized.
  6. Pay special attention to censoring due to loss-to-follow-up. Administrative censoring (censoring of subjects who enter a study late and do not experience an event) may not result in significant bias, but non-administrative censoring (censoring because of loss-to-follow-up or discontinuing) is more likely to pose a threat to validity.

References

Carroll KJ. Analysis of progression-free survival in oncology trials: some common statistical issues. Pharm Stat. 2007 Apr-Jun;6(2):99-113. Review. PubMed PMID: 17243095.

D’Agostino RB Sr. Changing end points in breast-cancer drug approval—the Avastin story. N Engl J Med. 2011 Jul 14;365(2):e2. doi: 10.1056/NEJMp1106984. Epub 2011 Jun 27. PubMed PMID: 21707384.

Fleming TR, Rothmann MD, Lu HL. Issues in using progression-free survival when evaluating oncology products. J Clin Oncol. 2009 Jun 10;27(17):2874-80. doi: 10.1200/JCO.2008.20.4107. Epub 2009 May 4. PubMed PMID: 19414672

Lachin JM. (John M. Lachin, Sc.D., Professor of Biostatistics and Epidemiology, and of Statistics, The George Washington University personal communication)

Lachin JM. Statistical considerations in the intent-to-treat principle. Control Clin Trials. 2000 Jun;21(3):167-89. Erratum in: Control Clin Trials 2000 Oct;21(5):526. PubMed PMID: 10822117.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Network Meta-analyses—More Complex Than Traditional Meta-analyses

Status

Network Meta-analyses—More Complex Than Traditional Meta-analyses

Meta-analyses are important tools for synthesizing evidence from relevant studies. One limitation of traditional meta-analyses is that they can compare only 2 treatments at a time in what is often termed pairwise or direct comparisons. An extension of traditional meta-analysis is the “network meta-analysis” which has been increasingly used—especially with the rise of the comparative effectiveness movement—as a method of assessing the comparative effects of more than two alternative interventions for the same condition that have not been studied in head-to-head trials.

A network meta-analysis synthesizes direct and indirect evidence over the entire network of interventions that have not been directly compared in clinical trials, but have one treatment in common.

Example
A clinical trial reports that for a given condition intervention A results in better outcomes than intervention B. Another trial reports that intervention B is better than intervention C. A network meta-analysis intervention is likely to report that intervention A results in better outcomes than intervention C based on indirect evidence.

Network meta-analyses, also known as “multiple-treatments meta-analyses” or “mixed-treatment comparisons meta-analyses” include both direct and indirect evidence. When both direct and indirect comparisons are used to estimate treatment effects, the comparison is referred to as a “mixed comparison.” The indirect evidence in network meta-analyses is derived from statistical inference which requires many assumptions and modeling. Therefore, critical appraisal of network meta-analyses is more complex than appraisal of traditional meta-analyses.

In all meta-analyses, clinical and methodological differences in studies are likely to be present. Investigators should only include valid trials. Plus they should provide sufficient detail so that readers can assess the quality of meta-analyses. These details include important variables such as PICOTS (population, intervention, comparator, outcomes, timing and study setting) and heterogeneity in any important study performance items or other contextual issues such as important biases, unique care experiences, adherence rates, etc. In addition, the effect sizes in direct comparisons should be compared to the effect sizes in indirect comparisons since indirect comparisons require statistical adjustments. Inconsistency between the direct and indirect comparisons may be due to chance, bias or heterogeneity. Remember, in direct comparisons the data come from the same trial. Indirect comparisons utilize data from separate randomized controlled trials which may vary in both clinical and methodological details.

Estimates of effect in a direct comparison trial may be lower than estimates of effect derived from indirect comparisons. Therefore, evidence from direct comparisons should be weighted more heavily than evidence from indirect comparisons in network meta-analyses. The combination of direct and indirect evidence in mixed treatment comparisons may be more likely to result in distorted estimates of effect size if there is inconsistency between effect sizes of direct and indirect comparisons.

Usually network meta-analyses rank different treatments according to the probability of being the best treatment. Readers should be aware that these rankings may be misleading because differences may be quite small or inaccurate if the quality of the meta-analysis is not high.

Delfini Comment
Network meta-analyses do provide more information about the relative effectiveness of interventions. At this time, we remain a bit cautious about the quality of many network meta-analyses because of the need for statistical adjustments. It should be emphasized that, as of this writing, methodological research has not established a preferred method for conducting network meta-analyses, assessing them for validity or assigning them an evidence grade.

References
Li T, Puhan MA, Vedula SS, Singh S, Dickersin K; Ad Hoc Network Meta-analysis Methods Meeting Working Group. Network meta-analysis-highly attractive but more methodological research is needed. BMC Med. 2011 Jun 27;9:79. doi: 10.1186/1741-7015-9-79. PubMed PMID: 21707969.

Salanti G, Del Giovane C, Chaimani A, Caldwell DM, Higgins JP. Evaluating the quality of evidence from a network meta-analysis. PLoS One. 2014 Jul 3;9(7):e99682. doi: 10.1371/journal.pone.0099682. eCollection 2014. PubMed PMID: 24992266.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Comparative Effectiveness Research (CER), “Big Data” & Causality

Status

Comparative Effectiveness Research (CER), “Big Data” & Causality

For a number of years now, we’ve been concerned that the CER movement and the growing love affair with “big data,” will lead to many erroneous conclusions about cause and effect.  We were pleased to see the following blog from Austin Frakt, an editor-in-chief of The Incidental Economist: Contemplating health care with a focus on research, an eye on reform

Ten impressions of big data: Claims, aspirations, hardly any causal inference

http://theincidentaleconomist.com/wordpress/ten-impressions-of-big-data-claims-aspirations-hardly-any-causal-inference/

+

Five more big data quotes: The ambitions and challenges

http://theincidentaleconomist.com/wordpress/five-more-big-data-quotes/

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Cochrane Risk Of Bias Tool For Non-Randomized Studies

Status

Cochrane Risk Of Bias Tool For Non-Randomized Studies

Like many others, our position is that, with very few exceptions, cause and effect conclusions regarding therapeutic interventions can only be drawn when valid RCT data exists. However, there are uses for observational studies which may be used to answer additional questions, and non-randomized studies (NRS) are often included in systematic reviews.

In September 2014, Cochrane published a tool for assessing bias in NRS for systematic review authors [1]. It may be of interest to our colleagues. The tool is called ACROBAT-NRSI (“A Cochrane Risk Of Bias Assessment Tool for Non-Randomized Studies”) and is designed to assist with evaluating the risk of bias (RoB) in the results of NRS that compare the health effects of two or more interventions.

The tool focuses on internal validity. It covers seven domains through which bias might be introduced into a NRS. The domains provide a framework for considering any type of NRS, and are summarized in the table below, and many of the biases listed here are described and explanations of how they may cause bias are presented in the full document, and you can see our rough summary here: http://www.delfini.org/delfiniClick_Observations.htm#robtable

Response options for each bias include: low risk of bias; moderate risk of bias; serious risk of bias; critical risk of bias; and no information on which to base a judgment.

Details are available in the full document which can be downloaded at—https://sites.google.com/site/riskofbiastool/

Delfini Comment
We again point out that non-randomized studies often report seriously misleading results even when treated and control groups appear similar in prognostic variables and agree with Deeks that, for therapeutic interventions ,“non-randomised studies should only be undertaken when RCTs are infeasible or unethical”[2]—and even then, buyer beware. Studies do not get “validity grace” because of scientific or practical challenges.

Furthermore, we are uncertain that this tool is of great value when assessing NRS. Deeks [2] identified 194 tools that could be or had been used to assess NRS. Do we really need another one? While it’s a good document for background reading, we are more comfortable approaching the problem of observational data by pointing out that, when it comes to efficacy, high quality RCTs have a positive predictive value of about 85% whereas well-done observational trials have a positive predictive value of about 20% [3].

References

Sterne JAC, Higins JPT, Reves BC on behalf of the development group for ACROBAT- NRSI. A Cochrane Risk Of Bias Asesment Tol: for Non-Randomized Studies of Interventions (ACROBAT- NRSI), Version 1.0.0, 24 September 2014. Available from htp:/www.riskofbias.info [accessed 10/11/14.

Deeks JJ, Dinnes J, D’Amico R, Sowden AJ, Sakarovitch C, Song F, Petticrew M, Altman DG; International Stroke Trial Collaborative Group; European Carotid Surgery Trial Collaborative Group. Evaluating non-randomised intervention studies. Health Technol Assess. 2003;7(27):iii-x, 1-173. Review. PubMed PMID: 14499048.

Ioannidis JPA. Why Most Published Research Findings are False. PLoS Med 2005; 2(8):696-701 PMID: 16060722.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Comparative Study Designs: Claiming Superiority, Equivalence and Non-inferiority—A Few Considerations & Practical Approaches

Status

Comparative Study Designs: Claiming Superiority, Equivalence and Non-inferiority—A Few Considerations & Practical Approaches

This is a complex area, and we recommend downloading our freely available 1-page summary to help assess issues with equivalence and non-inferiority trials. Here is a short sampling of some of the problems in these designs: lack of sufficient evidence confirming efficacy of referent treatment, (“referent” refers to the comparator treatment); study not sufficiently similar to referent study; inappropriate Deltas (meaning the margin established for equivalence or non-inferiority); or significant biases or analysis methods that would tend to diminish an effect size and “favor” no difference between groups (e.g., conservative application of ITT analysis, insufficient power, etc.), thus pushing toward non-inferiority or equivalence.

However, we do want to say a few more things about non-inferiority trials based on some recent questions and readings.

Is it acceptable to claim superiority in a non-inferiority trial? Yes. The Food and Drug Administration (FDA) and the European Medicines Agency (EMA), among others, including ourselves, all agree that declaring superiority in a non-inferiority trial is acceptable. What’s more, there is agreement that multiplicity adjusting does not need to be done when first testing for non-inferiority and then superiority.

See Delfini Recommended Reading: Included here is a nice article by Steve Snapinn. Snappin even recommends that “…most, if not all, active-controlled clinical trial protocols should define a noninferiority margin and include a noninferiority hypothesis.” We agree. Clinical trials are expensive to do, take time, have opportunity costs, and—most importantly—are of impact on the lives of the human subjects who engage in them. This is a smart procedure that costs nothing especially as multiplicity adjusting is not needed.

What does matter is having an appropriate population for doing a superiority analysis. For superiority, in studies with dichotomous variables, the population should be Intention-to-Treat (ITT) with an appropriate imputation method that does not favor the intervention under study. In studies with time-to-event outcomes, the population should be based on the ITT principle (meaning all randomized patients should be used in the analysis by the group to which they were randomized) with unbiased censoring rules.

Confidence intervals (CIs) should be evaluated to determine superiority. Some evaluators seem to suggest that superiority can be declared only if the CIs are wholly above the Delta. Schumi et al. express their opinion that you can declare superiority if the confidence interval for the new treatment is above the line of no difference (i.e.., is statistically significant). They state, “The calculated CI does not know whether its purpose is to judge superiority or non-inferiority. If it sits wholly above zero [or 1, depending upon the measure of outcome], then it has shown superiority.” EMA would seem to agree. We agree as well. If one wishes to take a more conservative approach, one method we recommend is to judge whether the Delta seems clinically reasonable (you should always do this) and if not, establishing your own through clinical judgment. Then determine if the entire CI meets or exceeds what you deem to be clinically meaningful. To us, this method satisfies both approaches and makes practical and clinical sense.

Is it acceptable to claim non-inferiority trial superiority? It depends. This area is controversial with some saying no and some saying it depends. However, there is agreement amongst those on the “it depends” side that it generally should not be done due to validity issues as described above.

References
US Department of Health and Human Services, Food and Drug Administration: Guidance for Industry Non-Inferiority Clinical Trials (DRAFT). 2010.
http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/ Guidances/UCM202140.pdf

European Agency for the Evaluation of Medicinal Products Committee for Proprietary Medicinal Products (CPMP): Points to Consider on Switching Between Superiority and Non-Inferiority. 2000. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2014556/

http://www.delfini.org/delfiniReading.htm#equivalence

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Estimating Relative Risk Reduction from Odds Ratios

Status

Estimating Relative Risk Reduction from Odds Ratios

Odds are hard to work with because they are the likelihood of an event occurring compared to not occurring—e.g., odds of two to one mean that likelihood of an event occurring is twice that of not occurring. Contrast this with probability which is simply the likelihood of an event occurring.

An odds ratio (OR) is a point estimate used for case-control studies which attempts to quantify a mathematical relationship between an exposure and a health outcome. Odds must be used in case-control studies because the investigator arbitrarily controls the population; therefore, probability cannot be determined because the disease rates in the study population cannot be known. The odds that a case is exposed to a certain variable are divided by the odds that a control is exposed to that same variable.

Odds are often used in other types of studies as well, such as meta-analysis, because of various properties of odds which make them easy to use mathematically. However, increasingly authors are discouraged from computing odds ratios in secondary studies because of the difficulty translating what this actually means in terms of size of benefits or harms to patients.

Readers frequently attempt to deal with this by converting the odds ratio into relative risk reduction by thinking of the odds ratio as similar to relative risk. Relative risk reduction (RRR) is computed from relative risk (RR) by simply subtracting the relative risk from one and expressing that outcome as a percentage (1-RR).

Some experts advise readers that this is safe to do if the prevalence of the event is low. While it is true that odds and probabilities of outcomes are usually similar if the event rate is low, when possible, we recommend calculating both the odds ratio reduction and the relative risk reduction in order to compare and determine if the difference is clinically meaningful. And determining if something is clinically meaningful is a judgment, and therefore whether a conversion of OR to RRR is distorted depends in part upon that judgment.

a = group 1 outcome occurred
b = group 1 outcome did not occur
c = group 2 outcome occurred
d = group 2 outcome did not occur

OR = (a/b)/(c/d)
Estimated RRR from OR (odds ratio reduction) = 1-OR

RR = (a/ group 1 n)/(c/ group 2 n)
RRR – 1-RR

 

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

More on Attrition Bias: Update on Missing Data Points: Difference or No Difference — Does it Matter?

Attrition Bias Update 01/14/2014: Missing Data Points: Difference or No Difference — Does it Matter?

A colleague recently wrote us to ask us more about attrition bias. We shared with him that the short answer is that there is less conclusive research on attrition bias than on other key biases. Attrition does not necessarily mean that attrition bias is present and distorting statistically significant results. Attrition may simply result in a smaller sample size which, depending upon how small the remaining population is, may be more prone to chance due to outliers or false non-significant findings due to lack of power.

If randomization successfully results in balanced groups, if blinding is successful including concealed allocation of patients to their study groups, if adherence is high, if protocol deviations are balanced and low, if co-interventions are balanced, if censoring rules are used which are unbiased, and if there are no differences between the groups except for the interventions studied, then it may be reasonable to conclude that attrition bias is not present even if attrition rates are large. Balanced baseline comparisons between completers provides further support for such a conclusion as does comparability in reasons for discontinuation, especially if many categories are reported.

On the other hand, other biases may result in attrition bias. For example, imagine a comparison of an active agent to a placebo in a situation in which blinding is not successful. A physician might encourage his or her patient to drop out of a study if they know the patient is on placebo, resulting in biased attrition that, in sufficient numbers, would potentially distort the results from what they would otherwise have been.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Demo: Critical Appraisal of a Randomized Controlled Trial

Status

Demo: Critical Appraisal of a Randomized Controlled Trial

We recently had a great opportunity to listen to a live demonstration of a critical appraisal of a randomized controlled trial conducted by Dr. Brian Alper, Founder of DynaMed; Vice President of EBM Research and Development, Quality & Standards at EBSCO Information Services.

Dr. Alper is extremely knowledgeable about critical appraisal and does an outstanding job clearly describing key issues concerning his selected study for review. We are fortunate to have permission to share the recorded webinar with you.

“Learn How to Critically Appraise a Randomized Trial with Brian S. Alper, MD, MSPH, FAAFP”

Below are details of how to access the study that was used in the demo and how to access the webinar itself.

The Study
The study used for the demonstration is Primary Prevention Of Cardiovascular Disease with a Mediterranean Diet.  Full citation is here—

Estruch R, Ros E, Salas-Salvadó J, Covas MI, Corella D, Arós F, Gómez-Gracia E, Ruiz-Gutiérrez V, Fiol M, Lapetra J, Lamuela-Raventos RM, Serra-Majem L, Pintó X, Basora J, Muñoz MA, Sorlí JV, Martínez JA, Martínez-González MA; PREDIMED Study Investigators. Primary prevention of cardiovascular disease with a Mediterranean diet. N Engl J Med. 2013 Apr 4;368(14):1279-90. doi: 10.1056/NEJMoa1200303. Epub 2013 Feb 25. PubMed PMID: 23432189.

Access to the study for the critical appraisal demo is available here:

http://www.ncbi.nlm.nih.gov/pubmed/?term=N+Engl+J+Med+2013%3B+368%3A1279-1290

The Webinar: 1 Hour

For those of you who have the ability to play WebEx files or can download the software to do so, the webinar can be accessed here—

https://ebsco.webex.com/ebsco/lsr.php?AT=pb&SP=TC&rID=22616757&rKey=f7e98d3414abc8ca&act=pb

Important: It takes about 60 seconds before the webinar starts. (Be sure your sound is on.)

More Chances to Learn about Critical Appraisal

There is a wealth of freely available information to help you both learn and accomplish critical appraisal tasks as well as other evidence-based quality improvement activities. Our website is www.delfini.org. We also have a little book available for purchase for which we are getting rave reviews and which is now being used to train medical and pharmacy residents and is being used in medical, pharmacy and nursing schools.

Delfini Evidence-based Practice Series Guide Book

Basics for Evaluating Medical Research Studies: A Simplified Approach (And Why Your Patients Need You to Know This)

Find our book at—http://www.delfinigrouppublishing.com/ or on our website at www.delfini.org (see Books).

Delfini Recommends DynaMed™

We highly recommend DynaMed.  Although we urge readers to be aware that there is variation in all medical information sources, as members of the DynaMed editorial board (unpaid), we have opportunity to participate in establishing review criteria as well as getting a closer look into methods, staff skills, review outcomes, etc., and we think that DynaMed is a great resource. Depending upon our clinical question and project, DynaMed is often our starting point.

About DynaMed™ from the DynaMed Website

DynaMed™ is a clinical reference tool created by physicians for physicians and other health care professionals for use at the point-of-care. With clinically-organized summaries for more than 3,200 topics, DynaMed provides the latest content and resources with validity, relevance and convenience, making DynaMed an indispensable resource for answering most clinical questions during practice.

Updated daily, DynaMed editors monitor the content of over 500 medical journals on a daily basis. Each article is evaluated for clinical relevance and scientific validity. The new evidence is then integrated with existing content, and overall conclusions are changed as appropriate, representing a synthesis of the best available evidence. Through this process of Systematic Literature Surveillance, the best available evidence determines the content of DynaMed.

Who Uses DynaMed

DynaMed is used in hospitals, medical schools, residency programs, group practices and by individual clinicians supporting physicians, physician assistants, nurses, nurse practitioners, pharmacists, physical therapists, medical researchers, students, teachers and numerous other health care professionals at the point-of-care.

https://dynamed.ebscohost.com/

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email