“Reading” a Clinical Trial Won’t Get You There

Status

“Reading” a Clinical Trial Won’t Get You There—or Let’s Review (And Apply) Some Basics About Assessing The Validity of Medical Research Studies Claiming Superiority for Efficacy of Therapies

An obvious question raised by the title is, “Get you where?” Well, the answer is, “To where you know it is reasonable to think you can trust the results of the study you have just finished reading.” In this blog, our focus is on how to critically appraise medical research studies which claim superiority for efficacy of a therapy.

Because of Lack of Understanding Medical Science Basics, People May Be Injured or Die

Understanding basic requirements for valid medical science is very important. Numbers below are estimates, but are likely to be close or understated—

Over 63,000 people with heart disease died after taking encainide or flecainide because many doctors thought taking these drugs “made biological sense,” but did not understand the simple need for reliable clinical trial information to confirm what seemed to “make sense” [Echt 91].
An estimated 60,000 people in the United States died and another 140,000 experienced a heart attack resulting from the use of a nonsteroidal anti-inflammatory drug despite important benefit and safety information reported in the abstract of the pivotal trial used for FDA approval [Graham].
In another example, roughly 42,000 women with advanced breast cancer suffered excruciating side effects without any proof of benefit, many of them dying as a result, and at a cost of $3.4 billion dollars [Mello].
At least 64 deaths out of 751 cases in nearly half the United States were linked to fungal meningitis thought to be caused by a contaminated treatment that is used for back and radicular pain—but there is no reliable scientific evidence of benefit from that treatment [CDC].

In the above instances, these were preventable deaths and harms—from common treatments—which patients might have avoided if their physicians had better understood the importance and methods of evaluating medical science.

Failures to Understand Medical Science Basics

Many health care professionals don’t know how to quickly assess a trial for reliability and clinical usefulness—and yet mastering the basics is not difficult. Over the years, we have given a pre-test of 3 simple questions to more than a thousand physicians, pharmacists and others who have attended our training programs. Approximately 70% fail—”failure” being defined as missing 2 or 3 of the questions.

One pre-test question is designed to see if people recognize the lack of a comparison group in a report of the “effectiveness” of a new treatment. Without a comparison group of people with similar prognostic characteristics who are treated exactly the same except for the intervention under study, you cannot discern cause and effect of an intervention because a difference between groups may explain or affect the results.

A second pre-test question deals with presenting results as relative risk reduction (RRR) without absolute risk reduction (ARR) or event rates in the study groups. A “relative” measure raises the question, “Relative to what?” Is the reported RRR in our test question 60 percent of 100 percent? Or 60 percent of 1 percent?

The last of our pre-test questions assesses attendees’ basic understanding of only one of the two requirements to qualify as an Intention-to-Treat (ITT) analysis. The two requirements are that people should be randomized as analyzed and that all people should be included in the analysis whether they have discontinued, are missing or have crossed over to other treatment arms. The failure rate at knowing this last requirement is very high. (We will add that this last requirement means that a value has to be assigned if one is missing—and so, one of the most important aspects of critically appraising an ITT analysis is the evaluation of the methods for “imputing” missing data.)

By the end of our training programs, success rates have always markedly improved. Others have reported similar findings.

There is a Lot of Science + Much of It May Not Be Reliable
Each week more than 13,000 references are added to the world’s largest library—the National Library of Medicine (NLM). Unfortunately, many of these studies are seriously flawed. One large review of 60,352 studies reported that only 7 percent passed criteria of high quality methods and clinical relevancy [McKibbon]. We and others have estimated that up to (and maybe more than) 90% of the published medical information that health care professionals rely on is flawed [Freedman, Glasziou].

Bias Distorts Results
We cannot know if an intervention is likely to be effective and safe without critically appraising the evidence for validity and clinical usefulness. We need to evaluate the reliability of medical science prior to seriously considering the reported therapeutic results because biases such as lack of or inadequate randomization, lack of successful blinding or other threats to validity—which we will describe below—can distort reported result by up to 50 percent or more [see Risk of Bias References].

Patients Deserve Better
Patients cannot make informed choices regarding various interventions without being provided with quantified projections of benefits and harms from valid science.

Some Simple Steps To Critical Appraisal
Below is a short summary of our simplified approach to critically appraising a randomized superiority clinical trial. Our focus is on “internal validity” which means “closeness to truth” in the context of the study. “External validity” is about the likelihood of reaching truth outside of the study context and requires judgment about issues such as fit with individuals or populations in circumstances other than those in the trial.

You can review and download a wealth of freely available information at our website at www.delfini.org including checklists and tools at http://www.delfini.org/delfiniTools.htm which can provide you with much greater information. Most relevant to this blog is our short critical appraisal checklist which you can download here—http://www.delfini.org/Delfini_Tool_StudyValidity_Short.pdf

The Big Questions
In brief, your overarching questions are these:

Is reading this study worth my time? If the results are true, would they change my practice? Do they apply to my situation? What is the likely impact to my patients
Can anything explain the results other than cause and effect? Evaluate the potential for results being distorted by bias (anything other than chance leading away from the truth) or random chance effects.
Is there any difference between groups other than what is being studied? This is automatically a bias.
If the study appears to be valid, but attrition is high, sometimes it is worth asking, what conditions would need to be present for attrition to distort the results? Attrition does not always distort results, but may obscure a true difference due to the reduction in sample size.

Evaluating Bias

There are four stages of a clinical trial, and you should ask several key questions when evaluating bias in each of the 4 stages.

Subject Selection & Treatment Assignment—Evaluation of Selection Bias

Important considerations include how were subjects selected for study, were there enough subjects, how were they assigned to their study groups, and were the groups balanced in terms of prognostic variables?

Your critical appraisal to-do list includes—

a) Checking to see if the randomization sequence was generated in an acceptable manner. (Minimization may be an acceptable alternative.)

b) Determining if the investigators adequately concealed the allocation of subjects to each study group? Meaning, is the method for assigning treatment hidden so that an investigator cannot manipulate the assignment of a subject to a selected study group?

c) Examining the table of baseline characteristics to determine whether randomization was likely to have been successful, i.e., that the groups are balanced in terms of important prognostic variables (e.g., clinical and demographic variables).

The Intervention & Context—Evaluation of Performance Bias

What is being studied, and what is it being compared to? Was the intervention likely to have been executed successfully? Was blinding likely to have been successful? Was duration reasonable for treatment as well as for follow-up? Was adherence reasonable? What else happened to study subjects in the course of the study such as use of co-interventions? Were there any differences in how subjects in the groups were treated?

Your to-do list includes evaluating:

a) Adequacy of blinding of subjects and all working with subjects and their data—including likely success of blinding;

b) Subjects’ adherence to treatment;

c) Inter-group differences in treatment or care except for the intervention(s) being studied.

Data Collection & Loss of Data—Evaluation of Attrition Bias

What information was collected, and how was it collected? What data are missing and is it likely that missing data could meaningfully distort the study results?

Your to-do list includes evaluating—

a) Measurement methods (e.g., mechanisms, tools, instruments, means of administration, personnel issues, etc.)

b) Classification and quantification of missing data in each group (e.g., discontinuations due to ADEs, unrelated deaths, protocol violations, loss to follow-up, etc.)

c) Whether missing data are likely to distort the reported results? This is the area that the evidence on the distorting risk of bias provides the least help. And so, again, often it is worthwhile asking, “What conditions would need to be present for attrition to distort the results?”

Results & Assessing The Differences In The Outcomes Of The Study Groups—Evaluating Assessment Bias

Were outcome measures reasonable, pre-specified and analyzed appropriately? Was reporting selective? How was safety assessed? Remember that models are not truth.

Your to-do list includes evaluating—

a) Whether assessors were blinded.

b) How the effect size was calculated (e.g., absolute risk reduction, relative risk, etc.). You especially want to know benefit or risk with and without treatment.

c) Were confidence intervals included? (You can calculate these yourself online, if you wish. See our web links at our website for suggestions.)

d) For dichotomous variables, was a proper intention-to-treat (ITT) analysis conducted with a reasonable choice for imputing values for missing data?

e) For time-to-event trials, were censoring rules unbiased? Were the number of censored subjects reported?

After you have evaluated a study for bias and chance and have determined that the study is valid, the study results should be evaluated for clinical meaningfulness, (e.g., the amount of clinical benefit and the potential for harm). Clinical outcomes include morbidity; mortality; symptom relief; physical, mental and emotional functioning; and, quality of life—or any surrogate outcomes that have been demonstrated in valid studies to affect a clinical outcome.

Final Comment

It is not difficult to learn how to critically appraise a clinical trial. Health care providers owe it to their patients to gain these skills. Health care professionals cannot rely on abstracts and authors’ conclusions—they must assess studies first for validity and second for clinical usefulness. Authors are often biased, even with the best of intentions. Remember that authors’ conclusions are opinions, not evidence. Authors frequently use misleading terms or draw misleading conclusions. Physicians and others who lack critical appraisal skills are often mislead by authors’ conclusions and summary statements. Critical appraisal knowledge is required to evaluate the validity of a study which must be done prior to seriously considering reported results.

For those who wish to go more deeply, we have books available and do training seminars. See our website at www.delfini.org.

Risk of Bias References

Juni P, Altman DG, Egger M (2001) Systematic reviews in health care: assessing the quality of controlled clinical trials. BMJ 2001;323: 42-6. PubMed PMID: 11440947.
Juni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA. 1999 Sep 15;282( 11): 1054-60. PubMed PMID: 10493204.
Kjaergard LL, Villumsen J, Gluud C. Reported methodological quality and discrepancies between large and small randomized trials in metaanalyses. Ann Intern Med 2001;135: 982– 89. PMID 11730399.
Moher D, Pham B, Jones A, Cook DJ, Jadad AR, Moher M, Tugwell P, Klassen TP. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet. 1998 Aug 22;352( 9128): 609-13. PubMed PMID: 9746022.
Poolman RW, Struijs PA, Krips R, Inger N. Sierevelt IN, et al. (2007) Reporting of outcomes in orthopaedic randomized trials: Does blinding of outcome assessors matter? J Bone Joint Surg Am. 89: 550– 558. PMID 17332104.
Savovic J, Jones HE, Altman DG, et al. Influence of Reported Study Design Characteristics on Intervention Effect Estimates From Randomized, Controlled Trials. Ann Intern Med. 2012 Sep 4. doi: 10.7326/ 0003-4819-157-6-201209180-00537. [Epub ahead of print] PubMed PMID: 22945832.
van Tulder MW, Suttorp M, Morton S, et al. Empirical evidence of an association between internal validity and effect size in randomized controlled trials of low-back pain. Spine (Phila Pa 1976). 2009 Jul 15;34( 16): 1685-92. PubMed PMID: 19770609.

Other References

CDC: http://www.cdc.gov/HAI/outbreaks/meningitis.html
Echt DS, Liebson PR, Mitchell LB, Peters RW, Obias-Manno D, Barker AH, Arensberg D, Baker A, Friedman L, Greene HL, et al. Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial. N Engl J Med. 1991 Mar 21;324(12):781-8. PubMed PMID: 1900101.
Freedman, David H. Lies, Damn Lies and Bad Medical Science. The Atlantic. November, 2010. www.theatlantic.com/ magazine/ archive/ 2010/ 11/ lies-damned-lies-and-medical-science/ 8269/, accessed 11/ 07/ 2010.
Glasziou P. The EBM journal selection process: how to find the 1 in 400 valid and highly relevant new research articles. Evid Based Med. 2006 Aug; 11( 4): 101. PubMed PMID: 17213115.
Graham Natural News: http://www.naturalnews.com/011401_Dr_David_Graham_the_FDA.html
McKibbon KA, Wilczynski NL, Haynes RB. What do evidence-based secondary journals tell us about the publication of clinically important articles in primary health care journals? BMC Med. 2004 Sep 6;2: 33. PubMed PMID: 15350200.
Mello MM, Brennan TA. The controversy over high-dose chemotherapy with autologous bone marrow transplant for breast cancer. Health Aff (Millwood). 2001 Sep-Oct;20(5):101-17. PubMed PMID: 11558695.

Clinical Guidelines and Elderly Patients—Proceed with Caution

Status

Clinical Guidelines and Elderly Patients—Proceed with Caution

We were alerted to this important study of clinical practice guidelines (CPGs) in elderly patients with co-morbidities [1] by Demetra Antimisiaris, an associate professor who directs the polypharmacy initiative at the University of Louisville School of Medicine.

The authors report that most CPGs did not modify recommendations for older patients with multiple comorbidities. Most also did not comment the quality of the underlying scientific evidence or provide guidance for incorporating patient preferences into treatment plans. If the relevant CPGs were followed, one hypothetical patient described in the report would be prescribed 12 medications (costing her $406 per month). Use of guidelines for this patient would result in a complicated medication schedule, and adverse events would be likely. The authors state that, “Although CPGs provide detailed guidance for managing single diseases, they fail to address the needs of older patients with complex comorbid illness.” CPGs rarely address treatment of patients with 3 or more chronic diseases—a group that includes half of the population older than 65 years. Adhering to current CPGs in caring for an older person with several comorbidities may have multiple undesirable effects and could result in net harms.

Boyd CM, Darer J, Boult C, Fried LP, Boult L, Wu AW. Clinical practice guidelines and quality of care for older patients with multiple comorbid diseases: implications for pay for performance. JAMA. 2005 Aug 10;294(6):716-24. PubMed PMID: 16091574.

More on Attrition Bias: Update on Missing Data Points: Difference or No Difference — Does it Matter?

Posted on January 14, 2014 by Sheri Strite, Co-author of Basics for Evaluating Medical Research Studies: A Simplified Approach

Attrition Bias Update 01/14/2014: Missing Data Points: Difference or No Difference — Does it Matter?

A colleague recently wrote us to ask us more about attrition bias. We shared with him that the short answer is that there is less conclusive research on attrition bias than on other key biases. Attrition does not necessarily mean that attrition bias is present and distorting statistically significant results. Attrition may simply result in a smaller sample size which, depending upon how small the remaining population is, may be more prone to chance due to outliers or false non-significant findings due to lack of power.

If randomization successfully results in balanced groups, if blinding is successful including concealed allocation of patients to their study groups, if adherence is high, if protocol deviations are balanced and low, if co-interventions are balanced, if censoring rules are used which are unbiased, and if there are no differences between the groups except for the interventions studied, then it may be reasonable to conclude that attrition bias is not present even if attrition rates are large. Balanced baseline comparisons between completers provides further support for such a conclusion as does comparability in reasons for discontinuation, especially if many categories are reported.

On the other hand, other biases may result in attrition bias. For example, imagine a comparison of an active agent to a placebo in a situation in which blinding is not successful. A physician might encourage his or her patient to drop out of a study if they know the patient is on placebo, resulting in biased attrition that, in sufficient numbers, would potentially distort the results from what they would otherwise have been.

California Pharmacist Journal: Student Evidence Review of NAVIGATOR Study

Status

California Pharmacist Journal: Student Evidence Review of NAVIGATOR Study

Klevens A, Stuart ME, Strite SA. NAVIGATOR (Effect of nateglinide on the incidence of diabetes and cardiovascular events PMID 20228402) Study Evidence Review. California Pharmacist 2012. Vol. LIX, No. 4. Fall 2012. at our California Pharmacist journal page.

Safety Review of Five Biologic Antirheumatic Drugs

Status

Safety Review of Five Biologic Antirheumatic Drugs

An abstract of ours was selected for publication by the The European League Against Rheumatism (EULAR) for their Annual European Congress of Rheumatism 2012. We believe that our review provides important safety information for providers and patients. While the evidence has to be considered borderline at best due to study design and methodology issues (much is observational, for example), we believe the patterns are highly compelling, consistent and that they are not likely to be explained by a systematic bias. Therefore, we feel quite confident in the direction of the outcomes. (The link below is sometimes slow to load or needs to be loaded a subsequent time to view—so if it “fails to load,” try again.)

Stuart ME, Strite SA, Gandra SA. Systematic Safety Review Of Five Biologic Antirheumatic Drugs. Abstract number AB0478; EULAR 2012 Annual European Conference of Rheumatology

Loss to Follow-up Update

Status

Loss to Follow-up Update
Heads up about an important systematic review of the effects of attrition on outcomes of randomized controlled trials (RCTs) that was recently published in the BMJ.[1]

Background

Key Question: Would the outcomes of the trial change significantly if all persons had completed the study, and we had complete information on them?
Loss to follow-up in RCTs is important because it can bias study results if the balance between study groups that was established through randomization is disrupted in key prognostic variables that would otherwise result in different outcomes. If there is no imbalance between and within various study subgroups (i.e., as randomized groups compared to completers), then loss to follow-up may not present a threat to validity, except in instances in which statistical significance is not reached because of decreased power.

BMJ Study
The aim of this review was to assess the reporting, extent and handling of loss to follow-up and its potential impact on the estimates of the effect of treatment in RCTs. The investigators evaluated 235 RCTs published between 2005 through 2007 in the five general medical journals with the highest impact factors: Annals of Internal Medicine, BMJ, JAMA, Lancet, and New England Journal of Medicine. All eligible studies reported a significant (P<0.05) primary patient-important outcome.

Methods
The investigators did several sensitivity analyses to evaluate the effect varying assumptions about the outcomes of participants lost to follow-up on the estimate of effect for the primary outcome. Their analyses strategies were—

None of the participants lost to follow-up had the event
All the participants lost to follow-up had the event
None of those lost to follow-up in the treatment group had the event and all those lost to follow-up in the control group did (best case scenario)
All participants lost to follow-up in the treatment group had the event and none of those in the control group did (worst case scenario)
More plausible assumptions using various event rates which the authors call the “the event incidence:” The investigators performed sensitivity analyses using what they considered to be plausible ratios of event rates in the dropouts compared to the completers using ratios of 1, 1.5, 2, 3.5 in the intervention group compared to the control group (see Appendix 2 at the link at the end of this post below the reference). They chose an upper limit of 5 times as many dropouts for the intervention group as it represents the highest ratio reported in the literature.

Key Findings

Of the 235 eligible studies, 31 (13%) did not report whether or not loss to follow-up occurred.
In studies reporting the relevant information, the median percentage of participants lost to follow-up was 6% (interquartile range 2-14%).
The method by which loss to follow-up was handled was unclear in 37 studies (19%); the most commonly used method was survival analysis (66, 35%).
When the investigators varied assumptions about loss to follow-up, results of 19% of trials were no longer significant if they assumed no participants lost to follow-up had the event of interest, 17% if they assumed that all participants lost to follow-up had the event, and 58% if they assumed a worst case scenario (all participants lost to follow-up in the treatment group and none of those in the control group had the event).
Under more plausible assumptions, in which the incidence of events in those lost to follow-up relative to those followed-up was higher in the intervention than control group, 0% to 33% of trials—depending upon which plausible assumptions were used (see Appendix 2 at the link at the end of this post below the reference)— lost statistically significant differences in important endpoints.

Summary
When plausible assumptions are made about the outcomes of participants lost to follow-up in RCTs, this study reports that up to a third of positive findings in RCTs lose statistical significance. The authors recommend that authors of individual RCTs and of systematic reviews test their results against various reasonable assumptions (sensitivity analyses). Only when the results are robust with all reasonable assumptions should inferences from those study results be used by readers.

For more information see the Delfini white paper on “missingness” at http://www.delfini.org/Delfini_WhitePaper_MissingData.pdf

Reference

1. Akl EA, Briel M, You JJ et al. Potential impact on estimated treatment effects of information lost to follow-up in randomised controlled trials (LOST-IT): systematic review BMJ 2012;344:e2809 doi: 10.1136/bmj.e2809 (Published 18 May 2012). PMID: 19519891

Article is freely available at—

http://www.bmj.com/content/344/bmj.e2809

Supplementary information is available at—

http://www.bmj.com/content/suppl/2012/05/18/bmj.e2809.DC1

For sensitivity analysis results tables, see Appendix 2 at—

http://www.bmj.com/highwire/filestream/585392/field_highwire_adjunct_files/1

Happy Valentine’s! A New Delfini Day Dawns & A Treat For You!

Status

Our Blog
This is the official announcement of our blog! We are in the midst of moving to a social media-mode. Go to our website to get just-in-time updates to our blog and website as we dip our toes in these waters that are new for us…on a trial basis, FYI…we need to hear that you are using this information and sharing it to warrant our efforts. It’s a lotta work for us to keep all of this up! Follow us on Twitter (easy instructions at our website to get just-in-time updates…next entry will be next Thursday, 2/16/2012).

Our Website
And we are retooling our website. We are in the midst of this gargantuan effort. (We have a very big site! Whew!) But we hope access will be even easier because of our changes. Key pages have been converted, but many have not yet been, so page-to-page will look a little chaotic and not function the same until we are done: www.delfini.org.

Your Treat
And now, for a fantastic treat! The generous, creative and brilliant Paul Vallett, PhC, had a frustrating day one day in his lab, and while we would never wish him that, we are grateful for his transmogrifying that evil event into something brilliant and wonderful that sooooo exquisitely captures what life often looks like on the outside, but what insiders actually experience and know. We’ve all been here, right!!?!?? (Oh, yeah, like my web site redesign! It’s no cut-and-paste job. I am now as bald as Mike!) Paul’s is about one of our most beloved topics, “Doing Science,” but it can stand in as metaphor to so much more…to.so.much.more… And it is hilarious!!! The URL title gives you a hint. Thank you, Paul! Happy Day all, Sheri

http://electroncafe.wordpress.com/2011/05/04/scientific-process-rage/

Delfini Group :: Blog

Evidence-based clinical improvement, critical appraisal & medical decision-making help

Category Archives: uncategorized

“Reading” a Clinical Trial Won’t Get You There

Status

Clinical Guidelines and Elderly Patients—Proceed with Caution

Status

More on Attrition Bias: Update on Missing Data Points: Difference or No Difference — Does it Matter?

California Pharmacist Journal: Student Evidence Review of NAVIGATOR Study

Status

Safety Review of Five Biologic Antirheumatic Drugs

Status

Loss to Follow-up Update

Status

Happy Valentine’s! A New Delfini Day Dawns & A Treat For You!

Status