Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

Status

Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

We have previously mentioned the enormous contributions made by John Ioannidis MD in the area of understanding the reliability of medical evidence. [Ioannidis, Delfini Blog, Giannakakis] We want to draw your attention to a recent publication dealing with the risks of relying on observational data for cause and effect conclusions. [Hemkens] In this recent study, Hemkens, Ioannidis and other colleagues assessed differences in mortality effect size reported in observational (routinely collected data [RCD]) studies as compared with results reported in RCTs.

Eligible RCD studies used propensity scores in an effort to address confounding bias in the observational studies. The authors  compared the results of RCD and RCTs. The analysis included only RCD studies conducted before any RCT was published on the same topic. They assessed the risk of bias for RCD studies and randomized controlled trials (RCTs) using The Cochrane Collaboration risk of bias tools.  The direction of treatment effects, confidence intervals and effect sizes (odds ratios) were compared between RCD studies and RCTs. The relative odds ratios were calculated across all pairs of RCD studies and trials.

The authors found that RCD studies systematically and substantially overestimated mortality benefits of medical treatments compared with subsequent trials investigating the same question. Overall, RCD studies reported significantly more favorable mortality estimates by a relative 31% than subsequent trials (summary relative odds ratio 1.31 (95% confidence interval 1.03 to 1.65; I2 (I square)=0%)).

These authors remind us yet again that If no randomized trials exist, clinicians and other decision-makers should not trust results from observational data from sources such as local or national databases, registries, cohort or case-control studies. 

References
Delfini Blog: http://delfini.org/blog/?p=292

Giannakakis IA, Haidich AB, Contopoulos-Ioannidis DG, Papanikolaou GN, Baltogianni MS, Ioannidis JP. Citation of randomized evidence in support of guidelines of therapeutic and preventive interventions. J Clin Epidemiol. 2002 Jun;55(6):545-55. PubMed PMID: 12063096.

Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JP. Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey. BMJ. 2016 Feb 8;352:i493. doi: 10.1136/bmj.i493. PubMed PMID: 26858277.

Ioannidis JPA. Why Most Published Research Findings are False. PLoS Med 2005; 2(8):696-701 PMID: 16060722

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

“Reading” a Clinical Trial Won’t Get You There

Status

“Reading” a Clinical Trial Won’t Get You There—or Let’s Review (And Apply) Some Basics About Assessing The Validity of Medical Research Studies Claiming Superiority for Efficacy of Therapies

An obvious question raised by the title is, “Get you where?” Well, the answer is, “To where you know it is reasonable to think you can trust the results of the study you have just finished reading.” In this blog, our focus is on how to critically appraise medical research studies which claim superiority for efficacy of a therapy.

Because of Lack of Understanding Medical Science Basics, People May Be Injured or Die

Understanding basic requirements for valid medical science is very important. Numbers below are estimates, but are likely to be close or understated—

  1. Over 63,000 people with heart disease died after taking encainide or flecainide because many doctors thought taking these drugs “made biological sense,” but did not understand the simple need for reliable clinical trial information to confirm what seemed to “make sense” [Echt 91].
  2. An estimated 60,000 people in the United States died and another 140,000 experienced a heart attack resulting from the use of a nonsteroidal anti-inflammatory drug despite important benefit and safety information reported in the abstract of the pivotal trial used for FDA approval [Graham].
  3. In another example, roughly 42,000 women with advanced breast cancer suffered excruciating side effects without any proof of benefit, many of them dying as a result, and at a cost of $3.4 billion dollars [Mello].
  4. At least 64 deaths out of 751 cases in nearly half the United States were linked to fungal meningitis thought to be caused by a contaminated treatment that is used for back and radicular pain—but there is no reliable scientific evidence of benefit from that treatment [CDC].

In the above instances, these were preventable deaths and harms—from common treatments—which patients might have avoided if their physicians had better understood the importance and methods of evaluating medical science.

Failures to Understand Medical Science Basics

Many health care professionals don’t know how to quickly assess a trial for reliability and clinical usefulness—and yet mastering the basics is not difficult. Over the years, we have given a pre-test of 3 simple questions to more than a thousand physicians, pharmacists and others who have attended our training programs. Approximately 70% fail—”failure” being defined as missing 2 or 3 of the questions.

One pre-test question is designed to see if people recognize the lack of a comparison group in a report of the “effectiveness” of a new treatment. Without a comparison group of people with similar prognostic characteristics who are treated exactly the same except for the intervention under study, you cannot discern cause and effect of an intervention because a difference between groups may explain or affect the results.

A second pre-test question deals with presenting results as relative risk reduction (RRR) without absolute risk reduction (ARR) or event rates in the study groups. A “relative” measure raises the question, “Relative to what?” Is the reported RRR in our test question 60 percent of 100 percent? Or 60 percent of 1 percent?

The last of our pre-test questions assesses attendees’ basic understanding of only one of the two requirements to qualify as an Intention-to-Treat (ITT) analysis. The two requirements are that people should be randomized as analyzed and that all people should be included in the analysis whether they have discontinued, are missing or have crossed over to other treatment arms. The failure rate at knowing this last requirement is very high. (We will add that this last requirement means that a value has to be assigned if one is missing—and so, one of the most important aspects of critically appraising an ITT analysis is the evaluation of the methods for “imputing” missing data.)

By the end of our training programs, success rates have always markedly improved. Others have reported similar findings.

There is a Lot of  Science + Much of It May Not Be Reliable
Each week more than 13,000 references are added to the world’s largest library—the National Library of Medicine (NLM). Unfortunately, many of these studies are seriously flawed. One large review of 60,352 studies reported that only 7 percent passed criteria of high quality methods and clinical relevancy [McKibbon]. We and others have estimated that up to (and maybe more than) 90% of the published medical information that health care professionals rely on is flawed [Freedman, Glasziou].

Bias Distorts Results
We cannot know if an intervention is likely to be effective and safe without critically appraising the evidence for validity and clinical usefulness. We need to evaluate the reliability of medical science prior to seriously considering the reported therapeutic results because biases such as lack of or inadequate randomization, lack of successful blinding or other threats to validity—which we will describe below—can distort reported result by up to 50 percent or more [see Risk of Bias References].

Patients Deserve Better
Patients cannot make informed choices regarding various interventions without being provided with quantified projections of benefits and harms from valid science.

Some Simple Steps To Critical Appraisal
Below is a short summary of our simplified approach to critically appraising a randomized superiority clinical trial. Our focus is on “internal validity” which means “closeness to truth” in the context of the study. “External validity” is about the likelihood of reaching truth outside of the study context and requires judgment about issues such as fit with individuals or populations in circumstances other than those in the trial.

You can review and download a wealth of freely available information at our website at www.delfini.org including checklists and tools at http://www.delfini.org/delfiniTools.htm which can provide you with much greater information. Most relevant to this blog is our short critical appraisal checklist which you can download here—http://www.delfini.org/Delfini_Tool_StudyValidity_Short.pdf

The Big Questions
In brief, your overarching questions are these:

  1. Is reading this study worth my time? If the results are true, would they change my practice? Do they apply to my situation? What is the likely impact to my patients
  2. Can anything explain the results other than cause and effect? Evaluate the potential for results being distorted by bias (anything other than chance leading away from the truth) or random chance effects.
  3. Is there any difference between groups other than what is being studied? This is automatically a bias.
  4. If the study appears to be valid, but attrition is high, sometimes it is worth asking, what conditions would need to be present for attrition to distort the results? Attrition does not always distort results, but may obscure a true difference due to the reduction in sample size.

Evaluating Bias

There are four stages of a clinical trial, and you should ask several key questions when evaluating bias in each of the 4 stages.

  1. Subject Selection & Treatment Assignment—Evaluation of Selection Bias

Important considerations include how were subjects selected for study, were there enough subjects, how were they assigned to their study groups, and were the groups balanced in terms of prognostic variables?

Your critical appraisal to-do list includes—

a) Checking to see if the randomization sequence was generated in an acceptable manner. (Minimization may be an acceptable alternative.)

b) Determining if the investigators adequately concealed the allocation of subjects to each study group? Meaning, is the method for assigning treatment hidden so that an investigator cannot manipulate the assignment of a subject to a selected study group?

c) Examining the table of baseline characteristics to determine whether randomization was likely to have been successful, i.e., that the groups are balanced in terms of important prognostic variables (e.g., clinical and demographic variables).

  1. The Intervention & Context—Evaluation of Performance Bias

What is being studied, and what is it being compared to? Was the intervention likely to have been executed successfully? Was blinding likely to have been successful? Was duration reasonable for treatment as well as for follow-up? Was adherence reasonable? What else happened to study subjects in the course of the study such as use of co-interventions? Were there any differences in how subjects in the groups were treated?

Your to-do list includes evaluating:

a) Adequacy of blinding of subjects and all working with subjects and their data—including likely success of blinding;

b) Subjects’ adherence to treatment;

c) Inter-group differences in treatment or care except for the intervention(s) being studied.

  1. Data Collection & Loss of Data—Evaluation of Attrition Bias

What information was collected, and how was it collected? What data are missing and is it likely that missing data could meaningfully distort the study results?

Your to-do list includes evaluating—

a) Measurement methods (e.g., mechanisms, tools, instruments, means of administration, personnel issues, etc.)

b) Classification and quantification of missing data in each group (e.g., discontinuations due to ADEs, unrelated deaths, protocol violations, loss to follow-up, etc.)

c) Whether missing data are likely to distort the reported results? This is the area that the evidence on the distorting risk of bias provides the least help. And so, again, often it is worthwhile asking, “What conditions would need to be present for attrition to distort the results?”

  1. Results & Assessing The Differences In The Outcomes Of The Study Groups—Evaluating Assessment Bias

Were outcome measures reasonable, pre-specified and analyzed appropriately? Was reporting selective? How was safety assessed? Remember that models are not truth.

Your to-do list includes evaluating—

a) Whether assessors were blinded.

b) How the effect size was calculated (e.g., absolute risk reduction, relative risk, etc.). You especially want to know benefit or risk with and without treatment.

c) Were confidence intervals included? (You can calculate these yourself online, if you wish. See our web links at our website for suggestions.)

d) For dichotomous variables, was a proper intention-to-treat (ITT) analysis conducted with a reasonable choice for imputing values for missing data?

e) For time-to-event trials, were censoring rules unbiased? Were the number of censored subjects reported?

After you have evaluated a study for bias and chance and have determined that the study is valid, the study results should be evaluated for clinical meaningfulness, (e.g., the amount of clinical benefit and the potential for harm).  Clinical outcomes include morbidity; mortality; symptom relief; physical, mental and emotional functioning; and, quality of life—or any surrogate outcomes that have been demonstrated in valid studies to affect a clinical outcome.

Final Comment

It is not difficult to learn how to critically appraise a clinical trial. Health care providers owe it to their patients to gain these skills. Health care professionals cannot rely on abstracts and authors’ conclusions—they must assess studies first for validity and second for clinical usefulness.  Authors are often biased, even with the best of intentions. Remember that authors’ conclusions are opinions, not evidence. Authors frequently use misleading terms or draw misleading conclusions. Physicians and others who lack critical appraisal skills are often mislead by authors’ conclusions and summary statements. Critical appraisal knowledge is required to evaluate the validity of a study which must be done prior to seriously considering reported results.

For those who wish to go more deeply, we have books available and do training seminars. See our website at www.delfini.org.

Risk of Bias References

  1. Juni P, Altman DG, Egger M (2001) Systematic reviews in health care: assessing the quality of controlled clinical trials. BMJ 2001;323: 42-6. PubMed PMID: 11440947.
  2. Juni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA. 1999 Sep 15;282( 11): 1054-60. PubMed PMID: 10493204.
  3. Kjaergard LL, Villumsen J, Gluud C. Reported methodological quality and discrepancies between large and small randomized trials in metaanalyses. Ann Intern Med 2001;135: 982– 89. PMID 11730399.
  4. Moher D, Pham B, Jones A, Cook DJ, Jadad AR, Moher M, Tugwell P, Klassen TP. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet. 1998 Aug 22;352( 9128): 609-13. PubMed PMID: 9746022.
  5. Poolman RW, Struijs PA, Krips R, Inger N. Sierevelt IN, et al. (2007) Reporting of outcomes in orthopaedic randomized trials: Does blinding of outcome assessors matter? J Bone Joint Surg Am. 89: 550– 558. PMID 17332104.
  6. Savovic J, Jones HE, Altman DG, et al. Influence of Reported Study Design Characteristics on Intervention Effect Estimates From Randomized, Controlled Trials. Ann Intern Med. 2012 Sep 4. doi: 10.7326/ 0003-4819-157-6-201209180-00537. [Epub ahead of print] PubMed PMID: 22945832.
  7. van Tulder MW, Suttorp M, Morton S, et al. Empirical evidence of an association between internal validity and effect size in randomized controlled trials of low-back pain. Spine (Phila Pa 1976). 2009 Jul 15;34( 16): 1685-92. PubMed PMID: 19770609.

Other References

  1. CDC: http://www.cdc.gov/HAI/outbreaks/meningitis.html
  2. Echt DS, Liebson PR, Mitchell LB, Peters RW, Obias-Manno D, Barker AH, Arensberg D, Baker A, Friedman L, Greene HL, et al. Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial. N Engl J Med. 1991 Mar 21;324(12):781-8. PubMed PMID: 1900101.
  3. Freedman, David H. Lies, Damn Lies and Bad Medical Science. The Atlantic. November, 2010. www.theatlantic.com/ magazine/ archive/ 2010/ 11/ lies-damned-lies-and-medical-science/ 8269/, accessed 11/ 07/ 2010.
  4. Glasziou P. The EBM journal selection process: how to find the 1 in 400 valid and highly relevant new research articles. Evid Based Med. 2006 Aug; 11( 4): 101. PubMed PMID: 17213115.
  5. Graham Natural News: http://www.naturalnews.com/011401_Dr_David_Graham_the_FDA.html
  6. McKibbon KA, Wilczynski NL, Haynes RB. What do evidence-based secondary journals tell us about the publication of clinically important articles in primary health care journals? BMC Med. 2004 Sep 6;2: 33. PubMed PMID: 15350200.
  7. Mello MM, Brennan TA. The controversy over high-dose chemotherapy with autologous bone marrow transplant for breast cancer. Health Aff (Millwood). 2001 Sep-Oct;20(5):101-17. PubMed PMID: 11558695.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Sounding the Alarm (Again) in Oncology

Status

Sounding the Alarm (Again) in Oncology

Five years ago Fojo and Grady sounded the alarm about value in many of the new oncology drugs [1]. They raised the following issues and challenged oncologists and others to get involved in addressing these issues:

  • There is a great deal of uncertainty and confusion about what constitutes a benefit in cancer therapy; and,
  • How much should cost factor into these deliberations?

The authors review a number of oncology drug studies reporting increased overall survival (OS) ranging from a median of a few days to a few months with total new drug costs ranging from $15,000 to $90,000 plus. In some cases, there is no increase in OS, but only progression free survival (PFS) which is a weaker outcome measure due to its being prone to tumor assessment biases and is frequently assessed in studies of short duration. Adverse events associated with the new drugs are many and include higher rates of febrile neutropenia, infusion-related reactions, diarrhea, skin toxicity, infections, hypertension and other adverse events.

Fojo and Grady point out that—

“Many Americans would likely not regard a 1.2-month survival advantage as ‘significant’ progress, the much revered P value notwithstanding. But would an individual patient agree? Although we lack the answer to this question, we would suggest that the death of a mother of four at age 37 years would be no less painful were it to occur at age 37 years and 1 month, nor would the passing of a 67-year-old who planned to travel after retiring be any less difficult for the spouse were it to have occurred 1 month later.”

In a recent article [2] (thanks to Dr. Richard Lehman for drawing our attention to this article in his wonderful BMJ blog) Fojo and colleagues again point out that—

  • Cancer is the number one cause of mortality worldwide, and cancer cases are projected to rise by 75% over the next 2 decades.
  • Of the 71 therapies for solid tumors receiving FDA approval from 2002 to 2014, only 30 of the 71 approvals (42%) met the American Society of Clinical Oncology Cancer Research Committee’s “low hurdle” criteria for clinically meaningful improvement. Further, the authors tallied results from all the studies and reported very modest collective median gains of 2.5 months for PFS and 2.1 months for OS. Numerous surveys have indicated that patients expect much more.
  • Expensive therapies are stifling progress by (1) encouraging enormous expenditures of time, money, and resources on marginal therapeutic indications; and, (2) promoting a me-too mentality that is stifling innovation and creativity.

The last bullet needs a little explaining. The authors provide a number of examples of “safe bets” and argue that revenue from such safe and profitable therapies rather than true need has been a driving force for new oncology drugs. The problem is compounded by regulations—e.g., rules which require Medicare to reimburse patients for any drug used in an “anti-cancer chemotherapeutic regimen”—regardless of its incremental benefit over other drugs—as long as the use is “for a medically accepted indication” (commonly interpreted as “approved by the FDA”). This provides guaranteed revenues for me-too drugs irrespective of their marginal benefits. The authors also point out that when prices for drugs of proven efficacy fall below a certain threshold, suppliers often stop producing the drug, causing severe shortages.

What can be done? The authors acknowledge several times in their commentary that the spiraling cost of cancer therapies has no single villain; academia, professional societies, scientific journals, practicing oncologists, regulators, patient advocacy groups and the biopharmaceutical industry—all bear some responsibility. [We would add to this list physicians, P&T committees and any others who are engaged in treatment decisions for patients. Patients are not on this list (yet) because they are unlikely to really know the evidence.] This is like many other situations when many are responsible—often the end result is that “no one” takes responsibility. Fojo et al. close by making several suggestions, among which are—

  1. Academicians must avoid participating in the development of marginal therapies;
  2. Professional societies and scientific journals must raise their standards and not spotlight marginal outcomes;
  3. All of us must also insist on transparency and the sharing of all published data in a timely and enforceable manner;
  4. Actual gains of benefit must be emphasized—not hazard ratios or other measures that force readers to work hard to determine actual outcomes and benefits and risks;
  5. We need cooperative groups with adequate resources to provide leadership to ensure that trials are designed to deliver meaningful outcomes;
  6. We must find a way to avoid paying premium prices for marginal benefits; and,
  7. We must find a way [federal support?] to secure altruistic investment capital.

Delfini Comment
While the authors do not make a suggestion for specific responsibilities or actions on the part of the FDA, they do make a recommendation that an independent entity might create uniform measures of benefits for each FDA-approved drug—e.g., quality-adjusted life-years. We think the FDA could go a long way in improving this situation.

And so, as pointed out by Fojo et al., only small gains have been made in OS over the past 12 years, and costs of oncology drugs have skyrocketed. However, to make matters even worse than portrayed by Fojo et al., many of the oncology drug studies we see have major threats to validity (e.g., selection bias, lack of blinding and other performance biases, attrition and assessment bias, etc.) raising the question, “Does the approximate 2 month gain in median OS represent an overestimate?” Since bias tends to favor the new intervention in clinical trials, the PFS and OS reported in many of the recent oncology trials may be exaggerated or even absent or harms may outweigh benefits. On the other hand, if a study is valid, since a median is a midpoint in a range of results and a patient may achieve better results than indicated by the median, some patients may choose to accept a new therapy. The important thing is that patients are given information on benefits and harms in a way that allows them to have a reasonable understanding of all the issues and make the choices that are right for them.

Resources & References

Resource

  1. The URL for Dr. Lehman’s Blog is—
    http://blogs.bmj.com/bmj/category/richard-lehmans-weekly-review-of-medical-journals/
  2. The URL for his original blog entry about this article is—
    http://blogs.bmj.com/bmj/2014/11/24/richard-lehmans-journal-review-24-november-2014/

References

  1. Fojo T, Grady C. How much is life worth: cetuximab, non-small cell lung cancer, and the $440 billion question. J Natl Cancer Inst. 2009 Aug 5;101(15):1044-8. Epub 2009 Jun 29. PMID: 19564563
  2. Fojo T, Mailankody S, Lo A. Unintended Consequences of Expensive Cancer Therapeutics-The Pursuit of Marginal Indications and a Me-Too Mentality That Stifles Innovation and Creativity: The John Conley Lecture. JAMA Otolaryngol Head Neck Surg. 2014 Jul 28. doi: 10.1001/jamaoto.2014.1570. [Epub ahead of print] PubMed PMID: 25068501.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Involving Patients in Their Care Decisions and JAMA Editorial: The New Cholesterol and Blood Pressure Guidelines: Perspective on the Path Forward

Status

Involving Patients in Their Care Decisions and JAMA Editorial: The New Cholesterol and Blood Pressure Guidelines: Perspective on the Path Forward

Krumholz HM. The New Cholesterol and Blood Pressure Guidelines: Perspective on the Path Forward. JAMA. 2014 Mar 29. doi: 10.1001/jama.2014.2634. [Epub ahead of print] PubMed PMID: 24682222.

http://jama.jamanetwork.com/article.aspx?articleid=1853201

Here is an excellent editorial that highlights the importance of patient decision-making.  We thank the wonderful Dr. Richard Lehman, MA, BM, BCh, Oxford, & Blogger, BMJ Journal Watch, for bringing this to our attention. [Note: Richard’s wonderful weekly review of medical journals—informative, inspiring and oh so droll—is here.]

We have often observed that evidence can be a neutralizing force. This editorial highlights for us that this means involving the patient in a meaningful way and finding ways to support decisions based on patients’ personal requirements. These personal “patient requirements” include health care needs and wants and a recognition of individual circumstances, values and preferences.

To achieve this, we believe that patients should receive the same information as clinicians including what alternatives are available, a quantified assessment of potential benefits and harms of each including the strength of evidence for each and potential consequences of making various choices including things like vitality and cost.

Decisions may differ between patients, and physicians may make incorrect assumption about what most matters to patients of which there are many examples in the literature such as in the citations below.

O’Connor A. Using patient decision aids to promote evidence-based decision making. ACP J Club. 2001 Jul-Aug;135(1):A11-2. PubMed PMID: 11471526.

O’Connor AM, Wennberg JE, Legare F, Llewellyn-Thomas HA,Moulton BW, Sepucha KR, et al. Toward the ‘tipping point’: decision aids and informed patient choice. Health Affairs 2007;26(3):716-25.

Rothwell PM. External validity of randomised controlled trials: “to whom do the results of this trial apply?”. Lancet. 2005 Jan 1-7;365(9453):82-93. PubMed PMID: 15639683.

Stacey D, Bennett CL, Barry MJ, Col NF, Eden KB, Holmes-Rovner M, Llewellyn-Thomas H, Lyddiatt A, Légaré F, Thomson R. Decision aids for people facing health treatment or screening decisions. Cochrane Database Syst Rev. 2011 Oct 5;(10):CD001431. Review. PubMed PMID: 21975733.

Wennberg JE, O’Connor AM, Collins ED, Weinstein JN. Extending the P4P agenda, part 1: how Medicare can improve patient decision making and reduce unnecessary care. Health Aff (Millwood). 2007 Nov-Dec;26(6):1564-74. PubMed PMID: 17978377.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

Status

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

A recent paper by McCormack reminds us that authors may mislead readers by making unwarranted “all-or-none” statements and that readers should be mindful of this and carefully examine confidence intervals.

When examining results of a valid study, confidence intervals (CIs) provide much more information than p-values. The results are statistically significant if a confidence interval does not touch the line of no difference (zero in the case of measures of outcomes expressed as percentages such as absolute risk reduction and relative risk reduction and 1 in the case of ratios such as relative risk and odds ratios). However, in addition to providing information about statistical significance, confidence intervals also provide a plausible range for possibly true results within a margin of chance (5 percent in the case of a 95% CI). While the actual calculated outcome (i.e., the point estimate) is “the most likely to be true” result within the confidence interval, having this range enables readers to judge, in their opinion, if statistically significant results are clinically meaningful.

However, as McCormack points out, authors frequently do not provide useful interpretation of the confidence intervals, and authors at times report different conclusions from similar data. McCormack presents several cases that illustrate this problem, and this paper is worth reading.

As an illustration, assume two hypothetical studies report very similar results. In the first study of drug A versus drug B, the relative risk for mortality was 0.9, 95% CI (0.80 to 1.05). The authors might state that there was no difference in mortality between the two drugs because the difference is not statistically significant. However, the upper confidence interval is close to the line of no difference and so the confidence interval tells us that it is possible that a difference would have been found if more people were studied, so that statement is misleading. A better statement for the first study would include the confidence intervals and a neutral interpretation of what the results for mortality might mean. Example—

“The relative risk for overall mortality with drug A compared to placebo was 0.9, 95% CI (0.80 to 1.05). The confidence intervals tell us that Drug A may reduce mortality by up to a relative 20% (i.e., the relative risk reduction), but may increase mortality, compared to Drug B, by approximately 5%.”

In a second study with similar populations and interventions, the relative risk for mortality might be 0.93, 95% CI (0.83 to 0.99). In this case, some authors might state, “Drug A reduces mortality.” A better statement for this second hypothetical study would ensure that the reader knows that the upper confidence interval is close to the line of no difference and, therefore, is close to non-significance. Example—

“Although the mortality difference is statistically significant, the confidence interval indicates that the relative risk reduction may be as great as 17% but may be as small as 1%.”

The Bottom Line

  1. Remember that p-values refer only to statistical significance and confidence intervals are needed to evaluate clinical significance.
  2. Watch out for statements containing the words “no difference” in the reporting of study results. A finding of no statistically significant difference may be a product of too few people studied (or insufficient time).
  3. Watch out for statements implying meaningful differences between groups when one of the confidence intervals approaches the line of no difference.
  4. None of this means anything unless the study is valid. Remember that bias tends to favor the intervention under study.

If authors do not provide you with confidence intervals, you may be able to compute them yourself, if they have supplied you with sufficient data, using an online confidence interval calculator. For our favorites, search “confidence intervals” at our web links page: http://www.delfini.org/delfiniWebSources.htm

Reference

McCormack J, Vandermeer B, Allan GM. How confidence intervals become confusion intervals. BMC Med Res Methodol. 2013 Oct 31;13(1):134. [Epub ahead of print] PubMed PMID: 24172248.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Can Clinical Guidelines be Trusted?

Status

Can Clinical Guidelines be Trusted?

In a recent BMJ article, “Why we can’t trust clinical guidelines,” Jeanne Lenzer raises a number of concerns regarding clinical guidelines[1]. She begins by summarizing the conflict between 1990 guidelines recommending steroids for acute spinal injury versus 2013 cllinical recommendations against using steroids in acute spinal injury. She then asks, “Why do processes intended to prevent or reduce bias fail?

Her proposed answers to this question include the following—

  • Many doctors follow guidelines, even if not convinced about the recommendations, because they fear professional censure and possible harm to their careers.
    • Supporting this, she cites a poll of over 1000 neurosurgeons which showed that—
      • Only 11% believed the treatment was safe and effective.
      • Only 6% thought it should be a standard of care.
      • Yet when asked if they would continue prescribing the treatment, 60% said that they would. Many cited a fear of malpractice if they failed to follow “a standard of care.” (Note: the standard of care changed in March 2013 when the Congress of Neurological Surgeons stated there was no high quality evidence to support the recommendation.)
  • Clinical guideline chairs and participants frequently have financial conflicts.
    • The Cochrane reviewer for the 1990 guideline she references had strong ties to industry.

Delfini Comment

  • Fear-based Decision-making by Physicians

We believe this is a reality. In our work with administrative law judges, we have been told that if you “run with the pack,” you better be right, and if you “run outside the pack,” you really better be right. And what happens in court is not necessarily true or just. The solution is better recommendations constructed from individualized, thoughtful decisions based on valid critically appraised evidence found to be clinically useful, patient preferences and other factors. The important starting place is effective critical appraisal of the evidence.

  • Financial Conflicts of Interest & Industry Influence

It is certainly true that money can sway decisions, be it coming from industry support or potential for income. However, we think that most doctors want to do their best for patients and try to make decisions or provide recommendations with the patient’s best interest in mind. Therefore, we think this latter issue may be more complex and strongly affected in both instances by the large number of physicians and others involved in health care decision-making who 1) do not understand that many research studies are not valid or reported sufficiently to tell; and, 2) lack the skills to be able to differentiate reliable studies from those which may not be reliable.

When it comes to industry support, one of the variables traveling with money includes greater exposure to information through data or contacts with experts supporting that manufacturer’s products. We suspect that industry influence may be less due to financial incentives than this exposure coupled with lack of critical appraisal understanding. As such, we wrote a Letter to the Editor describing our theory that the major problem of low quality guidelines might stem from physicians’ and others’ lack of competency in evaluating the quality of the evidence. Our response is reproduced here.

Delfini BMJ Rapid Response [2]:

We (Delfini) believe that we have some unique insight into how ties to industry may result in advocacy for a particular intervention due to our extensive experience training health care professionals and students in critical appraisal of the medical literature. We think it is very possible that the outcomes Lenzer describes are less due to financial influence than are due to lack of knowledge. The vast majority of physicians and other health care professionals do not have even rudimentary skills in identifying science that is at high to medium risk of bias or understand when results may have a high likelihood of being due to chance. Having ties to industry would likely result in greater exposure to science supporting a particular intervention.

Without the ability to evaluate the quality of the science, we think it is likely that individuals would be swayed and/or convinced by that science. The remedy for this and for other problems with the quality of clinical guidelines is ensuring that all guideline development members have basic critical appraisal skills and there is enough transparency in guidelines so that appraisal of a guideline and the studies utilized can easily be accomplished.

References

1. Lenzer J. Why we can’t trust clinical guidelines. BMJ 2013; 346:f3830

2. Strite SA, Stuart M. BMJ Rapid Response: Why we can’t trust clinical guidelines. BMJ 2013;346:f3830; http://www.bmj.com/content/346/bmj.f3830/rr/651876

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Webinar: “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

Status

“Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

On Monday, May 20, 2013, we presented a webinar on “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities” for the member organizations of the Alliance of Community Health Plans (ACHP).

The 80-minute discussion addressed four topic areas, all of which have unique critical appraisal challenges. Webinar goals were to discuss issues that arise when conducting quality improvement efforts using real world data, such as data from claims, surveys and observational studies and other published healthcare evidence.

Key pitfalls were cherry picked for these four mini-seminars—

  • Pitfalls to avoid when using real-world data, dealing with heterogeneity, confounding-by-indication and causality.
  • Key issues in evaluating oncology studies — outcome issues and focus on how to address large attrition rates.
  • Important issues when conducting comparative safety reviews — assessing patterns through use of RCTs, systematic reviews, observational studies and registries.
  • Key issues in evaluating studies employing Kaplan-Meier estimates — time-to-event basics with attention to the important problem of censoring.

A recording of the webinar is available at—

https://achp.webex.com/achp/lsr.php?AT=pb&SP=TC&rID=45261732&rKey=1475c8c3abed8061&act=pb

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Review of Endocrinology Guidelines

Status

Review of Endocrinology Guidelines

Decision-makers frequently rely on the body of pertinent research in making decisions regarding clinical management decisions. The goal is to critically appraise and synthesize the evidence before making recommendations, developing protocols and making other decisions. Serious attention is paid to the validity of the primary studies to determine reliability before accepting them into the review.  Brito and colleagues have described the rigor of systematic reviews (SRs) cited from 2006 until January 2012 in support of the clinical practice guidelines put forth by the Endocrine Society using the Assessment of Multiple Systematic Reviews (AMSTAR) tool [1].

The authors included 69 of 2817 studies. These 69 SRs had a mean AMSTAR score of 6.4 (standard deviation, 2.5) of a maximum score of 11, with scores improving over time. Thirty five percent of the included SRs were of low-quality (methodological AMSTAR score 1 or 2 of 5, and were cited in 24 different recommendations). These low quality SRs were the main evidentiary support for five recommendations, of which only one acknowledged the quality of SRs.

The authors conclude that few recommendations in field of endocrinology are supported by reliable SRs and that the quality of the endocrinology SRs is suboptimal and is currently not being addressed by guideline developers. SRs should reliably represent the body of relevant evidence.  The authors urge authors and journal editors to pay attention to bias and adequate reporting.

Delfini note: Once again we see a review of guideline work which suggests using caution in accepting clinical recommendations without critical appraisal of the evidence and knowing the strength of the evidence supporting clinical recommendations.

1. Brito JP, Tsapas A, Griebeler ML, Wang Z, Prutsky GJ, Domecq JP, Murad MH, Montori VM. Systematic reviews supporting practice guideline recommendations lack protection against bias. J Clin Epidemiol. 2013 Jun;66(6):633-8. doi: 10.1016/j.jclinepi.2013.01.008. Epub 2013 Mar 16. PubMed PMID: 23510557.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Review of Bias In Diabetes Randomized Controlled Trials

Status

Review of Bias In Diabetes Randomized Controlled Trials

Healthcare professionals must evaluate the internal validity of randomized controlled trials (RCTs) as a first step in the process of considering the application of clinical findings (results) for particular patients. Bias has been repeatedly shown to increase the likelihood of distorted study results, frequently favoring the intervention.

Readers may be interested in a new systematic review of diabetes RCTs. Risk of bias (low, unclear or high) was assessed in 142 trials using the Cochrane Risk of Bias Tool.  Overall, 69 trials (49%) had at least one out of seven domains with high risk of bias. Inadequate reporting frequently hampered the risk of bias assessment: the method of producing the allocation sequence was unclear in 82 trials (58%) and allocation concealment was unclear in 78 trials (55%). There were no significant reductions in the proportion of studies at high risk of bias over time nor in the adequacy of reporting of risk of bias domains. The authors conclude that these trials have serious limitations that put the findings in question and therefore inhibit evidence-based quality improvement (QI). There is a need to limit the potential for bias when conducting QI trials and improve the quality of reporting of QI trials so that stakeholders have adequate evidence for implementation. The entire freely-available study is available at—

http://bmjopen.bmj.com/content/3/4/e002727.long

Ivers NM, Tricco AC, Taljaard M, Halperin I, Turner L, Moher D, Grimshaw JM. Quality improvement needed in quality improvement randomised trials: systematic review of interventions to improve care in diabetes. BMJ Open. 2013 Apr 9;3(4). doi:pii: e002727. 10.1136/bmjopen-2013-002727. Print 2013. PubMed PMID: 23576000.

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Reliable Clinical Guidelines

Status

Reliable Clinical Guidelines—Great Idea, Not-Such-A-Great Reality

Although clinical guideline recommendations about managing a given condition may differ, guidelines are, in general, considered to be important sources for individual clinical decision-making, protocol development, order sets, performance measures and insurance coverage. The Institute of Medicine [IOM] has created important recommendations that guideline developers should pay attention to—

  1. Transparency;
  2.  Management of conflict of interest;
  3.  Guideline development group composition;
  4. How the evidence review is used to inform clinical recommendations;
  5.  Establishing evidence foundations for making strength of recommendation ratings;
  6. Clear articulation of recommendations;
  7. External review; and,
  8. Updating.

Investigators recently evaluated 114 randomly chosen guidelines against a selection from the IOM standards and found poor adherence [Kung 12]. The group found that the overall median number of IOM standards satisfied was only 8 out of 18 (44.4%) of those standards. They also found that subspecialty societies tended to satisfy fewer IOM methodological standards. This study shows that there has been no change in guideline quality over the past decade and a half when an earlier study found similar results [Shaneyfeld 99].  This finding, of course, is likely to have the effect of leaving end-users uncertain as to how to best incorporate clinical guidelines into clinical practice and care improvements.  Further, Kung’s study found that few guidelines groups included information scientists (individuals skilled in critical appraisal of the evidence to determine the reliability of the results) and even fewer included patients or patient representatives.

An editorialist suggests that currently there are 5 things we need [Ransohoff]. We need:

1. An agreed-upon transparent, trustworthy process for developing ways to evaluate clinical guidelines and their recommendations.

2. A reliable method to express the degree of adherence to each IOM or other agreed-upon standard and a method for creating a composite measure of adherence.

From these two steps, we must create a “total trustworthiness score” which reflects adherence to all standards.

3. To accept that our current processes of developing trustworthy measures is a work in progress. Therefore, stakeholders must actively participate in accomplishing these 5 tasks.

4. To identify an institutional home that can sustain the process of developing measures of trustworthiness.

5. To develop a marketplace for trustworthy guidelines. Ratings should be displayed alongside each recommendation.

At this time, we have to agree with Shaneyfeld who wrote an accompanying commentary to Kung’s study [Shaneyfeld 12]:

What will the next decade of guideline development be like? I am not optimistic that much will improve. No one seems interested in curtailing the out-of-control guideline industry. Guideline developers seem set in their ways. I agree with the IOM that the Agency for Healthcare Research and Quality (AHRQ) should require guidelines to indicate their adherence to development standards. I think a necessary next step is for the AHRQ to certify guidelines that meet these standards and allow only certified guidelines to be published in the National Guidelines Clearinghouse. Currently, readers cannot rely on the fact that a guideline is published in the National Guidelines Clearinghouse as evidence of its trustworthiness, as demonstrated by Kung et al. I hope efforts by the Guidelines International Network are successful, but until then, in guidelines we cannot trust.

References

1. IOM: Graham R, Mancher M, Wolman DM,  et al; Committee on Standards for Developing Trustworthy Clinical Practice Guidelines; Board on Health Care Services.  Clinical Practice Guidelines We Can Trust. Washington, DC: National Academies Press; 2011 http://www.nap.edu/catalog.php?record_id=13058

2. Kung J, Miller RR, Mackowiak PA. Failure of Clinical Practice Guidelines to Meet Institute of Medicine Standards: Two More Decades of Little, If Any, Progress. Arch Intern Med. 2012 Oct 22:1-6. doi: 10.1001/2013.jamainternmed.56. [Epub ahead of print] PubMed PMID: 23089902.

3.  Ransohoff DF, Pignone M, Sox HC. How to decide whether a clinical practice guideline is trustworthy. JAMA. 2013 Jan 9;309(2):139-40. doi: 10.1001/jama.2012.156703. PubMed PMID: 23299601.

4. Shaneyfelt TM, Mayo-Smith MF, Rothwangl J. Are guidelines following guidelines? The methodological quality of clinical practice guidelines in the peer-reviewed medical literature. JAMA. 1999 May 26;281(20):1900-5. PubMed PMID: 10349893.

5. Shaneyfelt T. In Guidelines We Cannot Trust: Comment on “Failure of Clinical Practice Guidelines to Meet Institute of Medicine Standards”. Arch Intern Med. 2012 Oct 22:1-2. doi: 10.1001/2013.jamainternmed.335. [Epub ahead of print] PubMed PMID: 23089851.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email