Sheri & Mike, Authors of Basics for Evaluating Medical Research Studies | Delfini Group :: Blog

Medical Literature Searching Update

Status

Searching Update

We’ve updated our searching tips. You can download our Searching the Medical Literature Tool, along with others freely available, at our library of Tools & Educational Materials by Delfini:

http://www.delfini.org/delfiniTools.htm

1. Quick Way To Find Drug Information On The FDA Site

If you are looking for information about a specific drug, (e.g., a drug recently approved by the FDA) you it may be faster to use Google to find the information you want. Type “FDA [drug name].

2. Also see Searching With Symbols in the tool.

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

Status

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

A recent paper by McCormack reminds us that authors may mislead readers by making unwarranted “all-or-none” statements and that readers should be mindful of this and carefully examine confidence intervals.

When examining results of a valid study, confidence intervals (CIs) provide much more information than p-values. The results are statistically significant if a confidence interval does not touch the line of no difference (zero in the case of measures of outcomes expressed as percentages such as absolute risk reduction and relative risk reduction and 1 in the case of ratios such as relative risk and odds ratios). However, in addition to providing information about statistical significance, confidence intervals also provide a plausible range for possibly true results within a margin of chance (5 percent in the case of a 95% CI). While the actual calculated outcome (i.e., the point estimate) is “the most likely to be true” result within the confidence interval, having this range enables readers to judge, in their opinion, if statistically significant results are clinically meaningful.

However, as McCormack points out, authors frequently do not provide useful interpretation of the confidence intervals, and authors at times report different conclusions from similar data. McCormack presents several cases that illustrate this problem, and this paper is worth reading.

As an illustration, assume two hypothetical studies report very similar results. In the first study of drug A versus drug B, the relative risk for mortality was 0.9, 95% CI (0.80 to 1.05). The authors might state that there was no difference in mortality between the two drugs because the difference is not statistically significant. However, the upper confidence interval is close to the line of no difference and so the confidence interval tells us that it is possible that a difference would have been found if more people were studied, so that statement is misleading. A better statement for the first study would include the confidence intervals and a neutral interpretation of what the results for mortality might mean. Example—

“The relative risk for overall mortality with drug A compared to placebo was 0.9, 95% CI (0.80 to 1.05). The confidence intervals tell us that Drug A may reduce mortality by up to a relative 20% (i.e., the relative risk reduction), but may increase mortality, compared to Drug B, by approximately 5%.”

In a second study with similar populations and interventions, the relative risk for mortality might be 0.93, 95% CI (0.83 to 0.99). In this case, some authors might state, “Drug A reduces mortality.” A better statement for this second hypothetical study would ensure that the reader knows that the upper confidence interval is close to the line of no difference and, therefore, is close to non-significance. Example—

“Although the mortality difference is statistically significant, the confidence interval indicates that the relative risk reduction may be as great as 17% but may be as small as 1%.”

The Bottom Line

Remember that p-values refer only to statistical significance and confidence intervals are needed to evaluate clinical significance.
Watch out for statements containing the words “no difference” in the reporting of study results. A finding of no statistically significant difference may be a product of too few people studied (or insufficient time).
Watch out for statements implying meaningful differences between groups when one of the confidence intervals approaches the line of no difference.
None of this means anything unless the study is valid. Remember that bias tends to favor the intervention under study.

If authors do not provide you with confidence intervals, you may be able to compute them yourself, if they have supplied you with sufficient data, using an online confidence interval calculator. For our favorites, search “confidence intervals” at our web links page: http://www.delfini.org/delfiniWebSources.htm

Reference

McCormack J, Vandermeer B, Allan GM. How confidence intervals become confusion intervals. BMC Med Res Methodol. 2013 Oct 31;13(1):134. [Epub ahead of print] PubMed PMID: 24172248.

Biostatistical Help for Critical Appraisers

Status

Book Recommendation: Biostatistics for Dummies by John C. Pezzullo, PhD

We highly recommend this book. In short—

An excellent resource
Useful to critical appraisers because it can help us understand why certain common statistical tests are used in studies
Provides a needed resource for answering questions about various tests
Helpful explanations
Written in a clear style with the goal of making difficult information accessible and understandable
Friendly style due to author’s wit and charm, and the reassurance he provides along the way

Read our full review here. Go to Amazon page and full customer reviews here.

Time-related Biases

Status

Time-related Biases Including Immortality Bias

We were recently asked about the term “immortality bias.” The easiest way to explain immortality bias is to start with an example. Imagine a study of hospitalized COPD patients undertaken to assess the impact of drug A, an inhaled corticosteroid preparation, on survival. In our first example, people are randomized to receive a prescription to drug A post-discharge or not to receive a prescription. If someone in group A dies prior to filling their prescription, they should be analyzed as randomized and, therefore, they should be counted as a death in the drug A group even though they were never actually exposed to drug A.

Let’s imagine that drug A confers no survival advantage and that mortality for this population is 10 percent. In a study population of 1,000 patients in each group, we would expect 100 deaths in each group. Let us say that 10 people in the drug A group died before they could receive their medication. If we did not analyze the unexposed people who died in group A as randomized, that would be 90 drug A deaths as compared to 100 comparison group deaths—making it falsely appear that drug A resulted in a survival advantage.

If drug A actually works, the time that patients are not exposed to the drug works a little against the intervention (oh, yes, and do people actually take their drug?), but as bias tends to favor the intervention, this probably evens up the playing field a bit—there is a reason why we talk about “closeness to truth” and “estimates of effect.”

“Immortality bias” is a risk in studies when there is a time period (the “immortal” or the “immune” time when the outcome is other than survival) in which patients in one group cannot experience an event. Setting aside the myriad other biases that can plague observational studies, such as the potential for confounding through choice of treatment, to illustrate this, let us compare our randomized controlled trial (RCT) that we just described to a retrospective cohort study to study the same thing. In the observational study, we have to pick a time to start observing patients, and it is no longer randomly decided how patients are grouped for analysis, so we have to make a choice about that too.

For our example, let us say we are going to start the clock on recording outcomes (death) beginning at the date of discharge. Patients are then grouped for analysis by whether or not they filled a prescription for drug A within 90 days of discharge. Because “being alive” is a requirement for picking up prescription, but not for the comparison group, the drug A group potentially receives a “survival advantage” if this bias isn’t taken into account in some way in the analysis.

In other words, by design, no deaths can occur in the drug A group prior to picking up a prescription. However, in the comparison group, death never gets an opportunity to “take a holiday” as it were. If you die before getting a prescription, you are automatically counted in the comparison group. If you live and pick up your prescription, you are automatically counted in the drug A group. So the outcome of “being alive” is a prerequisite to being in the drug A group. Therefore, all deaths of people not filling a prescription that occur prior to that 90 day window get counted in the comparison group. And so yet another example of how groups being different or being treated differently other than what is being studied can bias outcomes.

Many readers will recognize the similarity between immortality bias and lead time bias. Lead time bias occurs when earlier detection of a disease, because of screening, makes it appear that the screening has conferred a survival advantage—when, in fact, the “greater length of time survived” is really an artifact resulting from the additional time counted between disease identification and when it would have been found if no screening had taken place.

Another instance where a time-dependent bias can occur is in oncology studies when intermediate markers (e.g., tumor recurrence) are assessed at the end of follow-up segments using Kaplan-Meier methodology. Recurrence may have occurred in some subjects at the beginning of the time segment rather than at the end of a time segment.

It is always good to ask if, in the course of the study, could the passing of time have had a resulting impact on any outcomes?

Other Examples —

Might the population under study have significantly changed during the course of the trial?
Might the time period of the study affect study results (e.g., studying an allergy medication, but not during allergy season)?
Could awareness of adverse events affect future reporting of adverse events?
Could test timing or a gap in testing result in misleading outcomes (e.g., in studies comparing one test to another, might discrepancies have arisen in test results if patients’ status changed in between applying the two tests)?

All of these time-dependent biases can distort study results.

Can Clinical Guidelines be Trusted?

Status

Can Clinical Guidelines be Trusted?

In a recent BMJ article, “Why we can’t trust clinical guidelines,” Jeanne Lenzer raises a number of concerns regarding clinical guidelines[1]. She begins by summarizing the conflict between 1990 guidelines recommending steroids for acute spinal injury versus 2013 cllinical recommendations against using steroids in acute spinal injury. She then asks, “Why do processes intended to prevent or reduce bias fail?

Her proposed answers to this question include the following—

Many doctors follow guidelines, even if not convinced about the recommendations, because they fear professional censure and possible harm to their careers.
- Supporting this, she cites a poll of over 1000 neurosurgeons which showed that—
  - Only 11% believed the treatment was safe and effective.
  - Only 6% thought it should be a standard of care.
  - Yet when asked if they would continue prescribing the treatment, 60% said that they would. Many cited a fear of malpractice if they failed to follow “a standard of care.” (Note: the standard of care changed in March 2013 when the Congress of Neurological Surgeons stated there was no high quality evidence to support the recommendation.)
Clinical guideline chairs and participants frequently have financial conflicts.
- The Cochrane reviewer for the 1990 guideline she references had strong ties to industry.

Delfini Comment

Fear-based Decision-making by Physicians

We believe this is a reality. In our work with administrative law judges, we have been told that if you “run with the pack,” you better be right, and if you “run outside the pack,” you really better be right. And what happens in court is not necessarily true or just. The solution is better recommendations constructed from individualized, thoughtful decisions based on valid critically appraised evidence found to be clinically useful, patient preferences and other factors. The important starting place is effective critical appraisal of the evidence.

Financial Conflicts of Interest & Industry Influence

It is certainly true that money can sway decisions, be it coming from industry support or potential for income. However, we think that most doctors want to do their best for patients and try to make decisions or provide recommendations with the patient’s best interest in mind. Therefore, we think this latter issue may be more complex and strongly affected in both instances by the large number of physicians and others involved in health care decision-making who 1) do not understand that many research studies are not valid or reported sufficiently to tell; and, 2) lack the skills to be able to differentiate reliable studies from those which may not be reliable.

When it comes to industry support, one of the variables traveling with money includes greater exposure to information through data or contacts with experts supporting that manufacturer’s products. We suspect that industry influence may be less due to financial incentives than this exposure coupled with lack of critical appraisal understanding. As such, we wrote a Letter to the Editor describing our theory that the major problem of low quality guidelines might stem from physicians’ and others’ lack of competency in evaluating the quality of the evidence. Our response is reproduced here.

Delfini BMJ Rapid Response [2]:

We (Delfini) believe that we have some unique insight into how ties to industry may result in advocacy for a particular intervention due to our extensive experience training health care professionals and students in critical appraisal of the medical literature. We think it is very possible that the outcomes Lenzer describes are less due to financial influence than are due to lack of knowledge. The vast majority of physicians and other health care professionals do not have even rudimentary skills in identifying science that is at high to medium risk of bias or understand when results may have a high likelihood of being due to chance. Having ties to industry would likely result in greater exposure to science supporting a particular intervention.

Without the ability to evaluate the quality of the science, we think it is likely that individuals would be swayed and/or convinced by that science. The remedy for this and for other problems with the quality of clinical guidelines is ensuring that all guideline development members have basic critical appraisal skills and there is enough transparency in guidelines so that appraisal of a guideline and the studies utilized can easily be accomplished.

References

1. Lenzer J. Why we can’t trust clinical guidelines. BMJ 2013; 346:f3830

2. Strite SA, Stuart M. BMJ Rapid Response: Why we can’t trust clinical guidelines. BMJ 2013;346:f3830; http://www.bmj.com/content/346/bmj.f3830/rr/651876

Webinar: “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

Status

“Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

On Monday, May 20, 2013, we presented a webinar on “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities” for the member organizations of the Alliance of Community Health Plans (ACHP).

The 80-minute discussion addressed four topic areas, all of which have unique critical appraisal challenges. Webinar goals were to discuss issues that arise when conducting quality improvement efforts using real world data, such as data from claims, surveys and observational studies and other published healthcare evidence.

Key pitfalls were cherry picked for these four mini-seminars—

Pitfalls to avoid when using real-world data, dealing with heterogeneity, confounding-by-indication and causality.
Key issues in evaluating oncology studies — outcome issues and focus on how to address large attrition rates.
Important issues when conducting comparative safety reviews — assessing patterns through use of RCTs, systematic reviews, observational studies and registries.
Key issues in evaluating studies employing Kaplan-Meier estimates — time-to-event basics with attention to the important problem of censoring.

A recording of the webinar is available at—

https://achp.webex.com/achp/lsr.php?AT=pb&SP=TC&rID=45261732&rKey=1475c8c3abed8061&act=pb

Review of Endocrinology Guidelines

Status

Review of Endocrinology Guidelines

Decision-makers frequently rely on the body of pertinent research in making decisions regarding clinical management decisions. The goal is to critically appraise and synthesize the evidence before making recommendations, developing protocols and making other decisions. Serious attention is paid to the validity of the primary studies to determine reliability before accepting them into the review. Brito and colleagues have described the rigor of systematic reviews (SRs) cited from 2006 until January 2012 in support of the clinical practice guidelines put forth by the Endocrine Society using the Assessment of Multiple Systematic Reviews (AMSTAR) tool [1].

The authors included 69 of 2817 studies. These 69 SRs had a mean AMSTAR score of 6.4 (standard deviation, 2.5) of a maximum score of 11, with scores improving over time. Thirty five percent of the included SRs were of low-quality (methodological AMSTAR score 1 or 2 of 5, and were cited in 24 different recommendations). These low quality SRs were the main evidentiary support for five recommendations, of which only one acknowledged the quality of SRs.

The authors conclude that few recommendations in field of endocrinology are supported by reliable SRs and that the quality of the endocrinology SRs is suboptimal and is currently not being addressed by guideline developers. SRs should reliably represent the body of relevant evidence. The authors urge authors and journal editors to pay attention to bias and adequate reporting.

Delfini note: Once again we see a review of guideline work which suggests using caution in accepting clinical recommendations without critical appraisal of the evidence and knowing the strength of the evidence supporting clinical recommendations.

1. Brito JP, Tsapas A, Griebeler ML, Wang Z, Prutsky GJ, Domecq JP, Murad MH, Montori VM. Systematic reviews supporting practice guideline recommendations lack protection against bias. J Clin Epidemiol. 2013 Jun;66(6):633-8. doi: 10.1016/j.jclinepi.2013.01.008. Epub 2013 Mar 16. PubMed PMID: 23510557.

California Pharmacist Journal: Student Evidence Review of NAVIGATOR Study

Status

California Pharmacist Journal: Student Evidence Review of NAVIGATOR Study

Klevens A, Stuart ME, Strite SA. NAVIGATOR (Effect of nateglinide on the incidence of diabetes and cardiovascular events PMID 20228402) Study Evidence Review. California Pharmacist 2012. Vol. LIX, No. 4. Fall 2012. at our California Pharmacist journal page.

California Pharmacist Journal: Student Evidence Review of SATURN Study

Status

California Pharmacist Journal: Student Evidence Review of SATURN Study

Salman G, Stuart ME, Strite SA. The Study of Coronary Atheroma by Intravascular Ultrasound: Effect of Rosuvastatin versus Atorvastatin (SATURN) Study Evidence Review. California Pharmacist 2012. Vol. LIX, No. 3. Summer 2012. at our California Pharmacist journal page

Quickly Finding Reliable Evidence

Status

Quickly Finding Reliable Evidence

Good clinical recommendations for various diagnostic and therapeutic interventions incorporate evidence from reliable published research evidence. Several online evidence-based textbooks are available for clinicians to use to assist them in making healthcare decisions. Large time lags in updating are a common problem for medical textbooks. Online textbooks offer a solution to these delays.

For readers who plan to create decision support, we strongly recommend DynaMed [full disclosure: we are on the editorial board in an unpaid capacity, though a few years ago we did receive a small gift]. DynaMed is a point-of-care evidence-based medical information database created by Brian S. Alper MD, MSPH, FAAFP. It continues to grow from its current 30,000+ clinical topics that are updated frequently. DynaMed monitors the content of more than 500 medical journals and systematic evidence review databases. Each item is thoroughly reviewed for clinical relevance and scientific reliability. DynaMed has been compared with several products, including in a new review by McMaster University. The DynaMed website is https://dynamed.ebscohost.com/.

The McMaster University maintains a Premium Literature Service (PLUS) database which is a continuously updated, searchable database of primary studies and systematic reviews. Each article from over 120 high quality clinical journals and evidence summary services is appraised by research staff for methodological quality, and articles that pass basic criteria are assessed by practicing clinicians in the corresponding discipline. Clinical ratings are based on 7-point scales, where clinical relevance ranges from 1 (“not relevant”) to 7 (“directly and highly relevant”), and newsworthiness ranges from 1 (“not of direct clinical interest”) to 7 (“useful information, most practitioners in my discipline definitely don’t know this).

Investigators from McMaster evaluated four evidence-based textbooks—UpToDate, PIER, DynaMed and Best Practice [Jeffery 12]. For each they determined the proportion of 200 topics which had subsequent articles in PLUS with findings different from those reported in the topics. They also evaluated the number of topics available in each evidence-based textbook compared with the topic coverage in the PLUS database, and the recency of updates for these publications. A topic was in need of an update if there was at least one newer article in PLUS that provided information that differed from the topic’s recommendations in the textbook.

Results

The proportion of topics with potential for updates was significantly lower for DynaMed than the other three textbooks, which had statistically similar values. For DynaMed topics, updates occurred on average of 170 days prior to the study, while the other textbooks averaged from 427 to 488 days. Of all evidence-based textbooks, DynaMed missed fewer articles reporting benefit or no effect when the direction of findings (beneficial, harmful, no effect) was investigated. The proportion of topics for which there was 1 or more recently published articles found in PLUS with evidence that differed from the textbooks’ treatment recommendations was 23% (95% CI 17 to 29%) for DynaMed, 52% (95% CI 45 to 59%) for UpToDate, 55% (95% CI 48 to 61%) for PIER, and 60% (95% CI 53 to 66%) for Best Practice (?²₃=65.3, P<.001). The time since the last update for each textbook averaged from 170 days (range 131 to 209) for DynaMed, to 488 days (range 423 to 554) for PIER (P<.001 across all textbooks).

Summary

Healthcare topic coverage varied substantially for leading evidence-informed electronic textbooks, and generally a high proportion of the 200 common topics had potentially out-of-date conclusions and missing information from 1 or more recently published studies. PIER had the least topic coverage, while UpToDate, DynaMed, and Best Practice covered more topics in similar numbers. DynaMed’s timeline for updating was the quickest, and it had by far the least number of articles that needed to be updated, indicating that quality was not sacrificed for speed.

Note: All textbooks have access to the PLUS database to facilitate updates, and also use other sources for updates such as clinical practice guidelines.

Conclusion

The proportion of topics with potentially outdated treatment recommendations in on-line evidence-based textbooks varies substantially.

Reference

Jeffery R, Navarro T, Lokker C, Haynes RB, Wilczynski NL, Farjou G. How current are leading evidence-based medical textbooks? An analytic survey of four online textbooks. J Med Internet Res. 2012 Dec 10;14(6):e175. doi: 10.2196/jmir.2105. PubMed PMID: 23220465.

Delfini Group :: Blog

Evidence-based clinical improvement, critical appraisal & medical decision-making help

Author Archives: Sheri & Mike, Authors of Basics for Evaluating Medical Research Studies

Medical Literature Searching Update

Status

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

Status

Biostatistical Help for Critical Appraisers

Status

Time-related Biases

Status

Can Clinical Guidelines be Trusted?

Status

Webinar: “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities”

Status

Review of Endocrinology Guidelines

Status

California Pharmacist Journal: Student Evidence Review of NAVIGATOR Study

Status

California Pharmacist Journal: Student Evidence Review of SATURN Study

Status

Quickly Finding Reliable Evidence

Status