Progression Free Survival (PFS) in Oncology Trials


Progression Free Survival (PFS) in Oncology Trials

Progression Free Survival (PFS) continues to be a frequently used endpoint in oncology trials. It is the time from randomization to the first of either objectively measured tumor progression or death from any cause. It is a surrogate outcome because it does not directly assess mortality, morbidity, quality of life, symptom relief or functioning. Even if a valid trial reports a statistically significant improvement in PFS and the reported effect size is large, PFS only provides information about biologic activity of the cancer and tumor burden or tumor response. Even though correlational analysis has shown associations between PFS and overall survival (OS) in some cancers, we believe that extreme caution should be exercised when drawing conclusions about efficacy of a new drug. In other words, PFS evidence alone is insufficient to establish a clinically meaningful benefit for patients or even a reasonable likelihood of net benefit. Many tumors do present a significant clinical burden for patients; however, clinicians frequently mistakenly believe that simply having a reduction in tumor burden equates with clinical benefit and that delaying the growth of a cancer is a clear benefit to patients.

PFS has a number of limitations which increases the risk of biased results and is difficult for readers to interpret. Unlike OS, PFS does not “identify” the time of progression since assessment occurs at scheduled visits and is likely to overestimate time to progression. Also, it is common to stop or add anti-cancer therapies in PFS studies (also a common problem in trials of OS) prior to documentation of tumor progression which may confound outcomes. Further, measurement errors may occur because of complex issues in tumor assessment. Adequate blinding is required to reduce the risk of performance and assessment bias. Other methodological issues include complex calculations to adjust for missed assessments and the need for complete data on adverse events.

Attrition and assessment bias are made even more difficult to assess in oncology trials using time-to-event methodologies. The intention-to-treat principle requires that all randomly assigned patients be observed until they experience the end point or the study ends. Optimal follow-up in PFS trials is to follow each subject to both progression and death.

Delfini Comment

FDA approval based on PFS may result in acceptance of new therapies with greater harms than benefits. The limitations listed above, along with a concern that investigators may be less willing to conduct trials with OS as an endpoint once a drug has been approved, suggest that we should use great caution when considering evidence from studies using PFS as the primary endpoint. We believe that PFS should be thought of as any other surrogate marker—i.e., it represents extremely weak evidence (even in studies judged to be at low risk of bias) unless it is supported by acceptable evidence of improvements in quality of life and overall survival.

When assessing the quality of a trial using PFS, we suggest the following:

  1. Remember that although in some cases PFS appears to be predictive of OS, in many cases it is not.
  2. In many cases, improved PFS is accompanied by unacceptable toxicity and unacceptable changes in quality of life.
  3. Improved PFS results of several months may be due to methodological flaws in the study.
  4. As with any clinical trial, assess the trial reporting PFS for bias such as selection, performance, attrition and assessment bias.
  5. Compare characteristics of losses (e.g., due to withdrawing consent, adverse events, loss to follow-up, protocol violations) between groups and, if possible, between completers and those initially randomized.
  6. Pay special attention to censoring due to loss-to-follow-up. Administrative censoring (censoring of subjects who enter a study late and do not experience an event) may not result in significant bias, but non-administrative censoring (censoring because of loss-to-follow-up or discontinuing) is more likely to pose a threat to validity.


Carroll KJ. Analysis of progression-free survival in oncology trials: some common statistical issues. Pharm Stat. 2007 Apr-Jun;6(2):99-113. Review. PubMed PMID: 17243095.

D’Agostino RB Sr. Changing end points in breast-cancer drug approval—the Avastin story. N Engl J Med. 2011 Jul 14;365(2):e2. doi: 10.1056/NEJMp1106984. Epub 2011 Jun 27. PubMed PMID: 21707384.

Fleming TR, Rothmann MD, Lu HL. Issues in using progression-free survival when evaluating oncology products. J Clin Oncol. 2009 Jun 10;27(17):2874-80. doi: 10.1200/JCO.2008.20.4107. Epub 2009 May 4. PubMed PMID: 19414672

Lachin JM. (John M. Lachin, Sc.D., Professor of Biostatistics and Epidemiology, and of Statistics, The George Washington University personal communication)

Lachin JM. Statistical considerations in the intent-to-treat principle. Control Clin Trials. 2000 Jun;21(3):167-89. Erratum in: Control Clin Trials 2000 Oct;21(5):526. PubMed PMID: 10822117.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Network Meta-analyses—More Complex Than Traditional Meta-analyses


Network Meta-analyses—More Complex Than Traditional Meta-analyses

Meta-analyses are important tools for synthesizing evidence from relevant studies. One limitation of traditional meta-analyses is that they can compare only 2 treatments at a time in what is often termed pairwise or direct comparisons. An extension of traditional meta-analysis is the “network meta-analysis” which has been increasingly used—especially with the rise of the comparative effectiveness movement—as a method of assessing the comparative effects of more than two alternative interventions for the same condition that have not been studied in head-to-head trials.

A network meta-analysis synthesizes direct and indirect evidence over the entire network of interventions that have not been directly compared in clinical trials, but have one treatment in common.

A clinical trial reports that for a given condition intervention A results in better outcomes than intervention B. Another trial reports that intervention B is better than intervention C. A network meta-analysis intervention is likely to report that intervention A results in better outcomes than intervention C based on indirect evidence.

Network meta-analyses, also known as “multiple-treatments meta-analyses” or “mixed-treatment comparisons meta-analyses” include both direct and indirect evidence. When both direct and indirect comparisons are used to estimate treatment effects, the comparison is referred to as a “mixed comparison.” The indirect evidence in network meta-analyses is derived from statistical inference which requires many assumptions and modeling. Therefore, critical appraisal of network meta-analyses is more complex than appraisal of traditional meta-analyses.

In all meta-analyses, clinical and methodological differences in studies are likely to be present. Investigators should only include valid trials. Plus they should provide sufficient detail so that readers can assess the quality of meta-analyses. These details include important variables such as PICOTS (population, intervention, comparator, outcomes, timing and study setting) and heterogeneity in any important study performance items or other contextual issues such as important biases, unique care experiences, adherence rates, etc. In addition, the effect sizes in direct comparisons should be compared to the effect sizes in indirect comparisons since indirect comparisons require statistical adjustments. Inconsistency between the direct and indirect comparisons may be due to chance, bias or heterogeneity. Remember, in direct comparisons the data come from the same trial. Indirect comparisons utilize data from separate randomized controlled trials which may vary in both clinical and methodological details.

Estimates of effect in a direct comparison trial may be lower than estimates of effect derived from indirect comparisons. Therefore, evidence from direct comparisons should be weighted more heavily than evidence from indirect comparisons in network meta-analyses. The combination of direct and indirect evidence in mixed treatment comparisons may be more likely to result in distorted estimates of effect size if there is inconsistency between effect sizes of direct and indirect comparisons.

Usually network meta-analyses rank different treatments according to the probability of being the best treatment. Readers should be aware that these rankings may be misleading because differences may be quite small or inaccurate if the quality of the meta-analysis is not high.

Delfini Comment
Network meta-analyses do provide more information about the relative effectiveness of interventions. At this time, we remain a bit cautious about the quality of many network meta-analyses because of the need for statistical adjustments. It should be emphasized that, as of this writing, methodological research has not established a preferred method for conducting network meta-analyses, assessing them for validity or assigning them an evidence grade.

Li T, Puhan MA, Vedula SS, Singh S, Dickersin K; Ad Hoc Network Meta-analysis Methods Meeting Working Group. Network meta-analysis-highly attractive but more methodological research is needed. BMC Med. 2011 Jun 27;9:79. doi: 10.1186/1741-7015-9-79. PubMed PMID: 21707969.

Salanti G, Del Giovane C, Chaimani A, Caldwell DM, Higgins JP. Evaluating the quality of evidence from a network meta-analysis. PLoS One. 2014 Jul 3;9(7):e99682. doi: 10.1371/journal.pone.0099682. eCollection 2014. PubMed PMID: 24992266.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Sounding the Alarm (Again) in Oncology


Sounding the Alarm (Again) in Oncology

Five years ago Fojo and Grady sounded the alarm about value in many of the new oncology drugs [1]. They raised the following issues and challenged oncologists and others to get involved in addressing these issues:

  • There is a great deal of uncertainty and confusion about what constitutes a benefit in cancer therapy; and,
  • How much should cost factor into these deliberations?

The authors review a number of oncology drug studies reporting increased overall survival (OS) ranging from a median of a few days to a few months with total new drug costs ranging from $15,000 to $90,000 plus. In some cases, there is no increase in OS, but only progression free survival (PFS) which is a weaker outcome measure due to its being prone to tumor assessment biases and is frequently assessed in studies of short duration. Adverse events associated with the new drugs are many and include higher rates of febrile neutropenia, infusion-related reactions, diarrhea, skin toxicity, infections, hypertension and other adverse events.

Fojo and Grady point out that—

“Many Americans would likely not regard a 1.2-month survival advantage as ‘significant’ progress, the much revered P value notwithstanding. But would an individual patient agree? Although we lack the answer to this question, we would suggest that the death of a mother of four at age 37 years would be no less painful were it to occur at age 37 years and 1 month, nor would the passing of a 67-year-old who planned to travel after retiring be any less difficult for the spouse were it to have occurred 1 month later.”

In a recent article [2] (thanks to Dr. Richard Lehman for drawing our attention to this article in his wonderful BMJ blog) Fojo and colleagues again point out that—

  • Cancer is the number one cause of mortality worldwide, and cancer cases are projected to rise by 75% over the next 2 decades.
  • Of the 71 therapies for solid tumors receiving FDA approval from 2002 to 2014, only 30 of the 71 approvals (42%) met the American Society of Clinical Oncology Cancer Research Committee’s “low hurdle” criteria for clinically meaningful improvement. Further, the authors tallied results from all the studies and reported very modest collective median gains of 2.5 months for PFS and 2.1 months for OS. Numerous surveys have indicated that patients expect much more.
  • Expensive therapies are stifling progress by (1) encouraging enormous expenditures of time, money, and resources on marginal therapeutic indications; and, (2) promoting a me-too mentality that is stifling innovation and creativity.

The last bullet needs a little explaining. The authors provide a number of examples of “safe bets” and argue that revenue from such safe and profitable therapies rather than true need has been a driving force for new oncology drugs. The problem is compounded by regulations—e.g., rules which require Medicare to reimburse patients for any drug used in an “anti-cancer chemotherapeutic regimen”—regardless of its incremental benefit over other drugs—as long as the use is “for a medically accepted indication” (commonly interpreted as “approved by the FDA”). This provides guaranteed revenues for me-too drugs irrespective of their marginal benefits. The authors also point out that when prices for drugs of proven efficacy fall below a certain threshold, suppliers often stop producing the drug, causing severe shortages.

What can be done? The authors acknowledge several times in their commentary that the spiraling cost of cancer therapies has no single villain; academia, professional societies, scientific journals, practicing oncologists, regulators, patient advocacy groups and the biopharmaceutical industry—all bear some responsibility. [We would add to this list physicians, P&T committees and any others who are engaged in treatment decisions for patients. Patients are not on this list (yet) because they are unlikely to really know the evidence.] This is like many other situations when many are responsible—often the end result is that “no one” takes responsibility. Fojo et al. close by making several suggestions, among which are—

  1. Academicians must avoid participating in the development of marginal therapies;
  2. Professional societies and scientific journals must raise their standards and not spotlight marginal outcomes;
  3. All of us must also insist on transparency and the sharing of all published data in a timely and enforceable manner;
  4. Actual gains of benefit must be emphasized—not hazard ratios or other measures that force readers to work hard to determine actual outcomes and benefits and risks;
  5. We need cooperative groups with adequate resources to provide leadership to ensure that trials are designed to deliver meaningful outcomes;
  6. We must find a way to avoid paying premium prices for marginal benefits; and,
  7. We must find a way [federal support?] to secure altruistic investment capital.

Delfini Comment
While the authors do not make a suggestion for specific responsibilities or actions on the part of the FDA, they do make a recommendation that an independent entity might create uniform measures of benefits for each FDA-approved drug—e.g., quality-adjusted life-years. We think the FDA could go a long way in improving this situation.

And so, as pointed out by Fojo et al., only small gains have been made in OS over the past 12 years, and costs of oncology drugs have skyrocketed. However, to make matters even worse than portrayed by Fojo et al., many of the oncology drug studies we see have major threats to validity (e.g., selection bias, lack of blinding and other performance biases, attrition and assessment bias, etc.) raising the question, “Does the approximate 2 month gain in median OS represent an overestimate?” Since bias tends to favor the new intervention in clinical trials, the PFS and OS reported in many of the recent oncology trials may be exaggerated or even absent or harms may outweigh benefits. On the other hand, if a study is valid, since a median is a midpoint in a range of results and a patient may achieve better results than indicated by the median, some patients may choose to accept a new therapy. The important thing is that patients are given information on benefits and harms in a way that allows them to have a reasonable understanding of all the issues and make the choices that are right for them.

Resources & References


  1. The URL for Dr. Lehman’s Blog is—
  2. The URL for his original blog entry about this article is—


  1. Fojo T, Grady C. How much is life worth: cetuximab, non-small cell lung cancer, and the $440 billion question. J Natl Cancer Inst. 2009 Aug 5;101(15):1044-8. Epub 2009 Jun 29. PMID: 19564563
  2. Fojo T, Mailankody S, Lo A. Unintended Consequences of Expensive Cancer Therapeutics-The Pursuit of Marginal Indications and a Me-Too Mentality That Stifles Innovation and Creativity: The John Conley Lecture. JAMA Otolaryngol Head Neck Surg. 2014 Jul 28. doi: 10.1001/jamaoto.2014.1570. [Epub ahead of print] PubMed PMID: 25068501.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Comparative Effectiveness Research (CER), “Big Data” & Causality


Comparative Effectiveness Research (CER), “Big Data” & Causality

For a number of years now, we’ve been concerned that the CER movement and the growing love affair with “big data,” will lead to many erroneous conclusions about cause and effect.  We were pleased to see the following blog from Austin Frakt, an editor-in-chief of The Incidental Economist: Contemplating health care with a focus on research, an eye on reform

Ten impressions of big data: Claims, aspirations, hardly any causal inference


Five more big data quotes: The ambitions and challenges

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Comparative Study Designs: Claiming Superiority, Equivalence and Non-inferiority—A Few Considerations & Practical Approaches


Comparative Study Designs: Claiming Superiority, Equivalence and Non-inferiority—A Few Considerations & Practical Approaches

This is a complex area, and we recommend downloading our freely available 1-page summary to help assess issues with equivalence and non-inferiority trials. Here is a short sampling of some of the problems in these designs: lack of sufficient evidence confirming efficacy of referent treatment, (“referent” refers to the comparator treatment); study not sufficiently similar to referent study; inappropriate Deltas (meaning the margin established for equivalence or non-inferiority); or significant biases or analysis methods that would tend to diminish an effect size and “favor” no difference between groups (e.g., conservative application of ITT analysis, insufficient power, etc.), thus pushing toward non-inferiority or equivalence.

However, we do want to say a few more things about non-inferiority trials based on some recent questions and readings.

Is it acceptable to claim superiority in a non-inferiority trial? Yes. The Food and Drug Administration (FDA) and the European Medicines Agency (EMA), among others, including ourselves, all agree that declaring superiority in a non-inferiority trial is acceptable. What’s more, there is agreement that multiplicity adjusting does not need to be done when first testing for non-inferiority and then superiority.

See Delfini Recommended Reading: Included here is a nice article by Steve Snapinn. Snappin even recommends that “…most, if not all, active-controlled clinical trial protocols should define a noninferiority margin and include a noninferiority hypothesis.” We agree. Clinical trials are expensive to do, take time, have opportunity costs, and—most importantly—are of impact on the lives of the human subjects who engage in them. This is a smart procedure that costs nothing especially as multiplicity adjusting is not needed.

What does matter is having an appropriate population for doing a superiority analysis. For superiority, in studies with dichotomous variables, the population should be Intention-to-Treat (ITT) with an appropriate imputation method that does not favor the intervention under study. In studies with time-to-event outcomes, the population should be based on the ITT principle (meaning all randomized patients should be used in the analysis by the group to which they were randomized) with unbiased censoring rules.

Confidence intervals (CIs) should be evaluated to determine superiority. Some evaluators seem to suggest that superiority can be declared only if the CIs are wholly above the Delta. Schumi et al. express their opinion that you can declare superiority if the confidence interval for the new treatment is above the line of no difference (i.e.., is statistically significant). They state, “The calculated CI does not know whether its purpose is to judge superiority or non-inferiority. If it sits wholly above zero [or 1, depending upon the measure of outcome], then it has shown superiority.” EMA would seem to agree. We agree as well. If one wishes to take a more conservative approach, one method we recommend is to judge whether the Delta seems clinically reasonable (you should always do this) and if not, establishing your own through clinical judgment. Then determine if the entire CI meets or exceeds what you deem to be clinically meaningful. To us, this method satisfies both approaches and makes practical and clinical sense.

Is it acceptable to claim non-inferiority trial superiority? It depends. This area is controversial with some saying no and some saying it depends. However, there is agreement amongst those on the “it depends” side that it generally should not be done due to validity issues as described above.

US Department of Health and Human Services, Food and Drug Administration: Guidance for Industry Non-Inferiority Clinical Trials (DRAFT). 2010. Guidances/UCM202140.pdf

European Agency for the Evaluation of Medicinal Products Committee for Proprietary Medicinal Products (CPMP): Points to Consider on Switching Between Superiority and Non-Inferiority. 2000.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Estimating Relative Risk Reduction from Odds Ratios


Estimating Relative Risk Reduction from Odds Ratios

Odds are hard to work with because they are the likelihood of an event occurring compared to not occurring—e.g., odds of two to one mean that likelihood of an event occurring is twice that of not occurring. Contrast this with probability which is simply the likelihood of an event occurring.

An odds ratio (OR) is a point estimate used for case-control studies which attempts to quantify a mathematical relationship between an exposure and a health outcome. Odds must be used in case-control studies because the investigator arbitrarily controls the population; therefore, probability cannot be determined because the disease rates in the study population cannot be known. The odds that a case is exposed to a certain variable are divided by the odds that a control is exposed to that same variable.

Odds are often used in other types of studies as well, such as meta-analysis, because of various properties of odds which make them easy to use mathematically. However, increasingly authors are discouraged from computing odds ratios in secondary studies because of the difficulty translating what this actually means in terms of size of benefits or harms to patients.

Readers frequently attempt to deal with this by converting the odds ratio into relative risk reduction by thinking of the odds ratio as similar to relative risk. Relative risk reduction (RRR) is computed from relative risk (RR) by simply subtracting the relative risk from one and expressing that outcome as a percentage (1-RR).

Some experts advise readers that this is safe to do if the prevalence of the event is low. While it is true that odds and probabilities of outcomes are usually similar if the event rate is low, when possible, we recommend calculating both the odds ratio reduction and the relative risk reduction in order to compare and determine if the difference is clinically meaningful. And determining if something is clinically meaningful is a judgment, and therefore whether a conversion of OR to RRR is distorted depends in part upon that judgment.

a = group 1 outcome occurred
b = group 1 outcome did not occur
c = group 2 outcome occurred
d = group 2 outcome did not occur

OR = (a/b)/(c/d)
Estimated RRR from OR (odds ratio reduction) = 1-OR

RR = (a/ group 1 n)/(c/ group 2 n)
RRR – 1-RR



Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence


Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

A recent paper by McCormack reminds us that authors may mislead readers by making unwarranted “all-or-none” statements and that readers should be mindful of this and carefully examine confidence intervals.

When examining results of a valid study, confidence intervals (CIs) provide much more information than p-values. The results are statistically significant if a confidence interval does not touch the line of no difference (zero in the case of measures of outcomes expressed as percentages such as absolute risk reduction and relative risk reduction and 1 in the case of ratios such as relative risk and odds ratios). However, in addition to providing information about statistical significance, confidence intervals also provide a plausible range for possibly true results within a margin of chance (5 percent in the case of a 95% CI). While the actual calculated outcome (i.e., the point estimate) is “the most likely to be true” result within the confidence interval, having this range enables readers to judge, in their opinion, if statistically significant results are clinically meaningful.

However, as McCormack points out, authors frequently do not provide useful interpretation of the confidence intervals, and authors at times report different conclusions from similar data. McCormack presents several cases that illustrate this problem, and this paper is worth reading.

As an illustration, assume two hypothetical studies report very similar results. In the first study of drug A versus drug B, the relative risk for mortality was 0.9, 95% CI (0.80 to 1.05). The authors might state that there was no difference in mortality between the two drugs because the difference is not statistically significant. However, the upper confidence interval is close to the line of no difference and so the confidence interval tells us that it is possible that a difference would have been found if more people were studied, so that statement is misleading. A better statement for the first study would include the confidence intervals and a neutral interpretation of what the results for mortality might mean. Example—

“The relative risk for overall mortality with drug A compared to placebo was 0.9, 95% CI (0.80 to 1.05). The confidence intervals tell us that Drug A may reduce mortality by up to a relative 20% (i.e., the relative risk reduction), but may increase mortality, compared to Drug B, by approximately 5%.”

In a second study with similar populations and interventions, the relative risk for mortality might be 0.93, 95% CI (0.83 to 0.99). In this case, some authors might state, “Drug A reduces mortality.” A better statement for this second hypothetical study would ensure that the reader knows that the upper confidence interval is close to the line of no difference and, therefore, is close to non-significance. Example—

“Although the mortality difference is statistically significant, the confidence interval indicates that the relative risk reduction may be as great as 17% but may be as small as 1%.”

The Bottom Line

  1. Remember that p-values refer only to statistical significance and confidence intervals are needed to evaluate clinical significance.
  2. Watch out for statements containing the words “no difference” in the reporting of study results. A finding of no statistically significant difference may be a product of too few people studied (or insufficient time).
  3. Watch out for statements implying meaningful differences between groups when one of the confidence intervals approaches the line of no difference.
  4. None of this means anything unless the study is valid. Remember that bias tends to favor the intervention under study.

If authors do not provide you with confidence intervals, you may be able to compute them yourself, if they have supplied you with sufficient data, using an online confidence interval calculator. For our favorites, search “confidence intervals” at our web links page:


McCormack J, Vandermeer B, Allan GM. How confidence intervals become confusion intervals. BMC Med Res Methodol. 2013 Oct 31;13(1):134. [Epub ahead of print] PubMed PMID: 24172248.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Patient Years


What are Patient-Years?

A participant at one of our recent conferences asked a good question—“What are patient-years?”

“Person-years” is a statistic for expressing incidence rates—it is the summing of the results of events divided by time. In many studies, the length of exposure to the treatment is different for different subjects, and the patient-year statistic is one way of dealing with this issue.

The calculation of events per patient-year(s) is the number of incident cases divided by the amount of person-time at risk. The calculation can be accomplished by adding the number of patients in the group and multiplying that number times the years that patients are in a study in order to calculate the patient-years (denominator). Then divide the number of events (numerator) by the denominator.

  • Example: 100 patients are followed for 2 years. In this case, there are 200 patient-years of follow-up.
  • If there were 8 myocardial infarctions in the group, the rate would be 8 MIs per 200 patient years or 4 MIs per 100 patient-years.

The rate can be expressed in various ways, e.g., per 100, 1,000, 100,000, or 1 million patient-years. In some cases, authors report the average follow-up period as the mean and others use the median, which may result in some variation in results between studies.

Another example: Assume we have a study reporting one event at 1 year and one event at 4 years, but no events at year 2 and 3. This same information can be expressed as 2 events/10 (1+2+3+4=10) years or an event rate of 0.2 per person-year.

An important issue is that frequently the timeframe for observation in studies reporting patient-years does not match the timeframe stated in the study. Brian Alper of Dynamed explains it this way: “If I observed a million people for 5 minutes each and nobody died, any conclusion about mortality over 1 year would be meaningless. This problem occurs whether or not we translate our outcome into a patient-years measure. The key in critical appraisal is to catch the discrepancy between timeframe of observation and timeframe of conclusion and not let the use of ‘patient-years’ mistranslate between the two or represent an inappropriate extrapolation.”[1]


1. Personal communication 9/3/13 with Brian S. Alper, MD, MSPH, FAAFP, Editor-in-Chief, DynaMed, Medical Director, EBSCO Information Services.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

When Is a Measure of Outcomes Like a Coupon for a Diamond Necklace?


When Is a Measure of Outcomes Like a Coupon for a Diamond Necklace?

For those of you who struggle with the fundamental difference between absolute risk reduction (ARR) versus relative risk reduction (RRR) and their counterparts, absolute and relative risk increase (ARI/RRI), we have always explained that only knowing the RRR or the RRI without other quantitative information about the frequency of events is akin to knowing that a store is having a half-off sale—but when you walk in, you find that they aren’t posting the actual price!  And so your question is 50 percent off of what???

You should have the same question greet you whenever you are provided with a relative measure (and if you aren’t told whether the measure is relative or absolute, you may be safer off assuming that it is relative). Below is a link to a great short cartoon that turns the lens a little differently and which might help.

However, we will add that, in our opinion, ARR alone isn’t fully informative either, nor is its kin, the number-needed-to-treat or NNT, and for ARI, the number-needed-to-harm or NNH.  A 5 percent reduction in risk may be perceived very differently when “10 people out of a hundred benefit with one intervention compared to 5 with placebo” as compared to a different scenario in which “95 people out of a hundred benefit with one intervention as compared to 90 with placebo.” As a patient, I might be less likely to want to expose myself to side effects if it is highly likely I am going to improve without treatment, for example.  Providing this full information–for critically appraised studies that are deemed to be valid–of course, may best provide patients with information that helps them make choices based on their own needs and requirements including their values and preferences.

We think that anyone involved in health care decision-making—including the patient—is best helped by knowing the event rates for each of the groups studied—i.e., the numerators and denominators for the outcome of interest by group which comprise the 4 numbers that make up the 2 by 2 table which is used to calculate many statistics.

Isn’t it great when learning can be fun too!  Enjoy!

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Our Current Thinking About Attrition Bias


Delfini Thoughts on Attrition Bias

Significant attrition, whether it be due to loss of patients or discontinuation or some other reason, is a reality of many clinical trials. And, of course, the key question in any study is whether attrition significantly distorted the study results. We’ve spent a lot of time researching the evidence-on-the-evidence and have found that many researchers, biostatisticians and others struggle with this area—there appears to be no clear agreement in the clinical research community about how to best address these issues. There also is inconsistent evidence on the effects of attrition on study results.

We, therefore, believe that studies should be evaluated on a case-by-case basis and doing so often requires sleuthing and sifting through clues along with critically thinking through the unique circumstances of the study.

The key question is, “Given that attrition has occurred, are the study results likely to be true?” It is important to look at the contextual elements of the study. These contextual elements may include information about the population characteristics, potential effects of the intervention and comparator, the outcomes studied and whether patterns emerge, timing and setting. It is also important to look at the reasons for discontinuation and loss-to-follow up and to look at what data is missing and why to assess likely impact on results.

Attrition may or may not impact study outcomes depending, in part, upon the reasons for withdrawals, censoring rules and the resulting effects of applying those rules, for example. However, differential attrition issues should be looked at especially closely. Unintended differences between groups are more likely to happen when patients have not been allocated to their groups in a blinded fashion, groups are not balanced at the onset of the study and/or the study is not effectively blinded or an effect of the treatment has caused the attrition.

One piece of the puzzle, at times, may be whether prognostic characteristics remained balanced. One item that would be helpful authors could help us all out tremendously by assessing comparability between baseline characteristics at randomization and for those analyzed. However, an imbalance may be an important clue too because it might be informative about efficacy or side effects of the agent understudy.

In general, we think it is important to attempt to answer the following questions:

Examining the contextual elements of a given study—

  • What could explain the results if it is not the case that the reported findings are true?
  • What conditions would have to be present for an opposing set of results (equivalence or inferiority) to be true instead of the study findings?
  • Were those conditions met?
  • If these conditions were not met, is there any reason to believe that the estimate of effect (size of the difference) between groups is not likely to be true.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email