Comparative Study Designs: Claiming Superiority, Equivalence and Non-inferiority—A Few Considerations & Practical Approaches


Comparative Study Designs: Claiming Superiority, Equivalence and Non-inferiority—A Few Considerations & Practical Approaches

This is a complex area, and we recommend downloading our freely available 1-page summary to help assess issues with equivalence and non-inferiority trials. Here is a short sampling of some of the problems in these designs: lack of sufficient evidence confirming efficacy of referent treatment, (“referent” refers to the comparator treatment); study not sufficiently similar to referent study; inappropriate Deltas (meaning the margin established for equivalence or non-inferiority); or significant biases or analysis methods that would tend to diminish an effect size and “favor” no difference between groups (e.g., conservative application of ITT analysis, insufficient power, etc.), thus pushing toward non-inferiority or equivalence.

However, we do want to say a few more things about non-inferiority trials based on some recent questions and readings.

Is it acceptable to claim superiority in a non-inferiority trial? Yes. The Food and Drug Administration (FDA) and the European Medicines Agency (EMA), among others, including ourselves, all agree that declaring superiority in a non-inferiority trial is acceptable. What’s more, there is agreement that multiplicity adjusting does not need to be done when first testing for non-inferiority and then superiority.

See Delfini Recommended Reading: Included here is a nice article by Steve Snapinn. Snappin even recommends that “…most, if not all, active-controlled clinical trial protocols should define a noninferiority margin and include a noninferiority hypothesis.” We agree. Clinical trials are expensive to do, take time, have opportunity costs, and—most importantly—are of impact on the lives of the human subjects who engage in them. This is a smart procedure that costs nothing especially as multiplicity adjusting is not needed.

What does matter is having an appropriate population for doing a superiority analysis. For superiority, in studies with dichotomous variables, the population should be Intention-to-Treat (ITT) with an appropriate imputation method that does not favor the intervention under study. In studies with time-to-event outcomes, the population should be based on the ITT principle (meaning all randomized patients should be used in the analysis by the group to which they were randomized) with unbiased censoring rules.

Confidence intervals (CIs) should be evaluated to determine superiority. Some evaluators seem to suggest that superiority can be declared only if the CIs are wholly above the Delta. Schumi et al. express their opinion that you can declare superiority if the confidence interval for the new treatment is above the line of no difference (i.e.., is statistically significant). They state, “The calculated CI does not know whether its purpose is to judge superiority or non-inferiority. If it sits wholly above zero [or 1, depending upon the measure of outcome], then it has shown superiority.” EMA would seem to agree. We agree as well. If one wishes to take a more conservative approach, one method we recommend is to judge whether the Delta seems clinically reasonable (you should always do this) and if not, establishing your own through clinical judgment. Then determine if the entire CI meets or exceeds what you deem to be clinically meaningful. To us, this method satisfies both approaches and makes practical and clinical sense.

Is it acceptable to claim non-inferiority trial superiority? It depends. This area is controversial with some saying no and some saying it depends. However, there is agreement amongst those on the “it depends” side that it generally should not be done due to validity issues as described above.

US Department of Health and Human Services, Food and Drug Administration: Guidance for Industry Non-Inferiority Clinical Trials (DRAFT). 2010. Guidances/UCM202140.pdf

European Agency for the Evaluation of Medicinal Products Committee for Proprietary Medicinal Products (CPMP): Points to Consider on Switching Between Superiority and Non-Inferiority. 2000.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Involving Patients in Their Care Decisions and JAMA Editorial: The New Cholesterol and Blood Pressure Guidelines: Perspective on the Path Forward


Involving Patients in Their Care Decisions and JAMA Editorial: The New Cholesterol and Blood Pressure Guidelines: Perspective on the Path Forward

Krumholz HM. The New Cholesterol and Blood Pressure Guidelines: Perspective on the Path Forward. JAMA. 2014 Mar 29. doi: 10.1001/jama.2014.2634. [Epub ahead of print] PubMed PMID: 24682222.

Here is an excellent editorial that highlights the importance of patient decision-making.  We thank the wonderful Dr. Richard Lehman, MA, BM, BCh, Oxford, & Blogger, BMJ Journal Watch, for bringing this to our attention. [Note: Richard’s wonderful weekly review of medical journals—informative, inspiring and oh so droll—is here.]

We have often observed that evidence can be a neutralizing force. This editorial highlights for us that this means involving the patient in a meaningful way and finding ways to support decisions based on patients’ personal requirements. These personal “patient requirements” include health care needs and wants and a recognition of individual circumstances, values and preferences.

To achieve this, we believe that patients should receive the same information as clinicians including what alternatives are available, a quantified assessment of potential benefits and harms of each including the strength of evidence for each and potential consequences of making various choices including things like vitality and cost.

Decisions may differ between patients, and physicians may make incorrect assumption about what most matters to patients of which there are many examples in the literature such as in the citations below.

O’Connor A. Using patient decision aids to promote evidence-based decision making. ACP J Club. 2001 Jul-Aug;135(1):A11-2. PubMed PMID: 11471526.

O’Connor AM, Wennberg JE, Legare F, Llewellyn-Thomas HA,Moulton BW, Sepucha KR, et al. Toward the ‘tipping point’: decision aids and informed patient choice. Health Affairs 2007;26(3):716-25.

Rothwell PM. External validity of randomised controlled trials: “to whom do the results of this trial apply?”. Lancet. 2005 Jan 1-7;365(9453):82-93. PubMed PMID: 15639683.

Stacey D, Bennett CL, Barry MJ, Col NF, Eden KB, Holmes-Rovner M, Llewellyn-Thomas H, Lyddiatt A, Légaré F, Thomson R. Decision aids for people facing health treatment or screening decisions. Cochrane Database Syst Rev. 2011 Oct 5;(10):CD001431. Review. PubMed PMID: 21975733.

Wennberg JE, O’Connor AM, Collins ED, Weinstein JN. Extending the P4P agenda, part 1: how Medicare can improve patient decision making and reduce unnecessary care. Health Aff (Millwood). 2007 Nov-Dec;26(6):1564-74. PubMed PMID: 17978377.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Estimating Relative Risk Reduction from Odds Ratios


Estimating Relative Risk Reduction from Odds Ratios

Odds are hard to work with because they are the likelihood of an event occurring compared to not occurring—e.g., odds of two to one mean that likelihood of an event occurring is twice that of not occurring. Contrast this with probability which is simply the likelihood of an event occurring.

An odds ratio (OR) is a point estimate used for case-control studies which attempts to quantify a mathematical relationship between an exposure and a health outcome. Odds must be used in case-control studies because the investigator arbitrarily controls the population; therefore, probability cannot be determined because the disease rates in the study population cannot be known. The odds that a case is exposed to a certain variable are divided by the odds that a control is exposed to that same variable.

Odds are often used in other types of studies as well, such as meta-analysis, because of various properties of odds which make them easy to use mathematically. However, increasingly authors are discouraged from computing odds ratios in secondary studies because of the difficulty translating what this actually means in terms of size of benefits or harms to patients.

Readers frequently attempt to deal with this by converting the odds ratio into relative risk reduction by thinking of the odds ratio as similar to relative risk. Relative risk reduction (RRR) is computed from relative risk (RR) by simply subtracting the relative risk from one and expressing that outcome as a percentage (1-RR).

Some experts advise readers that this is safe to do if the prevalence of the event is low. While it is true that odds and probabilities of outcomes are usually similar if the event rate is low, when possible, we recommend calculating both the odds ratio reduction and the relative risk reduction in order to compare and determine if the difference is clinically meaningful. And determining if something is clinically meaningful is a judgment, and therefore whether a conversion of OR to RRR is distorted depends in part upon that judgment.

a = group 1 outcome occurred
b = group 1 outcome did not occur
c = group 2 outcome occurred
d = group 2 outcome did not occur

OR = (a/b)/(c/d)
Estimated RRR from OR (odds ratio reduction) = 1-OR

RR = (a/ group 1 n)/(c/ group 2 n)
RRR – 1-RR



Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

More on Attrition Bias: Update on Missing Data Points: Difference or No Difference — Does it Matter?

Attrition Bias Update 01/14/2014: Missing Data Points: Difference or No Difference — Does it Matter?

A colleague recently wrote us to ask us more about attrition bias. We shared with him that the short answer is that there is less conclusive research on attrition bias than on other key biases. Attrition does not necessarily mean that attrition bias is present and distorting statistically significant results. Attrition may simply result in a smaller sample size which, depending upon how small the remaining population is, may be more prone to chance due to outliers or false non-significant findings due to lack of power.

If randomization successfully results in balanced groups, if blinding is successful including concealed allocation of patients to their study groups, if adherence is high, if protocol deviations are balanced and low, if co-interventions are balanced, if censoring rules are used which are unbiased, and if there are no differences between the groups except for the interventions studied, then it may be reasonable to conclude that attrition bias is not present even if attrition rates are large. Balanced baseline comparisons between completers provides further support for such a conclusion as does comparability in reasons for discontinuation, especially if many categories are reported.

On the other hand, other biases may result in attrition bias. For example, imagine a comparison of an active agent to a placebo in a situation in which blinding is not successful. A physician might encourage his or her patient to drop out of a study if they know the patient is on placebo, resulting in biased attrition that, in sufficient numbers, would potentially distort the results from what they would otherwise have been.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Demo: Critical Appraisal of a Randomized Controlled Trial


Demo: Critical Appraisal of a Randomized Controlled Trial

We recently had a great opportunity to listen to a live demonstration of a critical appraisal of a randomized controlled trial conducted by Dr. Brian Alper, Founder of DynaMed; Vice President of EBM Research and Development, Quality & Standards at EBSCO Information Services.

Dr. Alper is extremely knowledgeable about critical appraisal and does an outstanding job clearly describing key issues concerning his selected study for review. We are fortunate to have permission to share the recorded webinar with you.

“Learn How to Critically Appraise a Randomized Trial with Brian S. Alper, MD, MSPH, FAAFP”

Below are details of how to access the study that was used in the demo and how to access the webinar itself.

The Study
The study used for the demonstration is Primary Prevention Of Cardiovascular Disease with a Mediterranean Diet.  Full citation is here—

Estruch R, Ros E, Salas-Salvadó J, Covas MI, Corella D, Arós F, Gómez-Gracia E, Ruiz-Gutiérrez V, Fiol M, Lapetra J, Lamuela-Raventos RM, Serra-Majem L, Pintó X, Basora J, Muñoz MA, Sorlí JV, Martínez JA, Martínez-González MA; PREDIMED Study Investigators. Primary prevention of cardiovascular disease with a Mediterranean diet. N Engl J Med. 2013 Apr 4;368(14):1279-90. doi: 10.1056/NEJMoa1200303. Epub 2013 Feb 25. PubMed PMID: 23432189.

Access to the study for the critical appraisal demo is available here:

The Webinar: 1 Hour

For those of you who have the ability to play WebEx files or can download the software to do so, the webinar can be accessed here—

Important: It takes about 60 seconds before the webinar starts. (Be sure your sound is on.)

More Chances to Learn about Critical Appraisal

There is a wealth of freely available information to help you both learn and accomplish critical appraisal tasks as well as other evidence-based quality improvement activities. Our website is We also have a little book available for purchase for which we are getting rave reviews and which is now being used to train medical and pharmacy residents and is being used in medical, pharmacy and nursing schools.

Delfini Evidence-based Practice Series Guide Book

Basics for Evaluating Medical Research Studies: A Simplified Approach (And Why Your Patients Need You to Know This)

Find our book at— or on our website at (see Books).

Delfini Recommends DynaMed™

We highly recommend DynaMed.  Although we urge readers to be aware that there is variation in all medical information sources, as members of the DynaMed editorial board (unpaid), we have opportunity to participate in establishing review criteria as well as getting a closer look into methods, staff skills, review outcomes, etc., and we think that DynaMed is a great resource. Depending upon our clinical question and project, DynaMed is often our starting point.

About DynaMed™ from the DynaMed Website

DynaMed™ is a clinical reference tool created by physicians for physicians and other health care professionals for use at the point-of-care. With clinically-organized summaries for more than 3,200 topics, DynaMed provides the latest content and resources with validity, relevance and convenience, making DynaMed an indispensable resource for answering most clinical questions during practice.

Updated daily, DynaMed editors monitor the content of over 500 medical journals on a daily basis. Each article is evaluated for clinical relevance and scientific validity. The new evidence is then integrated with existing content, and overall conclusions are changed as appropriate, representing a synthesis of the best available evidence. Through this process of Systematic Literature Surveillance, the best available evidence determines the content of DynaMed.

Who Uses DynaMed

DynaMed is used in hospitals, medical schools, residency programs, group practices and by individual clinicians supporting physicians, physician assistants, nurses, nurse practitioners, pharmacists, physical therapists, medical researchers, students, teachers and numerous other health care professionals at the point-of-care.


Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

G-I-N Webinar: Guideline Development & Evidence-based Quality Improvement


Guidelines International Network Webinar: How to Develop Guidelines Within the Context of a Clinical Quality Improvement Program

Thanks to the Guidelines International Network, a webinar we did for them is available online.  To access the recording and slide show presentation, go to—

For information about the case study we showcased for our presentation, go to—

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Critical Appraisal Tool for Clinical Guidelines & Other Secondary Sources


Critical Appraisal Tool for Clinical Guidelines & Other Secondary Sources

Everything citing medical science should be appraised for validity and clinical usefulness. That includes clinical guidelines and other secondary sources. Our tool for evaluating these resources— the Delfini QI Project Appraisal Tool—has been updated and is available in the Delfini Tools & Educational Library at  For quick access to the PDF version, go to—


Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

When Is a Measure of Outcomes Like a Coupon for a Diamond Necklace?


When Is a Measure of Outcomes Like a Coupon for a Diamond Necklace?

For those of you who struggle with the fundamental difference between absolute risk reduction (ARR) versus relative risk reduction (RRR) and their counterparts, absolute and relative risk increase (ARI/RRI), we have always explained that only knowing the RRR or the RRI without other quantitative information about the frequency of events is akin to knowing that a store is having a half-off sale—but when you walk in, you find that they aren’t posting the actual price!  And so your question is 50 percent off of what???

You should have the same question greet you whenever you are provided with a relative measure (and if you aren’t told whether the measure is relative or absolute, you may be safer off assuming that it is relative). Below is a link to a great short cartoon that turns the lens a little differently and which might help.

However, we will add that, in our opinion, ARR alone isn’t fully informative either, nor is its kin, the number-needed-to-treat or NNT, and for ARI, the number-needed-to-harm or NNH.  A 5 percent reduction in risk may be perceived very differently when “10 people out of a hundred benefit with one intervention compared to 5 with placebo” as compared to a different scenario in which “95 people out of a hundred benefit with one intervention as compared to 90 with placebo.” As a patient, I might be less likely to want to expose myself to side effects if it is highly likely I am going to improve without treatment, for example.  Providing this full information–for critically appraised studies that are deemed to be valid–of course, may best provide patients with information that helps them make choices based on their own needs and requirements including their values and preferences.

We think that anyone involved in health care decision-making—including the patient—is best helped by knowing the event rates for each of the groups studied—i.e., the numerators and denominators for the outcome of interest by group which comprise the 4 numbers that make up the 2 by 2 table which is used to calculate many statistics.

Isn’t it great when learning can be fun too!  Enjoy!

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email

Our Current Thinking About Attrition Bias


Delfini Thoughts on Attrition Bias

Significant attrition, whether it be due to loss of patients or discontinuation or some other reason, is a reality of many clinical trials. And, of course, the key question in any study is whether attrition significantly distorted the study results. We’ve spent a lot of time researching the evidence-on-the-evidence and have found that many researchers, biostatisticians and others struggle with this area—there appears to be no clear agreement in the clinical research community about how to best address these issues. There also is inconsistent evidence on the effects of attrition on study results.

We, therefore, believe that studies should be evaluated on a case-by-case basis and doing so often requires sleuthing and sifting through clues along with critically thinking through the unique circumstances of the study.

The key question is, “Given that attrition has occurred, are the study results likely to be true?” It is important to look at the contextual elements of the study. These contextual elements may include information about the population characteristics, potential effects of the intervention and comparator, the outcomes studied and whether patterns emerge, timing and setting. It is also important to look at the reasons for discontinuation and loss-to-follow up and to look at what data is missing and why to assess likely impact on results.

Attrition may or may not impact study outcomes depending, in part, upon the reasons for withdrawals, censoring rules and the resulting effects of applying those rules, for example. However, differential attrition issues should be looked at especially closely. Unintended differences between groups are more likely to happen when patients have not been allocated to their groups in a blinded fashion, groups are not balanced at the onset of the study and/or the study is not effectively blinded or an effect of the treatment has caused the attrition.

One piece of the puzzle, at times, may be whether prognostic characteristics remained balanced. One item that would be helpful authors could help us all out tremendously by assessing comparability between baseline characteristics at randomization and for those analyzed. However, an imbalance may be an important clue too because it might be informative about efficacy or side effects of the agent understudy.

In general, we think it is important to attempt to answer the following questions:

Examining the contextual elements of a given study—

  • What could explain the results if it is not the case that the reported findings are true?
  • What conditions would have to be present for an opposing set of results (equivalence or inferiority) to be true instead of the study findings?
  • Were those conditions met?
  • If these conditions were not met, is there any reason to believe that the estimate of effect (size of the difference) between groups is not likely to be true.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Tumblr Email