Evidence Based Health Care: Glossary

About PMID Numbers: We frequently utilize a PMID number in place of a citation. Where PMID numbers are available, enter that number into the PubMed search box to retrieve that citation and listing.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Absolute risk reduction (ARR) – A measure of outcomes which reflects the actual percent difference in study outcomes between groups.
Formula: % in comparison group - % in study group
Example: Mortality in control group 8%; mortality in intervention group 5%; ARR=8%-5%=3%
Similar concept: Absolute risk increase (ARI).
Related: Measures of outcomes

Accuracy – The ability to correctly identify that which one intends to identify.

Active agent — Active drug (i.e., not the placebo).

Adjustment - Statistical means for minimizing differences in study and comparison groups (e.g., standardization and regression analysis).Adverse event (ADE) — see Safety

Aim, study — see Objectives

Allocation concealment — see Concealment of allocation

All-or-none results - Very large results between groups or before-and-after results—for example, observed outcomes in which, before the intervention, nearly everyone dies, but following use of the intervention, nearly everyone survives.

There is general agreement that causality MAY be present when there is dramatic change following application of an intervention or technology that is unlikely to be due to confounding (i.e., close to all-or-none results: example, before treatment all died and following treatment, high survival rate). Examples of all-or-none results are antibiotics for meningitis and electrical cardioversion for ventricular fibrillation.

Alpha error — see Error

Alpha spending - see Multiplicity adjustment or testing

A priori – In advance. For example, issues such as study questions, outcome measures, subgroups for analysis and p-values should be determined in advance of the actual study (i.e., determined a priori). Reduces risk of chance findings for prespecified outcomes.

Ascertainment bias - See bias

Assessment bias - See bias

Association – Statistical relationship between two or more events, characteristics or other variables. Example: weight is related to height.

Attrition bias - See bias

Baseline characteristics — Key characteristics of interest at the start of a study which are potential prognostic factors. These usually include demographic information such as age, gender, race and other characteristics related to the area under study such as disease severity, baseline measurements for clinical items of interest, etc. Collecting and reporting on baseline characteristics helps to establish how similar the study groups are, describes the actual population studied and helps to measure improvements during the course of the study.

Benchmark – A standard or point of reference used in measuring and/or judging something such as quality or value.

Beta error - see Error

Bias – Study processes or events which may result in, or lead to, conclusions differing from truth in a systematic way (meaning not due to chance). Bias may occur at various study stages such as assigning subjects to study or comparison groups, intervention or exposure, performance, provision of services or conduct of processes affecting subjects, data collection, subject follow-up, measurement, analysis, interpretation and/or publication of data. Bias frequently occurs as a result of some inequality between the study and the comparison group. The inequality may be important because it may affect the study outcome.

Key Biases Which May Occur at Various Study Stages

Selection bias – A bias occurring in subject identification, selection and/or assignment. This includes dissimilarity between groups.

Observation bias – Bias that may have been introduced through study procedures during implementation, intervention or exposure, follow-up or assessment. Specific biases in this category primarily consist of –

Performance bias – Threats to validity, post-randomization, arising from study activities involving subjects such as interventions, processes or procedures. Frequently this is a bias that can result when there are differences between groups other than the intervention under study.
Synonym: Intervention bias

Attrition bias – A bias occurring as a result of subjects lost to follow-up through withdrawals or other study attrition.
Synonym: Follow-up bias

Assessment bias – A bias occurring in the way outcomes are assessed.
Synonyms: Ascertainment bias, Detection bias, Measurement bias

Bivariate analysis - Bivariate analysis involves the analysis of two variables to show relationships between those two variables. Data from two variables are plotted on a joint distribution graph, and then correlation analysis and simple linear regression analysis are performed to assess how closely the variables are related.
Related: Pearson correlation co-efficient

Blinding – An important study procedure to keep secret certain study procedures such as which is an active drug and which is a placebo. Bias can result when study subjects and those involved in study procedures know treatment assignment of individual subjects. Blinding is a method to help avoid the introduction of this kind of bias. Double-blinding refers to when neither patient nor persons performing the intervention or exposure — or working with the subjects' data — know if the patient is in the study group or the comparison group.
Synonym: Masking
Related: Concealment of allocation, Double-blinding, Double-dummy, Encapsulation

Case — Research subject who is a member of the group possessing the characteristic under study such as an exposure, risk factor, outcome or receiving the intervention in an experimental study.

Case-control study – An observational study in which subjects with outcomes of interest (cases) are compared to those without the outcome (controls). Histories are examined to attempt to determine exposures.

Case series – A group of patients receives an intervention and outcomes are assessed. There is no comparison group. Case series is almost never “evidence.” All-or-none results that are likely to be true are required, and all-or-none results are rare.

Case study — An observational of one person. Also referred to as a case report. A case study is not evidence.

Censoring - Censoring is the practice of removing the patient from the curve at a specific point in time. Examples of censoring: 1) Patients who don’t experience the event (administrative, or right, censoring, which is appropriate), 2) Other reasons determined by the investigators and called “censoring rules” (non-administrative censoring such as lost to follow-up or dying before a non-mortality outcome of interest is reached), and which must be critically appraised.

Chance — Research results happening by "accident," meaning that they did not happen because of truth or because of bias. Likelihood is assessed by using appropriate statistical tests. To critically appraise, examine p-values and confidence intervals with associated point-estimates (none of which can address bias, which is a systematic error - meaning an error not due to chance).
Synonyms: Random error, Sampling error
Related: Bias

Class effect - Class effect refers to a concept that certain drugs share sufficient similarities that they can be thought of as a class. It is a determination that a set of agents with similar chemical structures, mechanisms of action and pharmacological effects has similar therapeutic and adverse effects. There are no universally accepted criteria for defining class effect. (For all intents and purposes, class effect treats the agents as if they are the same clinically, but some other factors, such as cost, may vary.)

Clinical practice guidelines – Clinical practice guidelines are systematically developed statements to assist practitioners and patients in choosing appropriate healthcare for specific conditions. They should be based on valid and useful evidence, and they should be critically appraised before adoption.

Clinical significance — Research should benefit patients in ways that are clinically significant (i.e., that matter to them). Clinically significant areas are morbidity, mortality, symptom relief, functioning and health-related quality of life. Results from valid research in these areas should be large enough and useful enough to provide meaningful clinical benefit.

Clinical trial — A research methodology which evaluates a drug or other intervention to assess safety and efficacy by evaluating its effect on a group of subjects. The subjects are divided into two or more groups with the study group receiving the drug or intervention and the comparison group or groups receiving a comparison drug, placebo or intervention.

Cohort study – An observational study in which exposure occurs in the study group and not in the comparison group. Exposure is not introduced by the investigator, but happens through other means such as natural occurrence or individual choice. Groups are followed over time and outcomes are compared. Cohort studies can be done prospectively where a group of interest and a comparison group are followed forward in time. Cohort studies can be done retrospectively when the group of interest and comparison group are identified from the past, based on an exposure or characteristic, and then information about them is reviewed from that time forward to assess development of outcomes.

Comparison groups — Subjects in groups being compared to the group of interest (e.g., intervention group would be “group of interest” and placebo group would be the “comparison group”).

Composite endpoints — Refers to an endpoint in which single endpoints are grouped together to form one endpoint such as “cardiovascular events.”

Concealment of allocation – The process used in a randomized controlled trial to hide the assignment to a study group. The purpose is to ensure that no one can influence or control which study subject gets assigned to which group (e.g., assignment made through a call-center, etc.)
Related: Blinding

Confidence interval (CI) – CIs represent a range of statistically plausible results consistent with an outcome from a single study. Confidence intervals have some practical limitations similar to P-values (see P-values). Although the CIs can project a range of results consistent with the study results, they cannot tell you the truth of the outcomes. CIs cannot replace the need to critically appraise the study.

Our approach:

Despite their limitations, we believe confidence intervals to be more informative than P-values. We approach them as providing a possible range of plausible results for the larger population IF the study results in the studied population are true.

For valid studies, we make a judgment for what we consider to be a reasonable range for clinical significance—this need not be hard and fast. For statistically significant findings, is the confidence interval wholly within bounds for clinical significance? For non-significant findings, is the confidence interval wholly beneath your limit for clinical significance? A yes to these two questions means likely conclusive findings for valid studies. No, means findings are inconclusive.

Reporting styles vary. Our favorite is below.
Example: ARR 5%, 95% CI (3% to 7%)
(Avoid using "-" when separating the interval numbers, which can be confused with a minus sign.)
Related: P-value

Confounding – A special type of bias in which another factor associated with the study variable of interest may have "traveled" with that variable and is the true reason for the study conclusion instead of the variable under study or also affects the study results. Example: HRT users were found to have lower second MI rates; however, the results were confounded by HRT users living healthier lives. There are known confounders and unknown confounders. Randomization is a method which attempts to minimize confounding by randomly allocating subjects to their groups in hopes that any potential confounders are equally distributed between the groups.

Contamination – Any member of a comparison group receiving the intervention under study or being exposed to the variable of interest.

Control - Research subject who is a member of the comparison group.

Control Event Rate (CER) - The proportion of patients in a comparison (control) group in whom an event is observed.

Controlled clinical trial – An experiment in which investigators include a control group which is used as a comparison group. The comparison group may receive another treatment, a placebo or “usual care”. Subjects are assigned to the study or to the control group. Intervention or exposure is then performed. Patients are followed to determine outcomes such as improvements or harms.
Related: Clinical trial, Randomized controlled trial (RCT)

Correlation - Variables that are related to each other such that, when observing a correlation between the two variables in a graphic display, when one variable changes the other variable is seen to change. The changes cannot be assumed to be causal. Example:in humans, weight is positively correlated to height.

Covariate - A variable that might affect, or possibly predict, an outcome.

Critical appraisal – A scientific evaluation of evidence (e.g., research data) to appraise validity (closeness to truth) and usefulness (e.g., generalizability to one's own patients or circumstances, meaningful benefit, etc).

Cross-over design - A study design in which patients start out having one intervention and then are “crossed over” to a different intervention.

Crossover trial – A comparison of two or more interventions in which the participants receive one treatment and then are “crossed over” to another intervention. Care must be taken with this study design because effects of the first treatment could contaminate the effects of the subsequent intervention.
Related: Contamination

Cross-sectional study – An observational study in which a sample from a larger population is examined, at one single point in time – like a snapshot - for prevalence of an exposure or characteristic along with an outcome of interest. Cross-sectional studies are also frequently used to evaluate diagnostic testing.

Database research — Statistical relationships between two or more variables are assessed from databases. Frequently database research is used to report outcome measures. Database research, however, is a type of observation and, like all observational research, is highly prone to selection bias, observation bias and confounding — and the likelihood of finding "statistically significant" relationships merely due to chance is high to certain, depending upon the way the analysis is conducted. Definite cause and effect conclusions should not be drawn from database research.

Data gathering validity – Data gathering validity refers to the methods used to obtain numerators and denominators during the data gathering phase of performance measurement.

Delta — See Equivalence trials and Non-inferiority trials

Denominator – The lower portion of a fraction used to calculate a rate or ratio. In a rate, the denominator is usually the population (or population experience, as in person-years, etc.) at risk.
Related: Numerator

Denominator – For performance measurement, the population “at risk” for experiencing the event or occurrence described in the numerator — the "pool."

Dependent variable – In a statistical analysis, the outcome variables under study.
Related: Independent variable, variables

Detection bias - See bias

Diagnostic testing — see Measures of test function

Disease spectrum — A range of symptoms, signs, lab results and results of other diagnostic tests, and rate of disease progression, response to therapy and disease severity.

Double-blind – see Blinding

Double-dummy treatment design — A treatment design involving two placebos. Control subjects receive one or the other of two placebos, but not both. Study subjects receive the intervention drug. For example, if you were studying an antibiotic being given at two different doses, you might need two placebos because of the weight and size difference of the doses.

Drug utilization review (DUR): A system for evaluating appropriateness of drug therapy. Prescribing patterns are evaluated to determine if drugs are being misprescribed, possibly resulting in problems with safety or effectiveness.

Effect size - The size of the difference in outcomes between groups.

Effectiveness – The extent to which a given intervention is likely to produce beneficial results for which it is intended in ordinary circumstances.
Related: efficacy

Efficacy – The extent to which a given intervention is likely to produce beneficial effects in the context of the research study.
Related: Effectiveness

Empirical – Empirical results are those which are based on experience or observation.

Encapsulation — A drug blinding method whereby drugs (active or inert) are crushed and put into gelatin capsules to disguise the active agent as compared to the placebo.
Related: Blinding

Endpoint — see Outcome measure

Epidemiology – A study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to the control of health problems.

Equipoise -In healthcare, a state in which it is unknown which interventions will result in best outcomes.

Equipotency - See Class effect

Equivalence trial - Equivalence trials are usually used to demonstrate that the effects of two treatments do not vary more than a prespecified clinically acceptable amount and can therefore be considered clinically equivalent. Delta is the name given to the range of results within which results are judged to be equivalent.

Error —

Type 1 - or alpha error - A difference is reported, but there is no difference. This can be due to bias, confounding or chance.
Related: Statistical significance
Type 2 - or beta error - No difference is reported, but there is a difference. This can be due to an insufficient number of people studied.
Related: Power

Estimate of effect – see Measures of outcomes

Event rate – see Control Event Rate

Evidence-based medicine (EBM) – Delfini definition: "The use of the scientific method and application of valid and useful science to inform health care provision, practice, evaluation and decisions."

Experimental study – A study in which the investigator controls an exposure or intervention, then follows subjects to compare outcomes between research subjects who are exposed or who receive the intervention and those who are not exposed or who do not receive the intervention. Tip: If an intervention is "assigned" through the research, it is an experiment. If it is chosen, then the study type is observational.
Related: Observational study

Exposed group – A group whose members have been exposed to a supposed cause of disease or health state of interest, or possess a characteristic that is a determinant of the health outcome of interest.

External validity – Whether a study's results are generalizable either to a patient population (see Population bias) or to "real world" circumstances (see Intensity bias). Also referred to as
Synonym: Generalizability
Related: Validity, Internal validity, Population bias, Intensity bias

First-line therapy: The drugs to be utilized first in treating a patient.

Follow-up bias - See bias

Foraging tools - Tools that assist with keeping up-to-date with new information.
Related:Hunting tools

Forest Plot - A graphic representation of the results of individual trials in a meta-analysis along with a summary diamond. Horizontal lines represent trials; a vertical line represents the line of no difference. The graphic display is useful in evaluating and summing numerous trials at one time.

Formulary: List of therapeutic agents available in a particular practice for caring for patients. The term “preferred drug list” is also used.

An open formulary may have few or no restrictions.
A closed formulary has restrictions.

Formulary system: A system that provides for the processes for establishing and managing the formulary.

Generalizability – see External validity

Generic substitution: Replacement of one agent with a different agent having the same chemical structure. This may be done when the patent on a brand-name drug expires. Bioequivalence is frequently assumed (i.e., it is assumed that the generic agent is equivalent to the brand-name drug). In some cases the effects of other components of the generic preparation (e.g., the vehicle in a dermatological preparation) may vary and result in outcomes that differ from those reported for the brand-name agent.

Gold standard – The intervention generally believed to be the best available against which new interventions should be compared.

Halo effect - How an observer's perception, knowledge or recollection, may bias results. Also used as a synonym for Placebo effect when medical attention and services can bias results.

Hawthorne effect - How the knowledge of being in a study can influence a subject's behavior (usually favorably) which can bias results.

Hazard - An incidence rate in a survival curve.
See Survival Analysis

Hazard rate - A measure of how rapidly subjects are experiencing the endpoint in a survival analysis

Hazard ratio - (Calculated using Cox proportional hazards model) approximates the relative risk in the intervention group compared to the control group in a Kaplan-Meier model and is assumed to remain constant (often not a valid assumption).

Heterogeneity - In systematic reviews, incompatibility between trials included in the review.

Hunting tools - Tools which aid the process for searching to answer a clinical question.
Related: Foraging tools

Hypothesis — Tentative explanation that forms the basis of a research study.
Related: Null hypothesis

Imputation — As in "imputation of missing variables." Principles of Intention-to-Treat (ITT) analysis require analyzing all patients in the groups to which they were assigned. Investigators are to "impute" or assign outcomes for missing data points. Example: worst case scenario in which study subjects with missing outcomes are assigned as "treatment failures," and comparison subjects, assigned as "successes." Frequently "last-observation-carried-forward" (LOCF) is used, however, this has been shown to be a method prone to bias. Imputing outcomes for missing data points is not a method for "determining the truth of what may have happened if we had no missing data," but rather a method to test the strength of the outcomes (i.e., statistically similar results to those reported) considering all the data points in such a way that does not favor the intervention. LOCF would be especially biased in patients with a progressive illness, for example, because patient outcomes would appear better than would be expected in a progressive illness. However, there are instances in which LOCF can help support claims of efficacy. See— http://www.delfini.org/delfiniClick_PrimaryStudies.htm#LOCFhelp
Related: Intention-to-treat

Incidence – The proportion of new cases of the target disorder in the population at risk during a specified time interval.
Related: Incidence rate, Prevalence

Incidence rate (IR) – A measure of the frequency with which an event, such as a new case of illness, occurs in a population over a period of time. The denominator is the population at risk; the numerator is the number of new cases occurring during a given time period. Incidence rate is calculated by taking the number of events divided by the person-time at risk.

Incidence rate ratio (IRR) - Incidence rate ratio is the ratio of two incidence rates. The incidence rate is defined as number of events divided by the person-time at risk. To calculate the IRR, the incidence rate among the exposed proportion of the population, divided by the incidence rate in the unexposed portion of the population, gives a relative measure (IRR) of the effect of a given exposure and approximates the relative risk or the odds ratio if the occurrences are rare.

Inconsistency statistic - The I2 statistic is a test of heterogeneity. The range of I2 values is between 0% and 100%. I2 provides an estimate of the percentage of variability in results across studies that is likely due to true differences in treatment effect as opposed to chance. When the I2 is 0%, chance provides a satisfactory explanation for the variability in the individual study point estimates, and clinicians can be comfortable with a single pooled estimate of treatment effect in a valid study. As the I2 increases, bias becomes more likely. A rule of thumb characterizes an I2 of less than 25% as small heterogeneity, 25% to 50% as moderate and more than 50% as large heterogeneity.

Independent variable – An exposure, risk factor, or other characteristic being observed or measured that is hypothesized to influence an event or outcome (i.e., the dependent variable).
Related: Dependent variable, variables

Inference, statistical – In statistics, the development of generalizations from sample data, usually with calculated degrees of uncertainty.

Intensity bias - An external validity bias in which the circumstances of the study differ meaningfully from the circumstances to which study methods might otherwise be applied (e.g., "real world" settings as contrasted with the highly controlled environment often found in controlled trials).
Related: External validity

Intention-to-treat (ITT) – Analyzing results for all patients in the groups to which they were assigned whether or not they received or completed the intervention or experienced the exposure. The number randomized to each group should equal the number analyzed in each group — and they should be the same people.
Related: Imputation

Intermediate outcome markers – Outcome measures, such as a biologic factor (biomarker) or lab/imaging test, that are “assumed” to represent clinical outcomes (e.g., blood pressure used as a surrogate end point in studies of stroke).
Synonyms: Proxy markers, Surrogate markers, Surrogate end points

Internal validity - Closeness to truth within the context of the study (i.e., truth of the study not taking into account external validity). Assessing internal validity entails "ruling out" bias, confounding and chance as possible explanations for an observed association between an element of interest in a study and resulting outcomes.
Related: Validity, External validity, Bias, Confounding, Chance

Interventions — Includes screening, prevention and treatments. Only valid randomized controlled trials with useful results should be used to inform decisions about interventions in these areas.

Intervention bias - See bias

Kaplan-Meier methodology—the most commonly used survival analysis in healthcare.
Synonyms: Kaplan-Meier estimate, Kaplan-Meier model

Lead time bias - A bias resulting from a disease found through screening as compared to when it might otherwise have been detected. This kind of bias can result in a treatment seeming to be very effective if the lead time is long (e.g., “increased” survival time).
Related: Bias

Length time bias - A bias that can occur when certain characteristics or conditions under study differ in the speed of progression. This kind of bias can result in findings favoring screening. An example is tumors. Faster-growing tumors causing symptoms will be more likely to be found outside of screening. Slower-growing, asymptomatic tumors will have a longer duration and be less likely to be found outside of screening. Thus, screening will identify more slower-growing asymptomatic tumors which could then result in a conclusion that screening helps prevent mortality. This kind of bias is most likely to occur in screening studies and case-control studies where prevalence cases are included, rather than incidence cases because incidence cases are assumed to be new starts.
Related: Bias

Likelihood ratios (LR) – Likelihood ratios can be helpful for comparing one test to another, and results can help rule in or rule out a condition.
Related: Measures of test function, Positive likelihood ratio and Negative likelihood ratio.

Line of no difference – The point at which there is no greater benefit or risk one way or another (meaning the meeting point for "favors intervention" versus "favors placebo.")
Related Terms: Line of no effect, Infinity (NNT etc), Unity (ratios)

Point estimates expressed as percentages — the meeting line is at 0.
Relative risk reduction = 0 equals no difference
Absolute risk reduction = 0 equals no difference
Point estimates expressed as ratios — the meeting line is at 1.
Odds Ratio = 1 equals no difference.
Relative risk (also known as Risk ratio) = 1 equals no difference.

Masking - see Blinding

Matching – A mechanism to reduce selection bias by attempting to make a study group and a comparison group as comparable as possible by matching similar variables such as age and sex, etc.

Mean, arithmetic – The average.

Measure of association – see Measures of outcomes

Measurement bias - See bias

Measurement of risk – see Measures of outcomes

Measures of outcomes — Statistics that show the size of differences between the results from the study groups.
Synonyms and related terms: Measurement of association, Measurement of risk, Estimates of effect, Point estimates, Effect size, Treatment effect

Absolute risk reduction
Number needed to treat
Odds ratio
Relative risk
Relative risk reduction

Measures of test function — Measures to help determine the accuracy and usefulness of diagnostic tests.
Synonyms: Indices of accuracy
See individual definitions for the following:

Sensitivity
Specificity
Positive predictive value
Negative predictive value
Positive likelihood ratio
Negative likelihood ratio
Post-test odds
Post-test probabilities
Number-needed-to-diagnose

Median – The data midpoint.

MeSH - Medical Subject Headings: a list of synonyms or thesaurus of terms used by search databases to index and classify medical information.

Meta-analysis – A quantitative technique for summarizing results of more than one study using predetermined criteria. The goal is to provide a summary estimate of effect based on the scientific weight of the studies. Meta-analysis can be achieved through use of the results of individual studies or actually pooling data from those studies.
Related: Systematic Review.

Mode – The most frequently occurring value in a dataset.

Monograph - A written review and analysis, often of a single agent, containing a list of usage recommendations.

Multiplicity adjustment or multiplicity testing - Methods to attempt to reduce the higher risk of chance effects resulting from multiple analyses and multiple outcomes being tested by apportioning the p-value. Portions of the p-value (e.g. if the p-value is 0.05 and there are 10 assessments, authors may require a p-value of 0.005 for statistical significance for each assessment) making it harder to achieve statistical significance. In other words, part of the p-value is "borrowed" for each assessment. This would amount to lower p-values for the multiple assessments.

Rothman 99 makes some good points about this:

If the risk for rejecting a truly null hypothesis is, for example, 5% for every hypothesis examined then examining multiple hypotheses will generate a large number of errors simply because of the increasing number of hypotheses examined.
Adjusting for multiple comparisons is thought by many to result in a smaller probability of erroneously rejecting the null hypothesis, but is that so? Instead, adjusting for multiple comparisons might amount to paying a penalty (decreased power) for simply appropriately doing more comparisons and Rothman argues that we have no empirical justification for adjusting—we do not know whether a given association is an “unpredictable manifestation of random processes” or that we have evidence suggesting that we should “pay for peeking” at more data by adjusting P-values.
Rothman argues that the burden is on those who advocate for multiple comparison adjustments to show we have a problem requiring a statistical adjustment fix.
Rothman’s conclusion: It is reasonable to consider each association on its own for the information it conveys. [Rothman KJ PMID: 2081237]

Related term: Alpha-spending

Multivariable analysis - A statistical method for determining the specific contributions of various factors to a single outcome. This allows investigation of a single variable while controlling for the effect of other variables. Multivariable analysis determines the independent contribution of each independent variable to the dependent variable (i.e., development of coronary heart disease).

Methods include multiple linear regression, multiple logistic regression, and proportional hazards (Cox) regression

Linear regression is used with continuous outcomes
Logistic regression is used with dichotomous outcomes
Proportional hazards regression is used when the outcome is the length of time to reach a discrete event (such as time from baseline visit to death)

Multivariate analysis - A statistical method for determining the specific contributions of various factors to a more than one outcome.

Narrative review: An article in the medical literature summarizing other studies for a given topic, characterized by a lack of a transparent, scientific, and systematic approach; thus, the summary is highly likely to be misleading. Instead, systematic reviews should be sought (and appraised).

Negative likelihood ratio – The number of times the percent of false negatives occurs over percent of true negatives.
Related: Measures of test function
Formula = (1-sensitivity) / specificity
This represents the change from pre-test odds to post-test odds. The size of an increase is considered as follows —

Small =.02-.05
Modest = .05-.1
Large = > .1. Large is considered to rule out a condition.

Negative predictive value (NPV) – Of all testing negative, percent who do not have a condition, based on that population's prevalence.
Formula = d / (c + d) from two-by-two table
Related: Measures of test function, Two-by-two table

Network meta-analysis - A method of assessing comparative effectiveness of interventions that have not been directly compared in clinical trials. Network meta-analyses compare the results from two or more trials that have one treatment in common. For example, if treatment A has been compared to treatment B in one trial and in another trial of similar subjects, B has been compared to treatment C, then a network meta-analysis can be used to indirectly compare treatment A to treatment C.

Non-inferiority trial - Non-inferiority trials aim to show that an intervention is not inferior to a comparison intervention by more than a prespecified clinically acceptable amount (Delta). Judgment is required to establish what is meant by a clinically acceptable amount.

Non-parametric - Non-parametric refers to instance in which the distribution is unknown. Parametric refers to instances in which parameters of the distribution of the variable of interest is known (e.g., knowledge that a distribution of a variable is expected to follow a bell curve).

Null hypothesis – The first step in testing for statistical significance in which it is assumed that the exposure is not related to a disease. This is done for mathematical purposes to level the playing field to begin with the assumption that study outcomes are the same as what would have occurred by chance.
Related: Hypothesis

Number-needed to diagnose (NND) — From Bandolier: "For any chosen clinical endpoint the NNT is the reciprocal of the fractional improvement in a treated group minus the fractional improvement in an untreated group NNT = 1/(fraction improved with active - fraction improved with control)" or NND = 1/[Sensitivity - (1 - Specificity)]. Primarily useful for comparing the NND values between different tests. Best outcome is as close to 1.0 as possible.
Related: Number-needed-to-treat, Measures of outcomes

Number-needed-to-treat (NNT) – The number of patients who need to be treated in order for one patient to benefit over that patient taking the comparator agent within the study time period. NNT is the reciprocal of the ARR. For an ARR of 5%, for every 20 people treated, 1 more person will have improved outcomes with Agent A than if they were treated with Agent B within the study time.
Formula = 1/ARR (or how many times does ARR # go into 100). Example: For an ARR of 5 percent the NNT is 20; meaning, twenty people would have to be treated for one person to benefit.
Similar concept: Number needed to harm / screen / prevent, etc – NNH, NNS, NNP
Related: Measures of outcomes

Numerator – The upper portion of a fraction.
Related: Denominator

Numerator – For performance measurement, the event or occurrence being tracked (a subset of the denominator) — the "count."

Objective — Goal or aim of the study.

Observation bias - See bias

Observational study – Epidemiological study in which observations are made, but investigators do not control the exposure or intervention and other factors. Changes or differences in one characteristic are studied in relation to changes or differences in others, without the intervention of the investigator. Observational studies are highly prone to selection bias, observation bias and confounding. Tip: If an intervention is "assigned" through the research, it is an experiment. If it is chosen, then the study type is observational.
Related: Experimental study

Odds - The likelihood of an event occurring compared to not occurring--e.g., odds of two to one mean that likelihood of an event occurring is twice that of not occurring.

Odds ratio (OR) – A point estimate used for case-control studies which attempts to quantify a mathematical relationship between an exposure and a health outcome. Odds are used in case-control studies because the investigator arbitrarily controls the population; therefore, probability cannot be determined because the disease rates in the study population cannot be known. The odds that a case is exposed to a certain variable are divided by the odds that a control is exposed to that same variable. Odds are often used in other types of studies as well, such as meta-analysis, because of various properties of odds which make them easy to use mathematically.

Odds and probabilities of outcomes are usually similar if the event rate is low, but we recommend calculating both the odds ratio reduction and the relative risk reduction in order to compare and determine if the difference is clinically meaningful.

Related: Measures of outcomes

Open label study - Not blinded as to the intervention.
Related: Blinding

Outcome — see Outcome measure

Outcome measure – What we are interested in studying (e.g., mortality, use of rescue medications in asthma patients, incidence of no flares in atopic dermatitis).
Synonyms: Endpoints, Outcomes, Outcome variables
Related: Primary outcomes, secondary outcomes, composite endpoints.

Overdiagnosis bias - A finding of a disease at an asymptomatic stage in a patient who would not have become symptomatic or harmed by the disease.

Override - Process of setting aside a prescriber’s choice of a medication and usually substituting another medication.

Parametric - Parametric refers to instances in which parameters of the distribution of the variable of interest is known (e.g., knowledge that a distribution of a variable is expected to follow a bell curve. Non-parametric refers to instance in which the distribution is unknown.

Pathophysiology - Disruption of normal biochemical or physical function by a disease or disorder.

Patient years - A way of communicating the rate events in a study. Patient-years is a statistic encountered in clinical studies. It is used because it conveys information about event rates and time. The calculation is accomplished by adding the years that patients are in a study and dividing the years by the number of events. Example: 100 patients are followed for 2 years. In this case, there are 200 patient-years of follow-up. If there were 8 myocardial infarctions (MI), the rate would be 8 MIs per 200 patient years or 4 MIs per 100 patient-years. Patient-years is used to give information about means. Read more here.

Pearson correlation co-efficient - An analysis method to measure the extent of the linear relationship of two variables to determine how independent and dependent variables change together (e.g., salt intake and blood pressure).
Synonym: r value
Related: Bivariate analysis

Performance bias - See bias

Performance measure – A quantitative assessment of a process of care, outcome or service usually used in quality improvement work. It consists of a denominator (e.g., population of interest), a numerator (i.e., a count of events of interest occurring within the denominator) and a frequency (i.e., the specified interval for measurement). May target various levels or units such as a system, specialty group or individual. Usually expressed as a rate, ratio or percentage. Sometimes used interchangeably with “quality indicator.”

Key terms = accuracy, benchmark, data gathering validity, denominator, dependability, numerator, outcome measure, performance measure, precision, quality indicator, rate, ratio, risk adjustment, risk stratification

Pharmacy & Therapeutics Committee: Committee charged with making formulary management decisions, along with performing other formulary management functions.

Pharmacy benefit manager (PBM): A company that manages pharmacy benefits and formulary management for health care systems and/or insurance companies.

Phase (clinical) studies –

Phase I is the first stage in testing a new drug in humans for safety and chemical action of the drug. Phase I studies are usually performed on healthy volunteers (often 20 to 100 persons) without a comparison group.
Phase II studies are performed mainly to test a drug’s efficacy. They are often performed on healthy volunteers (as many as several hundred people who have the condition in question) and may be up to two years duration or more. They are sometimes conducted as randomized controlled trials.
Phase III studies are full-scale evaluations of treatment, in hundreds to thousands of patients, for comparison to the current standard treatments for the same condition. Phase III studies are frequently randomized controlled trials.
Phase IV studies are postmarketing surveillance studies.

Placebo – An inactive substance or procedure administered to a patient for comparison to the intervention. Used in clinical trials to blind people to their treatment allocation.
Related: Blinding

PICOTS – Acronym standing for opulation, intervention, comparison, outcomes, timing, setting. PICOTS is useful in formulating a clinical question, searching*, comparing and reporting on studies. (*We find searching can be too limited using PICO or PICOTS. We frequently use condition/intervention.)

Pivotal trial - A controlled trial to evaluate the safety and efficacy of a drug in patients and usually the basis for the New Drug Application (NDA) filing with the FDA.

Point estimate – see Measures of outcomes

Population – For research studies, the term "population" may refer to a group of people living in a specific geographic area or it may mean a unique set of individuals who share some characteristic such as exposure to a disease or a group of patients of a single clinician, as examples. In statistics, this can mean a population of data.

Population bias - An external validity bias in which the population under study differs meaningfully from the population to which the results might be applied.
Related: External validity

Positive likelihood ratio (LR+) – The number of times the percent of true positives occurs over percent of false positives. This represents the change from pre-test odds to post-test odds.
Related: Measures of test function
Formula = sensitivity / (1-specificity)
This represents the change from pre-test odds to post-test odds. The size of an increase is considered as follows —

Small = 2 to 5
Modest = 5 to 10
Large = > 10. Large is considered to rule out a condition.

Positive predictive value (PPV) – Of all testing positive, percent who have the disease, based on that population's prevalence.
Formula = a / (a + b) from two-by-two table.
Related: Measures of test function, Two-by-two table

Post-test odds for positive test — The odds of a person having a condition if the test is positive.
Formula = pre-test odds x positive likelihood ratio
Related: Measures of test function

Post-test odds for negative test — The odds of a person not having a condition if the test is negative.
Formula = pre-test odds x negative likelihood ratio
Related: Measures of test function

Post-test probability for a positive test — After learning test result, the probability that a person testing positive has the condition.
Formula = Odds post (T+) / (1 + Odds post (T+))
Related: Measures of test function

Post-test probability for a negative test — After learning test result, the probability that a person testing negative has the condition.
Formula = Odds post (T-) / (1 + Odds post (T-))
Related: Measures of test function

Power - See Statistical power.

Precision – The ability to provide sufficient detail, such as small incremental units, to be useful.

Predictive value – A diagnostic measure of function which is used to address the probability of having or not having a condition. Predictive value is affected by prevalence.
Related: Positive predictive value, Negative predictive value.

Pre-test likelihood — Pre-test probability of having a condition based on the prevalence of the condition in the population studied.
Related: Prevalence

Pre-test odds — Pre-test odds of having a condition based on the prevalence of the condition in the population studied.
Formula = prevalence / (1-prevalence)
Related: Prevalence

Pre-test Probability -The probability that a patient has the condition prior to administering a test.

Prevalence – The number or proportion of cases or events or conditions in a given population.
Related: Prevalence rate, Incidence

Prevalence rate – The proportion of persons in a population who have a particular disease or attribute at a specified point in time or over a specified period of time.

Prior authorization: Requirement that a clinician obtain approval before a drug can be dispensed and/or covered.

Probability - The likelihood of an event occurring expressed as a number between 0 and 1. It is measured by the ratio of the event to the total number of possible events. Example: the probability of flipping a coin and coming up with heads is .5.

Proportion – A type of ratio in which the numerator is included in the denominator. The ratio of a part to the whole, expressed as a “decimal fraction” (e.g., 0.2), as a fraction (1/5), or, loosely, as a percentage (20%).
Related: Denominator, Numerator

Proxy outcome markers – see Intermediate outcome markers.

Publication bias – The bias toward the publishing of studies showing statistically significant “positive” results.

P-value - Assuming there truly is no difference between the groups studied, the P-value is a calculated probability of observing a difference as big as or bigger than the one you observed in a study based on compatibility with an assumed standard distribution.

Problems with P-values:

The P-value cannot tell you the chance the results are true or even how likely they are to be due to chance.

“No test based on a theory of probability can by itself provide any valuable evidence of the truth or falsehood of a hypothesis.” [Neyman J, Pearson E. On the problem of the most ancient tests of statistical hypotheses. Philosophical Transactions of the Royal Society, Series A 1933; 231:289–337.]

You do not know if the null hypothesis is true or not.

You do not know if the sample is truly random and/or representative of the population.

You do not know if the distribution in the population is standard.

To reject a null hypothesis (e.g., no difference between groups), is not the same as accepting an opposite hypothesis (there is a true difference) as there could be other explanations for no difference (such as a bias or an insufficient number of people studied to show a true difference that actually exists, i.e., lack of power).

The description of differences merely as “statistically significant” is not acceptable—precise P-values should be presented. [Numerous authors]

Interpretation of confidence intervals in valid studies should focus on the range of values in the interval and the clinical importance of the results based on that range.

A P-value of < 0.05, with a 95 per cent confidence interval which indicates that the true treatment effect may be close to the null value may be misleading. Approximately half of comparisons with P<0.05 are from null hypotheses which are true. [Sterne JA PMID: 11921006]

Therefore, the P-value has much more limited value than is frequently believed and isn’t very meaningful. One author likens it to a data compatibility issue: “The P-value is an indicator of the relative compatibility between the data and the null hypothesis, but it does not indicate whether the null hypothesis is a correct explanation for the data.” [Rothman KJ PMID: 2081237]

Consider the following scenario:

Drug A compared to placebo

100 patients in each group

Mortality in Drug A group is 5%

Mortality in placebo group is 15%

P-value = .03

In this scenario, the P-value can be interpreted as follows:

“If it is true that there is no difference in mortality outcomes between Drug A and placebo, then there is a 3% statistical probability (chance) of observing a difference equal to or greater than the difference represented by 5 deaths out of 100 in the Drug A group as compared to 15 deaths out of 100 in the placebo group.”

Our approach:

In a study that is evaluated to be at low risk of bias, we may put greater reliance upon visual inspection of the differences between groups and examination of confidence interval, recognizing, however, that confidence intervals cannot be relied upon as meaning that a true answer lies within their range as there are related issues to P-values. Confirmatory studies and patterns provide us greater comfort with results.

Related: Statistical significance, Error (e.g., sampling error), Confidence intervals

Qualitative research - Any research that utilizes non-quantitative methodology, e.g., case studies or narrative description or interviews.

Quality indicator – Oftentimes used interchangeably with “performance measure.” Quality indicators are specific and measurable elements of health care that can be used to assess the quality of care.

Random error – see Chance

Randomization – Mechanism to assign, by chance, “intervention / no intervention” assignments to study participants in randomized controlled trials.

Mechanisms should include random numbers which may be through a table or done via computer.
Simple randomization is like a coin flip where each person has equal chance of being assigned to either group.
Blocked randomization is used to create equally sized groups.
Stratified randomization randomizes subjects within selected criteria – such as age or sex or anticipated confounders.
Sequential methods are more prone to bias especially when there has not been effective concealment of allocation.

Randomized controlled trial (RCT) – An experiment in which investigators control an exposure or intervention, and subjects are randomly assigned to the study group or to the control group or groups. Intervention or exposure then is performed. Patients are followed to determine outcomes such as improvements or harms.
Related: Experimental study

Random sample – A sample derived by selecting individuals such that each individual has the same probability of selection.

Rate – Derived by dividing a numerator by a denominator. The numerator is a subset of the denominator. Example: The percentage of diabetic eye exams for Type I diabetics was 80% for our clinic.
Related: Denominator, numerator

Ratio – A numerator and denominator (the numerator is not required to be a subset of the denominator). Example –The ratio of women to men in these studies was 1 to 5 (which can also be expressed as 1:5). Or, the rate and ratio for diabetic eye exams in our clinic would be 80/100.
Related: Denominator, numerator

Rebate: A cost offset that pharmacy systems receive when purchasing large quantities of drugs. AQ30Selection of drugs for large bulk purchase is based on the drugs’ volume of use.

Recall bias – Inaccuracies resulting from data collected from study participants who are asked to retrospectively self-report on study items of interest. This kind of bias is most likely to occur when certain behaviors are hard to track (e.g., diet) or are often hidden for some reason such as being socially sensitive (e.g., sexual behaviors, smoking behavior, etc.).
Related: Bias

Regression model – A method for estimating the relationship of a dependent variable to a variety of independent variables.

Regression-to-the-mean – Returning to the “average” state. Meaning that extreme test values are statistically likely to move to an average over time. When patients present with extreme values and then seem to have improvement, it may be falsely attributed to an intervention when it is truly due to regression-to-the-mean. A comparison group with no intervention can help expose this effect.

Relative risk (RR) - A point estimate in which the risk of some health-related event, such as disease or death, is compared between a study group and a comparison group. Relative risk is expressed as the number of times one group may be at risk over another. A relative risk of less than 1 is a lower risk.
Formula: Risk in study group divided by risk in the comparison group
Synonym: Risk ratio
Related: Measures of outcomes

Relative risk reduction (RRR) – A point estimate in which the percent reduction in events in the intervention or exposed group is compared to the comparison group — relative risk reduction is the proportional difference in size between outcomes. For example, if we had 15% mortality in a comparison group as compared to 10% mortality in the intervention group, since 10 is one-third smaller than 15, the RRR would be 33%. Relative risk reduction begs the question "relative to what?" Because RRR overestimates the actual difference, it should always be used in conjunction with another point estimate such as absolute risk reduction or number-needed-to-treat when communicating study results.
Formula: RRR = [((Comparison group outcomes - Intervention group outcomes) / Comparison group outcomes) x 100] or 1-Relative Risk (RR)
Related: Measures of outcomes

Representative sample – A sample whose characteristics correspond to those of the original population or population of interest.

Retrospective study – A study method to look at cause and effect after outcomes have already occurred. Generally this refers to a case-control study; however, cohort studies can use a retrospective method.

Risk – The probability that an event will occur.

Risk adjustment – The process of adjusting performance rates or other outcomes of care to level the playing field due to differences in health status between populations.

Risk factor – An aspect of personal behavior or lifestyle, an environmental exposure, or an inborn or inherited characteristic that is associated with an increased occurrence of disease or other health-related event or condition.

Risk ratio – see Relative risk

Risk stratification – The process of, or result of, separating a sample into subsamples based on health status or risk factors such as age, comorbidities, etc.

Safety — Safety in research studies pertains to harms of various interventions.
Related terms: Adverse events, ADEs, Harms, Risks, Side effect

Sample – A selected subset of a population. A sample may be random or non-random and it may be representative or non-representative.
Related: Random sample, Representative sample

Sampling error - See chance

Second-line therapy: Therapy for patients failing on drugs established as “first-line” therapy.

Selection bias – See bias

Sensitivity (SN) – Correct identification by a screening test or case definition as having disease – of all those with a disease, the percent testing positive (true positives). Sensitivity is derived from calculations based on people who are known to have the condition. Sensitivity is especially useful when it is important not to miss a disease. High sensitivity is considered to "rule in" a condition.
Formula = a / (a + c) from two-by-two table
Related: Measures of test function, Two-by-two table

Sensitivity analyses - Analyses undertaken to test the robustness of data (e.g., "what if" scenarios).

SIR - See Standardized incident ratio

Specificity (SP) – Correct identification by a screening test or case definition as not having disease – of all those without a disease, the percent testing negative (true negatives). Specificity is derived from calculations based on people who are known not to have the condition. Specificity is especially useful when it is important to avoid false positives. High specificity is considered to "rule out" a condition.
Formula = d / (b + d) from two-by-two table
Related: Measures of test function, Two-by-two table

Standard deviation (SD) – A widely used measure of dispersion of a frequency distribution, equal to the positive square root of the variance. Standard deviation can tell you how widely or how tightly variables are distributed around the average. One standard deviation accounts for approximately 68 percent of the distribution. Two standard deviations account for approximately 95 percent.

Standardized incident ratio (SIR) - The ratio of observed occurrences to expected occurrences. Expected occurrences for SIR calculations are based on selected data sources e.g., the National Cancer Institute (NCI) Surveillance Epidemiology and End Results (SEER) database. For example, a SIR of 150 is interpreted as 50% more cases than the expected number; a SIR of 90 indicates 10% fewer cases than expected.

Standardized mortality ratio (SMR) - The ratio of observed deaths to expected deaths calculated using the expected rates (e.g., based on country-specific age and sex matched general population data from the World Health Organization).

Statistical power - The ability to reliably determine a statistically significant relationship between interventions if one exists. In other words, power, by definition, means having sufficient people in a study to find a statistically significant difference between groups in an outcome of interest if such a difference truly exists. Therefore, any statistically significant outcome is, by definition, sufficiently powered for that outcome. Power comes into play in planning and interpreting studies. Investigators want to know the risk of drawing erroneous conclusion about efficacy and so they do power calculations to estimate the number of subjects needed in a study to show a statistically significant result if one exists. It is conventional to accept a 5% risk of concluding that an outcome is truly different in the study groups when they are not (Type I or Alpha error). It is also conventional to accept a 20% risk of concluding that an outcome is not statistically significant when in truth there may be a true difference (Type II or Beta error). Power can be determined by considering Alpha, Beta, the event rate in the comparison group, and what is judged to be a clinically significant difference between the groups. As a practical matter, many users focus on the confidence intervals. For valid studies, consider what you judge to be a reasonable range for clinical significance – this need not be hard and fast. For statistically significant findings, is the confidence interval wholly within bounds for clinical significance? For non-significant findings, is the confidence interval wholly beneath your limit for clinical significance? A yes to these two questions means likely conclusive findings for valid studies. No, means findings are inconclusive.
Synonyms: Power, Power of a study
Related: Error

Statistical significance – The extent to which study results are unlikely to be due to chance. Expressed as a p-value and it is typically set at <.05 due to convention. Statistical significance may be set lower, but must be determined in advance of the study.
Related: Error, p-value

Subgroup analysis — Analysis of research results on subsets of patients in a study, such as a subgroup of Caucasian women in a study of asthmatics of all races, ages and genders. Subgroups for study must be determined in advance of a study — otherwise the results are highly likely to be due to chance.

Superiority trial - Clinical trials which are conducted to determine if one intervention is superior to another.

Surrogate end points – See Intermediate outcome markers.

Survival analysis - a type of analysis that representsf the length of time to an outcome of interest, (e.g., time-to-pregnancy, time-to-cancer progression)
Synonyms: Life table analysis

Systematic review – Studies that examine more than one study on a given topic in a systematic way using predetermined criteria. The goal is to provide a summary estimate of effect based on the scientific weight of the studies.
Related: Meta-analysis.

Therapeutic substitution: Substitution of a drug with an agent having a different chemical structure but similar clinical benefits.

Trend - A term used to describe observed results in several studies which consistently go in the same direction, but are not statistically significant.

Triangulation issues: A catchall term for all the various considerations that need to be made in making a decision.

Type 1 error - see Error

Type 2 error - see Error

Two-by-Two Table — Table to report results and from which statistics are calculated. See examples below:

Intervention Example

Two-by-two Table (2x2 Table)	Not Improved	Improved	Totals
Intervention Group	a	b	a+b
Control Group	c	d	c+d
Totals	a+c = Total Not Improved	b+d = Total Improved	a+b+c+d

Diagnostic Testing Example

Two-by-two Table (2x2 Table)	Condition Present	Condition Absent	Totals
Test Positive	a = true positives	b = false positives	a+b
Test Negative	c = false negatives	d = true negatives	c+d
Totals	a+c = prevalence (in this population)		a+b+c+d

Validity – Closeness to truth. For example, in performance measurement, the degree to which a measurement actually measures or detects what it is supposed to measure.

Variable – Any characteristic or attribute that can be measured.

Statistical tests are chosen based on type of variables. The four main types of variables are -

Nominal (named categories without any measurable scale such as ethnic groups)
Dichotomous or binary (two mutually exclusive categories resulting in “either this or that” such as “death” or “survival”)
Ordinal or ranked (three or more variables that can be “ordered” or ranked such as good/better/best or satisfied/neutral/unsatisfied)
Continuous (can be anywhere along a continuum, e.g., blood glucose readings)

Related: Dependent variable, independent variable

Variance – A measure of the spread of a variable about its mean value. In a data set, a single point lies above or below the mean for the entire dataset. A deviation score is the measure of how much each point lies above or below the mean for the entire dataset. Variance is the mean of all the deviation scores. It is a way of mathematically stating how far deviations are from expected results.

Volunteer Bias - A bias that occurs when volunteers are enrolled in screening studies. Volunteers have been shown to have better outcomes than those who don’t (i.e., those who are recruited through active requests to participate), possibly due to the healthy user effect. In other words, patients who volunteer for screening may be “different” (healthier) from the larger population and may have a better prognosis.

Washout period – Used in a cross-over trial to discontinue patients on a treatment before the second treatment is given to reduce the likelihood that the first treatment will affect the outcome even after it is stopped.

Withdrawal design - A study design in which an intervention to be tested is applied and then withdrawn to determine if any change reverts to pre-intervention levels.

CONTACT DELFINI