## Absolute risk

Absolute risk is the probability that a person will experience a specified outcome during a specified period.^{5}

## Absolute risk difference

Absolute risk difference is the difference in absolute risk of an outcome between the control group and the intervention group. It can be an absolute risk increase (ARI) or an absolute risk reduction (ARR).^{6} An absolute risk difference of zero signifies no difference between the groups.^{5}

## Bias

Bias refers to the systematic deviation of study results from the true results because of the way in which the study is conducted.^{3} It can arise from differences in the characteristics of the groups that are compared (selection bias), differences in the care that is provided or exposure to factors other than the intervention of interest (performance bias), withdrawals or exclusions of people involved in the study (attrition bias), or the way outcomes are assessed (detection bias).^{7}

## Biomarkers

A biomarker is a laboratory measurement that is used to reflect the activity of a disease process. Biomarkers can be used as end points in clinical trials^{8} and are of increasing interest in the development of personalised medicines. For example, HER2 expression is a biomarker used to identify patients who are likely to respond to treatment with trastuzumab. For more information, see Individualised medicine.

## Blinding

Many RCTs use blinding to minimise bias. Blinding involves not disclosing which participants in the study are receiving the intervention being evaluated. If both the investigator and the study participants are unaware of the allocation, the study is double blind. If, in addition, the statistical analysis is done in the absence of knowledge of the group to which participants belong, the study may be described as triple blind.^{3}

## Clinical outcomes

Clinical outcomes are the patient-oriented outcomes that are the most important for individual patients and the healthcare system. Examples of clinical outcomes are rates of stroke or hip fracture. Such outcomes are often influenced by a variety of factors (not just those under investigation in a trial) and might not become apparent in the typical time frame of a trial. As a consequence, trials often report surrogate outcomes or markers (such as reductions in blood pressure or increases in bone mineral density) that are more easily measured and become apparent in a shorter time frame.^{3}

Evidence that shows beneficial effects on clinical outcomes should influence practice more than evidence that shows beneficial effects only on surrogate markers.

## Confidence interval

The range of values within which the true value for a population (represented by subjects in studies) is likely to lie is called the confidence interval (CI). Most often, a 95% CI is calculated. If the relative risk (RR) (as measured in a study) of breast cancer after 5 years of hormone replacement therapy use is calculated as RR = 3.0 (95% CI: 2.5–3.8), this is interpreted as meaning that there is a 95% chance that the true relative risk lies somewhere in the range of 2.5 to 3.8.^{4}^{,5}

## Confounding

Confounding refers to a bias that can distort the exposure–disease or exposure–outcome relationship. A confounding factor is something extraneous to the main study question that affects the outcome and distorts the true relationship between study variables.^{3}

## Cost–benefit analysis

Cost–benefit analysis assesses whether the cost of an intervention is worth the benefit by measuring both in monetary units.^{4}

## Cost-effectiveness analysis

Cost-effectiveness analysis measures both the costs and the benefits of interventions that have a common health outcome (e.g. stroke prevention), to find the strategy with the best ratio of benefits to costs. The results are reported as cost per unit effect (e.g. cost per episode of stroke prevented).^{4}^{,5}

## Cost-minimisation analysis

Cost-minimisation analysis calculates the cost of two or more alternatives that produce the same outcome, to identify the option with lowest cost. With medicines, this type of evaluation usually involves comparing efficacy and safety.^{5}^{,9}

## Cost–utility analysis

Cost–utility analysis provides a common unit of measurement when the options being compared produce different outcomes. Outcome measures involving length and quality of life (quality‑adjusted life-years, or QALYs) are used. The results are often expressed in terms of cost per QALY gained.^{5}^{,9}

## Direct and indirect costs

Direct costs can be associated directly with resource use for a health service or commodity. Indirect costs often refer to productivity losses.^{10}

## Economic evaluation

Economic evaluation involves methods that identify, measure and analyse the costs and consequences of health interventions. It encompasses cost–benefit, cost-minimisation, cost–utility and cost‑effectiveness analyses.^{5}

## Effectiveness

Effectiveness is the extent to which an intervention provides beneficial effects relative to harmful effects in real-world use, in clinical practice. It can be expressed in a variety of ways, using either clinical outcomes or surrogate markers.^{3}

## Efficacy

Efficacy is the extent to which an intervention improves the outcome for people under ideal circumstances (e.g. RCT). Testing efficacy means finding out whether something is capable of causing an effect at all.^{3}

## Hazard ratio

The hazard ratio (HR) measures relative hazard between two groups. Broadly equivalent to relative risk, the HR is useful when the risk is not constant over time. It can be expressed as the rate of hazard in the treatment group at a specific point in time divided by the rate of hazard in the control group at the same point in time.^{7}^{,11}

## Health economic modelling

Health economic modelling applies randomised trial evidence to real-life settings to judge a medicine's clinical and economic performance. It can be used to examine the impact of differences between study subjects and patients likely to receive the drug in clinical practice, to extrapolate surrogate markers to clinical outcomes, and to extend the findings of studies to the likely duration of use.^{5}

## Heterogeneity

Heterogeneity means variation in effects between individual studies that is not likely to be the result of chance alone.^{4} In the context of meta-analysis, it means that a combined estimate will not produce a meaningful description of the set of studies, and could render the pooling of data unreliable or inappropriate. Heterogeneity can be caused by the use of different statistical methods (statistical heterogeneity), or by evaluation of people with different characteristics, treatments or outcomes (clinical heterogeneity).^{7}

## Intention to treat

Intention to treat (ITT) analysis is a method used for trials in which all patients who are randomised to a treatment arm are analysed together, regardless of whether or not they completed or received the treatment. Such analysis generates findings that are more applicable to practice, reflecting the reality that some patients will abandon a treatment because of adverse effects or lack of efficacy.^{4}

## Meta-analysis

Meta-analysis is a statistical technique that uses quantitative methods to synthesise and summarise the results of several studies in a single weighted estimate in which more weight is given to results from higher-quality studies. It is often used in a systematic review. Meta-analysis of trials that individually demonstrate a high level of evidence will provide results that are considered to constitute a higher level of evidence.^{4}^{,9}

## Number needed to harm

The number needed to harm (NNH) is an epidemiological measure that shows how many patients need to be exposed to a risk factor for a certain period to cause harm in one patient who would not otherwise have been harmed. It is defined as the inverse of the absolute risk increase (ARI) (i.e. NNH = 1/ARI). The lower the NNH, the greater the risk of harm.^{3}^{,4}

## Number needed to treat

The number needed to treat (NNT) is an estimate of how many people need to receive a treatment for a certain period before one person would experience the outcome measured. For example, if a stroke prevention drug needs to be given to 20 people for 5 years to prevent one stroke, the NNT would be 20 over 5 years. NNT can be calculated in a number of ways; the simplest is to calculate the reciprocal of absolute risk reduction (ARR), with ARR expressed as a decimal (i.e. NNT = 1/ARR).^{4}^{,5}

## Odds

Odds refers to the probability that an event will occur. It is expressed as a proportion of the probability that the event will not occur.^{3}^{,4}

## Odds ratio

The odds ratio (OR) is a measure of treatment effectiveness. It shows the odds of an event happening in the experimental group as a proportion of the odds of the event happening in the control group.^{3}^{,4} The closer the OR is to 1, the smaller the difference in effect between the experimental intervention and the control intervention. If the OR is greater (or less) than 1, the effects of the treatment are more (or less) than those of the control treatment. The effects being measured can be adverse or desirable.

For most clinical trials in which the event rate is low (i.e. less than 10% of all participants have an event), the OR and relative risk (RR) can be considered interchangeable. They will also be closer together when the treatment effect is small (i.e. OR and RR are close to 1) than when the treatment effect is large. The OR will progressively move away from the RR as the event rate increases above 15%, or as the treatment effect becomes large.^{11}

## Post-marketing surveillance

Post-marketing surveillance, or pharmacovigilance, refers to the identification and collection of information on the actions of a medicine (including adverse reactions) from consumers, researchers, health professionals and pharmaceutical companies, after the medicine has been released. Increased use of the medicine after it is released and use in special patient populations (such as the elderly or patients with comorbidities) can lead to the identification of rare side effects that are undetected in clinical trial populations.^{12} The association between selective serotonin re-uptake inhibitors and abnormal bleeding was identified as a result of post-marketing surveillance.

## Power of a study

The power of a study is the probability of detecting a pre-specified difference between treatments if that difference truly exists.^{3}^{,5} The power of a study is influenced by variability in the end point of interest, the level of statistical significance used in the analysis, the size of the difference being investigated and the sample size.

## Progression-free survival

Progression-free survival is the post-treatment period during which there is no reappearance of the symptoms or disease.^{13} It is a surrogate marker used in oncology and in treatment of HIV infection.

## P-value

The p-value is the probability that an observed difference occurred by chance, if it is assumed that there is no real difference between effects in different study groups. The lower the p-value, the more likely it is that the difference between groups was caused by the intervention. If this probability is less than 1 in 20 (i.e. the p-value is less than 0.05), the result is conventionally regarded as being statistically significant.^{3}

## Quality-adjusted life-years

Quality-adjusted life-years (QALYs) is a common measure of health status that includes both the duration and the quality of life.^{5}^{,9}

## Relative risk

Also called the risk ratio, the relative risk (RR) is the number of times more likely (RR >1) or less likely (RR <1) an event is to occur in one group compared with another. For example, if RR = 3.0, the event is about three times more likely to occur; if RR = 0.5, the event is half as likely to occur. An RR of 1.0 means there is no apparent effect on risk.^{3}^{,5}

The RR is derived by dividing the absolute risk in the intervention group by the absolute risk in the control group. It should be expressed with confidence intervals—for example, RR 3.0 (95% CI: 2.5–3.8).

## Relative risk reduction

The relative risk reduction (RRR) is the amount by which the relative risk has been reduced as a result of treatment: RRR = 1 – RR. It is often expressed as a percentage. The RRR can sometimes lead to overestimation of the treatment effect. Using absolute risk reduction to represent an identical outcome will result in a lower numerical value than using RRR.^{3}^{,5}

## Standard deviation

Standard deviation is a statistical measure of the distance that a value is likely to lie from its average value (i.e. a measure of dispersion).

Plotting the frequency of variables such as height or weight in a large, random sample of people will result in a bell-shaped curve, known as a pattern of normal distribution. About 70% of people will fall within one standard deviation above and below the mean (see Normal distribution (bell-shaped curve)). The more widely the scores are spread, the greater the standard deviation and the greater the range of values (the difference between the highest and lowest values) in the sample.

Normal distribution (bell-shaped curve)

When two sets of results have the same mean, a larger standard deviation means more scatter and less precision in the results; a smaller standard deviation means the reverse.

## Statistical significance

The term statistically significant means that the findings of a study are unlikely to be a result of chance.^{3} Significance at the comonly cited 5% level (p <0.05) means that the observed result would occur by chance in only 1 in 20 repeated studies. Statistically significant does not necessarily mean clinically important. It is the size of the effect that determines the clinical importance, not the presence of statistical significance. Conversely, non‑significance does not mean ‘no effect'. Small studies often report non-significance even when there are important, real effects.

## Surrogate markers

Surrogate markers are proxy measures of effect that are used to predict a target clinical outcome on the basis of epidemiologic, therapeutic, pathophysiologic or other scientific evidence. In some cases, they are used because they will be easier, cheaper or quicker to measure than specific clinical outcomes (e.g. studies reporting lipid levels instead of cardiovascular events such as heart attack).^{5}^{,13}

For some medicines, the only outcomes available might be from trials using surrogate markers, and evidence of clinical outcomes can sometimes take several years to emerge.

Caution should be used when interpreting the findings of studies that use surrogates that have not been widely validated. For example, a new osteoporosis treatment might be shown to improve bone mineral density (a widely accepted surrogate marker) but also cause dizziness or drowsiness. These side effects could contribute to falls, potentially offsetting the beneficial effect on fracture rates (a clinical outcome) associated with the improved bone density.