Redefining Biostatistics – Managing Medical Uncertainties
From the epitome of crunching numbers, statistical science has traveled a long distance. It is time that it
is realized as a management science. This is especially true for biostatistics. I seek to define biostatistics
as the science of managing medical uncertanities1. This can provide a completely new orientation to the subject
and integrate it fully well into medical disciplines. How is this new definition justifiable?
Health differs greatly from person to person and in the same person from time to time. The variations are so
prominent that no two individuals have ever been exactly alike. Differences in facial features and morphologic
appearance help us to identify people uniquely. But more important for medicine are the profound variations in
physiologic functions. We all know that measurements such as hemoglobin level, cholesterol level, and heart rate
differ from person to person even in perfect health. Variations occur not only between individuals but also
within individuals from time to time. Diurnal variations in body temperature, blood pressure, and blood glucose
levels are normal. In addition, states such as shock, anger, and excitement affect most of us temporarily but
have the potential to produce long-term sequelae. In the presence of such large variation, it is not surprising
that a response to a stimulus such as a drug can seldom be exactly reproduced even in the same person.
Uncertainties resulting from these variations are an essential feature of the practice of medicine and deserve
to be recognized. Medicine is not just ingesting a drug. It involves intimate interaction with the patient. More
often than not, a large number of steps are taken before reaching to a treatment regimen. The patient’s history
is taken; measurements such as weight, blood pressure, and heart rate are recorded; physical examination is
carried out; and investigations such as the electrocardiogram (ECG), x-ray studies, blood glucose measurements,
and stool examination are done. In passing through these steps, the patient sometimes encounters many observers
and many instruments. Variations among them contribute their bit to the uncertainties in clinical practice. The
assessment of diagnosis, treatment, and prognosis can all go wrong. In order to highlight the large magnitude of
these uncertainties, a list of specific contributing factors is provided below. This shows how profound
uncertainties are and how important it is to delineate them and to contain their effect.
- Biologic variability
- Genetic variability
- Variation in behavioural and other host factors
- Environment variability
- Chance variability
- Sampling fluctuation
- Observer variability
- Variability in treatment strategies
- Instrument and laboratory variability
- Imperfect tools
- Incomplete information on the patient
- Poor compliance of the regimen
- Inadequate knowledge (epistemic, diagnostic, therapeutic and prognostic uncertainties, predictive and
Medical uncertainty at individual level is the potential fallibility of decisions regarding diagnosis, treatment
and prognosis of health conditions. At the group or community level, this comprises lack of assurance regarding
the role of primordial and proximal risk factors of various conditions of ill-health, and regarding the effect
of various preventive and moderating interventions. In both setups, a prominent component is the uncertainty
regarding the present state and future course of events.
Medical uncertainty is easy to handle when divided into aleatory and epistemic types. These terms may sound new
to medicine but are commonly used in seismic science and economics. Aleatory uncertainty in medicine arises from
endogenous factors such as inherent biological variation, environmental factors, socio-cultural and psychological
factors, random variation due to observers, instruments and laboratories, etc. Epistemic uncertainty arises from
lack of knowledge, conceptual errors, non-availability of tools, and biases of various types. These sources are
exogenous in nature.
Management of Uncertainties
Since the uncertainties are glaring, one wonders how medicine has been successful, sometimes very successful,
in giving succor to mankind. The silver lining is that a trend can still be detected among these variations, and
following this trend yields results within clinical tolerance in most cases. The term clinical tolerance
signifies that the medical intervention may not necessarily restore the system to its homeostatic level but
tends to bring it closer to that level so that the patient feels better, almost cured. Also note the emphasis
on ‘most cases’. Positive results are not obtained in all cases, nor is this expected. But a large percentage
responds to medical intervention. Thus, the statement is doubly probabilistic and truly statistical.
Essentially, management is a value addition process that tries to optimize the output by properly organizing
the inputs. It involves elements such as goal setting; identifying quality and quantity of inputs such as men,
machine, methods, material and money in a production line, and their adequate and timely provision; minimizing
risk opportunities and maximizing conducive environment for optimal functioning of the inputs; gauging
performance; and taking rectifying and promoting steps—thus starting the cycle all over again. Management is
a flexible process instead of adhering to consistency and conformity. It is an art of accomplishing an
assignment by translating complexity, specialization and talents into performace2.
Value addition in the case of management of medical uncertainties is in terms of their control so that the
impact of such uncertainties on decisions is minimal. Their description and assessment are integral part of
the process. Performance is the key in this case also as is for management anywhere else. The inputs are the
aleatory variations and epistemic bottlenecks. Study design is a tool that seeks to organize these inputs.
A perfect design, when immaculately executed, would minimize the risk of reaching to an invalid or unreliable
conclusion, and maximize the power of the study, for fixed inputs. Considerations such as definition of study
units and variables, sample size, method of selection, confounders, potential sources of bias including
reliability and validity of medical assessment, and the method of analysis of data, are the elements that
provide definite help in reducing the risk of reaching to an invalid conclusion. With tools such as probability
and its derivatives that include frequency distribution, sensitivity, specificity, relative risk and odds
ratio; estimation methods in terms of confidence interval and meta-analysis; test of hypothesis for absence
of medically important difference; and trend analysis that sieves clear signals from noise; biostatistics
fits the bill quite admirably. By considering various options, it awards flexibility instead of consistency
and conformity that could mar a management process. Decision analysis that allows infusion of value judgements
regarding utility of various possible outcomes to the evidence-based risk assessments at the stage of diagnosis
and treatment, is also an important function of biostatistical methods for managing medical uncertainty at
The sources of intrinsic variation listed earlier are mostly beyond control but their impact can still be
managed. Other sources of uncertainty also contribute but investigations can be designed such that their
influence on decisions is minimized. Elementary concepts are given in the following paragraphs as illustration.
The quality of decisions in the long run can also be enhanced by devising and using improved medical methods.
Both require substantial statistical inputs.
Real challenge in research is thrown by epistemic uncertainties that arise from inadequate medical knowledge.
Statements are sometimes made without realizing that they are assumptions. Gastric ulcer was thought to be
caused by acidity till it was established that the culprit is Helicobacter pylori in many cases. Thus even
fully established ‘facts’ should be continuously evaluated and replaced by new ones where needed.
A clinical trial aims to evaluate the efficacy of one or more treatment procedures—generally different drugs
or different dosages of the same drug—relative to one another. The ‘another’ could be ‘no treatment’ or
‘existing treatment’ and termed control. Among the precautions sometimes taken is matching of the subjects in
various groups so that the known sources of variability have less influence on the outcome. Another very effective
strategy is randomization, which equalizes the chance of the presence of different sources of uncertainty in
various groups including the unknown sources. The techniques of observation and measurement are standardized
and uniformly implemented to minimize the diverse influence of these techniques on the outcome. If identifiable
sources of uncertainty still remain uncontrolled, they are taken care of at the time of analysis by suitable
adjustments. Appropriate statistical methods help to come to a conclusion that has only a small likelihood of
These preliminaries are stated in the context of clinical trials, but other medical investigations, be it in
a community, in a clinic, or in a laboratory, have the same basic structure and require similar statistical
inputs. Descriptive research, such as to estimate the magnitude of the problem or to delineate normal levels
of, say, a hematological parameter in a specific population, also requires similar care in selection of
subjects and similar quality control of the instruments. The investigations into cause and effect or
association also need similar inputs to minimize the influence of uncertainty and thus to increase the
reliability of the conclusions.
Improved Medical Methods
Although health of each individual is important, and clinical practice must use the best available methods,
but research endeavors generally require especially improved methods that are more accurate and more exact.
This makes medical research an expensive proposition but compromise on improved methods can substantially
affect the quality of research. If such improved methods are not available, research may have to be redirected
to devise such methods.
Incomplete knowledge about a patient's condition can be substantially overcome by research into new methods that
are quicker, safer, and more accurate and that can be performed more easily. To fill gaps in medical knowledge,
research into more exact delineation of factors responsible for specific conditions of ill health and their
mechanism is required. All this will help devising strategies to minimize the uncertain space.
Some epistemic uncertainties can be minimized by using appropriate scoring system. Newer treatment regimens or
other modes of patient management need to be discovered so that better relief can be provided. Inadequacies in
medical tools such as diagnostic tests can be removed only by research on newer, more, valid, and more reliable
tools. Compliance with prescribed regimens can be improved by devising regimens that are simple to implement,
less toxic, and more effective. Instrument and observer variability can be controlled mostly by adhering to
strict standards and by training. Thus, improving the methods can minimize the uncertainties arising from these
deficiencies. Research into these requires scientific investigations so that the conclusions arrived at are
valid as well as reliable. Proper design of the investigation helps to achieve this aim.
Analysis and Synthesis
Because of the uncertainties involved at every stage of a medical investigation, the conclusion can seldom be
drawn in a straightforward manner. In almost all cases, the data obtained are carefully examined to find the
answers to the questions initially proposed. For this, it is generally necessary that the data be collated in
the form of tables, charts, or diagrams. Some summary measures are also chosen and computed to draw inferences.
Because of the inherent variations in the data and because only a sample of the subjects is investigated rather
than the entire target population, some special methods are required to draw valid conclusions— collectively
called techniques of statistical inference. These techniques depend on the type of questions asked, on the
design of the study, on the kind of measurements used, on the number of groups investigated, on the number of
subjects studied in each group, etc. These techniques are primary focus of biostatistics. The role of statistical
analysis is to help draw valid and reliable conclusions.
Although statistical analysis is acknowledged as an essential step in empirical research, the importance of
synthesis is sometimes overlooked. Synthesis is the process of combining and reconciling varied and sometimes
conflicting evidence. The findings of an investigation do not often match those in another investigation.
Diabetes, smoking habits and blood pressure levels were found to be significant factors in mortality in Italy
in one study but not in other studies in the same country3. Prevalence of hypertension in India was found to
range widely from 0.36 to 30.92 percent in a general population of adults4. These differences occur
for a variety of reasons such as genuine population differences; sampling fluctuation; differences in
definitions, methodology, and instruments; and differences in the statistical methods used. A major scientific
activity is to synthesize these varying results and arrive at a consensus. The discussion part of most articles
published in medical journals tries to do such a synthesis. The objective of most review articles is basically
to present a holistic view after reconciling the varying results in different studies. In addition, techniques
such as meta analysis seek to combine evidence from different studies. This text does not discuss the synthesis
methods, although these methods too are primarily statistical in nature and are important for medical research.
Aleatory uncertainties are the basic ingredient of statistical methods. These can be very adequately managed
by these methods. The same can not be stated about epistemic uncertainties. Sensitivity analysis5 can be
effectively used to delineate the impact of some epistemic uncertainties. However, they can be rarely minimized.
There are epistemic bottlenecks for which apparently there is no solution except further research. If the
underlying process of emergence and progression of a health condition is unclear, modeling will have to be
based on conjectures. They may or may not stand the test of the time. No science is available that can
adequately deal with the unknown except, to some extent, statistics that pools all these together under
‘error term’, and provides methods to examine them.
Considering all this, it seems very appropriate to define biostatistics as the science of managing medical
uncertainties. No definition is perfect, nor one that is convincing to every one, but this definition seems
to describe the subject in a very appropriate manner. Incidentally, this definition emphasizes that “bio” is an
integral part of biostatistics and exemplifies fusion of medicine with statistics that Feinstein6
emphasized so much. Conventionally, biostatistics has come to be identified with medicine rather than other
biological disciplines such as agriculture or fisheries—thus restricting it to the medical uncertainties looks
- Indrayan A. Medical Biostatistics, Second Edition. New York: CRC Press, 2008:p2.
- Magretta J. What Management Is. New York: The Free Press, 2002:pp1-4.
Menotti A, Seccareccia F. Cardiovascular risk factors predicting all causes of death in an occupational
population sample. Int J Epidemiol 1988; 17:773-778.
- Gupta R. Meta-analysis of prevalence of hypertension in India. Indian Heart J 1997; 49:43-48.
- Saltellie A, Chan K, Scott EM, editors. Sensitivity Analysis. New York: John Wiley, 2000.
- Feinstein AR. Clinical Biostatistics. Saint Louis: The CV Mosby Company, 1977:p4.