External Validity | Educational Research Basics by Del Siegle | Neag School of Education

Note to EPSY 5601 Students: An understanding of the difference between population and ecological validity is sufficient. Mastery of the sub categories for each is not necessary for this course.

External Validity
(Generalizability)
–to whom can the results of the study be applied–

There are two types of study validity: internal (more applicable with experimental research) and external. This section covers external validity.

External validity involves the extent to which the results of a study can be generalized (applied) beyond the sample. In other words, can you apply what you found in your study to other people (population validity) or settings (ecological validity). A study of fifth graders in a rural school that found one method of teaching spelling was superior to another may not be applicable with third graders (population) in an urban school (ecological).

Threats to External Validity

Population Validity the extent to which the results of a study can be generalized from the specific sample that was studied to a larger group of subjects

the extent to which one can generalize from the study sample to a defined population–
If the sample is drawn from an accessible population, rather than the target population, generalizing the research results from the accessible population to the target population is risky.
2. the extent to which personological variables interact with treatment effects–
If the study is an experiment, it may be possible that different results might be found with students at different grades (a personological variable).

Ecological Validity the extent to which the results of an experiment can be generalized from the set of environmental conditions created by the researcher to other environmental conditions (settings and conditions).

Explicit description of the experimental treatment (not sufficiently described for others to replicate)
If the researcher fails to adequately describe how he or she conducted a study, it is difficult to determine whether the results are applicable to other settings.
Multiple-treatment interference (catalyst effect)
If a researcher were to apply several treatments, it is difficult to determine how well each of the treatments would work individually. It might be that only the combination of the treatments is effective.
Hawthorne effect (attention causes differences)
Subjects perform differently because they know they are being studied. “…External validity of the experiment is jeopardized because the findings might not generalize to a situation in which researchers or others who were involved in the research are not present” (Gall, Borg, & Gall, 1996, p. 475)
Novelty and disruption effect (anything different makes a difference)
A treatment may work because it is novel and the subjects respond to the uniqueness, rather than the actual treatment. The opposite may also occur, the treatment may not work because it is unique, but given time for the subjects to adjust to it, it might have worked.
Experimenter effect (it only works with this experimenter)
The treatment might have worked because of the person implementing it. Given a different person, the treatment might not work at all.
Pretest sensitization (pretest sets the stage)
A treatment might only work if a pretest is given. Because they have taken a pretest, the subjects may be more sensitive to the treatment. Had they not taken a pretest, the treatment would not have worked.
Posttest sensitization (posttest helps treatment “fall into place”)
The posttest can become a learning experience. “For example, the posttest might cause certain ideas presented during the treatment to ‘fall into place’ ” (p. 477). If the subjects had not taken a posttest, the treatment would not have worked.
Interaction of history and treatment effect (…to everything there is a time…)
Not only should researchers be cautious about generalizing to other population, caution should be taken to generalize to a different time period. As time passes, the conditions under which treatments work change.
Measurement of the dependent variable (maybe only works with M/C tests)
A treatment may only be evident with certain types of measurements. A teaching method may produce superior results when its effectiveness is tested with an essay test, but show no differences when the effectiveness is measured with a multiple choice test.
Interaction of time of measurement and treatment effect (it takes a while for the treatment to kick in)
It may be that the treatment effect does not occur until several weeks after the end of the treatment. In this situation, a posttest at the end of the treatment would show no impact, but a posttest a month later might show an impact.

Bracht, G. H., & Glass, G. V. (1968). The external validity of experiments. American Education Research Journal, 5, 437-474.
Gall, M. D., Borg, W. R., & Gall, J. P. (1996). Educational research: An introduction. White Plains, NY: Longman.

Del Siegle, Ph.D.
Neag School of Education – University of Connecticut
del.siegle@uconn.edu