Experimental Research

The major feature that distinguishes experimental research from other types of research is that the researcher manipulates the independent variable.  There are a number of experimental group designs in experimental research. Some of these qualify as experimental research, others do not.

  • In true experimental research, the researcher not only manipulates the independent variable, he or she also randomly assigned individuals to the various treatment categories (i.e., control and treatment).
  • In quasi experimental research, the researcher does not randomly assign subjects to treatment and control groups. In other words, the treatment is not distributed among participants randomly. In some cases, a researcher may randomly assigns one whole group to treatment and one whole group to control. In this case, quasi-experimental research involves using intact groups in an experiment, rather than assigning individuals at random to research conditions. (some researchers define this latter situation differently. For our course, we will allow this definition).
  • In causal comparative (ex post facto) research, the groups are already formed. It does not meet the standards of an experiment because the independent variable in not manipulated.

The statistics by themselves have no meaning. They only take on meaning within the design of your study. If we just examine stats, bread can be deadly.

The term validity is used three ways in research…

  1. In the sampling unit, we learn about external validity (generalizability).
  2. In the survey unit, we learn about instrument validity.
  3. In this unit, we learn about internal validity and external validity. Internal validity means that the differences that we were found between groups on the dependent variable in an experiment were directly related to what the researcher did to the independent variable, and not due to some other unintended variable (confounding variable). Simply stated, the question addressed by internal validity is “Was the study done well?” Once the researcher is satisfied that the study was done well and the independent variable caused the dependent variable (internal validity), then the research examines external validity (under what conditions [ecological] and with whom [population] can these results be replicated [Will I get the same results with a different group of people or under different circumstances?]). If a study is not internally valid, then considering external validity is a moot point (If the independent did not cause the dependent, then there is no point in applying the results [generalizing the results] to other situations.). Interestingly, as one tightens a study to control for treats to internal validity, one decreases the generalizability of the study (to whom and under what conditions one can generalize the results).

There are several common threats to internal validity in experimental research. They are described in our text.  I have review each below (this material is also included in the PowerPoint Presentation on Experimental Research for this unit):

  • Subject Characteristics (Selection Bias/Differential Selection) — The groups may have been different from the start. If you were testing instructional strategies to improve reading and one group enjoyed reading more than the other group, they may improve more in their reading because they enjoy it, rather than the instructional strategy you used.
  • Loss of Subjects (Mortality) — All of the high or low scoring subject may have dropped out or were missing from one of the groups. If we collected posttest data on a day when the honor society was on field trip at the treatment school, the mean for the treatment group would probably be much lower than it really should have been.
  • Location — Perhaps one group was at a disadvantage because of their location.  The city may have been demolishing a building next to one of the schools in our study and there are constant distractions which interferes with our treatment.
  • Instrumentation Instrument Decay — The testing instruments may not be scores similarly. Perhaps the person grading the posttest is fatigued and pays less attention to the last set of papers reviewed. It may be that those papers are from one of our groups and will received different scores than the earlier group’s papers
  • Data Collector Characteristics — The subjects of one group may react differently to the data collector than the other group. A male interviewing males and females about their attitudes toward a type of math instruction may not receive the same responses from females as a female interviewing females would.
  • Data Collector Bias — The person collecting data my favors one group, or some characteristic some subject possess, over another. A principal who favors strict classroom management may rate students’ attention under different teaching conditions with a bias toward one of the teaching conditions.
  • Testing — The act of taking a pretest or posttest may influence the results of the experiment. Suppose we were conducting a unit to increase student sensitivity to prejudice. As a pretest we have the control and treatment groups watch Shindler’s List and write a reaction essay. The pretest may have actually increased both groups’ sensitivity and we find that our treatment groups didn’t score any higher on a posttest given later than the control group did. If we hadn’t given the pretest, we might have seen differences in the groups at the end of the study.
  • History — Something may happen at one site during our study that influences the results. Perhaps a classmate dies in a car accident at the control site for a study teaching children bike safety. The control group may actually demonstrate more concern about bike safety than the treatment group.
  • Maturation –There may be natural changes in the subjects that can account for the changes found in a study. A critical thinking unit may appear more effective if it taught during a time when children are developing abstract reasoning.
  • Hawthorne Effect — The subjects may respond differently just because they are being studied. The name comes from a classic study in which researchers were studying the effect of lighting on worker productivity. As the intensity of the factor lights increased, so did the work productivity. One researcher suggested that they reverse the treatment and lower the lights. The productivity of the workers continued to increase. It appears that being observed by the researchers was increasing productivity, not the intensity of the lights.
  • John Henry Effect — One group may view that it is competition with the other group and may work harder than than they would under normal circumstances. This generally is applied to the control group “taking on” the treatment group. The terms refers to the classic story of John Henry laying railroad track.
  • Resentful Demoralization of the Control Group — The control group may become discouraged because it is not receiving the special attention that is given to the treatment group. They may perform lower than usual because of this.
  • Regression (Statistical Regression) — A class that scores particularly low can be expected to score slightly higher just by chance. Likewise, a class that scores particularly high, will have a tendency to score slightly lower by chance. The change in these scores may have nothing to do with the treatment.
  • Implementation –The treatment may not be implemented as intended. A study where teachers are asked to use student modeling techniques may not show positive results, not because modeling techniques don’t work, but because the teacher didn’t implement them or didn’t implement them as they were designed.
  • Compensatory Equalization of Treatment — Someone may feel sorry for the control group because they are not receiving much attention and give them special treatment. For example, a researcher could be studying the effect of laptop computers on students’ attitudes toward math. The teacher feels sorry for the class that doesn’t have computers and sponsors a popcorn party during math class. The control group begins to develop a more positive attitude about mathematics.
  • Experimental Treatment Diffusion — Sometimes the control group actually implements the treatment. If two different techniques are being tested in two different third grades in the same building, the teachers may share what they are doing. Unconsciously, the control may use of the techniques she or he learned from the treatment teacher.

When planning a study, it is important to consider the threats to interval validity as we finalize the study design. After we complete our study, we should reconsider each of the threats to internal validity as we review our data and draw conclusions.

Del Siegle, Ph.D.
Neag School of Education – University of Connecticut