Annotated Bibliography of Research MethodsLocate and read peer-reviewed articles on a variety of research designs and methods. Be sure to include at least one article regarding the method used in the study you selected in Week One. Create an annotated bibliography in which you select, summarize, and evaluate at least six of these peer-reviewed sources. The sources chosen must have been published within the last 10 years. Note that the articles required for this assignment are not research studies; instead, they are articles about how to do various aspects of research.At least three of the sources should be about qualitative research methods, and at least three of the sources should focus on quantitative research methods. Do not select any articles about both qualitative and quantitative methods or about mixed methods. For each source, identify the approach (qualitative or quantitative) and the research design category (non-experimental or experimental). Summarize each source in your own words, describing its purpose and how the information would be useful to a researcher designing a new study. Do not use quotations from references for this assignment. In searching for articles on topics in psychology and other social sciences, it is strongly recommended that you utilize the EBSCOhost, ProQuest, and SAGE Journals Online databases.  You may only select articles that are available in full-text format.  To do this, limit your search criteria by selecting the Full-Text box in the search function of the database. Your paper must be a minimum of four pages (excluding title page) and formatted according to APA style.psy326_chapter03.pdfchapter 3
Descriptive Designs—
Observing Behavior
Jan Halaska/Ticket/Photolibrary
Chapter Contents
• Qualitative Methods
• Case Studies
• Archival Research
• Observational Research
• Describing Your Data
new66480_03_c03_p089-132.indd 89
10/31/11 9:39 AM
n the fall of 2009, Phoebe Prince and her family relocated from Ireland to South Hadley,
Massachusetts. Phoebe was immediately singled out by bullies at her new high school
and subjected to physical threats, insults about her Irish heritage, and harassing posts
on her Facebook page. This relentless bullying continued until January of 2010, ending
only because Phoebe elected to take her own life in order to escape her tormentors (UPI,
2011). Tragic stories like this one are all too common, and it should come as no surprise
that the Centers for Disease Control have identified bullying as a serious problem facing
our nation’s children and adolescents (CDC, 2002).
Scientific research on bullying began in Norway
in the late 1970s in response to a wave of teen suicides. Work begun by psychologist Dan Olweus—
and since continued by many others—has documented both the frequency and the consequences
of bullying in the school system. Thus, we know
that approximately one third of children are victims of bullying at some point during development, with between 5% and 10% bullied on a regular basis (Griffin & Gross, 2004; Nansel et al., 2001).
William Gottlieb/Corbis
Victimization by bullies has been linked with a
wide range of emotional and behavioral problems,
Most studies on bullying rely on self-report
including depression, anxiety, self-reported health
problems, and an increased risk of both violent
behavior and suicide (for a detailed review, see
Griffin & Gross, 2004). Recent research even suggests that bullying during adolescence may
have a lasting impact on the body’s physiological stress response (Hamilton et al., 2008).
But most of this research has a common limitation: It has studied the phenomenon of bullying using self-report survey measures. That is, researchers typically ask students and
teachers to describe the extent of bullying in the schools, and/or have students fill out a
collection of survey measures, describing in their own words both bullying experiences
and psychological functioning. These studies are conducted rigorously, and the measures they use certainly meet the criteria of reliability and validity that we discussed in
Chapter 2. However, as Wendy Craig, Professor of Psychology at Queen’s University, and
Debra Pepler, a Distinguished Professor at York University, suggested in a 1997 article,
this questionnaire approach is unable to capture the full context of bullying behaviors.
And, as we have already discussed, self-report measures are fully dependent on people’s
ability to answer honestly and accurately. In order to address this limitation, Craig and
Pepler (1997) decided to observe bullying behaviors as they occurred naturally on the
playground. Among other things, the researchers found that acts of bullying occurred
approximately every 7 minutes, lasted only about 38 seconds, and tended to occur within
120 feet of the school building. They also found that peers intervened to try to stop the
bullying more than twice as often as adults did (11% vs. 4%, respectively). These findings
add significantly to scientific understanding of when and how bullying occurs. And for
our purposes, the most notable thing about them is that none of the findings could have
been documented without directly observing and recording bullying behaviors on the
playground. By using this technique, the researchers were able to gain a more thorough
understanding of the phenomenon of bullying and thus able to provide real-world advice
to teachers and parents.
new66480_03_c03_p089-132.indd 90
10/31/11 9:39 AM
Section 3.1 Qualitative Methods
One recurring theme in this book is that it is absolutely critical to pick the right research
design to address your hypothesis. Over the next three chapters, we will be discussing
three specific categories of research designs, proceeding in order of increasing control over
elements of the design. This chapter focuses on descriptive research designs, in which the
primary goal is to describe attitudes and behavior. We will begin by contrasting qualitative and quantitative approaches to description. We will then discuss three examples of
descriptive designs—case studies, archival research, and observational research—covering the basic concept and the pros and cons of each. Finally, this chapter concludes with
a discussion of guidelines for presenting descriptive data in graphical, numerical, and
narrative form.
Figure 3.1: Descriptive Designs on the Continuum of Control
• Case Study
• Archival Research
• Observation
• Survey Research
• Quasi-experiments
• “True” Experiments
Increasing Control . . .
3.1 Qualitative Methods
e learned in Chapter 1 that researchers generally take one of two broad
approaches to answering their research questions. Quantitative research is a
systematic and empirical approach that attempts to generalize results to other
contexts, whereas qualitative research is a more descriptive approach that attempts to
gain a deep understanding of particular cases and contexts. Before we discuss specific
examples of descriptive designs, it is important to understand that these can represent
either quantitative or qualitative perspectives. In this section, we examine the qualitative
approach in more detail.
In Chapter 1 we used the analogy of studying traffic patterns to contrast qualitative and
quantitative methods—a qualitative researcher would likely study a single busy intersection in detail. This illustrates a key point about this approach: Qualitative researchers
are focused on interpreting and making sense out of what they observe rather than trying to simplify and quantify these observations. In general, qualitative research involves
a collection of interviews and observations made in a natural setting. Regardless of the
overall approach (qualitative or quantitative), collecting data in the real world results in
less control and structure than does collecting data in a laboratory setting. But whereas
new66480_03_c03_p089-132.indd 91
10/31/11 9:39 AM
Section 3.2 Case Studies
quantitative researchers might view reduced control as a threat to reliability and validity,
qualitative researchers view it as a strength of the study. By conducting observations in a
natural setting, it is possible to capture people’s natural and unfiltered responses.
As an example, consider two studies on the ways people respond to traumatic events.
In a 1993 paper, psychologists James Pennebaker and Kent Harber took a quantitative
approach to examining the community-wide impact of the 1989 Loma Prieta earthquake
(near San Francisco). These researchers conducted phone surveys of 789 area residents,
asking people to indicate, using a 10-point scale, how often they “thought about” and
“talked about” the earthquake over the 3-month period after its occurrence. In analyzing these data, Pennebaker and Harber discovered that people tend to stop talking about
traumatic events about 2 weeks after they occurred but keep thinking about the event for
approximately 4 more weeks. That is, the event is still on people’s minds, but they decide
to stop discussing it with other people. In a follow-up study using the 1991 Gulf War,
these researchers found that this conflict leads to an increased risk of illness (Pennebaker
& Harber, 1991). Thus, the goal of the study was to gather data in a controlled manner and
test a set of hypotheses about community responses to trauma.
Contrast this approach with the more qualitative one taken by the developmental psychologist Paul Miller and colleagues (in press), who used a qualitative approach to studying the ways that parents model coping behavior for their children. These researchers
conducted semistructured interviews of 24 parents whose families had been evacuated
following the 2007 wildfires in San Diego County and an additional 32 parents whose
families had been evacuated following a 2008 series of deadly tornadoes in Tennessee.
Due to a lack of prior research on how parents teach their children to cope with trauma,
Miller and colleagues approached their interviews with the goal of “documenting and
describing” (p. 8) these processes. That is, rather than attempt to impose structure and test
a strict hypothesis, the researchers focused on learning from these interviews and letting
the interviewees’ perspectives drive the acquisition of knowledge.
In the following three sections, we examine three specific examples of descriptive designs—
case studies, archival research, and observational research. Because each of these methods
has the goal of describing attitudes, feelings, and behaviors, each one can be used from
either a quantitative or a qualitative perspective. In other words, qualitative and quantitative researchers use many of the same general methods but do so with different goals. To
illustrate this flexibility, we will end each section with a paragraph that contrasts qualitative and quantitative uses of the particular method.
3.2 Case Studies
t the 1996 meeting of the American Psychological Association, James Pennebaker—chair of the Psychology department at The University of Texas—delivered an invited address, describing his research on the benefits of therapeutic
writing. Rather than follow the expected route of showing graphs and statistical tests to
support his arguments, Pennebaker told a story. In the mid-1980s, when Pennebaker’s
lab was starting to study the effects of structured writing on physical and psychological
new66480_03_c03_p089-132.indd 92
10/31/11 9:39 AM
Section 3.2 Case Studies
health, one study participant was an American soldier who had served in the Vietnam
War. Like many others, this soldier had had difficulty adjusting to what had happened
during the war and consequent trouble reintegrating into “normal” life. In Pennebaker’s
study, he was asked to simply spend 15 minutes per day, over the course of a week, writing about a traumatic experience—in this case, his tour of duty in Vietnam. At the end of
this week, as you might expect, this veteran felt awful; these were unpleasant memories
that he had not relived in over a decade. But over the next few weeks, amazing things
started to happen: He slept better; he made fewer
visits to his doctor; he even reconnected with his
wife after a long separation!
Pennebaker’s presentation was a case study,
which provides a detailed, in-depth analysis
of one person over a period of time. Although
this case study was collected as part of a larger
quantitative experiment, case studies are usually
conducted in a therapeutic setting and involve a
series of interviews. An interviewer will typically
study the subject in detail, recording everything
from direct quotes and observations to his or her
own interpretations. We encountered this technique briefly in Chapter 2, in discussing Oliver
Sacks’s case studies of individuals learning to live
with neurological impairments.
Case studies involve a researcher
conducting a series of interivews and
usually take place in a therapeutic setting.
Pros and Cons of Case Studies
Case studies in psychology are a form of qualitative research and represent the lowest point
on our continuum of control. Because they involve one person at a time, without a control
group, case studies are often unsystematic. That is, the participants are chosen because they
tell a compelling story or because they represent an unusual set of circumstances, rather
than being selected randomly. Studying these individuals allows for a great deal of exploration, which can often inspire future research. However, it is nearly impossible to generalize
from one case study to the larger population. In addition, because the case study includes
both direct observation and the researcher’s interpretation, there is a risk that a researcher’s
biases might influence the interpretations. For example, Pennebaker’s investment in demonstrating that writing has health benefits could have led to more positive interpretations of
the Vietnam vet’s outcomes. However, in this particular case study, the benefits of writing
mirror those seen in hundreds of controlled experimental studies that involved thousands
of people, so we can feel confident in the conclusions from the single case.
Case studies have two distinct advantages over other forms of research. First is the simple
fact that anecdotes are persuasive. Despite Pennebaker’s nontraditional approach to a
scientific talk, the audience came away utterly convinced of the benefits of therapeutic
writing. And despite the fact that Oliver Sacks studies one neurological patient at a time,
the stories in his books shed very convincing light on the ability of humans to adapt to
their circumstances. Second, case studies provide a useful way to study rare populations
and individuals with rare conditions. For example, from a scientific point of view, the
new66480_03_c03_p089-132.indd 93
10/31/11 9:39 AM
Section 3.2 Case Studies
ideal might be to gather a random sample of individuals living with severe memory
impairment due to alcohol abuse and conduct some sort of controlled study in a laboratory environment. This approach could allow us to make causal statements about the
results, as we will discuss in Chapter 5. But from a practical point of view, this study
would be nearly impossible to conduct, making case studies such as Sacks’s interviews
with William Thompson the best strategy for understanding this condition in depth.
Examples of Case Studies
Throughout the history of psychology, case studies have been used to address a number of important questions and to provide a starting point for controlled quantitative
studies. For example, in developing his theories of cognitive development, the Swiss
psychologist Jean Piaget studied the way that his own children developed and changed
their thinking styles. Piaget proposed that children would progress through a series of
four stages in the way that they approached the world—sensorimotor, preoperational,
concrete operational, and formal operational—with each stage involving more sophisticated cognitive skills than the previous stage. By observing his own children, Piaget
noticed preliminary support for this theory and later was able to conduct more controlled research with larger populations.
Perhaps one of the most famous case studies in
psychology is the story of Phineas Gage, a 19thcentury railroad worker who suffered severe
brain damage. In September of 1848, Gage was
working with a team to blast large sections of
rock to make way for new rail lines. After a large
hole was drilled into a section of rock, Gage’s job
was to pack the hole with gunpowder, sand, and
a fuse and then tamp it down with a long cylindriEverett Collection
cal iron rod (known as a “tamping rod”). On this
particular occasion, it seems Gage forgot to pack
Various views show an iron rod embedded
in the sand. So, when the iron rod struck gunpowin Phineas Gage’s (1823-1860) skull.
der, the powder exploded, sending the 3-foot long
iron rod through his face, behind his left eye, and
out the top of his head. Against all odds, Gage survived this incident with relatively few
physical side effects. However, everyone around him noticed that his personality had
changed—Gage became more impulsive, violent, and argumentative. Gage’s physician,
John Harlow, reported the details of this case in an 1868 article. The following passage is a
great example of the rich detail that is often characteristic of case studies:
He is fitful, irreverent, indulging at times in the grossest profanity (which was
not previously his custom), manifesting but little deference for his fellows,
impatient of restraint or advice when it conflicts with his desires. A child in
his intellectual capacity and manifestations, he has the animal passions of a
strong man. Previous to his injury, although untrained in the schools, he possessed a well-balanced mind, and was looked upon by those who knew him as
a shrewd, smart businessman, very energetic and persistent in executing all his
new66480_03_c03_p089-132.indd 94
10/31/11 9:39 AM
Section 3.2 Case Studies
plans of operation. In this regard his mind was radically changed, so decidedly
that his friends and acquaintances said he was “no longer Gage.” (Harlow,
1868, pp. 339–342)
Gage’s transformation ultimately inspired a large body of work in psychology and neuroscience that attempts to understand the connections between brain areas and personality. The area of his brain destroyed by the tamping rod is known as the frontal lobe,
now understood to play a critical role in impulse control, planning, and other high-level
thought processes. Gage’s story is a perfect illustration of the pros and cons of case studies: On the one hand, it is difficult to determine exactly how much the brain injury affected
his behavior because he is only one person. On the other hand, Gage’s tragedy inspired
researchers to think about the connections among mind, brain, and personality. As a result,
we now have a vast—and still growing—understanding of the brain. This illustrates a key
point about case studies: Although individual cases provide limited knowledge about
people in general, these cases often lead researchers to conduct additional work that does
lead to generalizable knowledge.
Qualitative versus Quantitative Approaches
Case studies tend to be qualitative more often than not: The goal of this method is to
study a particular case in depth, as a way to learn more about a rare phenomenon. In
both Pennebaker’s study of the Vietnam veteran and Harlow’s study of Phineas Gage,
the researcher approached the interview process as a way to gather information and learn
from the bottom up about the interviewee’s experience. However, it is certainly possible
for a case study to represent quantitative research. This is often the case when researchers conduct a series of case studies, learning from the first one of the first few and then
developing hypotheses to test on future cases. For example, a researcher could use the
case of Phineas Gage as a starting point for hypotheses about frontal lobe injury, perhaps
predicting that other cases would show similar changes in personality. Another way in
which case studies can add a quantitative element is for researchers to conduct analyses
within a single subject. For example, a researcher could study a patient with brain damage for several years following an injury, tracking the association between deterioration
of brain regions with changes in personality and emotional responses. At the end of the
day, though, these examples would still suffer from the primary downside of case studies:
Because they study a single individual, it is difficult to generalize findings.
new66480_03_c03_p089-132.indd 95
10/31/11 9:39 AM
Section 3.2 Case Studies
Research: Thinking Critically
By the Peninsula College of Medicine and Dentistry
Attending frequently with medically unexplained symptoms is distressing for both patient and doctor and effective treatment or management options are limited: one in five patients has symptoms
that remain unexplained by conventional medicine. Studies have shown that the cost to the NHS
[National Health Service] of managing the treatment of a patient with medically unexplained symptoms can be twice that of a patient with a diagnosis.
A research team from the Institute of Health Services Research, Peninsula Medical School, University
of Exeter, has carried out a randomised control trial and a linked interview study regarding 80 such
patients from GP [General Practitioner] practices across London, to investigate their experiences of
having five-element acupuncture added to their usual care. This is the first trial of traditional acupuncture for people with unexplained symptoms.
The results of the research are published in the British Journal of General Practice. They reveal that
acupuncture had a significant and sustained benefit for these patients and consequently acupuncture could be safely added to the therapies used by practitioners when treating frequently attending
patients with medically unexplained symptoms.
The patient group was made up of 80 adults, 80% female with an average age of 50 years and from a
variety of ethnic backgrounds who had consulted their GP at least eight times in the past year. Nearly
60% reported musculoskeletal health problems, of which almost two thirds had been present for a year.
In the 3 months before taking part in the study, the 80 patients had accounted for the following NHS
experiences: 21 patient in-days; 106 outpatient clinic visits; 52 hospital clinic visits (for treatments
such as physiotherapy, chiropody, and counselling); 44 hospital visits for investigations (including
10 magnetic resonance imaging—MRI—scans); and 75 visits to non–NHS practitioners such as opticians, dentists, and complementary therapists.
The patients were randomly divided into an acupuncture group and a control group. Eight acupuncturists administered individual five-element acupuncture to the acupuncture group immediately, up
to 12 sessions over 26 weeks. The same numbers of treatments were made available to the control
group after 26 weeks.
At 26 weeks the patients were asked to complete a number of questionnaires including the individualised health status questionnaire “Measure Yourself Medical Outcome Profile.”
The acupuncture group registered a significantly improved overall score when compared with the
control group. They also recorded improved well-being but did not show any change in GP and other
clinical visits and the number of medications they were taking. Between 26 and 52 weeks the acupuncture group maintained their improvement and the control group, now receiving their acupuncture treatments, showed a ‘catch up’ improvement.
The associated qualitative study, which focused on the patients’ experiences, supported the quantitative work.
This element identified that the participating patients had a variety of long-standing symptoms and
disability including chronic pain, fatigue, and emotional problems which affected their ability to
work, socialize, and carry out everyday tasks. A lack of a convincing diagnosis to explain their symptoms led to frustration, worry, and low mood.
Participating patients reported that their acupuncture consultations became increasingly valuable.
They appreciated the amount of time they had with each acupuncturist and the interactive and holistic nature of the sessions—there was a sense that the practitioners were listening to their concerns
and, via therapy, doing something positive about them. (continued)
new66480_03_c03_p089-132.indd 96
10/31/11 9:39 AM
Section 3.3 Archival Research
Research: Thinking Critically (continued)
As a result, many patients were encouraged to take an active role in their treatment, resulting in cognitive and behavioural lifestyle changes, such as a new self-awareness about what caused stress in
their lives, and a subsequent ability to deal with stress more effectively; and taking their own initiatives based on advice from the acupuncturists about diet, exercise, relaxation, and social activities.
Comments from participating patients included: “the energy is the main thing I have noticed. You
know, yeah, it’s marvellous! Where I was going out and cutting my grass, now I’m going out and cutting my neighbour’s after because he’s elderly”; “I had to reduce my medication. That’s the big help
actually, because medication was giving me more trouble . . . side effects”; and “It kind of boosts
you, somehow or another.”
Dr. Charlotte Paterson, who managed the randomised control trial and the longitudinal study of
patients’ experiences, commented: “Our research indicates that the addition of up to 12 five-element acupuncture consultations to the usual care experienced by the patients in the trial was feasible and acceptable and resulted in improved overall well-being that was sustained for up to a year.
This is the first trial to investigate the effectiveness of acupuncture treatment to those with unexplained symptoms, and the next development will be to carry out a cost-effectiveness study with a
longer follow-up period. While further studies are required, this particular study suggests that GPs
may recommend a series of five-element acupuncture consultations to patients with unexplained
symptoms as a safe and potentially effective intervention.
She added: “Such intervention could not only result in potential resource savings for the NHS, but
would also improve the quality of life for a group of patients for whom traditional biomedicine has
little in the way of effective diagnosis and treatment.”
Think about it:
1. In this study, researchers interviewed acupuncture patients using open-ended questions and
recorded their verbal responses, which is a common qualitative research technique. What
advantages does this approach have over administering a quantitative questionnaire with
multiple-choice items?
2. What are some advantages of adding a qualitative element to a controlled medical trial like this?
3. What would be some disadvantages of relying exclusively on this approach?
3.3 Archival Research
oving slightly further along the continuum of control, we come to archival
research, which involves drawing conclusions by analyzing existing sources
of data, including both public and private records. Sociologist David Phillips
(1997) hypothesized that media coverage of suicides would lead to “copycat” suicides. He
tested this hypothesis by gathering archival data from two sources: front-page newspaper
articles devoted to high-profile suicides and the number of fatalities in the 11-day period
following coverage of the suicide. By examining these patterns of data, Phillips found
support for his hypothesis. Specifically, fatalities appeared to peak 3 days after coverage
of a suicide, and increased publicity was associated with a greater peak in fatalities.
new66480_03_c03_p089-132.indd 97
10/31/11 9:39 AM
Section 3.3 Archival Research
Pros and Cons of Archival Research
It is difficult to imagine a better way to test Phillips’s hypothesis about copycat suicides.
You could never randomly assign people to learn about suicides and then wait to see
whether they killed themselves. Nor could you interview people right before they commit
suicide to determine whether they were inspired by media coverage. Archival research
provides a test of the hypothesis by examining existing data and, thereby, avoids most of
the ethical and practical problems of other research designs. Related to this point, archival
research also neatly sidesteps issues of participant reactivity, or the tendency of people to behave differently when they are aware of being observed. Any time you conduct
research in a laboratory, participants are aware that they are in a research study and may
not behave in a completely natural manner. In contrast, archival data involves making
use of records of people’s natural behaviors. The
subjects of Phillips’s study of copycat suicides
were individuals who decided to kill themselves,
who had no awareness that they would be part of
a research study.
Archival research is also an excellent strategy
for examining trends and changes over time. For
example, much of the evidence for global warming
comes from observing upward trends in recorded
temperatures around the globe. To gather this evidence, researchers dig into existing archives of
weather patterns and conduct statistical tests on
the changes over time. Psychologists and other
social scientists also make use of this approach
to examine population-level changes in everything from suicide rates to voting patterns over
time. These comparisons can sometimes involve a
blend of archival and current data. For example, a
great deal of social psychology research has been
dedicated to understanding people’s stereotypes
Joe Giron/Corbis
about other groups. In a classic series of studies
known as the “Princeton Trilogy,” researchers doc- Copycat suicides often peak 3 days after
umented the stereotypes held by Princeton stu- media coverage of a high profile suicide,
dents over a 25-year period (1933 to 1969). Social such as when Nirvana’s Kurt Cobain killed
psychologist Stephanie Madon and her colleagues himself in 1994.
(2001) collected a new round of data but also conducted a new analysis of this archival data. These
new analyses suggested that, over time, people have become more willing to use stereotypes about other groups, even as stereotypes themselves have become less negative.
One final advantage of archival research is that once you manage to gain access to the
relevant archives, it requires relatively few resources. The typical laboratory experiment
involves one participant at a time, sometimes requiring the dedicated attention of more
than one research assistant over a period of an hour or more. But once you have assembled your data from the archives, it is a relatively simple matter to conduct statistical
analyses. In a 2001 article, the psychologists Shannon Stirman and James Pennebaker
new66480_03_c03_p089-132.indd 98
10/31/11 9:39 AM
Section 3.3 Archival Research
used a text analysis computer program to compare the language of poets who committed suicide (e.g., Sylvia Plath) with the language of similar poets who had not committed suicide (e.g. Denise Levertov). In total, these researchers examined 300 poems from
20 poets, half of whom had committed suicide. Consistent with Durkheim’s theory of
suicide as a form of “social disengagement,” Stirman and Pennebaker (2001) found that
suicidal poets used more self-references and fewer references to other people in their
poems. But here’s the impressive part: Once they had assembled their archive of poems,
it took only seconds for their computer program to analyze the language and generate a
statistical profile of each poet.
Overall, however, archival research is still relatively low on our continuum of control. As
a researcher, you have to accept the archival data in whatever form they exist, with no
control over the way they were collected. For instance, in Stephanie Madon’s (2001) reanalysis of the “Princeton Trilogy” data, she had to trust that the original researchers had
collected the data in a reasonable and unbiased way. In addition, because archival data
often represent natural behavior, it can be difficult to categorize and organize responses
in a meaningful and quantitative way. The upshot is that archival research often requires
some creativity on the researcher’s part—such as analyzing poetry using a text analysis
program. In many cases, as we discuss next, the process of analyzing archives involves
developing a coding strategy for extracting the most relevant information.
Content Analysis—Analyzing Archives
In most of our examples so far, the data come in a straightforward, ready-to-analyze form.
That is, it is relatively simple to count the number of suicides, track the average temperature, or compare responses to questionnaires about stereotyping over time. In other cases,
the data can come in a sloppy, disorganized mass of information. What do you do if you
want to analyze literature, media images, or changes in race relations on television? These
types of data can yield incredibly useful information, provided you can develop a strategy
for extracting it.
Mark Frank and Tom Gilovich—both psychologists at Cornell University—were interested in whether cultural associations with the color black would have an effect on behavior. In virtually all cultures, black is associated with evil—the bad guys wear black hats;
we have a “black day” when things turn sour; and we are excluded from social groups
by being blacklisted or blackballed. Frank and Gilovich (1988) wondered whether “a cue
as subtle as the color of a person’s clothing” (p. 74) would influence aggressive behavior.
To test this hypothesis, they examined aggressive behaviors in professional football and
hockey games, comparing teams whose uniforms were black to teams who wore other
colors. Imagine for a moment that this was your research study. Professional sporting
events contain a wealth of behaviors and events. How would you extract information on
the relationship between uniform color and aggressive behavior?
Frank and Gilovich (1988) solved this problem by examining public records of penalty
yards (football) and penalty minutes (hockey) because these represent instances of punishment for excessively aggressive behavior, as recognized by the referees. And, in both
sports the size of the penalty increases according to the degree of aggression. These penalty records were obtained from the central offices of both leagues, covering the period
from 1970 to 1986. Consistent with their hypothesis, teams with black uniforms were
new66480_03_c03_p089-132.indd 99
10/31/11 9:39 AM
Section 3.3 Archival Research
“uncommonly aggressive” (p. 76). Most strikingly, two NHL hockey teams changed their
uniforms to black during the period under study and showed a marked increase in penalty minutes with the new uniforms!
But even this analysis is relatively straightforward in that it involved data that were
already in quantitative form (penalty yards and minutes). In many cases, the starting point
is a messy collection of human behavior. In a pair of journal articles, psychologist Russell
Weigel and colleagues (1980; 1995) examined the portrayal of race relations on prime-time
television. In order to do this, they had to make several critical decisions about what to
analyze and how to quantify it. The process of systematically extracting and analyzing the
contents of a collection of information is known as content analysis. In essence, content
analysis involves developing a plan to code and record specific behaviors and events in a
consistent way. We can break this down into a three-step process:
Step 1—Identify Relevant Archives
Before we develop our coding scheme, we have to start by finding the most appropriate
source of data. Sometimes the choice is fairly obvious: If you want to compare temperature
trends, the most relevant archives will be weather records. If you want to track changes in
stereotyping over time, the most relevant archive is questionnaire data assessing people’s
attitudes. In other cases, this decision involves careful consideration of both your research
question and practical concerns. Frank and Gilovich decided to study penalties in professional sports because these data were both readily available (from the central league
offices) and highly relevant to their hypothesis about aggression and uniform color.
A personal letter is an example of a data
source that a researcher would need to
obtain permission to use.
new66480_03_c03_p089-132.indd 100
Because these penalty records were publicly
available, the researchers were able to access them
easily. But if your research question involved sensitive or personal information—such as hospital
records or personal correspondence—you would
need to obtain permission from a responsible
party. Let’s say you wanted to analyze the love
letters written by soldiers serving overseas and
then try to predict relationship stability. Because
these letters would be personal, perhaps rather
intimate, you would need permission from each
person involved before proceeding with the
study. Or, say you wanted to analyze the correlation between the length of a person’s hospital stay
and the number of visitors he or she receives. This
would most likely require permission from both
hospital administrators, doctors, and the patients
themselves. However you manage to obtain
access to private records, it is absolutely essential to protect the privacy and anonymity of the
people involved. This would mean, for example,
using pseudonyms and/or removing names and
other identifiers from published excerpts of personal letters.
10/31/11 9:39 AM
Section 3.3 Archival Research
Step 2—Sample from the Archives
In Weigel’s research on race relations, the most obvious choice of archives was to take
snippets of both television programming and commercials. But this decision was only
the first step of the process. Should they examine every second of every program ever
aired on television? Naturally not; instead, their approach was to take a smaller sample
of television programming. We will discuss sampling in more detail in Chapter 4, but the
basic process involves taking a smaller, representative collection of the broader population in order to conserve resources. Weigel and colleagues (1980) decided to sample one
week’s worth of prime-time programming from 1978, assembling videotapes of everything broadcast by the three major networks at the time (CBS, NBC, and ABC). They narrowed their sample by eliminating news, sports, and documentary programming because
their hypotheses were centered on portrayals of fictional characters of different races.
Step 3—Code and Analyze the Archives
The third and most involved step is to develop a system for coding and analyzing the
archival data. Even a sample of one week’s worth of prime-time programming contains
a near-infinite amount of information! In the race-relations studies, Weigel et al. elected
to code four key variables: (1) the total human appearance time, or time during which
people were on-screen; (2) the black appearance time, in which black characters appeared
on-screen; (3) the cross-racial appearance time, in which characters of two races were onscreen at the same time; and (4) the cross-racial interaction time, in which cross-racial
characters interacted. In the original (1980) paper, these authors reported that black characters were shown only 9% of the time, and cross-racial interactions only 2% of the time.
Fortunately, by the time of their 1995 follow-up study, the rate of black appearances had
doubled, and the rate of cross-racial interactions had more than tripled. However, there
was depressingly little change in some of the qualitative dimensions that they measured,
including the degree of emotional connection between characters of different races.
This study also highlights the variety of options for coding complex behaviors. The four
key ratings of “appearance time” consist of simply recording the amount of time that each
person or group is on-screen. In addition, the researchers assessed several abstract qualities of interaction using judges’ ratings. The degree of emotional connection, for instance,
was measured by having judges rate the “extent to which cross-racial interactions were
characterized by conditions promoting mutual respect and understanding” (Weigel et al.,
1980, p. 888). As you’ll remember from Chapter 2, any time you use judges’ ratings, it is
important to collect ratings from more than one rater and to make sure they agree in their
Your goal as a researcher is to find a systematic way to record the variables most relevant
to your hypothesis. As with any research design, the key is to start with clear operational
definitions that capture the variables of interest. This involves both deciding the most
appropriate variables and the best way to measure these variables. For example, if you
analyze written communication, you might decide to compare words, sentences, characters, or themes across the sample. A study of newspaper coverage might code the amount
of space or number of stories dedicated to a topic. Or a study of television news might
code the amount of airtime given to different positions. The best strategy in each case will
be the one that best represents the variables of interest.
new66480_03_c03_p089-132.indd 101
10/31/11 9:39 AM
Section 3.4 Observational Research
Qualitative versus Quantitative Approaches
Archival research can represent either qualitative or quantitative research, depending on
the researcher’s approach to the archives. Most of our examples in this section represent the quantitative approach: Frank and Gilovich (1988) counted penalties to test their
hypothesis about aggression; and Stirman and Pennebaker (2001) counted words to test
their hypothesis about suicide. But the race-relations work by Weigel and colleagues (1980;
1995) represents a nice mix of qualitative and quantitative research. In their initial 1980
study, the primary goal was to document the portrayal of race relations on prime-time
television (i.e., qualitative). But in the 1995 follow-up study, the primary goal was to determine whether these portrayals had changed over a 15-year period. That is, they tested the
hypothesis that race relations were portrayed in a more positive light (i.e., quantitative).
Another way in which archival research can be qualitative is to study open-ended narratives, without attempting to impose structure upon them. This approach is commonly
used to study free-flowing text, such as personal correspondence or letters to the editor in a newspaper. A researcher approaching these from a qualitative perspective would
attempt to learn from these narratives, without attempting to impose structure via the use
of content analyses.
3.4 Observational Research
oving further along the continuum of control, we come to the descriptive design
with the greatest amount of researcher control. Observational research involves
studies that directly observe behavior and record these observations in an objective and systematic way. In your previous psychology courses, you may have encountered
the concept of attachment theory, which argues that an infant’s bond with his or her primary
caregiver has implications for later social and emotional development. Mary Ainsworth, a
Canadian developmental psychologist, and John Bowlby, a British psychologist and psychiatrist, articulated this theory in the early 1960s,
arguing that children can form either “secure” or a
variety of “insecure” attachments with their caregivers (Ainsworth & Bell, 1970; Bowlby, 1963).
In order to assess these classifications, Ainsworth
and Bell (1970) developed an observational technique called the “strange situation.” Mothers
would arrive at their laboratory with their children for a series of structured interactions, including having the mother play with the infant, leave
him alone with a stranger, and then return to the
room after a brief absence. The researchers were
most interested in coding the ways in which the
infant responded to the various episodes (8, in
total). One group of infants, for example, showed
curiosity when the mother left but then returned
to playing with their toys, trusting that she
would return. Another group showed immediate
new66480_03_c03_p089-132.indd 102
Doctor Stock/Science Faction/Corbis
Observational research can be used to
measure an infant’s attachment to a
10/31/11 9:39 AM
Section 3.4 Observational Research
distress when the mother left and clung to her nervously upon her return. Based on these
and other behavioral observations, Ainsworth and colleagues classified these groups of
infants as “secure” and “insecurely” attached to their mothers, respectively.
Research: Making an Impact
Harry Harlow
In the 1950s, U.S. psychologist Harry Harlow conducted a landmark series of studies with rhesus
monkeys on the mother–infant bond. While his research would be considered unethical by contemporary standards, the results of his work revealed the importance of affection, attachment, and love
on healthy childhood development.
Prior to Harlow’s findings, it was believed that infants attached to their mothers as a part of a drive
to fulfill exclusively biological needs, in this case obtaining food and water and to avoid pain (Herman, 2007; van der Horst, & van der Veer, 2008). In an effort to clarify the reasons that infants so
clearly need maternal care, Harlow removed rhesus monkeys from their natural mothers several
hours after birth, giving the young monkeys a choice between two surrogate “mothers.” Both mothers were made of wire, but one was bare and one was covered in terry cloth. Although the wire
mother provided food via an attached bottle, the monkeys preferred the softer, terry-cloth mother,
even though the latter provided no food (Harlow & Zimmerman, 1958; Herman, 2007).
Further research with the terry-cloth mothers contributed to the understanding of healthy attachment and childhood development (van der Horst & van der Veer, 2008). When the young monkeys
were given the option to explore a room with their terry-cloth mothers and had the cloth mothers in
the room with them, they used the mothers as a safe base. Similarly, when exposed to novel stimuli
such as a loud noise, the monkeys would seek comfort from the cloth-covered surrogate (Harlow &
Zimmerman, 1958). However, when the monkeys were left in the room without their cloth mothers,
they reacted poorly—freezing up, crouching, crying, and screaming.
A control group of monkeys who were never exposed to either their real mothers or one of the surrogates revealed stunted forms of attachment and affection. They were left incapable of forming
lasting emotional attachments with other monkeys (Herman, 2007). Based on this research, Harlow
discovered the importance of proper emotional attachment, stressing the importance of physical
and emotional bonding between infants and mothers (Harlow & Zimmerman, 1958; Herman, 2007).
Harlow’s influential research led to improved understanding of maternal bonding and child development (Herman, 2007). His research paved the way for improvements in infant and child care and in
helping children cope with separation from their mothers (Bretherton, 1992; Du Plessis, 2009). In
addition, Harlow’s work contributed to the improved treatment of children in orphanages, hospitals,
day care centers, and schools (Herman, 2007; van der Horst & van der Veer, 2008).
Pros and Cons of Observational Research
Observational designs are well suited to a wide range of research questions, provided
the questions can be addressed through directly observable behaviors and events; that is,
you can observe parent–child interactions, or nonverbal cues to emotion, or even crowd
behavior. However, if you are interested in studying thought processes—such as how
mothers interpret their interactions—then observation will not suffice. This harkens back
new66480_03_c03_p089-132.indd 103
10/31/11 9:39 AM
Section 3.4 Observational Research
to our discussion of behavioral measures in Chapter 2: In exchange for giving up access to
internal processes, you gain access to unfiltered behavioral responses.
To capture these unfiltered behaviors, it is vital for the researcher to be as unobtrusive as
possible. As we have already discussed, people have a tendency to change their behavior
when they are being observed. In the bullying study by Craig and Pepler (1997) discussed
at the beginning of this chapter, the researchers used video cameras to record children’s
behavior unobtrusively; otherwise, the occurrence of bullying might have been artificially low. If you conduct an observational study in a laboratory setting, there is no way
to hide the fact that people are being observed, but the use of one-way mirrors and video
recordings can help people to become comfortable with the setting (versus having an
experimenter staring at them across the table). If you conduct an observational study
out in the real world, there are even more possibilities for blending into the background,
including using observers who are literally hidden. For example, let’s say you hypothesize that people are more likely to pick up garbage when the weather is nicer. Rather
than station an observer with a clipboard by the trash can, you could place someone out
of sight standing behind a tree or perhaps sitting on a park bench pretending to read a
magazine. In both cases, people would be less conscious of being observed and therefore
more likely to behave naturally.
One extremely clever strategy for blending in comes from a study by the social psychologist Muzafer Sherif, involving observations of cooperative and competitive behaviors
among boys at a summer camp (1954). You can imagine that it was particularly important to make observations in this context without the boys realizing they were part of a
research study. Sherrif took on the role of camp janitor, allowing him to be a presence in
nearly all of the camp activities. The boys never paid enough attention to the “janitor” to
realize his omnipresence—or his discrete note taking. The brilliance of this idea is that it
takes advantage of the fact that people tend to blend into the background once we become
used to their presence.
Types of Observational Research
There are several variations on observational research, according to the amount of control
that a researcher has over the data collection process.
Structured observation involves creating a standard situation in a controlled setting and
then observing participants’ responses to a predetermined set of events. The “strange situation” studies of attachment (discussed above) are a good example of structured observation—mothers and infants are subjected to a series of eight structured episodes, and
researchers systematically observe and record the infants’ reactions. Even though these
types of studies are conducted in a laboratory, they differ from experimental studies in an
important way: Rather than systematically manipulate a variable to make comparisons,
researchers present the same set of conditions to all participants.
Another example of structured observation comes from the research of John Gottman, a
psychologist at the University of Washington. For nearly three decades, Gottman and his
colleagues have conducted research on the interaction styles of married couples. Couples
who take part in this research are invited for a 3-hour session in a laboratory that closely
resembles a living room. Gottman’s goal is to make couples feel reasonably comfortable
new66480_03_c03_p089-132.indd 104
10/31/11 9:39 AM
Section 3.4 Observational Research
Jose Luis Pelaez, Inc./Corbis
Structured observation is useful in predicting
which marriages will end in divorce.
and natural in the setting, in order to get them
talking as they might do at home. After allowing
them to settle in, Gottman adds the structured element by asking the couple to discuss an “ongoing
issue or problem” in their marriage. The researchers then sit back to watch the sparks fly, recording everything from verbal and nonverbal communication to measures of heart rate and blood
pressure. Gottman has observed and tracked so
many couples over the decades that he is able to
predict, with remarkable accuracy, which couples
will divorce in the 18 months following the lab
visit (Gottman & Levenson, 1992).
Naturalistic observation involves observing and
systematically recording behavior out in the real
world. This can be done in two broad ways—with or without intervention on the part of
the researcher. Naturalistic studies that involve researcher intervention consist of manipulating some aspect of the environment and then observing responses. For example, you
might leave a shopping cart just a few feet away from the cart return area and measure
whether people move the cart. (Given the number of carts that are abandoned just inches
away from their proper destination, someone must be doing this research all the time. . . ).
In another example you may remember from Chapter 1 (in our discussion of ethical dilemmas), Harari et al. (1995) used this approach to study whether people would help in emergency situations. In brief, these researchers staged what appeared to be an attempted rape
in a public park and then observed whether groups or individual males were more likely
to rush to the victim’s aid.
The ABC network has developed a hit reality show that illustrates this type of research. The
show “What Would You Do?” sets up provocative settings in public and videotapes people’s
reactions; full episodes are available online at
If you were an unwitting participant in one of these episodes, you might see a customer
stealing tips from a restaurant table or a son berating his father for being gay or a man
proposing to his girlfriend who minutes earlier had been kissing another man at the bar.
Of course, these observation “studies” are more interested in shock value than data collection (or IRB approval; see Chapter 1), but the overall approach can be a useful strategy to
assess people’s reactions to various situations. In fact, some of the scenarios on the show
are based on classic studies in social psychology, such as the well-documented phenomenon that people are reluctant to take responsibility for helping in emergencies.
Alternatively, naturalistic studies can involve simply recording ongoing behavior without
any attempt by the researchers to intervene or influence the situation. In these cases, the
goal is to observe and record behavior in a completely natural setting. For example, you
might station yourself at a liquor store and observe the numbers of men and women who
buy beer versus wine. Or, you might observe the numbers of people who give money to
the Salvation Army bell ringers during the holiday season. You can use this approach to
make comparisons of different conditions, provided the differences occur naturally. That
is, you could observe whether people donate more money to the Salvation Army on sunny
or snowy days or compare donation rates when the bell ringers are different genders or
new66480_03_c03_p089-132.indd 105
10/31/11 9:39 AM
Section 3.4 Observational Research
races. Do people give more money when the bell ringer is an attractive female? Or do they
give more to someone who looks more needy? These are all research questions that could
be addressed using a well-designed naturalistic observation study.
Participant observation involves having the researcher(s) conduct observations while
engaging in the same activities as the participants. The goal is to interact with these participants in order to gain better access and insight into their behaviors. In one famous
example, the psychologist David Rosenhan (1973) was interested in the experience of people hospitalized for mental illness. To study these experiences, he had eight perfectly sane
people gain admission to different mental hospitals. These fake patients were instructed
to give accurate life histories to a doctor except for lying about one diagnostic symptom;
they all supposedly heard voices occasionally, a symptom of schizophrenia.
Once admitted, these “patients” behaved in a
normal and cooperative manner, with instructions to convince hospital staff that they were
healthy enough to be released. In the meantime,
they observed life in the hospital and took notes
on their experiences—a behavior that many doctors interpreted as “paranoid note taking.” The
main finding of this study was that hospital staff
tended to see all patient behaviors through the
lens of their initial diagnoses. Despite immediately acting “normally,” these fake patients were
hospitalized an average of 19 days (with a range
from 7 to 52!) before being released. And all but
one was given a diagnosis of “schizophrenia in
remission” upon release. The other striking finding was that treatment was generally depersonalized, with staff spending little time with individual patients.
Psychologists David Rosenhan’s study of
staff and patients in a mental hospital found
that patients tended to be treated based on
their diagnosis, not on their actual behavior.
In another great example of participant observation, Festinger, Riecken, and Schachter (1956)
decided to join a doomsday cult to test their new
theory of cognitive dissonance. Briefly, this theory argues that people are motivated to maintain
a sense of consistency among their various thoughts and behaviors. So, for example,
if you find yourself smoking a cigarette despite being aware of the health risks, you
might rationalize your smoking by convincing yourself that lung cancer risk is really just
genetic. In this case, Festinger and colleagues stumbled upon the case of a woman named
Mrs. Keach, who was predicting the end of the world, via alien invasion, at 11 p.m. on
a specific date 6 months in the future. What would happen, they wondered, when this
prophecy failed to come true?
To answer this question, the researchers pretended to be new converts and joined the
cult, living among the members and observing them as they made their preparations for
doomsday. Sure enough, the day came, and 11 p.m. came and went without the world
new66480_03_c03_p089-132.indd 106
10/31/11 9:39 AM
Section 3.4 Observational Research
ending. Mrs. Keach first declared that she had forgotten to account for the time zone difference, but as sunrise started to approach the group members became restless. Finally,
after a short absence to communicate with the aliens, Mrs. Keach returned with some good
news: The aliens were so impressed with the devotion of the group that they decided to
postpone their invasion! The group members rejoiced, rallying around this brilliant piece
of rationalizing, and quickly began a new campaign to recruit new members.
As you can see from these examples, participant observation can provide access to amazing and one-of-a-kind data, including insights into group members’ thoughts and feelings. This also provides access to groups that might be reluctant to allow outside observers. However, this approach has two clear disadvantages over other types of observation.
The first problem is ethical; data is collected from individuals who do not have the opportunity to give informed consent. Indeed, the whole point of the technique is to observe
people without their knowledge. In order for an IRB to approve this kind of study, there
has to be an extremely compelling reason to ignore informed consent, as well as extremely
rigorous measures to protect identities. The second problem is methodological; there is
ample opportunity for the objectivity of observations to be compromised by the close
contact between researcher and participant. Because the researcher is a part of the group,
he or she can change the dynamics in subtle ways, possibly leading the group to confirm
his or her hypothesis. In addition, the group can shape the researcher’s interpretations in
subtle ways, leading him or her to miss important details.
Steps in Observational Research
One of the major strengths of observational research is that it has a high degree of ecological validity; that is, the research can be conducted in situations that closely resemble the
real world. Think of our examples so far—married couples observed in a living roomlike laboratory; doomsday cults observed from within; bullying behaviors on the school
playground. In every case, people’s behaviors are observed in the natural environment or
something very close to it. But this ecological validity comes at a price; the real world is a
jumble of information, some relevant, some not so much. The challenge for the researcher,
then, is to decide on a system for sorting out the signal from the noise that provides the
best test of her hypothesis. In this section, we discuss a three-step process for conducting observational research. The key thing you should note right away is that most of this
process involves making decisions ahead of time so that the process of data collection is
smooth, simple, and systematic.
Step 1—Develop a Hypothesis
For research to be systematic, it is important to impose structure by having a clear research
question and hypothesis. We have covered hypotheses in detail in other chapters, but the
main points bear repeating: Your hypothesis must be testable and falsifiable, meaning that
it must be framed in such a way that it can be addressed through empirical data and might
be disconfirmed by these data. In our example involving Salvation Army donations, we
predicted that people might donate more money to an attractive bell ringer. This could
easily be tested empirically and could just as easily be disconfirmed by the right set of
data—say, if attractive bell ringers brought in the fewest donations.
new66480_03_c03_p089-132.indd 107
10/31/11 9:39 AM
Section 3.4 Observational Research
This particular example also highlights an additional important feature of observational
hypotheses; namely, they have to be observable. Because observational studies are based
on observations of behaviors, our hypotheses have to be centered on behavioral measures.
That is, we can safely make predictions about the amount of money people will donate
because this can be directly observed. But we are unable to make predictions in this context about the reasons for donations. There would be no way to observe, say, that people
donate more to attractive bell ringers because they were trying to impress them. In sum,
one limitation of observing behavior in the real world is that we are unable to delve into
the cognitive and motivational reasons behind the behaviors.
Step 2—Decide What and How to Sample
Once you have developed a hypothesis that is testable, falsifiable, and observable, the
next step is to decide what kind of information to gather from the environment to test this
hypothesis. The simple fact is that the world is too complex to sample everything. Imagine
that you wanted to observe the dinner rush at a restaurant. There is a nearly infinite list
of possibilities to observe: What time does the restaurant get crowded? How many times
do people send their food back to the kitchen? What are the most popular dishes? How
often do people get in arguments with the wait staff? To simplify the process of observing
behavior, you will need to take samples, or small snippets of the environment that are
relevant to your hypothesis. That is, rather than observing “dinner at the restaurant,” the
goal is to narrow your focus to something like “the number of people waiting in line for a
table at 6 p.m. versus 9 p.m.”
The choice of what and how to sample will ultimately depend on the best fit for your
hypothesis. In the context of observational research, there are three strategies for sampling
behaviors and events. The first strategy, time sampling, involves comparing behaviors
during different time intervals. For example, to test the hypothesis that football teams
make more mistakes when they start to get tired, you could count the number of penalties
in the first 5 and the last 5 minutes of the game. This data would allow us to compare mistakes at one time interval with mistakes at another time interval. In the case of Festinger’s
study of a doomsday cult, time sampling was used to compare how the group members
behaved before and after their prophecy failed to come true.
Steve Mason/Photodisc/Thinkstock
The dinner scene at a busy restaurant offers
a wide variety of behaviors to sample.
new66480_03_c03_p089-132.indd 108
The second strategy, individual sampling,
involves collecting data by observing one person
at a time in order to test hypotheses about individual behaviors. Many of the examples we have
already discussed involve individual sampling:
Ainsworth and colleagues tested their hypotheses
about attachment behaviors by observing individual infants, while Gottman tests his hypotheses about romantic relationships by observing
one married couple at a time. These types of data
allow us to examine behavior at the individual
level and test hypotheses about the kinds of
things people do—from the way they argue with
their spouses to whether they wear team colors to
a football game.
10/31/11 9:39 AM
Section 3.4 Observational Research
The third strategy, event sampling, involves observing and recording behaviors that occur
throughout an event. For example, you could track the number of fights that break out during
an event such as a football game or the number of times people leave the restaurant without
paying the check. This strategy allows for testing hypotheses about the types of behaviors
that occur in a particular environment or setting. For example, you might compare the number of fights that break out in a professional football versus a professional hockey game. Or,
the next time you host a party, you could count the number of wine bottles versus beer bottles
that end up in your recycling bin. The distinguishing feature of this strategy is that you focus
on occurrence of behaviors more than on the individuals performing these behaviors.
Step 3—Record and Code Behavior
Now that you have formulated a hypothesis and decided on the best sampling strategy,
there is one final and critical step before you begin data collection. Namely, you have
to develop good operational definitions of your variables by translating the underlying
concepts into measurable variables. Gottman’s research turns the concept of marital interactions into a range of measurable variables like the number of dismissive comments
and passive-aggressive sighing—all things that can be observed and counted objectively.
Rosenhan’s study involving fake schizophrenic patients turned the concept of how staff
treat patients into measureable variables such as the amount of time staff members spent
with each patient—again, something very straightforward to observe.
It is vital to decide up front what kinds and categories of behavior you will be observing
and recording. In the last section, we narrowed down our observation of dinner at the
restaurant to the number of people in line at 6 p.m. versus the number of people in line at
9 p.m. But how can we be sure we get an accurate count? What if two people are waiting
by the door while the other two members of the group are sitting at the bar? Are those
at the bar waiting for a table or simply having drinks? One possibility might be to count
the number of individuals who walk through the door in different time periods, although
our count could be inflated by those who give up on waiting or who only enter to ask for
directions to another place.
In short, observing behavior in the real world can be messy. The best way to deal with this
mess is to develop a clear and consistent categorization scheme, and stick with it. That
is, in testing your hypothesis about the most crowded time at the restaurant, you would
choose one method of counting people and use it for the duration of the study. In part,
this choice is a judgment call, but your judgment should be informed by three criteria.
First, you should consider practical issues, such as whether your categories can be directly
observed. You can observe the number of people who leave the restaurant, but you cannot observe whether they got impatient. Second, you should consider theoretical issues,
such as how well your categories represent the underlying theory. Why did you decide to
study the most crowded time at the restaurant? Perhaps this particular restaurant is in a
new, up-and-coming neighborhood and you expect the restaurant to get crowded over the
course of the evening. It would also lead you to include people sitting both at tables and
at the bar—because this crowd may come to the restaurant with the sole intention of staying at the bar. Finally, you should consider previous research in choosing your categories.
Have other researchers studied dining patterns in restaurants? What kinds of behaviors
did they observe? If these categories make sense for your project, you should feel free to
re-use them—no need to reinvent the wheel!
new66480_03_c03_p089-132.indd 109
10/31/11 9:39 AM
Section 3.4 Observational Research
Last, but not least, you should take a step back and evaluate both the validity and the reliability of your coding system. (See Chapter 2 for a review of these terms.) Validity in this
case means making sure the categories we observe do a good job of capturing the underlying variables in our hypothesis (i.e., construct validity; see Chapter 2). For example, in
Gottman’s studies of marital interactions, some of the most important variables are the
emotions expressed by both partners. One way to observe emotions would be to count
the number of times a person smiles. However, we would have to think carefully about
the validity of this measure because smiling could indicate either genuine happiness or
condescension. As a general rule, the better our operational definitions, the more valid
our measures will be (Chapter 2).
Reliability in the context of observation means making sure our data are collected in a
consistent way. If your research involves more than one observer using the same system,
their data should look roughly the same (i.e., interrater reliability). This is accomplished
in part by making the task simple and straightforward—for example, you can have
trained assistants use a checklist to record behaviors rather than depend on open-ended
notes. The other key to improving reliability is through careful training of the observers,
giving them detailed instructions and ample opportunities to practice the rating system.
Observation Examples
To give you a sense of how all of this comes together, let’s walk through a pair of examples, from research question to data collection.
Example 1—Theater Restroom Usage
First, imagine, for the sake of this example, that you are interested in whether people are
more likely to use the restroom before or after watching a movie. This research question
could provide valuable information for theater owners in planning employee schedules
(i.e., when are bathrooms most likely to need cleaning). Thus, by studying patterns of
human behavior, we could gain valuable applied knowledge.
The first step is to develop a specific, testable, and observable hypothesis. In this case, we
might predict that people are more likely to use the restroom after the movie, as a result
of consuming those 64-ounce sodas during the movie. And, just for fun, let’s also compare
the restroom usage of men and women. Perhaps men are more likely to wait until after
the movie, whereas women are as likely to go before as after? This pattern of data might
look something like the percentages in Table 3.1. That is, men make 80% of their restroom
visits after the movie and 20% before the movie, while women make about 50% of their
restroom visits at each time.
Table 3.1: Hypothesized Data from Observation Exercise
new66480_03_c03_p089-132.indd 110
Before movie
After movie
10/31/11 9:39 AM
Section 3.4 Observational Research
The next step is to decide on the best sampling strategy to test this hypothesis. Of the three
sampling strategies we discussed—individual, event, and time—which one seems most
relevant here? The best option would probably be time sampling because our hypothesis
involves comparing the number of restroom visitors in two time periods (before versus
after the movie). So, in this case, we would need to define a time interval for collecting data.
One option would be to limit our observations to the 10 minutes before the previews begin
and the 10 minutes after the credits end. The potential problem here, of course, is that some
people might use either the previews or the end credits as a chance to use the restroom.
Another complication arises in trying to determine which movie people are watching; in
a giant multiplex theater, movies start just as others are finishing. One possible solution,
then, would be to narrow our sample to movie theaters that show only one movie at a time
and to define the sampling times based on the actual movie start and end times.
Once we decide on a sampling strategy, the next step is decide on the types of behaviors
we want to record. This particular hypothesis poses a challenge because it deals with
a rather private behavior. In order to faithfully record people “using the restroom,” we
would need to station researchers in both men’s and women’s restrooms to verify that
people actually, well, “use” the restroom while they are in there. However, this strategy
comes with the potential downside that your presence (standing in the corner of the restroom) will affect people’s behavior. Another, less intrusive option would be to stand outside the restroom and simply count “the number of people who enter.” The downside
here, of course, is that we don’t technically know why people are going into the restroom.
But sometimes research involves making these sorts of compromises—in this case, we
chose to sacrifice a bit of precision in favor of a less intrusive measurement.
So, in sum, we started with the hypothesis that men are more likely to use the restroom
after a movie, while women use the restroom equally before and after. We then decided
that the best sampling strategy would be to identify a movie theater showing only one
movie and to sample from the 10-minute periods before and after the actual movie’s running time. Finally, we decided that the best strategy for recording behavior would be to
station observers outside the restrooms and count the number of people who enter. Now,
let’s say we conduct these observations every evening for one week and collect the data
in Table 3.2.
Table 3.2: Findings from Observation Exercise
Before movie
75 (25%)
300 (60%)
After movie
225 (75%)
200 (40%)
300 (100%)
500 (100%)
You can see that more women (N = 500) than men (N = 300) attended the movie theater
during our week of sampling. But the real test of our hypothesis comes from examining
the percentages within gender groups. That is, of the 300 men who went into the restroom,
what percentage of them did so before the movie and what percentage of them did so
after the movie? In this dataset, women used the restroom with relatively equal frequency
before (60%) and after (40%) the movie. Men, in contrast, were three times as likely to
new66480_03_c03_p089-132.indd 111
10/31/11 9:39 AM
Section 3.4 Observational Research
use the restroom after (75%) than before (25%) the movie. In other words, our hypothesis
appears to be confirmed by examining these percentages.
Example 2—Cell Phone Usage While Driving
Imagine for this example that you are interested in patterns of cell phone usage among
drivers. Several recent studies have reported that drivers using cell phones are as impaired
as drunk drivers, making this an important public safety issue. Thus, if we could understand the contexts in which people are most likely to use cell phones, this would provide
valuable information for developing guidelines for safe and legal use of these devices. So,
in this study, we might count the number of drivers using cell phones in two settings: in
rush-hour traffic and moving on the freeway.
The first step is to develop a specific, testable, and observable hypothesis. In this case, we
might predict that people are more likely to use cell phones when they are bored in the car.
So, we hypothesize that we will see more drivers using cell phones while stuck in rushhour traffic than while moving on the freeway.
The next step is to decide on the best sampling strategy to test this hypothesis. Of the
three sampling strategies we discussed—individual, event, and time—which one seems
most relevant here? The best option would probably be individual sampling because we
are interested in the cell phone usage of individual drivers. That is, for each individual car
we see during the observation period, we want to know whether the driver is using a cell
phone. One strategy for collecting these observations would be to station observers along
a fast-moving stretch of freeway, as well as along a stretch of road that is clogged during
rush hour. These observers would keep a record of each passing car, noting whether the
driver was on the phone.
Once we decide on a sampling strategy, our next step is to decide on the types of behaviors we want to record. One challenge in this study is in deciding how broadly to define
the category of cell phone usage. Would we include both talking and text messaging?
Given our interest in distraction and public safety, we probably would want to include
text messaging. Several states have recently banned text messaging while driving, in
response to tragic accidents. Because we will be observing moving vehicles, the most reliable approach might be to simply note whether each driver had a cell phone in his or her
hand. As with our restroom study, we are sacrificing a little bit of precision (i.e., we don’t
know what the cell phone is being used for) to capture behaviors that are easier to record.
So, in sum, we started with the hypothesis that drivers would be more likely to use cell
phones when stuck in traffic. We then decided that the best sampling strategy would be
to station observers along two stretches of road, and they should note whether drivers
were using cell phones. Finally, we decided that the best compromise for observing cell
phone usage would be to note whether each driver was holding a cell phone. Now, let’s
say we conduct these observations over a 24-hour period and collect the data shown in
Table 3.3.
new66480_03_c03_p089-132.indd 112
10/31/11 9:39 AM
Section 3.4 Observational Research
Table 3.3: Findings from Observation Exercise #2
Rush Hour
Cell Phone
30 (30%)
200 (67%)
No Cell Phone
70 (70%)
100 (33%)
You can see that more cars passed by during the non-rush-hour stretch (N = 300) than
during the rush-hour stretch (N = 200). But the real test of our hypothesis comes from
examining the percentages within each stretch. That is, of the 100 people observed during
rush hour and the 300 observed not during rush hour, what percentage were using cell
phones? In this data set, 30% of those in rush hour were using cell phones, compared with
67% of those not during rush hour using cell phones. In other words, our hypothesis was
not confirmed by the data. Drivers in rush hour were less than half as likely to be using
cell phones. The next step in our research program would be to speculate on the reasons
why the data contradicted our hypothesis.
Qualitative versus Quantitative Approaches
The general method of observation lends itself equally well to qualitative and quantitative approaches, although some types of observation fit one approach better than the
other. For example, structured observation tends to be focused on hypothesis testing and
quantification of responses. In Mary Ainsworth’s “strange situation” research (described
above), the primary goal was to expose children to a predetermined script of events
and to test hypotheses about how children with secure and insecure attachments would
respond to these events. In contrast, naturalistic observation—and, to a greater extent,
participant observation—tends to be focused on learning from events as they occur naturally. In Leon Festinger’s “doomsday cult” study, the researchers joined the group in
order to observe the ways members reacted when their prophecy failed to come true.
new66480_03_c03_p089-132.indd 113
10/31/11 9:39 AM
Section 3.4 Observational Research
Research: Thinking Critically
The Irritable Heart
By K. Kris Hirst
Using open source data from a federal project digitizing medical records of veterans of the American
Civil War (1860–1865) called the Early Indicators of Later Work Levels, Disease, and Death Project,
researchers have identified an increased risk of post-war illness among Civil War veterans, including
cardiac, gastrointestinal, and mental diseases throughout their lives. In a project partly funded by the
National Institutes of Aging, military service files from a total of 15,027 servicemen from 303 companies of the Union Army stored at the United States National Archives were matched to pension files
and surgeon’s reports of multiple health examinations. A total of 43% of the men had mental health
problems throughout their lives, some of which are today recognized as related to post-traumatic
stress disorder (PTSD). Most particularly affected were men who enlisted at ages under 17. Roxane
Cohen Silver and colleagues at the University of California, Irvine, published their results in the February 2006 issue of Archives of General Psychiatry.
Studies of PTSD to date have connected war experiences to the recurrence of mental health problems and physical health problems such as cardiovascular disease and hypertension and gastrointestinal disorders. These studies have not had access to long-term health impacts, since they have
been focused on veterans of recent conflicts. Researchers studying the impact of modern conflict
participation report that the factors increasing risk of later health issues include age at enlistment,
intimate exposure to violence, prisoner of war status and having been wounded.
The Trauma of the American Civil War
The Civil War was a particularly traumatic conflict for American soldiers. Army soldiers commonly
enlisted at quite young ages; between 15% and 20% of the Union army soldiers enlisted between
ages of 9 and 17. Each of the Union companies was made up of 100 men assembled from regional
neighborhoods, and thus often included family members and friends. Large company losses—75%
of companies in this sample lost between 5% and 30% of their personnel—nearly always meant the
loss of family or friends. The men readily identified with the enemy, who in some cases represented
family members or acquaintances. Finally, close-quarter conflict, including hand-to-hand combat
without trenches or other barriers, was a common field tactic during the Civil War.
To quantify trauma experienced by Civil War soldiers, researchers used a variable derived from percentage of company lost to represent relative exposure to trauma. Researchers found that in military
companies with a larger percentage of soldiers killed, the veterans were 51% more likely to have
cardiac, gastrointestinal, and nervous disease.
The Youngest Soldiers Were Hardest Hit
The study found that the youngest soldiers (ages 9 to 17 years at enlistment) were 93% more likely
than the oldest (ages 31 and older) to experience both mental and physical disease. The younger
soldiers were also more likely to show signs of cardiovascular disease alone and in conjunction with
gastrointestinal conditions, and they were more likely to die early. Former POWs had an increased
risk of combined mental and physical problems as well as early death.
One problem the researchers grappled with was comparing diseases as they were recorded during
the latter half of the 19th century to today’s recognized diseases. Post-traumatic stress syndrome
was not recognized by doctors—although they did recognize that veterans exhibited an extreme
level of “nervous disease” that they labeled “irritable heart” syndrome. (continued)
new66480_03_c03_p089-132.indd 114
10/31/11 9:39 AM
Section 3.5 Describing Your Data
Critical Thinking Questions (continued)
Children and Adolescents in Combat
Harvard psychologist Roger Pitman, writing in an editorial in the publication , writes that the impact
on younger soldiers should be of immediate concern, since “their immature nervous systems and
diminished capacity to regulate emotion give even greater reason to shudder at the thought of
children and adolescents serving in combat.” Although disease identification is not one-to-one, said
senior researcher Roxane Cohen Silver, “I’ve been studying how people cope with traumatic life
experiences of all kinds for 20 years and these findings are quite consistent with an increasing body
of literature on the physical and mental health consequences of traumatic experiences.”
Boston University psychologist Terence M. Keane, Director of the National Center for PTSD, commented that this “remarkably creative study is timely and extremely valuable to our understanding
of the long-term effects of combat experiences.” Joseph Boscarino, Senior Investigator at Geisinger
Health System, added “There are a few detractors that say that PTSD does not exist or has been
exaggerated. Studies such as these are making it difficult to ignore the long-term effects of warrelated psychological trauma.”
Think about it
1. What hypotheses are the researchers testing in this study?
2. How did the researchers quantify trauma experienced by Civil War soldiers? Do you think this
is a valid way to operationalize trauma? Explain why or why not.
3. Would this research be best described as case studies, archival research, or natural observation? Are there elements of more than one type? Explain.
3.5 Describing Your Data
efore we move on from descriptive research designs, this last section covers the
process of presenting descriptive data in both graphical and numeric form. No matter how you present your data, a good description is one that is accurate, concise,
and easy to understand. In other words, you have to represent the data accurately and in
the most efficient way possible, so that your audience can understand it. Another, more
eloquent way to think of these principles is to take the advice of Edward Tufte, a statistician and expert in the display of visual information. Tufte suggests that when people
view your visual displays, they should spend time on “content-reasoning” rather than
“design-decoding” (Tufte, 2001). The sole purpose of designing visual presentations is to
communicate your information. So, the audience should spend time thinking about what
you have to say, not trying to puzzle through the display itself. In the following sections,
we cover guidelines for accomplishing this goal in both numeric and visual form.
Table 3.4 presents hypothetical data from a sample of 20 participants. In this example, we
have asked people to report their gender and ethnicity, as well as answer questions about
their overall life satisfaction and daily stress. Each row in this table represents one participant
in the study, and each column represents one of the variables for which data were collected.
In the following sections, we will explore different options for summarizing these sample
data, first in numeric form and then using a series of graphs. Our focus in this chapter is on
new66480_03_c03_p089-132.indd 115
10/31/11 9:39 AM
Section 3.5 Describing Your Data
ways to describe the sample characteristics. In later chapters, we will return to these principles in discussing graphs that display the relationship between two or more variables.
Table 3.4: Raw Data from a Sample of Twenty Individuals
Subject ID
Life Satisfaction
Daily stress
African American
African American
African American
African American
Numeric Descriptions
Frequency Tables
Often, a good first step in approaching your data set is to get a sense of the frequencies for
your demographic variables—gender and ethnicity in this example. The frequency tables
shown in Table 3.5 are designed for presenting the number and percentage of the sample
that falls into each of a set of categories. As you can see in this pair of tables, our sample
consisted of an equal number of men and women (i.e., 50% for each gender). The majority
of our participants were white (45%), with the remainder divided almost equally between
African American (20%), Asian (15%), and Hispanic (20%) ethnicities.
new66480_03_c03_p089-132.indd 116
10/31/11 9:39 AM
Section 3.5 Describing Your Data
Table 3.5: Frequency Table Summarizing Ethnicity and Sex Distribution
Valid percentage
Cumulative percentage
Valid percentage
Cumulative percentage
We can gain a lot of information from numerical summaries of data. In fact, numeric
descriptors form the starting point for doing inferential statistics and testing our hypotheses. We will cover these statistics in later chapters, but for now it is important to understand that two numeric descriptors can provide a wealth of information about our data
set: measures of central tendency and measures of dispersion.
Measures of Central Tendency
The first number we need to describe our data is a measure of central tendency, which
represents the most typical case in our data set. There are three indices for representing
central tendency:
The mean is the mathematical average of our data set, calculated using the following
The capital letter M is used to indicate the mean; the X refers to individual scores, and the
capital letter N refers to the total number of data points in the sample. Finally, the Greek
letter sigma, or S, is a common symbol used to indicate the sum of a set of values.
So, in calculating the mean, we add up all the scores in our data set (SX), and then divide
this total by the number of scores in the data set (N). Because we are adding and dividing
our scores, the mean can only be calculated using interval or ratio data (see Chapter 2 for
a review of the four scales of measurement). In our sample data set, we could calculate the
mean for both life satisfaction and daily stress. To calculate the mean value for life satisfaction scores, we would first add the 20 individual scores (i.e., 40 1 47 1 29 1 32 1 . . . 1 38),
and then divide this total by the number of people in the sample (i.e., 20).
new66480_03_c03_p089-132.indd 117
10/31/11 9:39 AM
Section 3.5 Describing Your Data
5 37.1
In other words, the mean, or most typical satisfaction rating in this sample is 37.1.
The median is another measure of central tendency, representing the number in the middle of our dataset, with 50% of scores both above and below it. The location of the median
is calculated by placing the list of values in ascending numeric order, then using the following formula: Mdn 5 (N 1 1)/2. For example, if you have 9 scores, the median will be
the fifth one: Mdn 5 (N 1 1)/2 5 (9 1 1)/2 5 10/2 5 5. If you have an even number of
scores, say, 8, the median will fall between two scores: Mdn 5 (8 1 1)/2 5 9/2 5 4.5, or
the average of the fourth and fifth one. This measure of central tendency can be used for
ordinal, interval, or ratio data because it does not require mathematical manipulation to
obtain. So, in our sample data set, we could calculate the median for either life satisfaction
or daily stress scores. To find the median score for life satisfaction, we would sort the data
in order of increasing satisfaction scores (which has already been done in this case). Next,
we find the position of the median using the formula Mdn 5 (N 1 1)/2. Because we have
an N of 20 scores:
Mdn 5
1N 1 12
1 20 1 1 2
5 37.1
In other words, the median will be the average of the 10th and 11th scores. The 10th participant scored a 37, and the 11th participant scored a 38, for a median of 37.5. The median is
another way to represent the most typical score on life satisfaction, so it is no accident that
it is so similar to the mean (i.e., 37.1).
The final measure of central tendency, the mode, represents the most frequent score in our
data set, obtained either by visual inspection of the values or by consulting a frequency
table like in the one in Table 3.5 (discussed below). Because the mode represents a simple
frequency count, it can be used with any of the four scales of measurement. In addition, it
is the only measure of central tendency that is valid for use with nominal data, since the
numbers assigned to these data are arbitrary.
So, in our sample data, we could calculate the mode for any of the variables in the table.
To find the mode of life satisfaction scores, we would simply scan the table for the most
common score, which turns out to be 40. Thus, we have one more way to represent the
most typical score on life satisfaction. Note that the mode is slightly higher than our mean
(37.1) or our median (37.5). We will return to this issue shortly and discuss the process of
choosing the most representative measure. Since we’ve been ignoring the nominal variables so far, let’s also find the mode for ethnicity. This is accomplished by tallying up the
number of people in each category—or, better yet, by letting a computer program do the
tallying for you. As we saw earlier, the majority of our participants were white (45%),
with the remainder divided almost equally among African American (20%), Asian (15%),
and Hispanic (20%) ethnicities. So, the modal, or most typical value of ethnicity, in this
sample was white.
One important take-home point is that your scale of measurement largely dictates the
choice between measures of central tendency—nominal scales can only use the mode, and
new66480_03_c03_p089-132.indd 118
10/31/11 9:39 AM
Section 3.5 Describing Your Data
the mean can only be used for interval or ratio scales. The other piece of the puzzle is to
consider which measure best represents the data. Remember that the central tendency is
a way to represent the “typical” case with a single number, so the goal is to settle on the
most representative number. This process is illustrated by the examples in Table 3.6.
Table 3.6: Comparing the Mean, Median, and Mode
• Both the mean and the median seem to
represent the data fairly well.
• The mean is a slightly better choice because it
hints at the higher scores.
• The mode is not representative—two people
seem to have higher scores than everyone else.
• The mean is inflated by the atypical score of
100 and therefore does not represent the data
• The mode is also not representative because it
ignores the higher values.
• In this case, the median is the most
representative value to describe this dataset.
Let’s look at one more example, using the “Daily Stress” variable from our sample data in
Table 3.4. The Daily Stress values of our 20 participants were as follows: 1, 1, 3, 3, 4, 4, 7, 7,
7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, and 10.

To calculate the mean of these values, we add up all of the values and divide by
our sample size of 20:

5 6.70
To calculate the median of these values, we use the formula Mdn 5 (N 1 1)/2
to find the middle score: Mdn = (N 1 1)/2 5 (21)/2 5 10.5. This tells us that our
median is the average of our 10th and 11th scores, or 8.
To obtain the mode of these values, we can inspect the data and determine that 8
is the most common number because it occurs five times.
In analyzing these three measures of central tendency, we see that they all appear to represent the data accurately. The mean is a slightly better choice than the other two because
it represents the lower values as well as the higher ones.
Measures of Dispersion
The second measure used to describe our data set is a measure of dispersion, or the spread
of scores around the central tendency. Measures of dispersion tell us just how typical the
typical score is. If the dispersion is low, then scores are clustered tightly around the central tendency; if dispersion is higher, then the scores stretch out farther from the central
new66480_03_c03_p089-132.indd 119
10/31/11 9:39 AM
Section 3.5 Describing Your Data
tendency. Figure 3.2 presents a conceptual illustration of dispersion. The graph on the
left has a low amount of dispersion because the scores (i.e., the blue curve) cluster tightly
around the average value (i.e., the red dotted line). The graph on the right shows a high
amount of dispersion because the scores (blue curve) spread out widely from the average
value (red dotted line).
Figure 3.2: Two Distributions with a Low vs. High Amount of Dispersion
Low Amount of
Dispersion Around the
Mean (red dotted line)
High Amount of
Dispersion Around the
Mean (red dotted line)
One of the most straightforward measures of dispersion is the range, which is the difference between the highest and lowest scores. In the case of our Daily Stress data, the
range would be found by simply subtracting the lowest value (1) from the highest value
(10) to get a range of 9. The range is useful for getting a general idea of the spread of
scores, although it does not tell us much about how tightly these scores cluster around
the mean.
The most common measures of dispersion are the variance and standard deviation, both
of which represent the average difference between the mean and each individual score.
The variance (abbreviated S2) is calculated by subtracting each score from the mean to
get a deviation score, squaring and summing these individual deviation scores, and then
dividing by the sample size. The more scores are spread out around the mean, the higher
the sum of our deviation scores will be, and therefore the higher our variance will be. The
deviation scores are squared because otherwise their sum would always equal zero; that
is, S(X 2 M) 5 0. Finally, the standard deviation, abbreviated SD, is calculated by taking
the square root of our variance. This four-step process is illustrated in Table 3.7, using a
hypothetical data set of 10 participants.
Once you know the central tendency and the dispersion of your variables, you have a
good sense of what the sample looks like. These numbers are also a valuable piece for
calculating the inferential statistics that we ultimately use to test our hypotheses.
new66480_03_c03_p089-132.indd 120
10/31/11 9:39 AM
Section 3.5 Describing Your Data
Table 3.7: Steps to Calculate the Variance and Standard Deviation
1. Subtract values
from mean.
2. Square & sum
deviation scores.
(1 2 5.4) 5 24.4
24.42 5 19.36
(2 2 5.4) 5 23.4
23.42 5 11.56
(2 2 5.4) 5 23.4
23.42 5 11.56
(4 2 5.4) 5 21.4
21.42 5 1.96
(5 2 5.4) 5 20.4
20.42 5 0.16
(7 2 5.4) 5 1.6
1.62 5 2.56
(7 2 5.4) 5 1.6
1.62 5 2.56
(8 2 5.4) 5 2.6
2.62 5 6.76
(9 2 5.4) 5 3.6
3.62 5 12.96
(9 2 5.4) 5 3.6
3.62 5 12.96
mean = 5.40
S 5 0.00
3. Calculate variance.
S2 5
S1X 2 X2
5 8.24
4. Ca…
Purchase answer to see full

Why Choose Us

  • 100% non-plagiarized Papers
  • 24/7 /365 Service Available
  • Affordable Prices
  • Any Paper, Urgency, and Subject
  • Will complete your papers in 6 hours
  • On-time Delivery
  • Money-back and Privacy guarantees
  • Unlimited Amendments upon request
  • Satisfaction guarantee

How it Works

  • Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
  • Fill in your paper’s requirements in the "PAPER DETAILS" section.
  • Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
  • Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
  • From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.