Introduction to Research Methods in Political Science: |
III. LEVELS OF MEASUREMENT
|
Subtopics |
SPSS Tools · New with this topic |
When data are prepared for
analysis by computer, values of variables are usually entered as numbers.
Sometimes such coding is natural — for example, the population of a
country or the number of votes received by a candidate. Sometimes,
artificial numerical codes are created for convenience of processing. In
a file containing data on members of the U.S. Congress, for example, Democrats
might be coded numerically as 1, Republicans as 2, and independents as 3.
Numerical, however, is not the same thing as quantitative. In fact,
whether data are coded numerically or not, there are different levels of
measurement.
The values of a nominal variable
do not indicate the amount of the thing being measured, nor are they in any
particular order. If coded numerically, the numbers chosen are arbitrary.
For example, if we list the regions of the
Sometimes the values of a variable are listed in order.
(Alternatively, we say that the values are "ranked" or
"rank ordered.") For example, the army orders
(ranks) military personnel from general to private. At a college
or university, class standing of undergraduates (freshman to senior) is another
example of an ordinal variable. In both of these
examples, the values of the variable in question (military rank or class
standing) are ranked from highest to lowest or vice versa. There are
other kinds of ordering. For example, respondents in a survey may be
asked to identify their political philosophy as "very liberal,"
"liberal," "moderate," "conservative," or
"very conservative," creating a scale rank ordered from most liberal
to most conservative.
Sometimes, in addition to being ordered, the differences (or intervals)
between any two adjacent values on a measurement scale are the same. For
example, the difference in temperature between 80 degrees Fahrenheit and 81
degrees is the same as that between 90 degrees and 91 degrees. When each
interval represents the same increment of the thing being measured, the measure
is called an interval variable.
Finally, in addition to having equal intervals, some measures also have an
absolute zero point. That is, zero represents the absence of the thing
being measured. Height and weight are obvious examples. Physicists
sometimes use the Kelvin temperature scale, in which zero means the complete
absence of energy. The same is not true of the Fahrenheit or Celsius
(Centigrade) scales. Zero degrees Celsius, for example, represents the freezing
point of water at sea level, but this does not mean that there is no
temperature at this point. The choice to put zero degrees at this point
on the scale is arbitrary. There is no particular reason why scientists
could not have chosen instead the freezing point of beer in Golden, Colorado
(other than that water is a more common substance, at least for most successful
scientists). With an absolute zero point, you can calculate ratios (hence
the name). For example, $20 is twice as much as $10, but 60 degrees
Fahrenheit is not really twice as hot as 30 degrees. Ratio data is fully quantitative: it tells us the
amount of the variable being measured. The percentage of votes received
by a candidate, Gross Domestic Product per Capita, and felonies per 100,000 population are all
ratio variables.
Dichotomous variables (those with only two values)
are a special case, and may sometimes be treated as nominal, ordinal, or
interval. Take, for example, political party affiliation in a two-party
legislature. Party is, on its face, a pure example of a nominal variable,
with the values of the variable being simply the names of the parties (or
arbitrary numbers used, for convenience, in place of the names). On the
other hand, we could treat party (and other dichotomous variables) as ordinal,
since there are only two possible ways for the values to be ordered, and it
makes no difference which way is chosen. There is, therefore, no way that
they can be listed out of order.
For certain purposes, we can even
treat dichotomous variables as interval, since there is only one interval (the
difference between Party A and Party B), which is obviously equal to itself.[1]
Level of measurement is important
because the higher the level of measurement of a variable (note that
"level of measurement" is itself an ordinal measure) the more
powerful are the statistical techniques that can be used to analyze it.
With nominal data, you can count the frequency with which each value of a
variable occurs. A voter's choice in the 2008 presidential race, for
example, is a nominal variable (with the values of the variable being
"McCain, "Obama," “Nader," "Barr," "McKinney," etc.), and so you can count
the number of votes received by each candidate. You can also calculate
the percentage of votes each candidate received. You can calculate joint
frequencies and percentages (how many and what percent of votes were cast by
Midwestern Obama supporters, for example). You can also use certain
measures that tell you how strong the relationship is between region and vote,
and the likelihood that the relationship occurred by chance.
On the other hand, there are
other operations you cannot legitimately perform with nominal data. Even
if you use numbers to label candidates (e.g., 1 = Obama, 2 = McCain, 3 = Nader, etc.),
you cannot very well say that Obama plus McCain equals Nader, or that Nader
divided by McCain is half way between Obama and McCain. Unfortunately, there
are many statistical techniques that require higher levels of measurement.
With ordinal data, you can employ
techniques that take into account the fact that the values of a variable are
listed in a meaningful order. With interval data, you can go even further
and use powerful techniques that assume a measurement scale of equal intervals.
As it happens, there are very few techniques in the social sciences that
require ratio data, and so some textbooks ignore the distinction between
interval and ratio scales.
If you use a technique that
assumes a higher level of measurement than is appropriate for your data, you
risk getting a meaningless answer. On the other hand, if you use a
technique that fails to take advantage of a higher level of measurement, you
may overlook important things about your data. (Note: in addition to
level of measurement, many statistical techniques also require other
assumptions about your data. For example, even if a variable is interval,
some otherwise appropriate techniques may yield misleading results if the
variable includes some values that are extremely high or low relative to the
rest of the distribution.)
The distinctions between levels of measurement are not always hard and fast.
Sometimes it depends on the underlying concept being measured. This applies, for example, to the question of whether to treat a dichotomous variable as nominal or ordinal. Do our values indicate two distinct categories (e.g., male and female), or do we think of them as two points along a spectrum (e.g., for or against capital punishment, since some people may favor or oppose capital punishment more strongly than others)?
In survey research, independents are often thought of as being somewhere in between Democrats and Republicans, and so measures of party identification are usually treated as ordinal. On the other hand, if you were studying the U.S. Senate, you would find that the only independents currently serving (as of the 112th Congress, 2011-2012) are Bernie Sanders of Vermont and Joseph Lieberman of Connecticut. While Lieberman might in some senses be considered "between" the Republican and Democratic parties, the same could hardly be said of Sanders, one of the most liberal members of the chamber. (Before coming to congress, he had run as a Socialist in winning election as mayor of Bennington, Vermont.)
Sometimes the question
of level of measurement hinges on the precise nature of the measure itself.
For example, the American National Election Study has for many years been
using "feeling thermometers." Respondents are asked to locate a
person (e.g., a presidential candidate) or a category of people (Democrats,
Republicans, feminists, evangelical Christians, Latinos, etc.) on a scale ranging from 0
to 100, with higher numbers representing warmer feelings toward the person or
category of people in question. Most researchers using these variables
have treated them as interval. Some, however, have raised doubts about
this practice. For example, does the difference between a rating of, say,
60 and 70 really mean the same thing as the difference between 90 and 100?
In designing research, there can
be tradeoffs between having data that are at a higher level of measurement
and other considerations. Aggregate data (data about groups of people) are generally interval or ratio,
but usually provide only indirect measures of how people think and act. Individual data get at these
things more directly, but are usually only nominal or ordinal.
Official election returns, for example, can provide us with ratio level data
about the distribution of votes in each precinct. These data, however,
tell us little about why individual people vote the way they do. Survey
research (public opinion polling), which provides data that are for the most
part only nominal or ordinal, allows us to explore such questions much more
extensively and directly.
Sometimes you will find other
terms used to describe the level of measurement of variables. SPSS,
for example, distinguishes among nominal, ordinal, and scale
(that is, interval or ratio) variables. Some texts distinguish
between nonparametric
(nominal or ordinal) and parametric
(interval or ratio) variables. In describing different statistical
procedures, we will sometimes distinguish between categorical and continuous variables. Categorical
variables generally consist of a small number of values, or categories, and are
usually nominal or ordinal. The values of continuous variables represent
a large or even infinite number of possible points along a scale, and are
interval or ratio.
categorical variable
continuous variable
dichotomous variable
interval variable
levels of measurement
nominal variable
nonparametric variable
ordinal variable
parametric variable
ratio variable
scale variable
1,
Start SPSS, and
open anes08s.sav.
In Variable View, notice that SPSS uses three categories of measurement:
nominal, ordinal, and scale (equivalent to interval and
ratio). Notice also that almost all of the variables are either
nominal or ordinal. This is typical with data on individuals, such as survey data. Now open countries.sav and do
the same. Notice that almost all of the variables are scale, as is
usually the case with aggregate data.
With each of the datasets
included in this project, care has been taken to correctly categorize the level
of measurement of variables. Remember, however, that the level of
measurement of some variables may depend on how the variable is used.
Also, when using datasets other than those provided with POWERMUTT, do not
assume without checking that the author has bothered to verify each variable's
level of measurement.
2.
Open the codebooks for the other datasets included in this project.
Classify each variable as nominal, ordinal, interval, or ratio.
Check your answers by opening the dataset in SPSS and examining the
Variable View (noting again that SPSS uses the term “scale” for
both interval and ratio data). (In some cases, there may be more than one
correct answer, depending on what assumptions are made.)
Lane, David, et al. “Levels of Measurement,” Online Statistics: A Multimedia Course
of Study. http://onlinestatbook.com/chapter1/levels_of_measurement.html.
University of Cambridge. "Levels of Measurement," Universities' Collaboration in eLearning. http://www.ucel.ac.uk/showroom/levels_of_measurement/Default.html.
[1] See the
section on "dummy"
variables under the regression analysis Topic.
Except where indicated, © 2003-2012 John L. Korey
Last Updated: December 18, 2012