I'll Give You a Definite
Maybe An Introductory Handbook for
Probability, Statistics, and Excel
handbook has been prepared by Ian Johnston of Malaspina University-College,
Nanaimo, BC (now Vancouver Island University), for students in Liberal Studies. The text is in the public domain,
released May 2000, and may be used, in whole or in part, by anyone without
charge and without permission]
comments, questions, corrections, and what not, please contact Ian
Six: Samples and Populations
Introduction: Samples and Populations
most of the examples we have been dealing with so far, our statistical analysis
has usually involved a complete set of information about all the items we wished
to study (e.g., all the students in a class). In other words we have been
dealing with populations (i.e., we had data for all the items in which were
our analysis is based upon an entire population (i.e., all the members of the
group under study, each of whom is taken into account in the analysis), we are
interested in data on each member of the group, and we do not extend our
conclusions beyond that particular group.
most statistical studies, however, the population we are interested in is far
too large for us to measure each and every one of the members of it (e.g., all
students at Malaspina University-College, all Canadian voters, all cars made in
Detroit, all children in Nanaimo, and so on). In such cases, we confine our
analysis to a relatively small selection taken from the total population. Such a
selection is called a sample.
purpose of dealing with a sample is straightforward: it enables us to study a
large population and to learn things about it, so that we can draw important
inferences, without having to go to the trouble of collecting data from every
member of the entire population.
very important part of statistics is the study of the sorts of conclusions we
can make about an entire population on the basis of a relatively small sample.
For instance, if we have measured data on, say, voting patterns for 1000 people,
are we entitled to make any conclusions based on that information about the
voting patterns of the population in general? And if so, what are the limits to
the sorts of generalizations we can make? What are we not entitled to conclude
about the wider population? How does my ability to make conclusions about the
wider population change as the size of my sample increases? How do I test claims
made about entire populations on the basis of an analysis of a single sample?
And so on.
other words, to use statistical information properly we need to understand
something about the relationship between the information we have collected from
a representative group of the entire population (the sample) and the total
population itself, from which the sample is taken and for which we can never
conduct complete measurements, since obtaining the information would be too time
consuming, if not impossible.
one important point in working with samples is the selection of a truly
representative sample—a collection of individual items for observation which
accurately represents the larger population. It is beyond the scope of this
module to explore the various methods statisticians use to make sure their
sampling techniques do not introduce major errors into the calculations (a
complex subject); however, it is appropriate to say a few things about the main
are a number of common procedures for selecting a sample, some simple and some
more complicated. Haphazard (or Opportunity) sampling, for example, relies upon
the convenience of the sampler or the self-selection of the sample (e.g.,
volunteers who respond to a mailed out questionnaire or who are picked at random
from a crowd) (1). Quota Sampling sets
quotas for various categories in the sample (so many men, so many women, so many
over age 45, so many under age 45, and so on), so as to achieve a representation
of the major divisions in the larger population. Random Sampling picks members
of the sample according to a random process, thus giving each member of the
large population an equal opportunity of being selected.
general, of the methods mentioned above, Random Sampling is the preferred
method, with the least built in bias. However, in order for random sampling
to be possible, there must be available a list of everyone in the population to
be sampled (for reasons explained below). Where that requirement cannot be
conveniently met (e.g., in a survey of all Canadians or all residents of BC),
then the simple method of random sampling outlined below is not appropriate.
simple random sample, with a list of the entire population under investigation,
the sampler then assigns a number to each item in the list and selects the
sample by consulting a random number generator or a table of random numbers. The
process works as outlined below.
Suppose we wish to investigate all the workers
in a particular factory, but we do not have the time or the resources to deal
with them all. So we decide to work with a sample of 30 workers out of a total
factory population of 450. We begin by assigning each member of the total
population a number. Since the largest number we require (450) has three digits,
we give everyone a three-digit number, starting with 001, 002, 003, 004, and so
on, up to 450.
then consult a list of random numbers. The list of random numbers looks like
this (a portion of a page).
begin the selection we blindly point to some number in the table (say, for
example, 2956, the figure in bold above). Then, reading across the table we take
three-digit numbers. If they fit someone in the general population, that person
is selected; if they do not, then we move on to the next three-digit number.
starting with 2956, the first-three digit number is 295. Since we have someone
in our list of 450 with that number, we select that person. The next three-digit
number (continuing to read to the right) is 674. That does not fit (since we
have only 450 in the total population we are studying), so we move on. The next
three-digit number is 053. This number fits, so the person with this number is
selected for the sample.
continue this process, moving through the table of random numbers, until we have
the number we need for the sample. To complete the selection of the sample of
30, we would obviously need a bigger list of random numbers than the partial
list given above.
is less bias in this selection because everyone in the total population has an
equal chance of being included in the sample. We have made no attempt to
organize the population into different sections or proportions. If we were
working on sampling merchandise or samples for experiment, we would proceed in
the same way, first assigning a number to each item in the larger population and
then consulting a list of random numbers to select the items for our sample.
some opinion polls, a variation of this method of random sampling can be useful:
random digit dialing for a telephone survey (although such a method is biased in
favour of those with telephones or more than one telephone number or who spend a
lot of time at home).
important factor in any sample is the size. The most appropriate size will
depend upon the accuracy we wish and upon the size of the general population we
are sampling. We shall be dealing with this question later in this section.
The Sample Mean
us assume we have properly identified our sample from the large population we
are interested in. On the basis of the measurements I have made of the sample I
have collected, I have a group of numbers. Thus, I can calculate the mean of
this sample (remember that the mean is the arithmetical average) in the usual
way (adding up all the values and dividing by the total number in the sample or
by entering the measurements on an Excel worksheet and getting Excel to make the
calculation for me). This figure is called the Sample Mean.
now, I conduct another similar sample of the same general population (not
including in the second sample anyone who was part of the first sample). I will
obtain a second set of measurements from my new sample, and I can calculate the
mean of that collection of numbers. Now I have a second Sample Mean. If I have
done my sampling without major bias, the second Sample Mean should be close to
the first Sample Mean (since I am sampling the same general population). But the
value for the second Sample Mean will almost certainly be somewhat different
from the first (even if the difference is quite small).
example, suppose I am investigating the body length of an adult male lizard. I
collect my first sample of, say, thirty lizards, measure the body length, enter
the data on a worksheet, and obtain a mean value for that sample Suppose this
value is 6.56 inches. I then collect a second sample for the same animal,
measure the body lengths, enter the data on a worksheet, and obtain a mean value
for that sample of 6.43 inches. These two figures are both sample means for the
same general population (all the adult male lizards): Sample Mean 1 and Sample
I continue in this fashion, making a number of different samples and calculating
the mean of each. Gradually I will collect a list of Sample Means, one for each
of the samples I have collected. I will create a list of numbers, each
representing a separate Sample Mean. These will probably be quite close to each
other in value, but there will be differences. In other words, the value of the
Sample Means will be distributed; we can think of the values we obtain for the
different Sample Means has having a frequency distribution, just like any other
list of numbers.
sure you understand this point. The collection of means from different samples
will provide a list of numbers which, like any such list (of the sort we have
been examining) will have a frequency distribution (with a mean value, a median,
a variation, and a standard deviation).
An Example of a Collection of Sample Means (S-Means)
order to reinforce this last point, let us continue to work through our example
with the adult male lizards. I continue my sampling, measuring, and calculating,
and produce the following results (let us assume for the sake of argument that
each sample contains 30 male lizards):
1: S-Mean 1: 6.56 in
Sample 2: S-Mean 2: 6.43 in
Sample 3: S-Mean 3: 6.48 in
Sample 4: S-Mean 4: 6.51 in
Sample 5: S-Mean 5: 6.40 in
Sample 6: S-Mean 6: 6.52 in
Sample 7: S-Mean 7: 6.54 in
Sample 8: S-Mean 8: 6.47 in
Sample 9: S-Mean 9: 6.49 in
Sample 10: S-Mean 10: 6.53 in.
that each of these S-means is the average for a sample of 30 adult male lizards.
This list of numbers also has a mean value (6.493 in) and a Standard Deviation
(0.0499 in). These, you will recall, we can have Excel calculate for us (just as
for any list of numbers).
will remember from the previous chapter that the standard deviation is a measure
of the distribution of the frequencies in the probable results. A small standard
deviation (as in the above example) means that most of the values will lie close
to the overall mean of the numbers in the list.
The Mean of the S-Means
reasons which lie outside the scope of this report, the values of the S-Means
will have a frequency distribution represented by the normal curve (that is, the
probabilities that particular S-means will have certain values will follow the
pattern of a normal distribution, which we discussed in the previous section).
Thus, the various probabilistic characteristics of the normal curve, which we
have studied in an earlier module, will apply to the collection of samples we
have made (2). Please make sure you
understand this very important point; everything we do in the rest of this
chapter depends upon it.
also know from mathematical studies that in such a normal distribution of all
the S-means for a particular population, the mean value (the mid point, the
highest part of the normal curve of S-means) will be the same as the average for
the entire population. We cannot measure all the population and then calculate
the mean, but we can theoretically establish that if we did so, the mean for the
entire population would be same as the average of all the means of all the
samples of that population we could collect (since if our sampling was complete
we would have measured each member of the population).
point is obvious enough if you think about it. If I kept collecting samples like
the 10 listed above, eventually I would have sampled the entire population
(assuming no two lizards were in more than one sample). The average of all my
samples would then be the average of the entire population, because all my
samples would be the same as the entire population.
particular sample we take of 30 adult male lizards might be truly representative
of the total population (in which case the mean of the sample would coincide
with the mean for the entire population), or it might misrepresent somewhat the
population under study (that is, the sample mean may be displaced from the
population mean). We have no way of directly knowing that unless we can measure
every member of the population.
more samples we collect and the larger those samples, the closer the average
height obtained by averaging the means of all the samples will be to the average
height for the entire population. If I kept sampling until I had sampled every
adult male lizard, then the average of all the sample means would be the same as
the average for the total population.
The Value of a Single Sample.
in practice we usually do not have time (or money) to carry out enough
measurements of separate samples to calculate the mean of all the Sample Means
(we do not want to carry out a very large number of samples, find the average of
each sample, and then, treat those averages as a distribution, calculating the
mean of the S-Means and the standard deviation, as we theorized above). Besides,
in many cases (as in the male lizard example) we may never know whether we have
sampled every single member of the population.
most cases, we are interested in making some judgment about the entire
population on the basis of a single sample (of, say, 50). So what is of
immediate interest is this question: If I use the S-Mean from a single sample of
observations to make an estimate about the mean for the entire population, how
likely am I to make a serious mistake?
the importance of this question. It poses an vital statistical enquiry: On the
basis of a single sample, what am I entitled to conclude about the entire
population? For example, if I have randomly selected adult male lizards for a
measurement of their body length, what legitimate conclusions can I draw from
this small sample about all the adult male lizards? How certain can I be of any
turns out that the error in basing a conclusion about the entire population on a
small sample is likely to be quite small. This vital conclusion follows from the
important fact that the distribution of all possible Sample Means is a normal
curve and that the normal curve has important characteristics (as we have seen
in the previous section).
we know that in any normal curve, the further any value falls from the mean, the
less likely it is to occur. You will recall that there is approximately a .68
probability that any value will fall within 1 Standard Deviation on both sides
of the mean, and approximately a .95 probability that any value will fall within
2 Standard Deviations on both sides of the mean. Thus, from the properties of
all normal distributions, we know that there is only a .05 probability that any
value will lie more than 2 Standard Deviations from the mean. Hence, the more a
sample is a poor representative of the entire population, the less likely it is
the Sample Means are normally distributed around the value of the mean of the
entire population, the further the mean of any one sample is from this mean of
the entire population, the less likely it is to occur. As one moves from the
mid-point of the distribution in either direction, the number of samples which
produce an Sample Mean much smaller or larger than the mean of the population
gets smaller and smaller (since the means of those samples would have to fit
into the extremes of the normal curve).
this implies is that if we could ascertain the Standard Deviation for the
distribution of sample means, we would know the probabilities that any
particular sample mean would be close to or far away from the mean for the
that we are conceptualizing a normal distribution curve which represents all the
frequencies of all the mean values for all the samples we might make of a large
population. We have ascertained that the mean value of such a curve will be the
same as the mean value for the entire population we are studying. If we could
find out the Standard Deviation of this normal curve, then we would know how the
various values of the sample means are distributed in relation to the mean of
the normal curve.
Standard Deviation of this normal distribution of Sample Means is called the
Standard Error or the Standard Error of the Means. If we had a way of
ascertaining its value, then we could describe the probabilities of the entire
curve, just as we can for any normally distributed value.
Standard Deviation and Standard Error
sure you understand the difference between the terms Standard Error and Standard
Deviation. The standard error is the name of a very particular standard
deviation, the standard deviation of the means of all the samples we could take
of a particular population (e.g., the population of adult male lizard in the
example we have been considering).
clarify this issue, if it still needs clarification, let me list here once more
some summary points:
When we collect a sample or deal with the entire population in our measurements,
we can list all the numerical results and then calculate the mean and the
standard deviation of that list by the methods we have already discussed
(usually getting Excel's Descriptive Statistics function do the work for us).
When we are dealing with a very large population, we will take a small sample
picked so as to avoid bias. The larger total population has a mean and a
standard deviation, but we do not have the time or the resources to measure all
the cases (even if we could locate them), and therefore we do not know what
these figures are directly. The only direct observations we have are from the
sample we have taken.
However, the Standard Error, which we are able to calculate from our sample (see
below), will give us the Standard Deviation of all the different averages from
all the samples we could make of the general population (or a figure close
enough to the Standard Deviation of the entire population to use as a substitute
We use the term Standard Deviation to remind ourselves that the figure we are
dealing with refers to a sample or to an entire population. We use the term
Standard Error to remind ourselves that we are dealing with the distribution of
the averages from all possible samples (even though we have undertaken to
measure only a single sample).
Calculating the Standard Error
our discussion above, we outlined one method for calculating the Standard Error.
That was to collect all the possible samples of a population, calculate the
mean, and then calculate the Standard Deviation of the frequency distribution of
Sample Means. Theoretically, that is fine, but in practice, we simply cannot
carry out sampling until we have included the entire population of our study.
there is another way of calculating the Standard Error. Mathematicians have
demonstrated that the Standard Error (which tells us the Standard Distribution
in the normal curve of all the possible Sample Means) can be derived from a
single sample (or a value so close to the Standard Distribution of that curve
that for practical purposes we can treat it as the Standard Error). The value is
equal to the Standard Deviation of the sample divided by the square root of the
number of items in the sample, as the following formula indicates:
this information, as we shall see, turns out to be a very powerful piece of
information. From a single sample, we can calculate the standard distribution of
the normal curve depicting the means of all possible samples. Make sure you
understand this point; much of what we do from here on depends upon grasping
this idea that from one relatively small sample of a large population we can
draw conclusions about the distribution of the averages from all possible
samples of that same population.
Minimum Sample Size
the mathematics we have been discussing to work effectively, the sample we
select must not be too small. The minimum permissible size is 30 observations.
And remember that when we are dealing with samples (as opposed to total
populations), to derive the standard deviation of the sample, we divide the sum
of the squared differences between the mean and the observation by one less than
the number in the sample. If this is a puzzle to you, do not worry about it,
since Excel does the calculations anyway. But this practice of dividing by one
less than the number in the sample is the reason why Excel's calculation of the
standard deviation of a list of numbers is always slightly higher than the
result produced by a manual working out of the result which uses all the numbers
in the sample. Excel treats every list of numbers as a sample not as the total
calculating the standard error, however, we do not follow the same principle of
using one less than the number in the sample. As the formula above indicates, we
divide the standard deviation by the square root of the total number of items in
you may have already observed, Excel calculates the standard error for any list
of data and includes the figure in the Descriptive Statistics box.
A Simple Application of the Sample Mean and Standard Error
fact that we can calculate the standard error of the means from a single sample
of populations turns out to be extraordinarily useful. For on the basis of a
single sample (provided it is more than 30 and free from bias), we can derive
the standard deviation of the normal curve representing the means of all
possible samples. And this, in turn, enables us to calculate the probability
that our sample mean is close to or far away from the mean of all the sample
means (which is equivalent to the mean of the total population).
instance, suppose, as a consumer advocate, I am interested in examining the
quality of a particular brand of light bulbs, to see if they are up to the
manufacturer's guarantee. Well, first I collect a random sample of, say, 100
bulbs. I then test that sample, measuring the number of hours the bulb functions
before burning out. This test yields a list of one hundred results (one for each
member of the sample). From these one hundred numbers, I calculate (or Excel
calculates for me) the mean life of the bulbs in the sample and the standard
deviation of the results listed from the test of the sample.
life of the light bulbs in the sample: 300 hr
Standard deviation of the sample: 20 hr
these two figures I can calculate the standard error: the standard deviation of
the sample divided by the square root of the number of items in the sample or,
in this case, 20 divided by the square root of 100, that is by10, for a result
of 2 hr.
know that the average of all the means of all the samples is the same as the
average for the entire population, and we know that the standard deviation in
the normal curve representing the values for all the different sample means is
equal to the standard error (2 hr).
on the basis of my single sample, I can conclude that there is a .68 probability
that the average for the entire population of all the light bulbs lies within 1
standard error of the mean of my sample, that is, between (300 - 2) and (300 +
2), or between 298 hr and 302 hr. There is a .95 probability that the mean of
the total population of light bulbs (that is, the average life of all the light
bulbs made by this manufacturer) lies between the sample mean and 2 standard
errors, or between (300 - 4) and (300 + 4), that is, between 296 and 304 hr.
the nature of this conclusion. On the basis of a relatively small sample of a
very large population, we can establish a conclusion about that larger
population. The conclusion is in the form of a series of probability statements,
each of which defines a range of possible values. This form of conclusion and
its uses will become clearer in some of the examples and exercises which follow.
The Evaluative Use of Standard Error
does all this add up to? Well, here's a hypothetical practical illustration.
Suppose I wish to learn about the mathematical capabilities of all the Grade XII
students in Nanaimo. I have neither the money nor the time to arrange to have
them all tested. Thus, I organize a random sample of, say, 100 students and give
them a special test on their mathematical skills. I find that the average score
in the sample is 65, with a standard deviation of 16.74. What can I conclude on
the basis of this information about the average capabilities in mathematics for
all Grade XII student in Nanaimo?
I begin by calculating the standard error (or reading it off from the
Descriptive Statistics table generated by Excel, once I have entered the
observational data onto a worksheet). In this case the standard error is 1.67
the average (mean) score in my sample was 65. And I know that if I analyzed many
similar samples, the averages of the samples would be normally distributed in a
curve where the standard deviation is equal to the standard error calculated
above (1.67 marks).
if the average in my sample was 65, I can state that there is a .68 probability
that it falls within 1 standard error of the mean of the total population of all
the Nanaimo Grade XII students (either higher or lower). Thus I am 68 percent
certain that the mean score for all the students in Nanaimo on this mathematics
test is between (65 - 1.67) and (65 + 1.67), that is, between 63.33 and 66.67.
want to be more certain than this, I can state that there is a probability of
.95 (or that I am 95 percent certain) that the average for the entire Nanaimo
Grade XII population on this mathematics test will fall between the sample mean
and 2 standard errors, that is, between [65 - (2 x 1.67)] and [65 + (2 x 1.67)]
or between 61.66 and 68.34.
want to be even more confident, I can state with .99 probability (or 99 percent
certainty) that the average for the entire Nanaimo Grade XII population will be
with 3 standard errors of the sample mean.
Self-Test on Estimating the Population Average from a Sample
are interested in finding out about the hours elementary school children in
School District 68 spend in organized recreational exercise outside of school.
You select a random sample of 50 elementary school students, obtain data about
organized recreational exercise for each of them, enter the data on an Excel
worksheet, and obtain the following result.
time spent in organized recreational exercise (per week): 2.46 hr
Standard deviation in the sample: 2.01 hr
the method we have already gone through with the light bulbs and the Grade XII
students to produce a conclusion about the average hours of organized
recreational exercise for all elementary students in School District 68. State
the conclusion with .68 probability, with .95 probability, and with .99
probability (or with 68 percent certainty, with 95 percent certainty, and with
99 percent certainty).
an answer to this self-test see the end of this section of the module.
have already briefly discussed the nature of the conclusion we have been drawing
from these statements about a total population based on what we measure in a
relatively small sample. These inferences consist of a range of values and a
mathematical figure of probability (e.g., .68 probability, a .95 probability).
like this illustrate what is called a confidence level, a conclusion which
offers a range of values and a statement of probability: we conclude that there
is a p probability that the mean of the total population falls between figures
and y. This might also be stated negatively: there is a certain probability that
the average score for the total population does not fall between x and y.
figure for the probability (p) is determined by the distance the limits
of the range are from the mean of the sample (measured in standard errors or, to
use language we introduced in an earlier section, measured in the z-score).
As we saw in the last chapter, we can have 68 percent confidence (or p =
.68) that any value in a normal distribution will fall within one standard
deviation of the mean (i.e., have a z-score of between -1 and +1). We can
have a 95 percent confidence (p = .95) that any value in a normal
distribution will fall within 2 standard deviations of the mean, that is,
between a z-score of -2 and a z-score of +2. And we can have a 99
percent confidence (p = .99) that any value in a normal distribution will
fall between a z-score of -3 and a z-score of +3.
that, as we would expect, I can increase the confidence of my conclusions by
widening the range within which the value will fall. The more certain I wish to
be, the wider the range of values. If I want to narrow the range of values in my
conclusion, then I lower the confidence level.
point about confidence levels is important in understanding the way in which the
media publish poll results. For example, when a newscaster says that a recent
poll has just revealed that 42 percent of the electorate would vote Liberal if
the election were held tomorrow, that remark will usually be accompanied by a
qualification like the following: "These results are considered accurate
within 2.5 percentage points nineteen times out of twenty." What this
qualification means is that the pollsters are 95 percent confident that (i.e.,
sure that in 19 cases out of 20) if the election were held tomorrow, the
Liberals would get 42 plus or minus 2.5 percent of the vote (i.e., between 39.5
and 44.5 percent of the vote). On the basis of their relatively small sample,
they are establishing a confidence level and a range within two standard errors.
More Curious Observations
the basis of what we have learned so far about making conclusions about a large
population on the basis of a single sample of more than 30, we can notice some
interesting further details about this very useful procedure made possible by
the calculation of the standard error.
the size of the confidence interval depends upon the size of the standard error
(which is a measure of the standard deviation in the distribution of sample
means). Thus, if we can lessen the standard error, we can diminish the range of
values in each confidence level (and thus provide more precise conclusions).
may recall that we calculate the standard error from the sample, taking the
standard deviation of that sample and dividing the figure by the square root of
the number of observations in the sample. Since we calculate the standard error
by dividing by the square root of the number in the sample, increasing the
number in the sample may have only a small effect on decreasing the size of the
question I might like to consider is the following: in order to lessen the size
of the standard error, how much would I have to increase the size of my sample?
Or, alternatively, will increasing the size of my sample enable me to narrow the
range of the conclusion?
answer, it turns out for reasons explained below, is that increasing the sample
size can indeed narrow the range of results, but that the increase in the sample
size has to be very large—so large, in fact, that it may prove to be too
costly and time consuming to implement.
example, if we were dealing with a sample of 100 students in a study of their
skills on a test and if the standard deviation of the list of results in our
sample was, say, 16 marks, then we would calculate the standard error by
dividing the standard deviation by the square root of the number in the sample,
that is, 16 divided by the square root of 100, or 16 divided by 10, or 1.6.
Thus, in estimating the confidence intervals for the entire population of
students, we would be using the figure of 1.6 marks as the basis of our
intervals to calculate the ranges for .68, .95, and .99 probability.
if we wanted a narrower range, in order to have a more precise result, we would
like to reduce the standard error (thus having a smaller interval). One way we
might like to do this is to increase the size of the sample. If we increase its
size, then we increase the size of its square root and therefore diminish the
standard error (which is produced by dividing the standard deviation by the
square root of the number in the sample).
since we are dealing with the square root of the number in the sample, we will
have to increase the sample size considerably. For instance, in the example
above we dealt with a sample of 100 students and achieved a standard error of
1.6 by dividing the standard deviation of the sample, 16, by the square root of
100, or 10. If we wanted to reduce the standard error by half, we would have to
divide 16 by 20. And to be able to do this we would have to sample 400 students
(the square root of 400 is 20).
this means, in effect, is that in many cases it is not worth the effort to
increase the sample size in order to achieve more precise results. Since
selecting the sample information is the really time consuming part of the
analysis, it is generally more efficient to keep the sample relatively small
(provided it is over 30) and to concentrate on making it the best sample we can
achieve (i.e., least liable to bias).
is not to say, of course, that the size the sample is irrelevant. Obviously,
that is not the case. Increasing the size of the sample does reduce the standard
error and thus makes the conclusions more precise. In fact, mathematicians have
drawn up guidelines as to the most appropriate sizes for samples relative to the
size of the larger population they are intended to represent and to the level of
accuracy in the sampling revealed.
this module, as mentioned before, we are not dealing with the complex rules for
proper sampling strategies (other than the few remarks previously in this
section). So we are not concerning ourselves with the problems of sampling
error. In the various examples we work through, we shall assume that the sample
is a good one and will not take into account the sampling error (as we should if we
were being statistically diligent).
for interest only, you might like to see a list of the recommended sample sizes
for different populations. The table below, from a book on surveys, indicates
some recommended sample sizes:
Recommended Sample Sizes for Different
Populations and Permissible Sampling Errors
Sampling Error Allowed
Working Through An Example
us review one more time the steps in making confidence generalizations about an
entire population from a single sample.
First we select a sampling strategy (normally using random sampling when the
total population is suitable for this process), select our sample (making sure
we have at least 30 separate observations in it), and collect the information.
Then, we enter the data on a spreadsheet (like Excel) and apply the Descriptive
Statistics tool in order to ascertain the mean and the standard error of the
Finally, we make our conclusions at different confidence levels: 68 percent for
a range within 1 standard error of the mean of our sample (above and below), 95
percent for a range within 2 standard errors of the mean of the sample, and 99
percent for a range within 3 standard errors of mean of the sample.
I wish to know (for purposes of comparison) the average score for all first-year
university students in British Columbia on a standard intelligence quotient (IQ)
test. Going through the steps outlined above, I complete steps 1 and 2 for a
sample of 100 students. The mean score of the sample is 112; the standard
deviation is 12 points.
these two figures I can compute the standard error (the standard deviation
divided by the square root of the number in the sample): that comes out to 12
divided by 10 or 1.2 points.
I can make my conclusion at different confidence levels, as follows:
am 68 percent certain that the average IQ score on this test for all first-year
university students in BC is within a range 1 standard error on either side of
the mean of my sample, that is, between 110.8 and 113.2.
am 95 percent certain that the average IQ score on this test for all first-year
students in BC is within a range 2 standard errors on either side of the mean of
my sample, that is, between 109.6 and 114.4.
am 99 percent certain that the average IQ score on this test for all first-year
students in BC is with a range 3 standard errors on either side of the mean for
my sample, that is, between 108.4 and 115.6.
Self-Test on Confidence Levels
the method outlined immediately above, try the two following problems.
We want to know the average pulse rate in a population of 1000 track athletes.
We sample the pulse rates of 50 athletes taken at random and calculate the mean
pulse rate of the sample to be 79.1 beats per minute, with a standard deviation
of 7.6 beats per minute. What can we conclude about the mean value (in beats per
minute) for the entire population of athletes? State your conclusion at three
different confidence intervals (at .68, .95, and .99 probability).
sample study of the family incomes in Canada revealed the following: sample
size, 1600, mean family income of the sample--$51,300; standard deviation of the
sample--$8000. What can you infer about the mean family income for the entire
population at a confidence level of 95 percent?
Using a Table to Read for any Level of Confidence
to this point we have only dealt with three confidence levels: 68 percent (or
.68 probability), 95 percent (or .95 probability), and 99 percent (or .99
probability). We used these because they correspond to the ranges defined by 1,
2, and 3 standard deviations away from the mean (something we learned in the
practice, however, we are not limited to just these three figures. We can
establish any level of confidence we want. But we will need to know the link
between the confidence level we want and the precise figure for the z-score
at that level (what this means will become clearer very soon).
this puzzles you, let us go through the point step by step, as follows:
normal curve (the shape of a normal distribution) indicates the relative
frequencies of all the values in the population we are studying. Thus, we can
imagine the area under the top line of the curve as representing the entire
If we think of the population under the curve as an area, then we can see
clearly that in a normal distribution the total population is divided in half by
the mean. There is thus a .5 probability in any normally distributed population
that a particular value will fall in the area to the right of the mean (i.e., in
the upper values), and a .5 probability that any particular member of the
population will fall to the left of the mean (i.e., in the lower half of the
In the previous chapter, we discussed how in the normally distributed curve, the
area under the curve is always divided in the same way by units of standard
deviation: 68 percent of the total population falls within 1 standard deviation
of the mean (34 percent on either side); 95 percent of the population falls
within 2 standard deviations of the mean (47.5 on either side); and 99 percent
of the total population falls within 3 standard deviations of the mean (49.5 on
But clearly we are not confined to just 1, 2, or 3 standard deviations. There
are all sorts of possibilities in between them (e.g., 1.2 standard deviations,
0.7 standard deviations, and so on). And each of these will define a different
area under the normal curve. And each area, so defined, will include its own
percentage of the total population (and thus establish its own confidence
Now, the mathematics of calculating areas under the normal curve for all
distances away from the mean is complex and laborious. Fortunately, however,
mathematicians have created tables for us, using which we can simply read off
particular distances from the mean and their corresponding areas. Thus, we can
easily determine what level of confidence we want and find the distance
appropriate to it.
do this procedure, we need a table which indicates precisely the areas of the
normal curve at various z-scores. Here is such a table. If it
looks a bit intimidating, don't worry. Read carefully the description of
how the table works in the paragraphs after it.
the Area Under the Normal Curve at Different z-scores
Note that this table is only for one half of the normal curve
table indicates in the extreme left hand column (in bold) the distance away from
the mean in standard deviation units (which is, mentioned before, is the same as
the z-score) up to one decimal place. The columns from column two
towards the right indicate the values for the third decimal place for that z-score
(e.g., from 1.10, 1.11, 1.12, 1.13. 1.14, and so on).
decimal figure in each cell indicates the area under the curve at that
particular z-score (for one half the normal curve). Thus, in the first line, the area under normal curve at a z-score of 0.00
is .0000. This means that when we are exactly on the mean, the area under the
curve is 0 (since there no distance between the mean and itself). If we move to the next column (to the right), the z-score is
0.01, and the corresponding area under the curve between the mean and this
distance away from it is 0.0040. Since we are dealing with only one half the
curve, the total area under the curve defined by a z-score of 0.01 on both sides
of the curve is twice the given value, 0.008 (or 0.8 percent of the total area
under the curve is within a z-score of 0.01 on either side of the mean).
you check now the area figure for a z-score of 1.00 you will notice that it
reads .3413. This means that of all the scores under the curve 34.13 percent of
them will fall between the mean and a z-score of 1 on one side of the curve. If
we want to include all the scores within 1 standard deviation of the mean on
both sides, then we would double this figure (i.e., to 68.26 percent). We have
been using the figure 68 percent as a convenient approximation of that value.
that in the top lines of the table, where the scores are close to the mean, the
values for the areas increase as one moves across a row much more than they do
at the bottom of the table (for the z-scores further from the
mean). Obviously, that is the case because the normal curve is highest
close to the mean (with considerable distance below it, so that increasing the z-score
includes a significant area); as the z-score approaches 3, the normal
curve is very close to the axis, with almost no area beneath it. Hence,
increasing the z-score does not increase the values given very quickly.
the Table for Different Confidence Levels
decimal numbers in the cells of the table also indicate the probability that any
value in a normal distribution will fall between the mean and the particular z-score
corresponding to the value.
example, suppose we wanted to know the z-score which would give us a
confidence level of .75 (or 75 percent). Half of 75 percent is 37.5
(remember we are dealing with half the curve), which we express as a fraction as
.375. If we consult the table, we can locate the number closest to that
value in the cell corresponding to a z-score of 1.15 (the value in the
table is .3749). This information tells us that in a normal distribution,
I can be 75 percent certain that any value will fall with 1.15 standard
deviations above or below the mean.
Answers to Self-Test Sections
to Self-Test on Estimating the Population Average from a Sample
find the Standard Error we divide the Standard Deviation of the Sample (2.01 hr)
by the square root of the number in the sample. The square root of 50 is 7.07.
Therefore the Standard Error is 2.01 hr divided by 7.07 or .28 hr.
about the average time elementary school children in School District 68 spend on
organized recreational exercise out of school, I can make the following
68 percent certain that the average time falls between (2.46 + .28) and (2.46 -
.28) or between 2.74 hr and 2.18 hr. I am 95 percent certain that the average
time falls between 3.02 hr and 1.9 hr. And I am 99 percent certain that the
average time falls between 3.3 hr and 1.62 hr.
here that the more confident I wish to be, the wider the range of values I have
to the Self-Test on Confidence Levels (Section Q)
The Standard Error of the sample is the Standard Deviation divided by the square
root of the number in the sample, that is, 7.6 divided by the square root of 50,
or 7.6 divided by 7.07, or 1.09 beats per minute. Thus, I can conclude the
following about the population of 1000 athletes: I am 68 percent certain (or the
probability is .68) that the average pulse rate is between (79.1 + 1.09) and
(79.1 - 1.09) or between 80.19 and 78.01 beats per minute. I am 95 percent
certain (or the probability is .95) that the average pulse rate is between 81.28
and 76.92 beats per minute. And I am 99 percent certain (or the probability is
.99) that the average pulse rate is between 82.37 and 75.83 beats per minute.
The Standard Error of the sample is the Standard Deviation (8000) divided by the
square root of the number in the sample (1600) or 8000 divided by 40, or 200
dollars. Thus, I can be 95 percent certain that the average family income is
between $51,700 and 50,900.
to Section Six
A famous well known example of the sort of bias which can occur in non-random
sampling is Shere Hite's book Women and Love. The author mailed out 100,000
questionnaires to women's organizations (a Haphazard or Opportunity sample).
Only 4.5 percent were filled out and returned, so that the results were biased
in favour of women who belong to such organizations and who were sufficiently
motivated to respond. [Back
This very important property is true whether or not the population from which
the samples are taken is normally distributed or not. The frequency distribution
of the Sample Means from any population will always follow a normal
distribution. [Back to