-
Skewness, Kurtosis, and the Normal
Curve
?
Skewness
In
everyday language, the terms “skewed” and “askew”
are used to refer to
something that is
out of line or distorted on one side. When
referring to the shape of
frequency or
probability distributions,
“skewness”
refers to asymmetry of the distribution.
A distribution with an asymmetric tail
extending out to the right is referred to as
“positively skewed” or “skewed to the
right,” while a distribution with an asymmetric
tail
extending out to the left is
referred to as “negatively skewed” or
“skewed to the left.”
Skewness can
range from minus infinity to positive infinity.
Karl Pearson (1895) first
suggested measuring skewness by standardizing the
?
?
mode
difference between the mean and the mode, that is,
sk
?
. Population
modes
?
are not well
estimated from sample modes, but one can estimate
the difference
between the mean and the
mode as being three times the difference between
the mean
and the median (Stuart & Ord,
1994), leading to the following estimate of
skewness:
3
(
M
?
median)
. Many statisticians
use this measure but with the ‘3’ eliminated,
sk
est
?
s
(
M
?
median)
that is,
sk
?
. This
statistic ranges from -1 to +1. Absolute values
above
s
0.2 indicate great
skewness (Hildebrand, 1986).
Skewness has also been defined with
respect to the third moment about the
?
(
X
?
?
p>
)
3
mean:
?
1
?
,
which is simply the expected value of the
distribution of cubed
z
n
?
3
scores.
Skewness measured in this way is sometimes
referred to as “Fisher’s
skew
ness.” When the
deviations from the mean are greater in one
direction than in the
other direction,
this statistic will deviate from zero in the
direction of the larger
deviations.
From sample data, Fisher’s skewness is most often
estimated by:
n
?
z
3
g
1
?
p>
. For large sample sizes
(
n
> 150),
g
1
may be
distributed
(
n
?
p>
1
)(
n
?
2
)
approximately
normally, with a standard error of approximately
6
/
n
.
While one could
use this sampling
distribution to construct confidence intervals for
or tests of
hypotheses about
?
1
, there is
rarely any value in doing so.
The most commonly used measures of
skewness (those discussed here) may
produce surprising results, such as a
negative value when the shape of the distribution
appears skewed to the right. There may
be superior alternative measures not
commonly used (Groeneveld & Meeden,
1984).
It is important for
behavioral researchers to notice skewness when it
appears in
their data. Great skewness
may motivate the researcher to investigate
outliers. When
making decisions about
which measure of location to report (means being
drawn in the
direction of the skew) and
which inferential statistic to employ (one which
assumes
normality or one which does
not), one should take into consideration the
estimated
skewness of the population.
Normal distributions have zero skewness. Of
course, a
distribution can be perfectly
symmetric but far from normal. Transformations
commonly
employed to reduce (positive)
skewness include square root, log, and reciprocal
transformations.
Also see
Skewness and the Relative Positions of
Mean, Median, and Mode
Kurtosis
Karl
Pearson (1905) defined a distribution’s degree of
kurtosis as
?
?
?
p>
2
?
3
,
?
(
X
?
?
)
4
where
?
2
?
,
the expected value of the distribution of
Z
scores which have
< br>n
?
4
been raised
to the 4
th
power.
?
2
is
often referred to as “Pearson’s kurtosis,” and
?
2
- 3
(often symbolized with
?
2
) as “kurtosis
excess” or “Fisher’s kurtosis,” even though it was
Pearson who defined kurtosis as
?
2
- 3. An
unbiased estimator for
?
2
is
n
(
n
?
1<
/p>
)
?
Z
4
3
(
n
?
1
)
2
g
2
?
?
. For
large sample sizes (
n
>
1000),
g
2
may be
(
n
?
1
p>
)(
n
?
2
)(
n
?
3
)
(
n
?
2
)(
n
?
3
)
distributed
approximately normally, with a standard error of
approximately
24
/
n
(Snedecor, & Cochran, 1967). While one
could use this sampling distribution to
construct confidence intervals for or
tests of hypotheses about
?
2
, there is
rarely any
value in doing so.
Pearson (1905) introduced
kurtosis as a measure of how flat the top of a
symmetric distribution is when compared
to a normal distribution of the same variance.
He referred to more flat-topped
distributions (
?
2
< 0) as “platykurtic,” less
flat
-topped
distributions
(
?
2
>
0) as “leptokurtic,” and equally
flat
-topped distributions as
“mesokurtic”
(
?
2
?
0). Kurtosis is actually
more influenced by scores in the tails of the
distribution than scores in the center
of a distribution (DeCarlo, 1967). Accordingly,
it is
often appropriate to describe a
leptokurtic distribution as “fat in the tails” and
a
platykurtic distribution as “thin in
the tails.”
Student (1927,
Biometrika
,
19
, 160) published a cute
description of kurtosis,
which I quote
here: “Platykurtic curves have shorter ‘tails’
than the normal curve of
error and
leptokurtic longer ‘tails.’ I
myself
bear in mind the meaning of the words by
the above
memoria
technica
, where the first figure
represents platypus and the second
kangaroos, noted for lepping.” Please
point your browser to
./jeff570/
, scroll down to
“kurtosis,” and look at Student’s
drawings.
Moors
(1986) demonstrated that
?
2
p>
?
Var
(
Z
p>
2
)
?
1
. Accordingly, it may be best to
treat kurtosis as the extent to which
scores are dispersed away from the shoulders of a
distribution, where the shoulders are
the points where
Z
2
= 1, that is,
Z
=
?
1. Balanda
and
MacGillivray (1988) wrote “it is best to define
kurtosis vaguely as the location
- and
scale-free movement of probability mass
from the shoulders of a distribution into its
centre and tails.” If one starts with
a normal distribution and mo
ves scores
from the
shoulders into the center and
the tails, keeping variance constant, kurtosis is
increased.
The distribution will
likely appear more peaked in the center and fatter
in the tails, like a
6
Laplace
distribution
(
?
2<
/p>
?
3
) or
Student’s
t
with few degrees of freedom
(
?
2
?
).
df
?
4
St
arting again with a normal distribution, moving
scores from the tails and the
center to
the shoulders will decrease kurtosis. A
uniform distribution
certainly has a flat
top, with
< br>?
2
?
?
1
.
2
, but
?
2
can reach a
minimum value of
?
2 when two
score values are
equally probably and
all other score values have probability zero (a
rectangular U
distribution
, that is, a
binomial distribution with
n
=1,
p
= .5). One might
object that the
rectangular U
distribution has all of its scores in the tails,
but closer inspection will
reveal that
it has no tails, and that all of its scores are in
its shoulders, exactly one
standard
deviation from its mean. Values of
g
2
less than that
expected for an uniform
distribution
(
?
1.2) may suggest that the
distribution is bimodal (Darlington, 1970), but
bimodal distributions can have high
kurtosis if the modes are distant from the
shoulders.
One leptokurtic
distribution we shall deal with is Student’s
t
distribution. The
kurtosis of
t
is
infinite when
df
< 5, 6 when
df
= 5, 3 when
df
= 6. Kurtosis decreases
further (towards zero) as
df
increase and
t
approaches the normal
distribution.
Kurtosis is
usually of interest only when dealing with
approximately symmetric
distributions.
Skewed distributions are always leptokurtic
(Hopkins & Weeks, 1990).
Among the
several alternative measures of kurtosis that have
been proposed (none of
which has often
been employed), is one which adjusts the
measurement of kurtosis to
remove the
effect of skewness (Blest, 2003).
There is much confusion about how
kurtosis is related to the shape of
distributions. Many authors of
textbooks have asserted that kurtosis is a measure
of
the peakedness of distributions,
which is not strictly true.
It is
easy to confuse low kurtosis with high variance,
but distributions with
identical
kurtosis can differ in variance, and distributions
with identical variances can
differ in
kurtosis. Here are some simple distributions that
may help you appreciate that
kurtosis
is, in part, a measure of tail heaviness relative
to the total variance in the
distribution (remember the
“
?
4
” in the
denominator).
Table 1.
Kurtosis for 7
Simple Distributions Also Differing in Variance
X
05
10
15
Kurtosis
Variance
freq A
20
00
20
-2.0
25
freq B
20
10
20
-1.75
20
freq C
20
20
20
-1.5
16.6
freq D
10
20
10
-1.0
12.5
freq E
05
20
05
0.0
8.3
freq F
03
20
03
1.33
5.77
freq G
01
20
01
8.0
2.27
Platykurtic
Leptokurtic
When I presented these
distributions to my colleagues and graduate
students
and asked them to identify
which had the least kurtosis and which the most,
all said A
has the most kurtosis, G the
least (excepting those who refused to answer).
But in fact
A has the least kurtosis
(
?
2 is the smallest possible
value of kurtosis) and G the most.
The
trick is to do a mental frequency plot where the
abscissa is in standard deviation
units. In the maximally platykurtic
distribution A, which initially appears to have
all its
scores in its tails, no score
is more than one
?
away from
the mean - that is, it has no
tails!
In the leptokurtic distribution G, which seems
only to have a few scores in its tails,
one must remember that those scores (5
& 15) are much farther away from the mean
(3.3
?
) than are the 5’s & 15’s in
distribution A. In fact, in G nine percent of the
scores
are more than three
?
from the mean, much more
than you would expect in a
mesokurtic
distribution (like a normal distribution), thus G
does indeed have fat tails.
If you were you to ask SAS to compute
kurtosis on the A scores in Table 1, you
would get a value less than
?
2.0, less than the lowest
possible population kurtosis.
Why?
SAS assumes your data are a sample and computes
the
g
2
estimate
of
population kurtosis, which can fall
below
?
2.0.
Sune Karlsson, of the Stockholm School
of Economics, has provided me with the
following modified example which holds
the variance approximately constant, making it
quite clear that a higher kurtosis
implies that there are more extreme observations
(or
that the extreme observations are
more extreme). It is also evident that a higher
kurtosis also implies that the
distribution is more
‘single
-
peaked’ (this would
be even
more evident if the sum of the
frequencies was constant). I have highlighted the
rows
representing the shoulders of the
distribution so that you can see that the increase
in
kurtosis is associated with a
movement of scores away from the shoulders.
-
-
-
-
-
-
-
-
-
上一篇:英文石材术语
下一篇:Protege新手入门(基础篇)