-
Heredity (2010)
105, 257
–
267
&
2010 Macmillan Publishers Limited All rights
reserved 0018-067X/10 $$32.00
ORIGINAL
ARTICLE
/hdy
Statistical properties of QTL linkage
mapping
in biparental genetic
populations
1,2
H Li
, S
Hearne
3
, M
Ba¨
nziger
4
, Z
Li
2
and J Wang
1
1
Institute of
Crop Science, The National Key Facility for Crop
Gene Resources and Genetic Improvement and CIMMYT
China, Chinese
Academy of Agricultural
Sciences, Beijing, PR China;
2
School of Mathematical
Sciences, Beijing Normal University, Beijing, PR
China;
3
International
Institute
of
Tropical
Agriculture
(IITA),
Ibadan,
Oyo
State,
Nigeria
and
4
International
Maize
and
Wheat
Improvement
Center (CIMMYT),
Mexico, DF, Mexico
Quantitative
trait
gene
or
locus
(QTL)
mapping is
routinely
used
in
genetic
analysis
of
complex
traits.
Especially
in
practical breeding programs, questions
remain such as how
large
a
population
and
what
level
of
marker
density
are
needed to detect QTLs
that are useful to breeders, and how
likely it is that the target QTL will
be detected with the data
set
in
hand.
Some
answers
can
be
found
in
studies
on
conventional interval
mapping (IM). However, it is not clear
whether the conclusions obtained from
IM are the same as
those
obtained
using
other
methods.
Inclusive
composite
interval
mapping
(ICIM)
is
a
useful
step
forward
that
highlights
the
importance
of
model
selection
and
interval
testing
in
QTL
linkage
mapping.
In
this
study,
we
investigate the
statistical properties of ICIM compared with
IM through simulation. Results indicate
that IM is less
responsive to marker
density and population size (PS). The
increase in marker density helps ICIM
identify indepen-dent
QTLs
explaining
4
5%
of
phenotypic
variance. When
PS
is
4
200, ICIM
achieves unbiased estimations of QTL position
and effect. For smaller PS, there is a
tendency for the QTL
to be located
toward the center of the chromosome, with its
effect
overestimated.
The
use
of
dense
markers
makes
linked
QTL
isolated
by
empty
marker
intervals
and
thus
improves
mapping
efficiency.
However,
only
large-sized
populations
can
take
advantage
of
densely
distributed
markers. These findings are different
from those previously
found in IM,
indicating great improvements with ICIM.
Heredity
(2010)
105,
257
–
267;
doi:10.1038/hdy.2010.56;
published online 12 May 2010
Keywords: confidence interval; false
discovery rate; inclusive composite interval
mapping; population size; statistical power
Introduction
Quantitative trait gene or
locus (QTL) mapping has become a
routine approach for genetic studies of
complex traits in plants,
animals
and
humans
because
of
the
availability
of
high-
throughput molecular
markers. In comparison with association
mapping,
QTL
linkage
mapping
in
animals
and
humans
is
normally based on pedigree data, but in
plants it is more often
based
on
biparental
genetic
populations.
Statistical
methods
for
QTL
linkage
mapping
have
been
extensively
studied
(Lander and Botstein, 1989; Darvasi et
al., 1993; Zeng, 1994;
Whittaker et
al., 1996; Piepho, 2000; Sen and Churchill, 2001;
Xu, 2003; Bogdan and Doerge, 2005; Li
et al., 2007; Wang,
2009),
and
composite
interval
mapping
(CIM)
proposed
by
Zeng
(1994)
represents
one
of
the
most
commonly
used
methods.
Recently, Li et al. (2007) found that
CIM resulted in biased
mapping
results
because
of
the
simultaneous
estimation
of
QTL and background
effects in the implementation algorithm.
Inclusive
composite
interval
mapping
(ICIM)
was
then
proposed (Li et al., 2007;
Correspondence: Dr J Wang,
Institute of Crop Science, Chinese Academy
of
Agricultural
Sciences,
No.
12
Zhongguancun
South
Street,
Beijing
100081, PR China.
E-mail:
wangjk@
Received 4 December
2009; revised 19 March 2010; accepted 30 March
2010; published online 12 May 2010
Wang,
2009)
to
deal
with
this
problem
while
retaining
other
advantages
related
to
CIM.
Major
advantages
of
ICIM
were
summarized
as
follows:
(1)
ICIM
controls
the
sampling
variance better;
(2) it makes the background marker selection
process easier and simpler; (3) it
gives clearly high logarithm
of the
odds (LOD) scores at chromosomal regions with QTL
but
rather
low
LOD
scores
(that
is,
close
to
0)
in
which
no
QTLs
are
located,
thereby
increasing
mapping
power
and
decreasing the false discovery rate
(FDR); (4) it is robust for
mapping
parameters;
(5)
it
can
be
extended
to
map
digenic
epistatic QTLs
regardless of whether the two interacting QTLs
have significant additive effects or
not; and (6) the expectation
and
maximization
(EM)
algorithm
used
in
ICIM
has
a
high
convergence
speed
and
is
therefore
less
computing
intensive
(Li et al., 2007, 2008; Zhang et al.,
2008).
Available
mapping
methods
have
their
own
statistical
properties and
power for detecting QTL. Factors influen-cing
the
statistical
power
of
each
method
include
mapping
population
size
(PS),
marker
density,
signifi-cance
level
in
declaring
the
existence
of
QTL,
contribu-tion
of
the
segregating
QTL
to
the
observed
phenotypic
variance
and
genetic
distances
of
QTL
to
markers.
There
are
several
simulation
studies
on
how
these
factors
affect
the
detection
power
of
interval
mapping
(IM).
Darvasi
et
al.
(1993)
investigated
the
effect
of
marker
density
in
a
backcross
population, and concluded that reducing
marker spacing below
10 or 20 cM does
not provide
Statistical properties of
QTL linkage mapping
H Li et
al
258
additional gains, regardless of PS and
gene effect. At 20 cM
marker density
and assuming QTLs have equal effects with all
positive
alleles
from
one
parent,
Beavis
(1994)
showed
that
the
estimated
effects
with
correctly
identified
QTLs
were
greatly
overestimated
if
only
100
progeny
were
evaluated,
slightly
overestimated
if
500
progeny
were
evaluated
and
fairly close to the
actual magnitude when 1000 progeny were
evaluated; this was statistically
explained by Xu (2003). Using
an
analytical method, Piepho (2000) showed that the
power of
QTL detection and the standard
errors of effect estimates are
little
affected by an increase in marker density beyond
10 cM.
The bias of estimators of QTL
effects and locations from IM
was
discussed by Bogdan and Doerge (2005). On the
basis of
multiple
interval
mapping,
Mayer
et
al.
(2004)
studied
the
accuracy of position and effect
estimates of linked QTLs in
F
2
populations
by
simula-tion.
Some
theoretical
and
simulation
studies have also been conducted on the
confidence interval of
IM
(Visscher
et
al.,
1996;
Dupuis
and
Siegmund,
1999).
Recently,
Bogdan
et
al.
(2008)
showed
the
influence
of
marker density on the detection power
of small- or medium-
sized QTLs by a
modified version of the Bayesian information
criterion.
ICIM has superior
genetic and statistical properties, which
may
represent
an
important
improvement
in
QTL
linkage
mapping. It may be misleading to assume
that the influence on
ICIM of
experimental parameters such as PS, QTL effect and
marker
density
is
the
same
as
has
been
found
in
IM.
Our
objectives
in
this
study
were
(1)
to
investigate
the
effect
of
genetic
effect,
PS
and
marker
density
on
statistical
power,
position
and
effect
estima-tions
of
ICIM
and
(2)
to
provide
practical and statistical tables of
probabilities and confidence
intervals
so
that
a
QTL
can
be
identified
in
mapping
populations of
various sizes.
160 cM in length, similar to the maize
genome. Four marker
densities (MD) were
used (that is, MD
?
40, 20,
10 and 5 cM)
from sparse to dense,
which corresponded to 5, 9, 17 and 33
evenly distributed markers on each
chromosome. Two genetic
models (Tables
1 and 2) were simulated.
In
the
first
genetic
model
(Table
1),
there
were
eight
independent
QTLs,
that
is,
IQ1
–
IQ8,
with
different
levels
of
additive
effects
on
a
quantitative
trait
of
interest
(Table
1).
IQ1
had
the
smallest
genetic
effect,
explaining
only
1%
of
phenotypic
variation,
that
is,
phenotypic
variance
explained
(PVE)
?
1%,
whereas
IQ8 had the largest effect,
explaining
30% of phenotypic variation,
that is, PVE
?
30%
(Table
1).
The
eight
QTLs
were
distributed
on
different
chromosomes,
and no interac-tions between QTLs were
considered. The error
variance
was
set
at
0.25,
for
a
total
of
phenotypic
variance
equal to one. Thus, the additive effect
of a QTL was equal to
the
square
root
of
the
corresponding
PVE
(Table
1).
Broad-
sense heritability of
this quantitative trait was therefore 0.75,
which is the sum of PVE as all QTLs
were not linked.
Table 1 One genetic model
consisting of eight independent QTLs
QTL
Chromosome
Position (cM)
Additive effect
PVE (%)
1
2
3
4
5
10
20
30
IQ1
IQ2
IQ3
IQ4
IQ5
IQ6
IQ7
IQ8
1
2
3
4
5
6
7
8
25
32
39
46
53
60
67
74
0.1000
0.1414
0.1732
0.2000
0.2236
0.3162
0.4472
0.5477
Materials and methods
Genetic models used in simulation
In this paper, we considered a
hypothetical genome consisting
of 10
chromosomes. Each chromosome was
Table
2 One genetic model of two linked QTLs
Abbreviations:
PVE,
phenotypic
variance
explained;
QTL,
quanti-tative
trait locus.
The
genome
consists
of
10
chromosomes,
each
160
cM
in
length.
The
eight
QTLs were represented by
IQ1
–
IQ8. Each QTL was
represented by
chromosome
number,
position
(cM),
additive
genetic
effect
and
proportion of PVE by the
QTL. The phenotypic variance was fixed at 1.0,
and
the
additive
effect
of
a
QTL
was
equal
to
the
square
root
of
the
corresponding PVE. Broad-sense
heritability was set at 0.75, which is the
sum of PVE, as all QTLs are not linked.
Therefore, the error variance was
0.25.
Linkage phase
Position (cM)
Additive effect
LQ1
LQ2
32
42
52
32
42
LQ1
LQ2
0.3162
0.3162
0.3162
0.3162
0.3162
Genetic
a
variance
(V
G
)
0.3637
0.3340
0.3097
0.0362
0.0659
Error
variance
(V
e
)
0.8
0.8
0.8
0.8
0.8
Heritability
2
b
(H
)
PVE (%) of
c
each
QTL
Coupling
22
22
22
22
22
0.3162
0.3162
0.3162
0.3162
0.3162
0.3125
0.2945
0.2791
0.0433
0.0761
8.59
8.82
9.01
11.96
11.55
Repulsive
22
52
0.3162
0.3162
0.0902
0.8
0.1013
11.23
Abbreviations: PVE,
phenotypic variance explained; QTL, quantitative
trait locus.
The genome
consists of 10 chromosomes, each 160 cM in length.
The two QTLs
were represented by
LQ1 and LQ2. They
were
represented by their
positions (cM) on
chromosome 1, additive genetic effects, total
genetic variance, error variance, broad-sense
heritability and proportions of the PVE.
a
c
V
p>
G
?
a
1
+a
2
+2(1<
/p>
–
2r)a
1
a
2
where
a
1
and
a
2
are the additive effects
of LQ1 and LQ2, respectively, and r is the
recombination frequency between LQ1
and
LQ2. Haldane mapping function is used to convert
genetic distance to recombination frequency.
b
2
2
Broad-
sense heritability was calculated as H
?
V
G
/(
V
G
+V
e
)
.
2
2
PVE of LQ1
was calculated as a
1
/(V
G
+V
e
), and
PVE of LQ2 was calculated as a
2
< br>/(V
G
+V
e
). As a
1
?
a
2
in the linkage model, LQ1 and LQ2 have
the same PVE. Error variance was fixed
at 0.8. If LQ1 and LQ2 were not linked, each would
explain 10% of the phenotypic variance, and the
heritability
would be 0.2.
2
2
2
Heredity
-
-
-
-
-
-
-
-
-
上一篇:英文名片常用名词翻译
下一篇:The Big Five 大五类人格测试