\documentstyle{report}
\textwidth 2.7in % or 3?
\textheight 10.3in
\topmargin -70pt
\renewcommand{\baselinestretch}{.82} % was .83
%\documentstyle{report}
%\textwidth 237pt
%%\textwidth 244pt was too wide
%\topmargin -85pt
%\textheight 750pt
%\renewcommand{\baselinestretch}{.86}
%\begin{document}
\begin{document}
\sloppy
%Coalescent time in the presence of inbreeding
\noindent{\bf Introduction.}
%The Coalescent and Selection
The term coalescent refers to the entire ancestral pedigree of a population,
but the coalescent time, either the time since a common ancestor of two
individuals or the time since a common ancestor of the entire population, is
also of interest. The original description of the coalescent process by
Kingman (1982) has attracted interest in part because, in addition to a
description of the ancestral pedigree, it provided a concrete value for the
time since a common ancestor of the entire population, $4N$ generations. The
analysis of Kingman assumes selective neutrality with random mating with a
binomial (essentially Poisson) progeny distribution. The analysis is easily
modified to accommodate other progeny distributions, but selective neutrality
and random mating (or a haploid model) are necessary for the equivalence of
all alleles which is the cornerstone of the analysis.
Coalescent times, especially the time since a common ancestor of two genes,
have been calculated for many mating structures. Coalescent times have also
been studied in sub-divided populations (Takahata 1988, Slatkin 1991).
Selective neutrality is more fundamental to the model: the study of
coalescence in the presence of selection by Kaplan, Dardeen, and Hudson (1988)
uses selection to construct a virtually subdivided population for which the
coalescent is tractable. New insights are necessary to study coalescence in
the presence of selection.
One approach to finding the coalescent time in the presence of selection is to
observe that the coalescence process is the time inverse of fixation of an
allele. Actually, since there may be many successive populations which share
the same most recent common ancestor, and a sequence of alleles which first
become fixed in the same population (i.e., at the same time), there is not a
natural way to identify coalescent processes with fixation processes. But it
is easily demonstrated that the expected times since coalescence and until
fixation (in a time homogeneous population) are the same.
Hence the diffusion results for time until fixation (Crow \& Kimura 1970)
provide the coalescent times for populations in the presence of selection.
Unfortunately, the integrals which give the time are more readily written than
evaluated, therefore have limited utility for general insight. Another
problem is that the diffusion model assumes a two allele model with relative
fitnesses constant over time. A model incorporating selection in the
population rather than fitnesses of alleles may allow more general selection
structures.
\newpage
\noindent{\bf A population measure of selection.}
Fisher's fundamental theorem of natural selection states that the rate of
increase in the fitness of a population is equal to the (additive) genetic
variance of fitness in the population. With the Poisson progeny distribution
(with parameter 1), the average number of progeny and variance of the progeny
distribution are 1, but the average number of progeny of the parents of
progeny (weighted by the number of progeny) is equal to 2 (a haploid
population is assumed). This provides an increase in fitness from 1 to 2
which increase is equal to the variance of the progeny number if the variance
of the
progeny number is genetic, i.e., individuals have the same number of progeny
as their parents did. But if there is no genetic basis for the distribution
of progeny (the number of progeny in subsequent generations is independent),
there is no increase in fitness.
If every individual has the same number of progeny as its parent (assuming
some variation in the number of progeny), a population will grow
exponentially. But with the Poisson progeny distribution, if every individual
has one fewer progeny than its parent, the population size will remain
constant. In this circumstance the correlation between the number of progeny
of an individual and its parent is 1, hence all the variation in progeny
number is transmitted to subsequent generations (is genetic). Although the
fitness of individuals, as measured by the number of progeny, decreases from
that of the parents, this is merely the result of normalization for constant
population size; prior to renormalization the fitness of the population
increases by one. If the parental and offspring progeny distributions are
independent, the correlation between generations is zero, and there is no
change in fitness, hence no normalization is necessary. It is possible that
the
number of progeny of individuals will be negatively correlated with the number
of progeny of their parents, but with the Poisson progeny distribution the
correlation can be no less than -.735759 ($= -2(1-e^{-1})$).
The (genetic) fitness of an individual should govern ultimate survival, hence
the number of descendants many generations subsequent. With the Poisson
progeny distribution, if the correlation between the number of progeny an
individual has and the number its parents had is one for all generations, then
the correlation between the number of progeny of an individual and the number
of progeny of its grandparents (or great- \ldots\ -great grandparents) will
also be one. Complete selection can be maintained over successive
generations. If the autocorrelation between successive generations is less
than one, a standard model (e.g., fitness determined by multiple loci with
recombination) provides that the auotcorrelation between different generations,
hence persistence of selection,
will decrease geometrically with the number of intervening generations. The
autocorrelation between the number of progeny in successive generations
provides an index of selection at the population level, which is not affected
by renormalizing fitness.
\newpage
\noindent{\bf Constraints of the real world.}
No population can manifest a Poisson progeny distribution, because the
probability of having a specified number of progeny, when multiplied by the
population size, will not always be an integer. In particular, the
probabilities of large numbers of progeny will be positive, but correspond to
less than one individual. As a result of this, the realized correlations
between the number of progeny of individuals and their parents will be less
than one. For a population of 10,000,000 individuals with an approximately
Poisson progeny distribution, the correlation can only be .9999912; for a
population of 1,000 individuals it can only be .9971. The correlation between
individuals and their grandparents can only be .7345 for a population of 1,000
individuals. This correlation is much less than would be obtained by squaring
the single generation autocorrelation, but such a decrease is expected. In a
finite population undergoing intense selection, only a few generations are
required until there is a common ancestor, and the covariation over that span
of generations will be zero since the number of progeny of the single
ancestor has no
variation. In a finite population, the decrease in autocorrelation reflects
both the geometric decay of association (loss of selective advantage) and the
fixation of ancestral genes.
If autocorrelation is close to 1, then fixation will occur in a few
generations, with reduction in autocorrelation over those generations
reflecting fixation of alleles rather than diminution of the selective
advantage. The coalescence time can be calculated for maximum correlation
between generations. For 10,000,000 individuals it is 11 generations, for
100,000 it is 9 generations, and for 1,000 it is 7 generations. These appear
to behave as $\log N$ rather than $N$ where $N$ is the population size.
When autocorrelations are significantly less than one, the results are quite
different from when the autocorrelations are (near) one. The geometric
decrease in autocorrelation reduces relative viabilities before fixation is
approached. For example, in a population of 10,000,000 individuals with
autocorrelation .5, a single
allele will increase to 50,000 copies in 10 generations, but its selective
advantage will be lost in those 10 generations. In general, an allele with
many descendants will initially rapidly increase in frequency, but the initial
selective advantage will be significantly reduced in a few generations so that
the initial increase will approach neutrality long before fixation is
approached. The examples of fixation in a few generations are quite dependent
on the autocorrelation being (near) one.
The above discussion has assumed a haploid population. If the population is
diploid with random mating, the maximum correlation will be approximately .5;
it will be less than that because of the constraint of maintaining a Poisson
progeny distribution. With multiple mating and assortative mating, the
autocorrelation could be maintained close to 1; but if selfing is precluded,
the effect of finite population size will provide a greater reduction from 1
than the haploid values above.
\newpage
\noindent{\bf The standard selection model.}
If there are two alleles with fitnesses 1 and $1+s$, then the genetic variance
of the population is $p(1-p)s^{2}$ where p is the frequency of one of the
alleles. This will also be the autocorrelation between successive generations
if the progeny distribution is Poisson with parameter 1; in particular,
the autocorrelation will generally be close to zero. However, unlike the
model where fitness is based on number of siblings independent of genotype,
the selective advantage will remain with the same alleles over successive
generations. The fixation (coalescent) time will behave like $\frac{\log
2N}{s}$. This is consistent with the rule that neutrality governs when $4Ns <
1$. Note also that the autocorrelation depends on the frequency of the
allele. Two allele models with constant relative fitnesses are incompatible
with constant autocorrelation models.
A standard selection model entailing multiple alleles considers a population
with normal (viable) and deleterious alleles, where a constant proportion of
the population is deleterious mutations which proportion is maintained by
mutation selection balance. In particular, we can let the population be of
size $N = N_{e} + N_{d}$ where $N_{e}$ is the number of normal alleles and
$N_{d}$ is the number of deleterious alleles in the population. In this case,
the coalescent process for the portion of the population which is not
deleterious mutations is exactly as described by Kingman (selectively neutral)
for population size $N_{e}$; the mutations which compensate for selection can
be viewed as emigrants which do not affect the normal subpopulation. Mating
between normal and deleterious individuals does not significantly alter the
analysis. Further, the deleterious mutations will, in general, be short lived,
hence the coalescent time for the normal subpopulation should also be the
coalescent time for the entire population.
\newpage
\noindent{\bf Discussion.}
The incorporation of selection provides substantial difficulties in the study
of the coalescent process, including coalescent times. The observation that
the expected coalescent time is the same as the expected time until fixation
(at equilibrium) provides that diffusion results can be used for coalescent
times in the presence of selection. However, the diffusion model is based on
two alleles, which may be inadequate to represent the selective forces
governing a population. The resulting integrals are also of limited utility.
An alternative is to ignore the individual alleles causing selection, and
measure selection by autocorrelation of progeny number. (If the
autocorrelation is calculated only for a single locus, the value will be
quite small.) This considers only
the realized number of progeny, and does not distinguish between selection
and drift as causal agents. In finite populations, autocorrelation can be
reduced both by decreasing association (persistence of selection)
and by fixation of alleles, which
distinction is important to interpreting the results.
A complete analysis has not been obtained, but one result for both the
standard ($s$) and autocorrelation selection models is that the coalescent time
behaves as $\log N$ when selection is governing the selection/fixation
process, as opposed to $N$ when neutral drift governs. \vspace{2pt}\\
\begin{small}
Literature cited:\\
Crow, J. F. \& M. Kimura. 1970. {\it Intro. Pop. Gen. Th.}\\
Kaplan, N. L., T. Darden, \& R. R. Hudson. 1988. {\it Genetics
120\/}:819-829.\\
Kingman, J. F. C. 1982. {\it J. Appl. Prob. 19A\/}:27-43. \\
Slatkin, M. 1991. {\it Genet. Res. 58\/}:167-175.\\
Takahata, N. 1988. {\it Genet. Res. 52\/}:1213-222.\\
\end{small}
\end{document}