# Are probabilities of mutations symmetric?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

For the premise of this quiestion let's assume that there is an allele A and an allele B. The allele A has a probability P to mutate into the allele B in the given timeframe.

Is it also true that the allele B has the probability P to mutate into the allele A?

The short answer is no, there is heterogeneity in the rate at which different nucleotides mutate into one another. This is generally a property of their differing chemistries (although I'm not an expert on this). Therefore, it doesn't really make sense to talk of alleles being $$p$$ or $$q$$ like we do in most of population genetics, because the actual nucleotide (whether it's A, C, G or T does make a difference).

This is important in fields like phylogenetics, where people construct substitution matrices which describes the rate at which one base in a sequence changes to another nucleotide. Currently, it's possible to estimate the matrix parameters from empirically data relatively easily.

For example, in one of the earliest and most simple models, Kimura (1980) introduced a matrix which had two parameters - one for the mutation rate of transition substitutions (A/C -> G/T, more likely) and one for the rate of transversions (A/G -> T/C, less likely). Later methods got progressively more complex, which account for e.g. the different amino/keto properties of different nucleotides. Felsenstein's model (1981) accounted for the equilibrium frequency of the target nucleotide. These substitution values are also allowed to vary across time as well within a phylogenetic tree (e.g. Yang 1994).

References

Kimura, Motoo. "A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences." Journal of molecular evolution 16.2 (1980): 111-120.

Felsenstein, Joseph. "Evolutionary trees from DNA sequences: a maximum likelihood approach." Journal of molecular evolution 17.6 (1981): 368-376. APA

Yang, Ziheng. "Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods." Journal of Molecular evolution 39.3 (1994): 306-314.

## Variation of mutational burden in healthy human tissues suggests non-random strand segregation and allows measuring somatic mutation rates

The immortal strand hypothesis poses that stem cells could produce differentiated progeny while conserving the original template strand, thus avoiding accumulating somatic mutations. However, quantitating the extent of non-random DNA strand segregation in human stem cells remains difficult in vivo. Here we show that the change of the mean and variance of the mutational burden with age in healthy human tissues allows estimating strand segregation probabilities and somatic mutation rates. We analysed deep sequencing data from healthy human colon, small intestine, liver, skin and brain. We found highly effective non-random DNA strand segregation in all adult tissues (mean strand segregation probability: 0.98, standard error bounds (0.97,0.99)). In contrast, non-random strand segregation efficiency is reduced to 0.87 (0.78,0.88) in neural tissue during early development, suggesting stem cell pool expansions due to symmetric self-renewal. Healthy somatic mutation rates differed across tissue types, ranging from 3.5 × 10 −9 /bp/division in small intestine to 1.6 × 10 −7 /bp/division in skin.

For more detailed reviews of interpretations of probability, see Gillies (2000a), Galavotti (2005) and Hájek (2012).

In the literature, it is often assumed that by coherence de Finetti means consistency (Howson 2008 Dickey et al. 2009 Vineberg 2011), but Berkovitz (2014) argues that this assumption is unjustified.

The conflation of the ontological status of theoretical terms with the way they are to be evaluated and their values as instruments is not particular to the interpretation of de Finetti's theory (Berkovitz 2014). In discussions of instrumentalism it is common to associate the instrumental value of theoretical postulates with their ontological status. Thus, for example, it is argued that under instrumentalism, theories are capable (at best) of accommodating known observable phenomena, and incapable of making novel predictions. Psillos (1999, p. 29) interprets Duhem as arguing along these lines. “Duhem’s point is that the fact that some theories generate novel predictions cannot be accounted for on a purely instrumentalist understanding of scientific theories. For how can one expect that an arbitrary (artificial) classification of a set of known experimental laws—i.e. a classification based only on considerations of convenience—will possibly be able to reveal unforeseen phenomena in the world?” The presupposition is that the ontological status of theoretical terms determines their capacity to generate novel predictions. But this presupposition begs the question against instrumentalism in general and de Finetti’s instrumentalism in particular (Berkovitz 2014).

For the sake of brevity, in what follows by ‘frequency’ we will mean relative frequency.

For a discussion of this tenet in the context of long-run propensity theories, see Berkovitz (2015, Sect. 3.5).

However, his theory cannot be considered as an interpretation of the probability calculus since it violates it.

For a detailed discussion of propensity theories, see Berkovitz (2015).

Lewis (1986, p. 87) formulates the principal principle as follows: “Let C be any reasonable initial credence function. Let t be any time. Let x be any real number in the unit interval. Let X be the proposition that the chance, at time t, of A's holding equals x. Let E be any proposition compatible with X that is admissible at time t. Then ( C(A/XE) = x ) .” C is a non-negative, normalized, finitely additive measure defined on all propositions (sets of worlds), and E is admissible at time t if it contains only information whose impact on the credence of A comes entirely by way of credence about the chance of A.

Unlike Strevens and Rosenthal, Abrams intends his interpretation to apply to both deterministic and indeterministic processes.

The standard conditional probability of A given B is defined as the ratio of the unconditional probability of A&B to the unconditional probability of B: ( <mathord >> ight. kern-0pt> > ) .

For a discussion of these arguments, see Berkovitz 2015, Sects. 5.2–5.3 and references therein.

For a critical discussion of this resolution, see Bub (1975) and Bub and Pitowsky (1985).

For examples of subjective interpretations of QM probabilities, see Caves et al. (2002a, b, 2007), Pitowsky (2003) and Berkovitz (2012).

FTNS states that “the rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time” (Fisher 1930, p. 35). That is, a population’s rate of change in mean fitness due to natural selection is equal to the additive genetic variance. It is important to emphasize here the fact that the variation rate of mean fitness necessarily increases, so the principle singles out a trend in nature, like the Second Law in TD.

In the classical view, this process happens primarily at the level of alleles.

A concept that is explicitly analogous, ‘ecological drift’, has been forged in community ecology by Hubbell (2001).

“[N]atural selection, acting on the heritable variation provided by the mutations and recombination of a Mendelian genetic constitution, is the main agency of biological evolution.” The letter from Huxley to Mayr was intended to explain the general orientation of the book Evolution as a process, to which Mayr contributed (see Huxley et al. 1954).

As Landsman (2009, p. 60) notes, the pragmatic attitude taken by most physicists is that the probabilities in Born’s rule are to be interpreted as long-run frequencies.

Roughly, an effective wavefunction of a system exists when ‘enough’ decoherence has occurred between the system and its environment (see Dürr et al. 1992 Callender 2007).

## Results

### Distribution of G+C content

Thanks to the availability of an increasing number of complete bacterial genomes with a high G+C content, there was a representative distribution of G+C content in the dataset (Table 1), and the values of G+C content in third codon positions (P3, as previously defined in [11] see also Materials and methods), in first and second codon positions averaged (P12), and in intergenic spaces (GC IGR), were found to be consistent with previously known results [10,11] both in terms of their range of distribution and the correlation between them (Figure 4). Intragenomic P3 distributions of all species examined (all histograms are given in the tables in Additional data files under the 'GCorder' flag) are unimodal and narrow for a major class of genes comprising more than 90% of the total genes, although average P3 values differ widely between species. Some species show scattering of P3 distribution. In particular, a previous study of 254 genes in Pseudomonas [7] revealed a sizeable minor class of genes (28%) forming a trail toward lower G+C content (genes with P3 < 0.75). In our present study of 5,255 genes, the minor class of genes (with P3 < 0.75) is only 8% of the total, indicating that the previous sample was biased towards the minor class. Moreover, the width of the distribution (standard deviation) of the majority of genes is narrower in species with both extreme ranges of P3 than in those with middle ranges. These features of the P3 distribution are expected from the theory of bidirectional mutation rates and their equilibrium [8]. In this theory, mutation rates, u (GC → AT) and v (AT → GC), will lead to a G+C content (P3) at equilibrium of v/(u+v). The expected standard deviation is

Distribution of G+C content in the dataset. The G+C content of the 51 bacterial chromosomes under analysis was highly variable from 25.5% in the Ureaplasma urealyticum chromosome to 67.9% in Halobacterium sp., with a larger distribution in third codon positions (x-axis P3 from 11.2% to 88.0%) than in intergenic spaces (y-axis GCIGR from 15.5% to 63.4%) than in first and second position (y-axis P12 from 33.4% to 62.2%) as expected [10,11]. Regression slopes (or ε values) and their standard deviations for P12 and GCIGR were 0.343 ± 0.021 and 0.586 ± 0.024, respectively.

where b is an average of the number of third codon positions per gene [8].

### Difference in G+C content between leading and lagging strands

Statistically highly significant differences (p < 0.01) in P3 between coding sequences located on the leading and lagging strands were found in 23 out of 43 chromosomes. In most cases (21 out of the 23) genes on the leading strand were found to have a lower P3 content than those on the lagging one. (Two exceptions were Mycoplasma genitalium, which is known to have a peculiar behavior with respect to G+C content [5,6], and Caulobacter crescentus.) As far as we know, this is the first time that such a systematic difference has been reported, but note that the differences were always very small. The maximum was only 0.035, for the difference in P3 content of leading and lagging strands of Vibrio cholerae chromosome 2, as compared to the 0.768 average P3 difference between Halobacterium sp. and Ureaplasma urealyticum, or the 0.050 accepted range of genomic G+C content polymorphism [57]. There was no significant correlation between the average P3 and the difference in P3 between the two groups of genes. A highly significant difference in G+C content between the leading and the lagging sequences was found in only five chromosomes for large intergenic regions, and three chromosomes for small intergenic spaces.

### PR2-biases in coding sequences

Out of 43 chromosomes, the differences in PR2-biases between the leading and lagging groups were highly significant (p < 0.01) in 39 chromosomes for G3/(G3+C3) and in 34 for A3/(A3+T3) and in 32 chromosomes for simultaneously G3/(G3+C3) and A3/(A3+T3) (see Additional data files). In the PR2-bias-plot analysis, we found a general pattern of x1 >x2 and y1 <y2, as in Figure 2b, meaning that coding sequences in the leading strand had higher G3/(G3+C3) values and lower A3/(A3+T3) values than those on the lagging strand: leading coding sequences were enriched in keto-bases (G and T) in the third codon position compared to lagging coding sequences. There were, however, exceptions to this general trend: out of the 39 chromosomes that were highly significant for G3/(G3+C3), all had higher values in the leading strand, but out of the 34 chromosomes that were highly significant for A3/(A3+T3), three chromosomes did not follow the general trend (Lactococcus lactis and Staphylococcus aureus strains Mu50 and N315). Leading and lagging coding sequences are separated in PR2 plots as expected under replication-associated effects, and the leading group is almost always down right of the lagging group. However, the extent to which the groups are separated differs between species. The two most extreme species are B. burgdorferi, where the two groups of genes are completely resolved, and M. genitalium, where the two groups are almost completely overlapping (Figure 5). Other species with a spectacular difference in PR2 plots were Chlamydia muridarum, Chlamydia trachomatis and Treponema pallidum. Out of 43 chromosomes, the residual bias B II was in most cases oriented as in Figure 2c, with xc < 0.5 (all except Pyrococcus abyssi and T. pallidum) and yc < 0.5 (all except Thermotoga maritima). The relative contribution of replication-coupled bias (B I) to the total bias (B I plus B II) ranged from 6% (Mycoplasma pulmonis) to 86% (B. burgdorferi) with an average value of 52%.

Example of detailed results for two extreme cases with B. burgdorferi (left) and M. genitalium (right). (a) P3-P12 plot for coding sequences (b) PR2-plot in third codon position (c) PR2 plot in large intergenic spaces. Leading-group sequences are represented by black circles, and lagging ones by white circles.

### PR2-biases in large intergenic regions

Out of 43 chromosomes, the differences in PR2-biases between the leading and lagging groups were highly significant (p < 0.01) in 36 chromosomes for G/(G+C) and in 34 for A/(A+T) and in 32 chromosomes for simultaneously G/(G+C) and A/(A+T) (see table in Additional data files). All the 36 chromosomes highly significant for G/(G+C) had higher values in the leading strand (that is, x1 >x2 as in coding sequences). Out of the 34 chromosomes highly significant for A/(A+T), 28 had lower values in the leading strand (that is, y1 <y2 as in coding sequences). Exceptions were Bacillus subtilis, Bacillus halodurans, L. lactis, S. aureus strains Mu50 and N315, and Streptococcus pyogenes. The distribution of intergenic regions in a PR2-plot was as expected under replication-associated effects (Figure 3a), and is particularly visible in the case of B. burgdorferi (Figure 5c). Unlike the third codon position, the leading and lagging groups clustered symmetrically around the center point of the PR2-bias plot. In intergenic regions, both xc and yc were always near 0.5, so that the residual bias B II was always close to zero. The relative contribution of replication-coupled bias (B I) to the total bias (B I plus B II) was therefore very high, on average 90%, and ranged from 88% (Helicobacter pylori) to 99.6% (B. subtilis) in the intergenic regions of species with significant B I values.

### PR2-biases in potentially transcribed untranslated regions

Out of 43 chromosomes, the differences in PR2-biases between the leading and lagging small intergenic spaces among co-oriented genes were highly significant (p < 0.01) in 26 chromosomes for G/(G+C) and in 7 chromosomes for A/(A+T) and in 7 chromosomes for both G/(G+C) and A/(A+T) together (see table in Additional data files). All the 26 chromosomes with highly significant difference in G/(G+C) showed the same pattern as in third codon positions in coding sequences with x1 >x2. Of the seven chromosomes highly significant for A/(A+T), all showed the same pattern as in third codon positions in coding sequences with y1 <y2, except for B. subtilis and S. pyogenes. Out of 43 chromosomes, the orientation of the residual bias B II showed no clear tendency (only 31 chromosomes had xc > 0.5 and 23 yc > 0.5 the trend, if any, would be in the opposite direction, as in third codon positions). The relative contribution of replication-coupled bias (B I) to the total bias (B I plus B II) ranged from 10% (H. pylori strain 26695) to 93% (Thermoplasma acidophilum) with an average value of 58%.

### Comparison of third codon positions and intergenic spaces

The differences between PR2-biases in intergenic regions between the leading and lagging strands (B I) were significantly and highly correlated with the PR2-biases in the third codon position as is evident from the regression coefficient of approximately 0.6 and correlation coefficient (r 2 = 0.77) among 43 chromosomes (Figure 6). The correlation was still significant when the extreme B. burgdorferi was removed from analysis (r 2 = 0.68). The 95% confidence interval for the value of the slope of the regression line [0.50-0.71] is less than 1, meaning that replication-associated biases were significantly smaller in intergenic regions than in third codon positions. This slope value was very close to the one obtained for the regression line between G+C content in intergenic regions versus third codon positions (Figure 4).

Comparison of the absolute contribution of replication-associated bias B I between intergenic and third codon positions.

### Correlation between P3and replication-associated biases

If differences in the G+C content between species were dictated by asymmetric replication-associated mutation pressure, one would expect a correlation between the extent of PR2-bias and G+C content however, no significant correlation between B I in coding sequences and P3 were found (Figure 7).

Correlation between GC content and the extent of strand biases in third codon positions.

## Discussion

In contrast to previous models, our mathematical model includes all relevant phases in which somatic mutations may accumulate in a tissue and by providing a way to estimate the background somatic mutation rate directly from sequencing data. Its predictions are validated by correlations between age and mutation number among patients with the same tumor type. In addition to the correlations described above, we found correlations between age and mutation number also in smaller datasets: glioblastoma (ref. 11, P = 0.035) and medulloblastoma (ref. 12, P = 0.00027). Similarly, a significant correlation was reported in neuroblastoma (10). In breast cancers, however, there was no correlation between number of mutations and age (20), P = 0.33 (estrogen receptor positive) and P = 0.14 (estrogen receptor negative), despite the fact that breast epithelial cells self-renew. It is possible that breast epithelial cell renewal is highly variable among individuals, given that it is dependent on hormonal status, number of pregnancies, breastfeeding history, etc. This would obscure any correlation between age of diagnosis and mutation number. Similarly, in ovarian high-grade serous adenocarcinoma (TCGA, 317 patients), we did not find a significant correlation (P = 0.21).

Strictly speaking, our model predicts a correlation with the number of tissue renewals rather than age per se. It is only when tissue renewal rates are relatively consistent among individuals that significant age vs. mutation correlations would be expected to exist.

In conclusion, our results suggest that in typical patients with cancers of self-renewing tissues, a large part of the somatic mutations occurred before tumor initiation. In CLL, colorectal, and ovarian cancer patients of median age, half or more (68%, 57%, and 51%, respectively) of the passenger somatic mutations appear to have occurred before the tumor-initiating event.

These results have substantial implications for the interpretation of the large number of genome-wide cancer studies now being undertaken. They reinforce the idea that most somatic mutations observed in common adult tumors do not play any causal role in neoplasia they in fact occurred in completely normal cells before initiation. They also indicate that patient age should be considered in statistical analyses of sequencing data. Sequencing data of younger patients’ tumors may provide more reliable distinction of driver mutations by reducing the “noise” caused by the accumulation of passenger mutations occurring in normal tissues as individuals age.

## The fixation probability of beneficial mutations

The fixation probability, the probability that the frequency of a particular allele in a population will ultimately reach unity, is one of the cornerstones of population genetics. In this review, we give a brief historical overview of mathematical approaches used to estimate the fixation probability of beneficial alleles. We then focus on more recent work that has relaxed some of the key assumptions in these early papers, providing estimates that have wider applicability to both natural and laboratory settings. In the final section, we address the possibility of future work that might bridge the gap between theoretical results to date and results that might realistically be applied to the experimental evolution of microbial populations. Our aim is to highlight the concrete, testable predictions that have arisen from the theoretical literature, with the intention of further motivating the invaluable interplay between theory and experiment.

### 1. Introduction

Mathematical population genetics is a field with an extremely rich historical literature. The first questions about gene frequency distributions were posed in analytical form by Fisher independent studies were conducted by Wright and Haldane. Fisher, Haldane and Wright together shaped the foundations of the field and are referred to as the ‘great trinity’ (Crow 1994) of population genetics. The works of these authors (Fisher 1922, 1930 Haldane 1927 Wright 1931) are now considered to be the classic papers in the field.

One of the central ideas addressed by these authors is the fixation probability: the probability that the frequency of a particular allele in a population will ultimately reach 100 per cent. Mathematically, there are several approaches to computing fixation probabilities, and interest in this problem has been sustained for almost a century: the first papers were written in the early 1920s, and there have been important advances in every decade since. Empirically, the fixation probability is necessary in order to estimate the rate at which a population might adapt to a changing environment, the rate of loss of genetic diversity or the rate of emergence of drug resistance.

The last several years have seen two key advances in this field. First, a number of important, and fascinating, theoretical advances have been made, each bringing us one step closer to theoretical predictions that might pertain in a ‘real’ laboratory population. Second, in parallel with this effort, experimental techniques in microbial evolution have advanced to the point where the fate of a novel mutant strain within a controlled population can be followed over many generations. Thus, these experiments are on the verge of being able to test our theoretical predictions of the fixation probability—predictions that have in many cases stood untested for 80 or 90 years. This is extremely exciting.

Although neutral and deleterious mutations may also reach fixation in finite populations, in the following review we will restrict our attention to beneficial mutations. The selective advantage, s, of a beneficial mutation is typically defined for haploids as follows: if each wild-type individual has on average W offspring per generation, each mutant individual has on average W(1+s) offspring. Throughout this review we will assume that this definition of s holds, unless stated otherwise. For simplicity, for diploid individuals we will use s to denote the advantage of the heterozygote, although the notation hs is also typically used.

In a deterministic model, an initially rare beneficial mutation will increase in frequency in each generation, and fixation is certain. In reality, however, the frequency of any particular lineage fluctuates over time. These fluctuations, ‘genetic drift’, are very likely to cause the extinction of a beneficial lineage when its frequency is low, and require a stochastic treatment. Once the frequency of the mutant is sufficiently large, further increases are well approximated by a deterministic model. Estimating the fixation probability for a beneficial mutation is thus usually equivalent to estimating the probability that the mutation survives genetic drift when initially rare.

The underlying distribution of s, i.e. the distribution of selective effects for all possible beneficial mutations, is a topic of current interest, both theoretically and experimentally. Although beyond the scope of this review, we refer the interested reader to several recent papers (Rozen et al. 2002 Orr 2003 Rokyta et al. 2005 Kassen & Bataillon 2006). A closely related, or even overlapping, issue is adaptation: the rate of fitness increase or overall rate at which beneficial mutations arise and become fixed. While fixation probabilities are essential building blocks in the models of adaptation, such models also require further assumptions, such as an underlying distribution of selective effects or a model for combining the effects of multiple mutations. Estimating the rate of adaptation has a rich literature in its own right, and again we refer the interested reader to a few key references (Orr 1994, 2000 Wilke 2004 Desai & Fisher 2007 Goncalves et al. 2007). We touch on this issue again in §5.3.

### 2. Historical overview

Broadly speaking, there are three approaches to computing fixation probabilities. When the state space of a population (exactly how many individuals have exactly which genotype) can be enumerated, a Markov chain approach can determine the fixation probability exactly. This approach is nicely outlined for the non-specialist reader by Gale (1990), and is typically feasible only when the population size is quite small (but see Parsons & Quince 2007a,b, discussed in §3.3). When the population size is large, methods based on discrete branching processes are often used. These methods build on the ‘Haldane–Fisher’ model (Fisher 1922, 1930 Haldane 1927, 1932), which is itself based on a Galton–Watson branching process. We note that any branching process approach provides an approximation to the true fixation probability, as it assumes that the wild-type population is sufficiently large that the fate of each mutant allele is independent of all others. This approach has been widely, and successfully, applied to a number of interesting recent questions regarding the fixation probability (Athreya 1992 Haccou & Iwasa 1996 Lange & Fan 1997 Otto & Whitlock 1997 Wahl & Gerrish 2001 Johnson & Gerrish 2002 De Oliveira & Campos 2004 Wahl & DeHaan 2004 Champagnat & Lambert 2007). Finally, when the population is large and the change in gene frequency is small in each generation (i.e. selection is weak), methods that incorporate a diffusion approximation may be used. These approaches follow from the pioneering ‘Wright–Fisher–Kimura’ model (Fisher 1922, 1930 Wright 1931, 1945 Kimura 1957, 1962), and are also in wide use today (Yamazaki 1977 Wahl & Gerrish 2001 Gavrilets & Gibson 2002 Whitlock 2003). Significant effort has also been made towards unifying or reconciling the discrete and continuous approaches (Kimura & Ohta 1970 Otto & Whitlock 1997 Wahl & Gerrish 2001 Lambert 2006). We will discuss many of these recent papers in turn in the sections to follow.

The most widely known result regarding the fixation probability is Haldane's celebrated approximation, obtained for weak selection using a discrete-time branching process. Haldane (1927) demonstrated that the probability of ultimate fixation, π, of an advantageous allele is given by π≈2s, when the allele is initially present as a single copy in a large population.

Haldane's elegant result necessarily relies on a number of simplifying assumptions. The population size is large and constant, generations are discrete and the number of offspring that each individual contributes to the next generation is Poisson distributed. This last simplification masks an assumption on which the fixation probability critically depends: individuals in such a branching process cannot die before having offspring. In effect, individuals die in such models only by having zero offspring. But since the probability of having zero offspring is completely determined by the mean of the Poisson distribution, there is no room in Haldane's approach to independently specify a survival probability. This will become important as we review some recent work that relaxes this assumption.

This work by Haldane, as well as Wright (1931) and Fisher (1992), was later generalized in a number of different directions, most notably by Kimura (Kimura 1957, 1962, 1964, 1970 Kimura & Ohta 1970). Kimura's approach was to use a diffusion approximation to model small changes, over many generations, in the frequency of a particular allele. To understand Kimura's foundational result, we must briefly introduce Ne, the variance effective population size. If we imagine a diploid population in which, for example, mating is not random or the sex ratio is not 1 : 1, these effects may change the variance in the number of offspring alleles per parental allele. Ne is then the size of an ‘ideal’ population—a large population of constant size, in which mating is random and we have equal numbers of males and females—that would give the same variance as the real population in question. Kimura's most widely known result is that the probability of ultimate fixation, π, of an allele with an initial frequency p and an additive selective effect s is

For large diploid populations, equation (2.1) implies that the fixation probability for a new mutation that arises as a single copy decreases with larger effective population sizes. However, the decay of this function is extremely rapid for example, for s=0.01, a population size of 100 is already sufficient that the denominator is approximately 1. For all but extremely small populations or nearly neutral mutations, we then find that π≈2sNe/N for a mutation occurring as a single copy. Thus, π depends on the ratio of effective population size to census size. It is also clear that when Ne=N, we obtain Haldane's approximation π≈2s for weak selection (Haldane 1927). By contrast, the fixation probability for an allele that is present at a given frequency increases with population size. (Note, however, that a single copy of an allele corresponds to a smaller frequency in a larger population, and thus π≈2s still holds.)

A final note on the approximation π≈2sNe/N is that s reflects the selective advantage of the beneficial allele, while Ne is most often inversely proportional to the variance in offspring number. This foreshadows the important work of Gillespie (1974, 1975) who predicted that the ratio of the mean to the variance in offspring number is necessary in determining both the long-term effects of selection on a beneficial allele and the fixation probability. This idea, particularly as applied to long-term selective effects, has been expanded in a number of elegant recent papers (Proulx 2000 Lande 2007 Orr 2007 Shpak & Proulx 2007).

Much progress has been made since the work of Kimura and the great trinity. As we will review in the following sections, the fixation probability has now been estimated in populations of fluctuating size, for populations whose size cycles among a set of constant values and, more recently, fluctuates according to a density-dependent birth–death process. Populations experiencing exponential or logistic growth or decline have been treated, as have populations that are subject to sustained growth periods followed by a population bottleneck—a sudden reduction in population size. A large body of work treats populations subdivided into demes, most recently including heterogeneous selection among demes and asymmetrical migration. Recent work has also addressed multiple segregating alleles, specifically treating quasi-species interactions and clonal interference, as described in the sections to follow.

### 3. Populations of changing size

#### 3.1 Growing, declining or cyclic population sizes

Fisher (1930) suggested that the probability of fixation of beneficial alleles would increase in growing populations and decrease in declining populations. Analysis by Kojima & Kelleher (1962) confirmed Fisher's proposition. Fisher's claim was further justified through the theoretical studies of logistically changing populations by Kimura & Ohta (1974).

Ewens (1967) used a discrete multitype branching process to study the survival probability of new mutants in a population that assumes a cyclic sequence of population sizes, as well as a population that initially increases in size and thereafter remains constant. For the former case, Ewens found the probability of fixation of a beneficial mutation to be

Ewens' relaxation of the assumption of constant population size was an important step towards generalizing fixation probability models however, he still maintained the other classic assumptions and only explored two cases of changing population sizes. The approximation in equation (3.1) led Kimura (1970) to a conjecture that equation (2.1) may be used for populations that assume a cyclic sequence of values, with Ne replaced by . Otto & Whitlock (1997) later built on the work of Ewens and Kimura by addressing the question of the fixation probability of beneficial mutations in populations modelled by exponential and logistic growth or decline. These authors proved that the conjecture made by Kimura holds true for the populations in which the product ks is small, where k is the total number of discrete population sizes.

All the papers mentioned above assume a Poisson distribution of offspring. Although such a distribution may be a good model of reproductive success in many species, some species clearly cannot be modelled well by such a distribution (e.g. bacteria that reproduce by binary fission). Pollak (2000) studied the fixation probability of beneficial mutations in a population that changes cyclically in size, assuming a very general distribution of successful gametes, described by a mean and variance, which are functions of the population size. Assuming that a beneficial mutation first appears in a single heterozygous individual, and that such an individual has 1+s times as many offspring as the wild-type, Pollak proved that the result found for the Poisson-distributed offspring by Ewens (1967) and Otto & Whitlock (1997) still holds: that the fixation probability is approximately proportional to the harmonic mean of the effective population sizes in the cycle and inversely proportional to the population size when the mutation manifests.

#### 3.2 Population bottlenecks

In an attempt to provide estimates of the fixation probability for microbial populations maintained in experimental evolution protocols, Wahl and Gerrish studied the effect of population bottlenecks on fixation. A population bottleneck is a sudden, severe reduction in population size. In experimental evolution, bottlenecks are an inherent feature of the protocol (Lenski et al. 1991 Lenski & Travisano 1994 Bull et al. 1997) the population typically grows for a fixed period of time, and then is sampled randomly such that it is reduced to its initial size. The repetition of this procedure is called ‘serial passaging’.

An important point to note is that at the population bottleneck, each individual—mutant or wild-type—survives with the same probability. Thus the ‘offspring’ distribution of each individual at the bottleneck is the same, for either mutant or wild-type. By contrast, during growth the selective advantage of the mutant is realized. Thus the case of growth between population bottlenecks is not simply a special case of cyclic population sizes.

Wahl & Gerrish (2001) derived the probability that a beneficial mutation is lost due to population bottlenecks. For this derivation they used both a branching process approach (Haldane 1927 Fisher 1930) as well as a diffusion approximation (Wright 1945 Kimura 1957, 1962). When selection is weak, Wahl and Gerrish demonstrated that the two approaches yield the same approximation for the extinction probability X of a beneficial mutation that occurs at time t between bottlenecks: 1−X≈2srtτ e −rt . Here s is the selective advantage of the mutant over the wild-type strain, r is the Malthusian growth rate of the wild-type population and τ is the time at which a bottleneck is applied. It was thus found that the fixation probability, π, drops rapidly as t increases, implying that mutations that occur late in the growth phase are unlikely to survive population bottlenecks. Since this model treats only extinction due to bottlenecks, this effect is not due to the large wild-type population size late in the growth phase, but rather due to the fact that the beneficial mutant does not have sufficient time to found a lineage large enough to survive the bottleneck. Wahl and Gerrish also defined an effective population size given by NeN0, where Ne is the effective population size and N0 is the population size at the beginning of each growth phase. This approximation is independent of the time of occurrence of the mutation as well as its selective advantage.

In 2002, this model was extended to include resource-limited growth (Wahl et al. 2002). Resource limitation was included in order to better model serial passaging protocols for bacterial populations, in which the growth phase is typically limited by a finite resource in the growth medium. For both resource-limited and time-limited growth, mutations occurring in the early stages of a growth phase were more likely to survive. Wahl et al. predicted that although most mutations occur at the end of growth phases, mutations that are ultimately successful occur fairly uniformly throughout the growth phase.

The two papers described above included extinction during bottlenecks, but did not include the effects of genetic drift during the growth phase, i.e. the possibility of extinction of an advantageous mutant lineage between bottlenecks. Heffernan & Wahl (2002) incorporated the latter effect, assuming a Poisson distribution of offspring during the growth phase, and using a method based on the work of Ewens (1967). This model predicted a greater than 25 per cent reduction in the fixation probability for realistic experimental protocols, compared with that predicted by Wahl & Gerrish (2001).

The method presented by Heffernan is valid for both large and small values of selective advantage, s. This was an important extension of previous results, especially given the recent reports of large selective advantages in the experimental literature (Bull et al. 2000). When selection is weak and the mutation occurs at the beginning of a growth phase, Heffernan and Wahl derived the approximation πs(k−1), where k is the number of generations between bottlenecks. This approximation is analogous to the classic result π≈2s (Haldane 1927) but is increased by a factor of (k−1)/2.

The work discussed in this section considers only the loss of beneficial mutations due to bottlenecks and genetic drift. In reality, rare beneficial mutations in asexual populations may also be lost during the growth phase due to competition between multiple new beneficial alleles (see §5.3) or quasi-species interactions (see §5.2). Most importantly, the papers described above either assume deterministic growth between bottlenecks or discrete generation times with offspring numbers that are Poisson distributed. These are not ideal simplifications for many microbial populations. Thus, the tailored life-history models described in §6 should provide a more accurate approach to these questions, although they have not, as yet, been as fully developed as the papers described here.

#### 3.3 Dynamically changing population sizes

Three intriguing papers addressing population sizes that change dynamically, according to underlying birth and death events, appeared in 2006 and 2007.

Lambert (2006) developed an extension of the Moran (1958) model, assuming that birth events have a constant per capita rate, while death events have a per capita rate that increases with population density. Lambert addressed three model constructions: the first model considered independent continuous-state branching processes the second model considered branching processes conditioned to produce a constant population size and finally the third model included logistic density dependence through a density-dependent death rate.

For the first and second models at a large population limit, Lambert pointed out that the factor 2 in Haldane's result of π≈2s for very small s stems from the assumption that the offspring distribution is Poisson. For near-critical branching processes, more generally, π≈2s/σ, where σ is the variance of the offspring distribution (Haccou et al. 2005). Thus, increased reproductive variance always reduces the fixation probability in such models.

For the third model, density dependence results in an upper asymptotic limit on the ‘invasibility coefficient’ that is, the rate at which the selective advantage of the mutant increases the fixation probability. Consequently, Lambert found that Haldane's classic approximation (π≈2s) and Kimura's diffusion approximation (equation (2.1)) tend to underestimate the fixation probability of beneficial mutations in growing populations and overestimate it in declining populations. This result is consistent with those of Parsons & Quince (2007a,b), described below, as well as the classic predictions of Fisher (1930), Kojima & Kelleher (1962) and Kimura & Ohta (1974).

Ultimately, Lambert derived a concise expression for the fixation probability, which holds for all three models. The limitation of this approach is that it holds only when the selective advantage of the beneficial mutation is small, such that higher order terms in s are negligible.

Parsons & Quince (2007a) introduced stochastic population sizes in a similar way. In contrast to the work of Lambert, Parsons and Quince considered density-dependent birth rates and density-independent death rates. Another key difference is that Parsons and Quince did not assume that selection is weak. In particular, they argued based on their results that the parameter space over which the assumptions in Lambert (2006) are valid may in fact be quite limited.

In the first case considered (the ‘non-neutral case’), the carrying capacities of the mutant and wild-type are not equal. For advantageous mutants, Parsons and Quince found that stochastic fluctuations in the wild-type population do not affect the fixation probability. On the other hand, for deleterious mutants, the fixation probability is proportional to the fluctuation size of the wild-type population, but relatively insensitive to initial density.

In a second paper, Parsons & Quince (2007b) investigated the ‘quasi-neutral’ case: the carrying capacities of mutant and wild-type are identical, but the birth and death rates are different. Since the carrying capacities are determined by a ratio of the birth and death rates, this implies a life-history trade-off between these parameters. Parsons and Quince used a diffusion approximation to determine the fixation probability when the carrying capacity is large. The authors predicted an increase in fixation probability for the type with a higher birth rate in growing populations and a reduction in a shrinking population. When the population is at carrying capacity initially, the type with a higher birth rate has larger fluctuations in population size and thus a reduced fixation probability.

A shared feature of the approaches described in this section is that beneficial mutations can affect more than one life-history parameter or ‘demographic trait’. Both models predict that the fixation probability depends on this mechanism of the selective advantage. This work is thus closely related to the more detailed life-history models described in §6 to follow.

### 4. Subdivided populations

Pollak (1966) was the first to address the question of the fixation probability (π) in a subdivided population. Pollak considered a situation in which K subpopulations occupy their respective habitats, with the possibility of migration between subpopulations. A branching process approach was used to deduce that for symmetric migration, π in a subdivided population is the same as that in a non-subdivided population. Later, for the case of symmetric migration, Maruyama (1970, 1974, 1977) used the Moran model with a diffusion approach to show that a similar result holds.

Populations structured into discrete demes were also studied by Lande (1979) and Slatkin (1981) among others. Lande (1979) demonstrated the elegant result that if a population is subdivided into demes, the net rate of evolution is the same as the rate of evolution in a single deme, where the rate of evolution is given by the probability of fixation of a single mutant multiplied by the number of mutations per generation in one deme. This result relies on the assumption that a mutation fixed in one deme can spread through the whole population only by random extinction and colonization. Slatkin (1981) then showed that for a given pressure of selection in each local population, the fixation probability of a mutant allele is bounded below by the appropriate fixation probability in an unstructured population of the same total size and above by the fixation probability obtained by assuming independent fixation in each deme. Slatkin found that the fixation probability is higher in the low-migration limit than in the high-migration limit when a heterozygote mutant has a fitness that is less than the arithmetic mean fitness of the two homozygote states (underdominance). The reverse was found to be true when the heterozygote was more fit than the average homozygote fitness (overdominance). This stands to reason: high migration increases the fixation probability in the overdominant case and decreases the fixation probability in the underdominant case.

Barton & Rouhani (1991) further investigated the fixation probability in a subdivided population, exploring the limiting case when migration is much larger than selection, so that the difference in gene frequency between adjacent demes is very small. In a model with two demes, π was greatly reduced by migration in this model. This observation, however, did not extend to a large array of demes. Clarifying Slatkin's prediction that underdominance reduces the fixation probability, Barton and Rouhani showed that the chance of fixation is considerable despite free gene flow and moderate selection against heterozygotes, as long as the neighbourhood is small and the homozygote has a substantial advantage.

In contrast to Lande's result, Barton and Rouhani concluded that even though the fixation probability for any one mutation may be very low, the overall rate of fixation of any particular novel allele may be very high. This is because mutations can arise in any of a very large number of individuals any mutation that is fixed in a large enough area has high probability of spreading through the entire population.

Like previous models, Barton and Rouhani assumed that migration is symmetric. Relaxing this assumption, Tachida & Iizuka (1991) considered asymmetric migration under the condition of strong selection and found that spatial subdivision increases π. This observation was consistent with the numerical results of Pollak (1972). However, the model by Tachida and Iizuka considered only a two-patch population. Lundy & Possingham (1998) extended the two-patch models of previous authors to investigate π in three- and four-patch systems. When migration is asymmetric, Lundy and Possingham found that the influence of a patch on the overall fixation probability depends largely on two factors: the population size of the patch and the net gene flow out of the patch.

More recently, Gavrilets & Gibson (2002) have studied the fixation probabilities in a population that experiences heterogeneous selection in distinct spatial patches, and in which the total population size is constant. In this model, each allele is advantageous in one patch and deleterious in the other. The results in this contribution are in agreement with the arguments of Ohta (1972) and Eldredge (1995, 2003) that, depending on exactly how migration rates change with population size, selection can be more important in small populations than large populations.

In a model of distinct patches, which focuses on extinctions and recolonizations, Cherry (2003) found that these two effects always reduce the fixation probability of a beneficial allele. Cherry's conclusion is consistent with Barton's (1993) observation for a favoured allele in an infinite population, but applies more generally. Cherry derived both an effective population size and an effective selection coefficient, for beneficial alleles in this model, such that established results for unstructured populations can be applied to structured populations. In his exposition, Cherry (2004) assumed that an extinct patch can be recolonized by only one founding allele. The author goes on to explore the case of more than one founding allele after extinction, confirming that extinction and recolonization reduce the fixation probability for beneficial alleles.

Whitlock (2003) relaxed some of the assumptions in previous structured population models to study the fixation of alleles that confer either beneficial or deleterious effects, with arbitrary dominance. Whitlock constructed a model that allows for an arbitrary distribution of reproductive success among demes, although selection is still homogeneous. He found that in a ‘differentially productive environment’, the effective population size is reduced relative to the census size and thus the probability of fixation of deleterious alleles is enhanced, while that of beneficial alleles is decreased. In a further paper, Whitlock & Gomulkiewicz (2005) examined the question of fixation probability in a metapopulation when selection is heterogeneous among demes. In contrast to the metapopulations with homogeneous selection, Whitlock and Gomulkiewicz concluded that the heterogeneity in selection never reduced (and sometimes substantially enhanced) the fixation probability of a new allele. They found that the probability of fixation is bounded below and above by approximations based on high- and low-migration limits, respectively.

An alternative realization of a spatially structured model was studied by Gordo & Campos (2006) who determined the rate of fixation of beneficial mutations in a population inhabiting a two-dimensional lattice. Under the assumption that deleterious mutations are absent and that all beneficial mutations have equal quantitative effect, Gordo and Campos found that the imposition of spatial structure did not change the fixation probability of a single, segregating beneficial mutation, relative to an unstructured haploid population (in agreement with the findings of Maruyama 1970). However, interestingly, spatial structure reduced the substitution rate of beneficial mutations if either deleterious mutations or clonal interference (more than one beneficial mutation segregating simultaneously) were added to the model. In an elegant example of experimental and theoretical interactions, the conclusions of Gordo and Campos were experimentally substantiated by Perfeito et al. (2008) who studied bacterial adaptation in either unstructured (liquid) or structured (solid) environments.

From the overview above, it is clear that an extremely rich literature surrounding the fixation probability in subdivided populations has been developed. In particular, Whitlock's recent work has relaxed a large number of the limiting assumptions in earlier papers, encompassing beneficial or deleterious mutations, arbitrary dominance, heterogeneous selection and asymmetric mutation. As argued by Whitlock & Golmulkiewicz (2005), some intriguing questions remain. For example, it seems likely that multiple alleles could be simultaneously segregating in different demes this case has not yet been treated in a subdivided population, although it is related to §5 below.

### 5. Multiple segregating alleles

In §4 above we have discussed the fixation probability in populations that are spatially subdivided (i.e. spatially heterogeneous populations). In analogy, here we consider populations that are divided into a variety of genetic rather than geographical backgrounds. This genetic heterogeneity can occur when multiple alleles are segregating simultaneously at the same locus or when contributions from other linked loci are considered. In general, the literature surrounding these questions suggests numerous possibilities for new work.

#### 5.1 Effects of linked and deleterious alleles

The effects of linked loci on the fixation probability of a beneficial mutation have been extensively studied, beginning with the ideas of Fisher (1922) and Hill & Robertson (1966). Peck (1994), in particular, focused on the fixation probability of a beneficial mutation in the presence of linked deleterious mutations, finding that deleterious mutations greatly reduce the fixation probability in asexual, but not sexual, populations. A more detailed model is presented by Charlesworth (1994) who derived expected substitution rates and fixation probabilities for beneficial alleles when the deleterious alleles are at completely linked loci. A key result of this work is that deleterious linked loci reduce the effective population size, by a factor given by the frequency of mutation-free gametes.

Barton (1994, 1995) derived a more comprehensive method for computing the fixation probability of a favourable allele in different genetic backgrounds. For a single large heterogeneous population, Barton found that loosely linked loci reduce fixation probability through a reduction in the effective population size, by a factor that depends on the additive genetic variance. At tightly linked loci, however, Barton demonstrated that deleterious mutations, substitutions and fluctuating polymorphisms each reduce the fixation probability in a way that cannot be simply captured by an effective population size.

The study of linked loci was extended by Johnson & Barton (2002) who estimated the fixation probability of a beneficial mutation in an asexual population of fixed size, in which recurrent deleterious mutations occur at a constant rate at linked loci. Johnson and Barton assumed that each deleterious mutation reduces the fitness of the carrier by a factor of (1−sd) (i.e. any deleterious mutation has the same quantitative effect on fitness). Furthermore, it is assumed that the beneficial mutation increases the fitness of an individual carrier by a factor of (1+sb) regardless of the number of deleterious mutations present in the carrier. Thus, the relative fitness of an individual with a beneficial mutation and i deleterious mutations is wi=(1+sb)(1−sd) i . Johnson and Barton estimated the fixation probability by summing fiPi, where fi is the probability that a beneficial mutation arises in an individual with i deleterious mutations and Pi, given by the solution of simultaneous equations, is the probability that a beneficial mutation arising in such an individual is not ultimately lost. Johnson and Barton were thus able to quantify the reduction in the fixation probability of a beneficial mutation due to interference from segregating deleterious mutations at linked loci. Interestingly, this result is then used to determine the expected rate of increase in population fitness and the mutation rate that maximizes this fitness increase.

#### 5.2 Quasi-species fixation

Quasi-species theory describes the evolution of a very large asexually reproducing population that has a high mutation rate (Eigen & Schuster 1979 Eigen et al. 1988, 1989 Domingo et al. 2001). This theory is often cited in describing the evolution of RNA viruses (Domingo et al. 2001 Wilke 2003 Manrubia et al. 2005 Jain & Krug 2007). Several authors have questioned the relevance of quasi-species theory to viral evolution (Moya et al. 2000 Jenkins et al. 2001 Holmes & Moya 2002), arguing that the mutation rates necessary to sustain a quasi-species are unrealistically high. In contrast, however, Wilke (2005) reviewed related literature and argued that quasi-species theory is the appropriate model for the population genetics of many haploid, asexually reproducing organisms.

In typical models of population genetics, it is assumed that mutations are rare events, such that an invading mutant strain will not mutate again before fixation or extinction occurs. In contrast, in quasi-species models, the offspring of a mutated individual are very likely to mutate before fixation. Consequently, the fitness of an invading quasi-species is not solely determined by the fitness of the initial/parent mutant, but depends on the average fitness of the ‘cloud’ of offspring mutants related to that parent, continually introduced by mutation, and removed through selection (the ‘mutation–selection balance’). In quasi-species theory, therefore, the fixation of a mutant is defined to be its establishment as a common ancestor of the whole population since the population is never genetically identical, the standard definition does not apply.

Wilke (2003) first investigated the fixation probability of an advantageous mutant in a viral quasi-species. This contribution uses multitype branching processes to derive an expression for the fixation probability in an arbitrary fitness landscape. Wilke initially assumed that mutations that are capable of forming a new invading quasi-species are rare. Thus, while mutations within the quasi-species are abundant, only one quasi-species will be segregating from the wild-type quasi-species at any given time. Under this assumption, the fixation probability was determined for fixation events that increase the average fitness of the population (situations where the average fitness is reduced or left unchanged were not addressed). If πi denotes the probability of fixation of sequence i, that is, the probability that the cascade of offspring spawned by sequence i does not go extinct, and Mij gives the expected number of offspring of type j from sequences of type i in one generation, Wilke demonstrated that the vector of fixation probabilities satisfies (with the convention ). This implies

As discussed more fully in §6, estimates of the fixation probability are extremely sensitive to assumptions regarding the life history of the organism. Wilke's elegant result is a generalization of Haldane's approach, retaining the assumptions of discrete, non-overlapping generations and Poisson-distributed offspring. As these assumptions are not particularly well suited for the life history of viruses, it remains unclear which conclusions of this study would hold in viral populations.

#### 5.3 Clonal interference

In a genetically homogeneous asexual population, two or more beneficial mutations may occur independently in different individuals of the population. Clonal interference refers to the competition that ensues between the lineages of these independent mutations thereby, potentially, altering the fate of the lineages. The idea that competing beneficial mutations may hinder a beneficial mutation's progress to fixation was formulated by Muller (1932, 1964) in his discussions on the evolutionary advantage of sex. Since that time, numerous studies have been conducted on the subject of clonal interference in the last decade a rich literature, both experimentally and theoretically, has developed, sparked by renewed interest in the adaptation of asexual populations in laboratory settings.

A review of this growing literature would be substantial, and is outside the scope of this contribution, relating more closely to adaptation and adaptation rates than to fixation and extinction probabilities, narrowly defined. However, we give a brief overview of the standard means of estimating fixation probabilities under clonal interference, and refer the reader to other recent contributions (Campos & de Oliveira 2004 Campos et al. 2004, 2008 Rosas et al. 2005 De Visser & Rozen 2006).

Gerrish & Lenski (1998) published the first discussion of fixation probabilities under clonal interference. Gerrish and Lenski considered the possibility that while an initial beneficial mutation is not yet fixed, it is possible for a set of other mutations to emerge in the population. If at least some of these mutations survive extinction when rare (for example, due to genetic drift), a competition ensues between the focal mutation and the subsequent mutations. Assuming that the probability density for the selective advantage of beneficial mutations is given by αe −αs , Gerrish and Lenski stated that the probability the focal mutation fixes will be . The function π(s) gives the probability that a given beneficial mutation is not lost through drift when rare, while the function λ(s) gives the mean number of mutations that: occur before the focal mutation fixes have a higher s than the focal mutation and survive drift. We note that λ(s) is also a function of the population size, the mutation rate and α. Under the assumption that mutations appear spontaneously at a constant rate, e −λ(s) then gives the probability that zero superior mutations occur, and survive drift, before the focal mutation fixes. This basic structure for the fixation probability during clonal interference has been augmented in subsequent contributions (Campos & de Oliveira 2004 Campos et al. 2004). The most interesting prediction of this work is that at high mutation rates, clonal interference imposes a ‘speed limit’ on the rate of adaptation.

There is a small conceptual flaw in this derivation (P. Gerrish 2000, personal communication), which is that the possibility that other beneficial mutations were segregating before the initial appearance of the focal individual was neglected. If many mutations are segregating simultaneously, the focal beneficial mutation is likely to have arisen on the background of a previously segregating beneficial mutation. Thus mutations may sweep in groups, the ‘multiple mutation’ regime. Conceptually, the multiple mutation regime lies on a continuum between clonal interference as described by Gerrish & Lenski (1998) and quasi-species dynamics.

The dynamics of adaptation in the multiple mutation regime have been recently described in some detail (Desai & Fisher 2007 Desai et al. 2007). In contrast to the work of Gerrish & Lenski (1998), these authors predicted that clonal interference may not always reduce adaptation rates. Like Gerrish and Lenski, this approach depends on the underlying probability that a beneficial mutation escapes extinction through drift when rare, and assumes that this probability is proportional to s.

### 6. Life-history models

In almost every contribution discussed so far, beneficial mutations are assumed to increase the average number of offspring: so-called ‘fecundity mutants’. For many organisms, however, a mutant may have the same average number of offspring as the wild-type, but may produce these offspring in a shorter generation time: ‘generation time mutants’. An example here is bacterial fission in the presence of antibiotics: many antibiotics reduce cell growth and thus mutations conferring resistance have a reduced generation time.

This issue was first addressed by Wahl & DeHaan (2004) who approximated the fixation probability for beneficial generation time mutants (πG), in a population of constant size or a population that grows between periodic bottlenecks. The approach is closely related to that of Pollak (2000). In a model with the Poisson offspring distribution with mean 2 and weak selection, it was found that πGs/ln(2) for a constant population size, while πGτs/2 ln(2), when τ, the number of generations between population bottlenecks, is moderately large. For a mutation that increases fecundity, the analogous approximation is π≈2s in a constant population size (Haldane 1927), while an estimate of πτs was obtained for a population with a moderately large τ (Heffernan & Wahl 2002). Thus, assuming that all mutations confer a fecundity advantage leads to an overestimate of the order 2 ln(2)∼1.4 for generation time mutations.

These results emphasize the sensitivity of fixation probabilities to the underlying life history of the organism being modelled, and to the specific effect of the beneficial mutation on this life history. Based on these results, Hubbarde and co-authors studied the fixation probability of beneficial mutations in a ‘burst–death model’ (Hubbarde et al. 2007 Hubbarde & Wahl 2008). This model is based on the well-known continuous-time branching process called the birth–death process, in which each individual faces a constant probability of death, and a constant probability of undergoing a birth event, in any short interval of time. Thus, the generation time or lifetime of each individual is exponentially distributed.

In contrast to a birth–death model, however, a burst event can add more than one offspring to the population simultaneously (a burst of two might model bacterial fission a burst of 100 might model a lytic virus). The burst–death model explored by Hubbarde et al. treats populations in which the expected size is constant (i.e. the death rate balances the burst rate), and populations that grow between periodic bottlenecks. Hubbarde et al. computed the fixation probability for mutations that confer an advantage by increasing either the burst size or the burst rate. This work was extended by Alexander & Wahl (2008) who compared the fixation probability of mutations with equivalent effects on the long-term growth rate, i.e. equally ‘fit’ mutations. The latter paper demonstrates that mutations that decrease the death rate (increasing survival) are most likely to fix, followed by mutations that increase the burst rate. Mutations that increase the burst size are least likely to fix in the burst–death model.

The important departure in the burst–death model from previous work is that a beneficial mutation may affect a number of life-history traits independently. Thus, the mean number of offspring can change independently of p0, the probability of having zero offspring. While the mean largely determines the long-term growth rate, or Malthusian fitness, of the mutant, the fixation probability is sensitive to short-term processes, particularly p0.

By contrast, when generation times are fixed and offspring numbers are Poisson distributed, the only way for a mutation to be beneficial is for it to increase the mean number of offspring, by a factor typically denoted (1+s). The probability of leaving zero offspring is completely constrained by this mean, and this ultimately implies that fixation probabilities, while perhaps not equal to 2s, are at least proportional to s under these classic assumptions.

This simple proportionality no longer holds when more complicated, and thus more realistic, life histories are considered. The overall conclusion here is that for many real populations, estimates of the fixation probability should take into account both the life-history details of the organism and the mechanism by which the mutation confers a reproductive advantage.

### 7. From theory to experiment

The experimental study of evolution has been recently accelerated through the study of rapidly evolving organisms, such as bacteria, viruses and protozoa (Lenski et al. 1991 Lenski & Travisano 1994 Papadopoulos et al. 1999). These organisms adapt to laboratory conditions on experimentally feasible time scales, making them ideal candidates for the real-time study of evolution. These experiments have generated tremendous interest in evolutionary biology, allowing for experimental tests of some of the most basic features of adaptation.

To date, however, the fixation probability of a specific beneficial mutation has never been experimentally measured. With the advent of serial passaging techniques that allow for experimental designs with very high numbers of replicates (e.g. 96-well plates), we argue that an experimental estimate of the fixation probability is finally within reach. After 80 or 90 years of theory, the possibility of experimental validation is fascinating.

On the other hand, the models developed to date are probably not sufficiently tailored to the life histories of the organisms that could be used in such experiments. Neither bacteria nor viruses are well modelled by discrete, non-overlapping generations, nor by a Poisson distribution of offspring. Recent contributions by Parsons & Quince (2007a,b) and Lambert (2006), as well as work from our own group (Hubbarde et al. 2007 Alexander & Wahl 2008) have highlighted the extreme sensitivity of fixation probabilities to such assumptions.

For experiments involving bacteria, we suggest that theoretical predictions of the fixation probability must be based specifically on bacterial fission. A beneficial mutation might reduce the generation time, for example, or increase the probability that one or both of the daughter cells survive to reproductive maturity. For experiments involving viruses, theoretical predictions must likewise be tailored to include the processes of viral attachment, the eclipse time and then the release of new viral particles through budding or lysis. Other microbial systems will present their own life histories and their own modelling challenges. In addition, population bottlenecks, washout from a chemostat or limited resources must be imposed in experimental systems to prevent unbounded microbial growth.

A final note is that very often, in estimating the fixation probability, it is assumed that selection is weak. This phrase means for example that the selective advantage s is sufficiently small that terms of order s 2 are negligible. This assumption has been widely, and very usefully, employed in population genetics over decades, and is still considered to be relevant to most natural populations. Recent evidence from the experimental evolution of microbial populations, however, has indicated that some beneficial mutations exert extremely high selection pressures, with s of the order of 10 or more (Bull et al. 2000). Thus, a further challenge for theoreticians is to design organism- and protocol-specific models that retain accuracy and tractability, even for very strong selective effects.

The authors are grateful to four anonymous referees, whose comments strengthened this review, and to the Natural Sciences and Engineering Research Council of Canada for funding.

## Odds Ratio and gene mutations association

I have a simple (yet to me still trivial) problem to submit to you.

I have a dataset of a group of patients affected by a disease, for which the presence of several genes mutations was inferred. Each gene is a variable with either 0 for negative and 1 for positive. I need to assess the presence of associations between these genes, to establish whether some tend to be co-mutated while some other tend to be mutually exclusive. For doing this, I first analyzed all the genes possible combinations in 2x2 contingency tables such as:

in this case for example, the p value is very significant, so I thought it could be useful to compute the OR to establish a relationship. Here, for example, the OR obtained from the formula (OR=A x D/ C x B) is 0.53, hence it should mean that the two genes tend to be more in opposite directions (0-1 or 1-0) compared to same directions (0-0 or 1-1). However, my concern is that in this way it is not clear whether the two genes have a positive or negative correlation. Should I just compare the double positive (1-1) against the total of discordant cases (0-1 and 1-0)? In this case it would be 69/(131+428)=69/559=0,12. Is it useful?

However, each gene has a different % of mutation within the population, so for example gene 1 here has a .18 probability of being mutated whilst gene 2 has a .46 probability. Should I take this into account? I played around and tried to see how these 4 combinations would look like if they were only due to the each of the two genes expected mutation frequencies, so something like that came out: final numbers are the same, but if you look at it, numbers are ridistributed according to the expected frequencies (ie: total no of mut gene 1 cases is 195/1059=0.18 which is the expected mut frequency of the gene). I then computed another OR for these numbers (12.76) and compared it with the previous one using Tarone´s test of homogeneity between the two tables (in this case, p-value is significant). From the simple division of each category from the "real life" table / the "expected frequencies" table I obtained a ratio (ie: 0/0 ratio=431/542=0.79, there are less double negative than expected). Do you think this is a correct reasoning? If so, should i use the 1/1 ratio to know if the relation is positive or negative (in this case 69/172=0,4, there are less double positive than expected so the genes are inversely correlated)?

## Georgia Institute of Technology School of Mathematics | Georgia Institute of Technology | Atlanta, GA

Suppose that n sample genomes are collected from the same population. The expected sample frequency spectrum (SFS) is the vector of probabilities that a mutation chosen at random will appear in exactly k out of the n individuals. This vector is known to be highly dependent on the population size history (demography) for this reason, geneticists have used it for demographic inference. What does the set of all possible vectors generated by demographies look like? What if we specify that the demography has to be piecewise-constant with a fixed number of pieces? We will draw on tools from convex and algebraic geometry to answer these and related questions.

#### GT Math Resources

Georgia Institute of Technology
North Avenue, Atlanta, GA 30332
Phone: 404-894-2000

## Materials and methods

### Plasmids

Gene synthesis for all plasmids generated by this study were performed and the sequence confirmed by GeneImmune Biotechnology (Rockville, MD). The SARS-CoV-2 spike protein ectodomain constructs comprised the S protein residues 1 to 1208 (GenBank: MN908947) with the D614G mutation, the furin cleavage site (RRAR residue 682-685) mutated to GSAS, a C-terminal T4 fibritin trimerization motif, a C-terminal HRV3C protease cleavage site, a TwinStrepTag and an 8XHisTag. All spike ectodomains were cloned into the mammalian expression vector pαH and have been deposited to Addgene (42) (https://www.addgene.org) under the codes 171743, 171744, 171745, 171746, 171747, 171748, 171749, 171750, 171751 and 171752. For the ACE2 construct, the C terminus was fused a human Fc region (19).

### Cell culture and protein expression

GIBCO FreeStyle 293-F cells (embryonal, human kidney) were maintained at 37°C and 9% CO2 in a 75% humidified atmosphere in FreeStyle 293 Expression Medium (GIBCO). Plasmids were transiently transfected using Turbo293 (SpeedBiosystems) and incubated at 37°C, 9% CO2, 75% humidity with agitation at 120 rpm for 6 days. On the day following transfection, HyClone CDM4HEK293 media (Cytiva, MA) was added to the cells. Antibodies were produced in Expi293F cells (embryonal, human kidney, GIBCO). Cells were maintained in Expi293 Expression Medium (GIBCO) at 37°C, 120 rpm and 8% CO2 and 75% humidity. Plasmids were transiently transfected using the ExpiFectamine 293 Transfection Kit and protocol (GIBCO) (9, 19, 55).

### Protein purification

On the 6 th day post transfection, spike ectodomains were harvested from the concentrated supernatant. The spike ectodomains were purified using StrepTactin resin (IBA LifeSciences) and size exclusion chromatography (SEC) using a Superose 6 10/300 GL Increase column (Cytiva, MA) equilibrated in 2mM Tris, pH 8.0, 200 mM NaCl, 0.02% NaN3. All steps of the purification were performed at room temperature and in a single day. Protein quality was assessed by SDS-Page using NuPage 4-12% (Invitrogen, CA). The purified proteins were flash frozen and stored at -80°C in single-use aliquots. Each aliquot was thawed by a 20-min incubation at 37°C before use. Antibodies were purified by Protein A affinity and digested to their Fab state using LysC. ACE2 with human Fc tag was purified by Protein A affinity chromatography and SEC (19). RBD constructs were produced and purified as described in Saunders et al. (56).

Antibody binding to SARS-CoV-2 spike and RBD constructs was assessed using SPR on a Biacore T-200 (Cytiva, MA, formerly GE Healthcare) with HBS buffer supplemented with 3 mM EDTA and 0.05% surfactant P-20 (HBS-EP+, Cytiva, MA). All binding assays were performed at 25°C. Spike variants were captured on a Series S Strepavidin (SA) chip (Cytiva, MA) by flowing over 200 nM of the spike for 60 s at 10 μL/min flowrate. The Fabs were injected at concentrations ranging from 0.625 nM to 800 nM (2-fold serial dilution) using the single cycle kinetics mode with 5 concentration per cycle. For the single injection assay, the Fabs were injected at a concentration of 200nM. A contact time of 60s, dissociation time of 120 s (3600s for DH1047 for the single cycle kinetics) at a flow rate of 50μL/min was used. The surface was regenerated after each dissociation phase with 3 pulses of a 50mM NaH + 1M NaCl solution for 10 s at 100 μL/min. For the RBDs, the antibodies were captured on a CM5 chip (Cytiva, MA) coated with Human Anti-Fc (using Cytiva Human Antibody Capture Kit and protocol), by flowing over 100nM antibody solution at a flowrate of 5μL/min for 120s. The RBDs were then injected at 100nM for 120 s at a flowrate of 50μL/min with a dissociation time of 30 s. The surface was regenerated by 3 consecutive pulse of 3M MgCl2 for 10s at 100μL/min. Sensorgram data were analyzed using the BiaEvaluation software (Cytiva, MA).

### Negative-stain electron microscopy

Samples were diluted to 100 μg/ml in 20 mM HEPES pH 7.4, 150 mM NaCl, 5% glycerol, 7.5 mM glutaraldehyde (Electron Microscopy Sciences, PA) and incubated for 5 min before quenching the glutaraldehyde by the addition of 1 M Tris (to a final concentration of 75 mM) and 5 min incubation. A 5-μl drop of sample was applied to a glow-discharged carbon-coated grid (Electron Microscopy Sciences, PA, CF300-Cu) for 10-15 s, blotted, stained with 2% uranyl formate (Electron Microscopy Sciences, PA), blotted and air-dried. Images were obtained using a Philips EM420 electron microscope at 120 kV, 82,000× magnification, and a 4.02 Å pixel size. The RELION (57) software was used for particle picking, and 2D and 3D class averaging.

### ELISA assays

Spike ectodomains tested for antibody- or ACE2-binding in ELISA assays as previously described (32). Assays were run in two formats i.e., antibodies/ACE2 coated, or spike coated. For the first format, the assay was performed on 384-well plates coated at 2 μg/ml overnight at 4°C, washed, blocked and followed by two-fold serially diluted spike protein starting at 25 μg/mL. Binding was detected with polyclonal anti-SARS-CoV-2 spike rabbit serum (developed in our lab), followed by goat anti-rabbit-HRP (Abcam, Ab97080) and TMB substrate (Sera Care Life Sciences, MA). Absorbance was read at 450 nm. In the second format, serially diluted spike protein was bound in wells of a 384-well plates, which were previously coated with streptavidin (Thermo Fisher Scientific, MA) at 2 μg/mL and blocked. Proteins were incubated at room temperature for 1 hour, washed, then human mAbs were added at 10 μg/ml. Antibodies were incubated at room temperature for 1 hour, washed and binding detected with goat anti-human-HRP (Jackson ImmunoResearch Laboratories, PA) and TMB substrate.

### Cryo-EM

Purified SARS-CoV-2 spike ectodomains were diluted to a concentration of

1.5 mg/mL in 2 mM Tris pH 8.0, 200 mM NaCl and 0.02% NaN3 and 0.5% glycerol was added. A 2.3-μL drop of protein was deposited on a Quantifoil-1.2/1.3 grid (Electron Microscopy Sciences, PA) that had been glow discharged for 10 s using a PELCO easiGlow Glow Discharge Cleaning System. After a 30-s incubation in >95% humidity, excess protein was blotted away for 2.5 s before being plunge frozen into liquid ethane using a Leica EM GP2 plunge freezer (Leica Microsystems). Frozen grids were imaged using a Titan Krios (Thermo Fisher) equipped with a K3 detector (Gatan). The cryoSPARC (58) software was used for data processing. Phenix (54, 59), Coot (60), Pymol (61), Chimera (62), ChimeraX (63) and Isolde (64) were used for model building and refinement.

### Vector based structure analysis

Vector analysis of intra-protomer domain positions was performed as described previously (19) using the Visual Molecular Dynamics (VMD) (65) software package Tcl interface (66). For each protomer of each structure, Cα centroids were determined for the NTD (residues 27 to 69, 80 to 130, 168 to 172, 187 to 209, 216 to 242, and 263 to 271), NTD′ (residues 44 to 53 and 272 to 293), RBD (residues 334 to 378, 389 to 443, and 503 to 521), SD1 (residues 323 to 329 and 529 to 590), SD2 (residues 294 to 322, 591 to 620, 641 to 691, and 692 to 696), CD (residues 711 to 716 1072 to 1121), and a S2 sheet motif (S2s residues 717 to 727 and 1047 to 1071). Additional centroids for the NTD (NTDc residues 116 to 129 and 169 to 172) and RBD (RBDc residues 403 to 410) were determined for use as reference points for monitoring the relative NTD and RBD orientations to the NTD′ and SD1, respectively. Vectors were calculated between the following within protomer centroids: NTD to NTD′, NTD′ to SD2, SD2 to SD1, SD2 to CD, SD1 to RBD, CD to S2s, NTDc to NTD, RBD to RBDc. Vector magnitudes, angles, and dihedrals were determined from these vectors and centroids. Inter-protomer domain vector calculations for the SD2, SD1, and NTD′ used these centroids in addition to anchor residue Cα positions for each domain including SD2 residue 671 (SD2a), SD1 residue 575 (SD1a), and NTD′ residue 276 (NTD′a). These were selected based upon visualization of position variation in all protomers used in this analysis via alignment of all of each domain in PyMol (61). Vectors were calculated for the following: NTD′ to NTD′r, NTD′ to SD2, SD2 to SD2r, SD2 to SD1, SD1 to SD1r, and SD1 to NTD′. Angles and dihedrals were determined from these vectors and centroids. Vectors for the RBD to adjacent RBD and RBD to adjacent NTD were calculated using the above RBD, NTD, and RBDc centroids. Vectors were calculated for the following: RBD2 to RBD1, RBD3 to RBD2, and RBD3 to RBD1. Angles and dihedrals were determined from these vectors and centroids. Principal components analysis, K-means clustering, and Pearson correlation (confidence interval 0.95, p<0.05) analysis of vectors sets was performed in R (67). Data were centered and scaled for the PCA analyses. Principal components analysis, K-means clustering, and Pearson correlation (confidence interval 0.95, p < 0.05) analysis of vectors sets was performed in R. Data were centered and scaled for the PCA analyses.

### Difference distance matrices (DDM)

DDM were generated using the Bio3D package (68) implemented in R (R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/)

The CHARMM CR3022 bound SARS-CoV-2 RBD crystal structure (69) (PDB ID 6ZLR) model (70, 71) was used for the adaptive sampling simulations (66). The CR3022 antibody, glycan unit, water, and ions were stripped from the model leaving only the protein portion of the RBD. The final model comprised Spike residues 327 to 529. A single Man5 glycan was added at the N343 position using the CHARMM GUI (70) with the P.1/B.1.1.28/B.1.351 RBD mutations K417N, E484K, and N501Y prepared in PyMol. Systems for simulation were built using the AmberTools20 Leap (72) program. The unmutated (WT) and P.1/B.1.1.28/B.1.351 (Mut) RBDs were immersed in a truncated octahedral TIP3P water box with a minimum edge distance of 15 Å to the nearest protein atom followed by system neutralization with chlorine atoms resulting in systems sizes of 67,508 and 66,894 atoms for the WT and Mut, respectively. The Amber ff14SB protein (73) and Glycam (74) forefields were used throughout. All simulations were performed using the Amber20 pmemd CUDA implementation. The systems were first minimized for 10,000 steps with protein atom restraints followed by minimization of the full system without restraints for an additional 10,000 steps. This was followed by heating of the systems from 0 K to 298 K over a period of 20 ps in the NVT ensemble using a 2 fs timestep using the particle mesh Ewald method for long-range electrostatics and periodic boundary conditions (75). The systems were then equilibrated for 100 ps in the NPT ensemble with the temperature controlled using Langevin dynamics with a frequency of 1.0 ps –1 and 1 atm pressure maintained using isotropic position scaling with a relaxation time of 2 ps (76). A non-bonded cut-off of 8 Å was used throughout and hydrogen atoms were constrained using the SHAKE algorithm (77) with hydrogen mass repartitioning (78) used to allow for a 4 fs timestep. In order to generate an ensemble of RBD tip conformations for initiation of the adaptive sampling routine, we performed one hundred 50 ns simulations in the NVT ensemble with randomized initial velocities for each of the WT and Mut systems. The final frame from each of these simulations was used to initiate the adaptive sampling scheme. Adaptive sampling was performed using the High-Throughput Molecular Dynamics (HTMD v. 1.24.2) package (79). Each iteration consisted of 50-100 independent simulations of 100 ns. Simulations from each iteration were first projected using a dihedral metric with angles split into their sin and cos components for residues 454 to 491. This was followed by a TICA (80) projection using a lag time of 5 ns and retaining five dimensions. Markov state models were then built using a lag time of 50 ns for the selection of new states for the next iteration. A total of 29 adaptive iterations were performed yielding total simulation times of 274.8 and 256.8 μs for the WT and Mut systems, respectively. Simulations were visualized in VMD and PyMol.

### Markov state modelling

Markov state models (MSMs) were prepared in HTMD with an appropriate coordinate projection selected using PyEMMA (81) (v. 2.5.7). Multiple projections were tested on a 25 μs subset of the Mut simulations that included atomic distance and contact measures between RBD residues as well as backbone torsions of the RBD tip residues using the variational approach to Markov processes score (82) (fig. S17 and table S4) (66). This led to the selection of a Cα pairwise distance metric between residues 471 to 480 and 484 to 488 for MSM construction. MSMs were prepared in HTMD using a TICA lag time of 5 ns retaining five dimensions followed by K-means clustering using 500 cluster centers. The implied timescales (ITS) plots were used to select a lag time of 30 ns for MSM building. Models were coarse-grained via Perron cluster analysis (PCCA++) using 2 states and validated using the Chapman-Kolmogorov (CK) test. A bootstrapping routine without replacement was used to calculate measurement errors retaining 80% of the data per iteration for a total of 100 iterations. State statistics were collected for mean first passage times (MFPT), stationary distributions, and root-mean square deviations (RMSD) for RBD tip residues 470-490. Residue 484 sidechain contacts were calculated from a representative model. A contact was defined as atom pairing within 3.5 Å between either the minimum of either E484 γ-carboxyl O atoms (for WT) or K484 ε-amino N atom (for Mut) and backbone or sidechain O or N atoms for residues 348 to 354, 413 to 425, or 446 to 500. The RMSD and contact metric means were model weighted. Weighted state ensembles containing 250 structures were collected for visualization in VMD.

Mutation probability (or ratio) is basically a measure of the likeness that random elements of your chromosome will be flipped into something else. For example if your chromosome is encoded as a binary string of lenght 100 if you have 1% mutation probability it means that 1 out of your 100 bits (on average) picked at random will be flipped.

Crossover basically simulates sexual genetic recombination (as in human reproduction) and there are a number of ways it is usually implemented in GAs. Sometimes crossover is applied with moderation in GAs (as it breaks symmetry, which is not always good, and you could also go blind) so we talk about crossover probability to indicate a ratio of how many couples will be picked for mating (they are usually picked by following selection criteria - but that's another story).

This is the short story - if you want the long one you'll have to make an effort and follow the link Amber posted. Or do some googling - which last time I checked was still a good option too :)