Principal investigators: Michael Baake (Bielefeld), Ellen Baake (Bielefeld) |

Recombination is a major source of genetic variability in populations. It occurs during sexual reproduction and leads to the `reshuffling' of the genetic material of two parents into the type of an offspring individual. More precisely, the maternal and paternal sequences perform one or more crossovers and are cut and relinked accordingly, which results in a reciprocal exchange of genetic material. The main objective of the proposed project is to further analyse the Moran model with recombination (which describes the joint action of random reproduction and recombination in a finite population) and the corresponding ancestral partitioning process, which emerges if the genetic material of individuals taken from a population today are traced back into the past. We will also take into account the additional effects of selection and of superimposed mutation. Emphasis will be on the regime of high recombination rates, that is, on the law of large numbers and the moderate- recombination diffusion limit. |

Principal investigators: Rolf Backofen (Freiburg), Peter Pfaffelhuber (Freiburg) |

The CRISPR-Cas system (Clustered Regularly Interspaced Short Palindromic Repeats) is a prokaryiotic adaptive defense system against phages (virus) that is contained in most bacteria. Upon an initial virus infection of a bacterial cell, the viral DNA can be excised by Cas-proteins and built into the CRISPR array of spacer sequences. Successive infection with the same virus is now prevented as foreign DNA matching this spacer is targeted by proteins which cleave the recognized DNA. If a bacterium does not posses the right spacer/Cas combination and is attacked by a virus, the virus can spread and kill the bacterium. In our project we combine bioinformatics expertise and probabilistic modeling in order to obtain quantitative and reliable models for understanding the evolution of CRISPR arrays in closely related strains. Fundamental to our work is a bioinformatics analysis including assembly and classification (by Cas proteins) of available metagenomics data from sea water and the human gut. Modeling CRISPR evolution is either carried out neutrally, where novel spacer sequences enter the CRISPR array at random positions or at the leader-end of the array. More realistic scenarios build on the co-evolution of bacterium and virus. We extend (the analysis of) existing evolutionary models and aim at bringing together metagenomics data and population genetic models via a neutrality test and inference of horizontal gene transfer within the CRISPR-Cas system. |

Principal investigators: Ellen Baake (Bielefeld), Anton Wakolbinger (Frankfurt am Main) |

Understanding the interplay of (random) reproduction, mutation, and selection is a major topic of population genetics research. In line with the historical perspective of evolutionary research, modern approaches aim at tracing back the ancestry of a sample of individuals taken from a present population. Generically, in populations (of genes, or genomes without recombination, say) that evolve in a time stationary manner with a controlled (ideally constant) population size over a long time span, the ancestral lines will eventually coalesce backwards in time into a single line of descent. Viewed forwards in time (and projected into the future) this gives rise to an immortal line. In the first funding period, we have used the concept of the equilibrium ancestral selection graph and developed a new construction, the ` pruned lookdown ancestral selection graph', to study the properties of the immortal line. The central objective of this continuation project is to develop these approaches further and make them available for a broader spectrum of problems, such as the genealogies of samples in a finite time horizon, more general type spaces, fitness functions and mutation schemes, and models including geographic and genetic structure. |

Principal investigator: Ellen Baake (Bielefeld) |

The Priority Programme ‘Probabilistic Structures in Evolution’ is devoted to the in- depth theoretical study of stochastic processes in population genetics (that is, describing the evolution of the genetic structure of populations under the action of the various evolutionary forces), stochastic models of adaptive dynamics (that is, individual-based models for the joint description of ecology and evolution), and probabilistic aspects of evolutionary game theory. The coordination project will provide funding for all central activities, like summer schools, workshops, and conferences, long-term visitors (shared between groups), administrative support for coordination, and a final book publication. |

Principal investigators: Matthias Birkner (Mainz), Jochen Blath (Berlin) |

Multiple merger coalescent modeling and analysis has up to now been mainly focused on neutral, haploid, single-locus set-ups. The central aim of this project is to develop the stochastic models, theoretical results and inference methods required to effectively describe and analyse the observed patterns of genetic variation in sequence data in real populations with skewed offspring distributions under the influence of further evolutionary forces, especially recombination, selection and population structure; in other words, the systematic development of the basics of a `mathematical population genetics for highly variableoffspring distributions'. Given recent progress in DNA sequencing technology, and insight in the limitations of inference methods based single locus set-ups, particular emphasis will be put on realistic diploid multi-locus models and the corresponding statistical machinery for data analysis. |

Principal investigators: Matthias Birkner (Mainz), Nina Gantert (München) |

Co-investigator: Andrej Depperschmidt (Freiburg) |

The goal is to understand the longtime behaviour of stochastic spatial population models with interaction and the spatial embeddings of their genealogies. The paper (Birkner und Depperschmidt, 2007) treated logistic branching random walks and showed that non-trivial equilibria exist if the interaction is weak enough. We plan to generalize these results to different interactions as well as to spatial models, and to several species. Then, we want to establish `true phase transitions': there should be a critical competition parameter such that the system survives when competition is below and dies out when it is above that value. Ancestral lineages can be described as random walks in dynamic random environment, see (Birkner et al, 2013, 2015). For such processes, we investigate the following topics: the almost sure central limit theorem, the rescaling of several ancestry lines in the same environment, the invariant measure for the environment seen from the particle. Finally, we also aim at a more explicit description of the coalescence probability for coalescent walkers in the same medium which isrelated to the `effective population size': a simplified model yields a pinning model which is of independent interest. |

Principal investigators: Jochen Blath (Berlin), Noemi Kurt (Berlin) |

Dormancy is a well-documented trait in many taxa, including microorganisms, and generates a seedbank that is considered to play an important role in the population genetics of a species. Very recently, a new universal coalescent structure, the seedbank coalescent, has been introduced, arising in a natural way as the backward in time scaling limit of the genealogy of a new Wright-Fisher model that includes dormancy. A first mathematical inspection of this model exhibits qualitative differences to standard coalescent models such as the Kingman coalescent. While some preliminary results have already been obtained, this new objects opens up many new lines of research in population genetics. In the proposed project, we plan a thorough mathematical and statistical investigation of the seedbank model and the seedbank coalescent and related models, following three main lines of research: First, we will carry out a detailed mathematical analysis of the model concerning in particular its longterm behaviour, the corresponding genealogical tree properties, limit theorems, generated partition structures, and universality properties. Second, we will incorporate further evolutionary forces into the model, in particular general mutation spaces, different types of selection, fluctuating population size, or variable environments, and also extend the model to accommodate different mechanisms of initiating and terminating dormancy. This will lead to a better understanding of the role of dormancy as an evolutionary force, and its interplay with other forces in population genetics. Third, we will undertake a statistical analysis of the model, investigate classical quantities such as the site frequency spectrum, under the seedbank coalescent. We will also aim to obtain (approximate) sampling formulae to derive explicit sample likelihoods. We aim to estimate model parameters, in particular for the seedbank size, the dormancy rate, and the mutation rates, and also to develop tests for the presence of (weak or strong) seed banks, or the presence of mutation in dormant forms. These inference methods should then be assessed using simulations, and ultimately lead to a toolbox that can be used by biologists to investigate real DNA sequence data. |

Principal investigators: Anton Bovier (Bonn), Joachim Krug (Köln) |

The adaptive evolution of a population under the influence of mutation and selection is governed by the structure of the underlying fitness landscape. Here fitness can either be assigned directly to genotypes, or it arises dynamically from ecological interactions. In the first case the fitness landscape is modeled as a random function on the space of genotypes, and the interest is in characterizing the evolutionary accessibility of high-fitness regions, for example, in terms of pathways of ascending fitness and the properties of uphill adaptive walks. Building on results achieved in the first funding period, the project will continue the investigation of accessibility properties for classes of correlated landscapes and introduce more realistic models incorporating explicit intermediate phenotypes. The application of adaptive walks for the exploration of high-dimensional empirical data sets will also be developed. An important achievement of the investigation of adaptive dynamics in the first funding period was a rigorous control of various scaling limits, and these results can now be exploited to provide an underpinning of the simplified adaptive walk models in terms of individual- based ecological dynamics. Moreover, the analysis will be extended to the regime of evolutionary branching, and the effects of higher mutation rates and diploid recombination on genetic diversity will be addressed. A further focus of the project will be on adaptive dynamics models with phenotypic switching that arise in the context of tumor growth and immunotherapy. |

Principal investigator: Julien Dutheil (Plön) |

The sequentially Markov coalescent (SMC) is an approximation of the coalescent process with recombination enabling its application to whole genome data sets. The SMC model differs from the standard coalescent as it models the genealogy of a set of sequences spatially along the alignment rather than chronologically. In addition, the process of genealogy change along the genome is Markovian, allowing the use of hidden Markov models for inference of population genomic parameters. While the SMC models the coalescent in space, current models so far assume homogeneity of parameters along the genome. This assumption is clearly at odds with our knowledge of the biology of genomes, as mutation rate, recombination rate and effective population size are highly heterogeneous. SMC models have also been exclusively applied to higher eukaryotic species, essentially Primates. These species have very large genomes, for which the parameter heterogeneity is rather diluted. With next-generation sequencing data becoming increasingly affordable, population genomic data sets are being generated for species with smaller, more compact genomes. For these data sets, parameter heterogeneity can be much more extreme than for primate genomes. Such species include economically important fungal pathogens, which cannot be analyzed with current, over-simplistic models. In this project we propose an extension of current SMC models to account for stochastic processes along the genome. The spatial heterogeneity is modeled as a Markov process, which, when combined with the intrinsic Markov property of the SMC, results in a Markov-modulated sequentially Markov model. The project will establish the formal properties of such Markov-modulated SMC (MMSMC) analytically and using simulation procedures. Biological applications are proposed for both primate and fungal data sets. |

Principal investigator: Fabian Freund (Stuttgart) |

To assess which evolutionary forces have acted on a real population, one can compare the observed genetic diversity in a sample with its distribution under one or several theoretical models for possible evolutionary histories of the population. Such models include a model of the sample's genealogy. For a single selectively neutral genetic locus without recombination, the standard genealogy model is Kingman's n-coalescent if the sample of size n is taken form a randomly mating population with fixed size, much higher than the sample size. Kingman's n-coalescent is a random bifurcating tree with n leaves. This genealogy model can be extended e.g. to account for population size fluctuations in the past or for population subdivision while still being a bifurcating tree. However, theoretical models for populations with properties like reproduction sweepstakes or rapid selection will lead to multifurcating random trees as genealogies of a sample called multiple- merger n-coalescents. The main goal of this project is to assess whether samples from real populations which have properties where theoretical models predict multiple- merger n-coalescents genealogies are actually fitting better to these models than to extended Kingman's n-coalescents based on the observed genetic diversity. Several statistical methods have been proposed to distinguish between multiple-merger n-coalescents and the (extended) Kingman's n- coalescent (gene tree maximum likelihood, approximate Bayesian computation (ABC), approximate likelihood and an approach using a minimum-distance statistic, the latter three based on the site frequency spectrum of the sample). Further aims of this project are to refine and extend the ABC inference method and to investigate whether using statistics based on other genetic information than the site frequency spectrum can improve inference capacity. The inference capacity of the available inference methods will be compared via simulation for different genealogy model comparisons to identify the best method for a given comparison of different n-coalescents as genealogy models. To assess the main goal, first populations that might be linked to multiple- merger genealogies are identified. For these, specific multiple-merger n- coalescents and (extended) Kingman's n-coalescents (biologically reasonable alternative models) are used as potential genealogy models. For these models, inference for the best model is performed following the inference protocol established before. The inference results are then discussed in the light of known properties of the populations. |

Principal investigator: Andreas Greven (Erlangen) |

The project focusses on qualitative features of stochastically evolving spatial populations under the following evolutionary forces: random genetic drift, mutation, selection, recombination and migration. This project focusses on the longtime behaviour of (1) interacting (spatial) Fleming-Viot diffusions, (2) interacting (spatial) generalized (with block resampling) Cannings processes, (3) infinite particle systems under external spatially structured catastrophs, both the measure-valued versions to describe type and geographic population composition, as well as their forward evolving genealogical structures and lines of descent. A key role as dual processes is played by spatial coalescents and generalized branching-coalescents. The main features for genealogies treated are: (1) selective sweeps by rare mutants: stasis, punctuated equilibria,fitness landscapes (2) the effects of relative strength of mutation, selection, recombination and migration on large-scale behaviour, (3) the question of universality in model parameters, spatial continuum limits, characterization by martingale problems, (4) effect of external noise (random environments) on genealogical structures, (5) impact of recombination on the large-scale behaviour focussing both on a general framework and concrete examples. The continuation shifts towards recombination, random environment and continuum space. |

Principal investigators: Martin Hutzenthaler (Essen), Dirk Metzler (München) |

In the first project phase we investigated under which conditions an inheritable behavioral trait of defense against parasites can spread in a structured population even if it is costly in the sense that individuals having a defense gene tend to have less offspring. In this proposed continuation project we study in a many-demes limit the time until the first fixation of a defense allele arising from rare mutations. We are going to show that this time to first fixation is logarithmic in the inverse mutation rate. So even for small mutation rates defense traits can appear on an evolutionary relevant time scale. Mathematically our central contribution is to prove and generalize the results of Dawson and Greven (2011) without using dual processes for a large class of processes. Moreover we are going to simulate our model on a two-dimensional lattice with nearest-neighbour migration. This will show whether our results for the many-demes limit also apply to biologically more relevant population structures. |

Principal investigators: Martin Hutzenthaler (Essen), Peter Pfaffelhuber (Freiburg) |

Natural selection shapes genealogies within a population in various ways. In our proposal, we suggest both, a general qualitative study of some aspects of genealogies under selection, and a quantitative treatment of specific relevant models. More precisely, we study models with unbounded selection (so there are arbitraryly beneficial and/or deleterious fitness classes) using Girsanov transforms and approximate dualities, and selection in fluctuating environment (where an allele can be beneficial or deleterious, depending on the environment) using a general result on stochastic averaging in the limit of fast environmental changes. For both models, we use the previously developed technique of treating genealogical trees as metric measure spaces, leading to tree-valued stochastic Markov processes. The qualitative work is dealing with a comparison of genealogical distances under neutrality and under selection. We conjecture that many ssituations including selection lead to shorter genealogical distances. |

Principal investigators: Götz Kersting (Frankfurt am Main), Anton Wakolbinger (Frankfurt am Main) |

Coalescent theory has become a topic of central importance at the interface of probability theory and population genetics. In recent years, coalescent structures evolving in time have gained particular interest, they serve as a model for the changes which genealogies of populations undergo in time. In the first funding period we have studied evolving coalescents as tree- valued processes, and analysed the asymptotic distribution of various functionals of evolving Beta coalescents. We found a method which allows to represent a rich class of functionals of Beta coalescents as stable Poisson integrals, simultaneously for the static and the evolving case. In the second funding period we will continue this line of research. We will elaborate on the method of Poisson integrals, with a particular focus on the internal length spectrum of Beta coalescents. Moreover we will zoom in on new functionals of the Bolthausen-Sznitman coalescent and extend our investigations of coalescent functionals to general Lambda-coalescents. Specifically we are interested in the external branches of extremal length and the size of the last merger. We also plan to investigate the symmetric excursion representation for genealogies under individual competition. |

Principal investigator: Achim Klenke (Mainz) |

In the first period of the priority programme 1590, our PhD student Fridolin Kielisch has established a genealogical construction for mutually catalytic branching processes (both on the particle level and in the continuous masses limit) as well as for the symbiotic branching process (on the particle level). This construction is in the spirit of the lookdown constructions of Donnelly and Kurtz and more recently by Kurtz and Rodrigues. The next goal is to obtain a similar construction also for the continuous masses symbiotic branching process. This would finish the PhD thesis. Furthermore, for these models, the limit of the lookdown processes as the branching rate tends to infinity shall be studied. While primarily the goal is to reach a deeper understanding of the genealogies of symbiotic branching processes, we also hope to get some new insight in the renormalization analysis of the infinite rate processes. This would stress the special role of symbiotic branching processes. |

Principal investigator: Wolfgang König (Berlin) |

Co-investigator: Onur Gün (Berlin) |

We propose to study various branching random walk systems with random branching rates. The state space that we focus on is the N-dimensional hypercube, serving as a model for long gene sequences with mutations occurring by flip of a gene. We throughout consider the limit of large N, coupled with characteristic parameters such as time and sample size. The random branching rates form a fitness landscape that governs the selection. We first focus on uncorrelated landscapes and start with a population of one randomly picked gene sequence. We want to study the concentration properties of the system, the evolution of the mean fitness of the population and the aging properties. We will attack these questions both with probabilistic and analytical, i.e., spectral theoretical, methods. In another workload, we will consider Gaussian correlated fitness landscapes, namely, Sherrington-Kirkpatrick models of spin glasses. The key to the study of these models is Gaussian comparison techniques, and we shall adopt these tools for studying branching systems. Finally, we will consider theoretical models of experiments on evolution which involves a strategy of restarting the system with a part of the population at the end of some time periods. We plan to develop a thorough understanding of appropriate choices of the scales for the time lags in order to be able to deduce useful conclusions from the experiment. |

Principal investigator: Marcel Ortgiese (Bath) |

Co-investigator: Matthias Hammer (Berlin) |

The aim of our project is to understand the large scale behaviour of interfaces formed by competing types in spatial population models. Starting with the classical continuum stepping stone model, we will consider various modifications. One of the main focal points will be the symbiotic branching model, which describes a two type population that is only allowed to branch in the simultaneous presence of both types. We are also interested in other models that extend the original stepping stone model: a big challenge are models in more than one spatial dimension or models with selection and even competition between types. An important question in all these models is how one type invades the other, thus creating an interface where both types are present. The interface for the continuum stepping stone model is well understood and we would like to investigate the changes in large scale patterns induced by the various modifications. The analysis of these models relies especially on the elegant technique of duality for Markov processes, but we also use tools from stochastic analysis and measure-valued processes. |

Principal investigators: Wolfgang Stephan (Berlin), Aurélien Tellier (München) |

The goal of this project is to gain insights into the coevolution of host- parasite systems, particularly processes that generate genetic diversity. In the first objective, we analyze an epidemiological model of coevolution with coupled reciprocal changes in allele frequencies and size changes of host and parasite populations. Using these results, we show how signatures of coevolution can be quantified in terms of the site frequency spectrum of genetic polymorphisms observed over the whole genome of both hosts and parasites. To do this, we start from a general model of coevolution that encompasses the well-known matching-allele and gene-for-gene models frequently used in the plant and animal literature. Then we extend this deterministic model by including several parasite generations per host generation. In the second objective, we study the influence of stochasticicity on the maintenance of polymorphism in host and parasite populations as well as the loss and fixation of alleles at coevolving loci. Two stochastic processes of biological relevance are modeled: (i) genetic drift is allowed to occur in the parasite population assuming a large variance in offspring production, which is a common feature of many plant and insect parasites; (ii) the stochastic process of disease transmission is modeled as a Markov process affecting the epidemiological dynamics. We specifically explore the conditions under which co-evolutionary models starting from three alleles in hosts and parasites generate loss or fixation of alleles (thus reducing the system to two-allele or monomorphic populations). We focus on conditions under which we observe the two extreme cases of coevolution that are both described by our generalized models and were the subject of the work of the first funding period, the arms race (recurrent selective sweeps) and trench warfare scenarios. |

Principal investigator: Anja Sturm (Göttingen) |

We propose to analyse the genealogies of single genes and gene sequences obtained from a sample of individuals from the present day population. For the underlying stochastic population models we consider a variety of settings: models with selection and competition, large variation in their offspring distribution as well as with spatial structure. More precisely, we propose to investigate population models with constant total population size and various genetic types, in which one type is selectively favoured. We study both the case of a selective sweep, where the beneficial type fixes in the population, as well as balancing selection, where we have an equilibrium situation. We also consider logistic branching processes in which the individuals compete for limited resources and the total population size fluctuates randomly as a result. Another focus lies on population models of constant total size with large variation in their offspring distribution such that the offspring of single individuals may be of the order of the total population size. This can, for example, be due to competition or selection, in particular due to recurring selective sweeps. Spatially structured settings for some of the aforementioned models will be studied as well. The ultimate aim of the proposed research is to provide a basis for the analysis of genetic data by assessing the impact of the different underlying populations scenarios on the joint ancestral relationships of various genes and hence also on quantities that can be read off from the gene sequences in the sample, such as measures of genetic variability. |

Principal investigator: Anita Winter (Essen) |

For many RNA viruses the lack of a proofreading mechanism in the virus' RNA polymerase results in frequent mutation. The high viral mutation rates, the large virus population size, and the short replication periods produce abundance of viral variability which is responsible for immune escape or drug resistance Understanding in detail the forces which maintain this diversity can assist in the struggle against viral infections. Pathogen patterns - and in particular the shapes of the phylogenies - are affected by the strength of selective pressure due to various levels of cross- immunity. We focus on the temporal structure of phylogenies associated with a persistent virus. We propose a two-level (host-pathogen) branching model with mutation and competition on both levels in different scaling regimes, where hosts can be either the infected patients or the infected cells within a single patient. We thereby extend our recent work on a panmitic virus population. We will further rely on techniques developed for measure-valued (neutral) multilevel branching dynamics and two-level multi-type branching dynamics with mutation and competition. |

Principal investigator: Thomas Wiehe (Köln) |

Bifurcating trees have extensively and succesfully been used as tools to model evolutionary dynamics. The genealogical tree of a set of alleles, genes, or species can be considered as a single realization of the evolutionary process. Yet, in models and their applications - for instance the coalescent model and the neutrality tests derived from it - it is typically assumed that samples are obtained under long-term average conditions. However, this assumption may not be appropriate when interpreting experimental data, and overlooking this important point may lead to severe mis-interpretations and mis-inferences. In order to gain a clearer understanding it is critically important to investigate the conditional sampling distribution of the tree properties. A comprehensive theory is however still missing. One goal of this proposal is to bridge this gap. We build on the results obtained during the first funding period, but emphasis will shift from combinatoric to probabilistic properties and from static to evolving trees. In particular, we will investigate how strongly a given population genealogy impinges on the genealogical properties of samples by studying the conditional sub-sampling distribution of tree properties, such as height, length and tree balance. Furthermore, we will study how strongly the contingent tree topology of a population leads to a bias in neutrality tests when applied to experimental data. Another key aspect is to integrate the fundamental evolutionary mechanism of recombination in this framework. We will use the ancestral recombination graph as a model of the spatial coalescent and study tree balance of samples and sub-samples as a stochastic process along the chromosome. Since recombination can be silent, i.e. not altering tree topology, it is essential to define topologically relevant recombination events and to quantify their rates. Complementing the spatial view, we will investigate tree balance as a process in time using the classical Moran model. In evolving trees lineages split or die and, as a consequence, tree balance changes over time. Critical times are those when the root jumps to younger tree nodes: in these events history is erased and new evolutionary episodes start. We will study the effect of this process on the sampling and conditional sub-sampling distributions of tree properties. Of particular interest are the rate of change and persistence times relative to generation time. Finally, on a slightly different tack, we will extend work from the first funding period and use tree topology and combinatoric properties of ordered trees to investigate the evolutionary mechanisms behind the distribution of large gene families along chromosomes. We will apply our theoretical results to the analysis of experimental data and their interpretation in the light of neutral vs. adaptive evolution. |

Principal investigators: Ellen Baake (Bielefeld), Michael Baake (Bielefeld) |

Recombination is a major source of genetic variability in populations. It occurs during sexual reproduction and leads to the “reshuffling” of the genetic material of two parents into the type of an offspring individual. More precisely, the maternal and paternal sequences perform one or more crossovers and are cut and relinked accordingly, which results in a reciprocal exchange of genetic material. The goal of the proposed project is to develop various stochastic aspects in the context of models with strong recombination (that is, recombination rates or probabilities are large as compared to the fluctuations due to random reproduction). The main objectives concern the analysis of single-crossover recombination and multiple-crossover recombination in an infinite population, as well as the coalescent process in the strong-recombination limit. Two leitmotifs will appear throughout the project: First, the concept of an ancestral recombination tree, and, second, the attempt to overcome problems related to dependence. |

Principal investigators: Ellen Baake (Bielefeld), Anton Wakolbinger (Frankfurt am Main) |

Understanding the interplay of (random) reproduction, mutation, and selection is a major topic of population genetics research. In line with the historical perspective of evolutionary research, modern approaches aim at tracing back the ancestry of a sample of individuals taken from a present population. Generically, in populations (of genes, or genomes without recombination, say) that evolve in a time stationary manner with a controlled (ideally constant) population size over a long time span, the ancestral lines will eventually coalesce backwards in time into a single line of descent. Viewed forwards in time (and projected into the future) this gives rise to an immortal line. The type process on this ancestral line, together with related genealogical aspects, is of particular interest. Its study, under a variety of selection regimes and mutation models, will be the central objective of our project. |

Principal investigator: Ellen Baake (Bielefeld) |

The Priority Programme “Probabilistic Structures in Evolution” is devoted to the in- depth theoretical study of stochastic processes in population genetics (that is, desribing the evolution of the genetic structure of populations under the action of the various evolutionary forces), stochastic models of adaptive dynamics (that is, individual-based models for the joint description of ecology and evolution), and probabilistic aspects of evolutionary game theory. The coordinator’s project will provide funding for all central activities, like summer schools, workshops and conferences, long-term visitors (shared between groups), administrative support for coordination, and database/web page software. |

Principal investigators: Matthias Birkner (Mainz), Jochen Blath (Berlin) |

Multiple merger coalescent modeling and analysis has up to now been mainly focused on neutral, haploid, single-locus set-ups. The central aim of this project is to develop stochastic models, theoretical results and inference methods required to effectively describe and analyse the observed patterns of genetic variation in real populations with skewed offspring distributions under the influence of further evolutionary forces, especially recombination, selection and population structure, in other words, the systematic development of the basics of a “mathematical population genetics for highly variable offspring distributions”. |

Principal investigators: Matthias Birkner (Mainz), Nina Gantert (München) |

Co-investigator: Andrej Depperschmidt (Freiburg) |

We investigate ancestral lineages in spatial stochastic population models with local self-regulation and competition, allowing locally fluctuating population sizes. In ongoing work of the applicants with Jiri Cerny, it turned out that there are interesting connections with random media questions. More precisely, the ancestral line of one particle in a simple, prototypic model corresponds to a directed random walk on a supercritical cluster of oriented site percolation. Our aim is to consider several, interacting panicles, and to compare their ancestral lineages with coalescing random walks in a random environment. Already for one particle, there are interesting questions about the law of medium, seen from the walker, and these are related to the forward/backward evolution of the environment. For locally regulated stochastically evolving populations, only very few rigorous relations between forward and backward processes are known so far. We hope to obtain in this way results for the long-time behaviour of stochastic population models with self-regulation. |

Principal investigator: Jochen Blath (Berlin) |

Co-investigators: Noemi Kurt (Berlin), Marcel Ortgiese (Bath) |

The method of duality is a mathematical formalism that allows one to establish close connections between two stochastic Markov processes with respect to a class of `duality functions'. If a formal duality is established, it is often possible to study important properties of a `complicated' spatial stochastic system, such as longtime-behaviour or properties of its genealogy, by analysing the properties of a simpler, typically discreteor combinatorial, dual process. This method has been used with great success for many processes in the theory of interacting particle systems and interacting stochastic (P)DEs modeling the evolution of populations (e.g. the stepping stone or the Wright-Fisher model). In the last years, important progress has been achieved. However, there is still no systematic theory of duality (“finding dual processes is something of a black art", A. Etheridge [Eth06] p.519), and many systems of theoretical and practical importance await further analysis. This project has three main objectives. Firstly, we would like to transfer several concrete questions about certain SPDEs to questions about their dual processes (I). Secondly, we are interested in the long-term properties of the dual processes themselves (II). Finally, we aim towards a systematic analysis of the method of duality. |

Principal investigators: Anton Bovier (Bonn), Joachim Krug (Köln) |

The adaptive evolution of a population under the influence of mutation and selection is governed by the structure of the underlying fitness landscape, which encodes the interactions between genetic loci in their effects on fitness. Mathematically speaking, a fitness landscape is a function on the space of genotype sequences, which is often modeled as a random field. The project aims to obtain rigorous results on the evolutionary accessibility for several classes of fitness landscape models. Here evolutionary accessibility is defined either in terms of the structure of the landscape, where it refers to the existence and properties of paths that are monotonic in fitness, or in terms of simple evolutionary dynamics given by a Markov chain with transition probabilities depending on fitness differences. Both types of problems are closely related to the low temperature dynamics of spin glasses, and this connection will be exploited. The investigation of predefined random fitness landscapes will be complemented by the approach of adaptive dynamics, where fitness arises dynamically from ecological interactions between individuals. In contrast to previous work which has been based on the characterization of the population by a low-dimensional phenotype, we will define and study adaptive dynamics directly on the high-dimensional space of genotypes. |

Principal investigator: Andreas Greven (Erlangen) |

The project is concerned with the analysis of qualitative features of stochastically evolving spatial populations under the following evolutionary forces: random genetic drift (resampling), mutation, selection and migration. This project focusses in particular on the analysis of the longtime behavior of- interacting (spatial) Fleming-Viot diffusions, - interacting (spatial) generalized (with block resampling) Cannings processes, both the measure-valued versions to describe type and geographic population composition, as well as of their tree-valued versions describing the genealogical structures and lines of descent of evolving populations. For the analysis a key role is played by spatial coalescents and certain generalized branching processes which come into play as dual processes. The main features of the genealogies under these models we are interested in for this project are:- the selective sweeps arising from rare mutants: stasis, punctuated equilibria, - the effects of relative strength of mutation, selection and migration on the longtime and large spatial scale behaviour, - the question of universality in model parameters of large-scale properties of the population.Finally we begin studying the impact of recombination on the large scale population behaviour in the model classes from above. |

Principal investigator: Oskar Hallatschek (Göttingen) |

An important challenge of theoretical population genetics is the joint mathematical description of natural selection and genetic drift. While approximations neglecting either natural selection or genetic drift are readily available for many evolutionary scenarios, analytical progress has been sparse when both of these forces matter. The difficulties are due to the fact that models combining genetic drift and selection are non-linear and stochastic, and therefore plagued by an infinite hierarchy of moment equations that can only be dealt with using uncontrolled truncations. We have recently described a promising new route to combining natural selection and genetic drift, based on branching processes under constraints, which generates analytical tractable models closed at the first moment equation. We plan to apply our theoretical framework to the dynamics of asexual adaptation with the goal to quantify the transient dynamics, the impact of deleterious mutations and adaptation in two-dimensional fitness landscapes as a first step towards epistatic models. As we argue, these research efforts will be helpful for interpreting microbial evolution experiments. On a more general level, we plan to advance our theoretical framework to capture fluctuations in addition to the first moment, which will establish constrained branching processes as a versatile tool for predicting evolutionary dynamics quantitatively in regimes where natural selection and genetic drift matter. |

Principal investigators: Martin Hutzenthaler (Essen), Dirk Metzler (München) |

We investigate under which conditions an inheritable behavioral trait of defense against parasites can spread in a structured population even if it is costly in the sense that individuals having a defense gene tend to have less offspring. In a structured population, the chance of a defense gene to spread out depends on its benefit for the local subpopulation and on the number of beneficiaries that also carry the defense gene. Population structure thus plays a major role but depends on the interactions with the parasite and on several other factors that we model as random fluctuations. We will apply diffusion theory to analyze under which conditions a defense allele can spread in the population. We will complement our mathematical analysis with computer simulations to compare our results to the properties of more complex models that are not accessible for a rigorous mathematical analysis. Furthermore, we will explore how our models can be fitted to observations of host-parasite interactions including genetic markers for the inference of population structure. |

Principal investigators: Götz Kersting (Frankfurt am Main), Anton Wakolbinger (Frankfurt am Main) |

Coalescent theory has become a topic of central importance at the interface of probability theory and population genetics. In recent years, coalescent structures evolving in time have gained particular interest, they serve as a model for the changes, which genealogies of populations undergo in time. A number of interesting results have been obtained for the Kingman coalescent, and also very recently for the Bolthausen-Sznitman coalescent. Our project has two major objectives. The first is an in-depth understanding of the evolving Kingman coalescent (in the infinite population limit) and its path properties, when viewed as a tree-valued process. The second is a thorough investigation of evolving Beta-coalescents in the parameter regime 1 < α < 2.This family of coalescents is of interest, because it can be viewed as a passage from Kingman to Bolthausen-Sznitman, during which the coalescent is completely changing its appearence. |

Principal investigator: Achim Klenke (Mainz) |

Co-investigator: Matthias Birkner (Mainz) |

The lookdown construction of Donnelly and Kurtz (1996 and 1999) and of Kurtz and Rodrigues (2011) has proved to be an elegant and powerful construction of population genetic and population dynamic models, respectively, that puts emphasis on the individual-based dynamics. In contrast to constructions with classical tools from stochastic analysis, such as stochastic differential equations or martingale problems, the lookdown construction displays the full genealogy of the processes. Symbiotic branching models were introduced by Etheridge and Fleischmann (2004). They form a one parameter class of coupled stochastic differential equations that comprise the population genetic stepping stone model as well the mutually catalytic branching process from population dynamics (introduced by Dawson and Perkins 1998) and the parabolic Anderson model from theoretical physics. The toolbox for symbiotic branching processes comprises essentially a self-duality and a moment duality. In this project, we intend to develop further the lookdown constructions to make it work for the mutually catalytic branching process as well as for the larger class of symbiotic branching processes. Furthermore, for these models, the limit of the lookdown processes as the branching rate tends to infinity shall be studied. While primarily the goal is to reach a deeper understanding of the genealogies of symbiotic branching processes, we also hope to get some new insight in the renormalization analysis of the infinite rate processes. This would stress the special role of symbiotic branching processes. |

Principal investigator: Wolfgang König (Berlin) |

Co-investigator: Onur Gün (Berlin) |

We study the long-time behaviour of branching random walk in random environment (BRWRE) on the d-dimensional lattice. We consider one of the basic models, which includes migration and branching/killing of the particles, given a random potential of spatially dependent branching/killing rates. Based on the observation that the expectation of the population size over the migration and the branching and killing is equal to the solution to the well-known and much-studied parabolic Anderson model (PAM), we will use our understanding of the long-time behaviour of the PAM to develop a detailed picture of the BRWRE. Furthermore, we will exploit methods that were successful in the treatment of the PAM to prove at least part of this picture. Particular attention is payed to the study of the concentration of the population in sites that determine the long-time behaviour of the PAM, which shows a kind of intermittency. One fundamental thesis that we want to make precise and rigorous is that the overwhelming contribution to the total population size of the BRWRE comes from small islands where most of the particles travel to and have a extremely high reproduction activity. We aim at a detailed analysis for the case of the random potential being Pareto-distributed, in which case the rigorous study of the PAM has achieved a particularly clear picture. This project has the following four main goals. I. For a variety of random potentials, we derive large-time asymptotics for the n-th moments of the local and total population size, based on techniques from the study of the PAM. II. We want to understand and identify the limiting distributions of the global population size by a finer analysis for Pareto-distributed potentials. III. We want to investigate, for Pareto-distributed potentials, the long-time (de)correlation properties of the evolution of the particles such as aging, in particular, slow/fast evolution phenomena and what type of aging functions will appear. IV. We want to study, for Paretodistributed potentials, almost surely with respect to the potential, the particle flow of the BRWRE in a geometric sense by finding trajectories along which most of the particles travel and branch, in particular the sites and the time intervals where, respectively when, most of the particles show an extremely high reproduction activity. |

Principal investigator: Peter Pfaffelhuber (Freiburg) |

The Kingman coalescent is a prominent example of a random tree, arising in the field of population genetics as the genealogy of a large population under neutral evolution. The ancestral recombination graph (ARG) extends this random tree to become a random graph, which encodes the correlated genealogies at several loci along a recombining chromosome. However, the coalescent as well as the a ARG only account for a single point in time. We extend this static picture and study the evolution of the ARG along the evolution of the population. Our project1extends recent work of Greven et al. and Depperschmidt et al., who study the evolving genealogy at a single locus (i.e. without recombination) in neutral as well as selective cases, respectively. We construct the evolving genealogy under recombination using well-posed martingale problems, and study its properties under neutrality as well as under selection. Special emphasis is given on a continuum of recombining loci and properties of the map ℓ→ genealogy at locus ℓ. Moreover, at least if selection is weak, our approach allows to compute properties of the genealogical trees along a recombining chromosome. In addition, we use the evolving genealogy under recombination to investigate the model for a bacterial population where the recombination rate depends on the similarity of the recombining genetic material. |

Principal investigator: Wolfgang Stephan (Berlin) |

The goal of this project is to gain insights into the coevolution of host-parasite systems, particularly processes that generate genetic diversity. In the first objective, we focus on analyzing the frequencies of host resistance and parasite virulence alleles that interact in a gene-for-gene (GFG) manner. This coevolutionary model contains indirect frequency-dependent selection as resistance is positively selected when virulence is rare, and virulence is selected for when resistance is common. Then we extend the basic GFG model by including direct frequency-dependent selection on both resistance and virulence traits and mutation between resistant and susceptible alleles in the host, and also between virulence and avirulence alleles in the parasite. We study the system deterministically and – by including genetic drift – also stochastically. We concentrate on the two extreme cases of coevolution that are both described by our generalized GFG model, the arms race (recurrent selective sweep) and trench warfare scenarios. In the second objective, we make use of the structured coalescent to analyze the patterns of linked neutral genetic variation around the genomic position of the selected alleles. We explore whether coevolutionary models can be distinguished by the pattern of neutral genetic variation around the target of selection. |

Principal investigator: Anja Sturm (Göttingen) |

The main questions of this proposal center around exploring the variability of single genes and gene sequences obtained from a sample of a present day population in the case when the population has a large variation in its offspring distribution. This may be due to competition or selection and also due to recurring selective sweeps. The impact of those factors, as well as of balancing selection and a spatial structure in the population, on the joint genealogy of multiple genetic sites (loci) or the entire gene sequence will be studied. From knowledge about the genealogical relationships of various loci and the mutation mechanism information about the variability and correlation observed in the sample of gene sequences will be deduced. The latter form again the basis for statistical tests and data analysis, for example aimed at detecting loci under selection. Previously, the variability of genetic types at one genetic locus as well as the correlation of variability over multiple loci and the influence of space have been studied for standard population models with small family sizes. On the other hand, there has been recent interest as well as substantial progress from a mathematical as well as biological point of view in the study of genealogies and genetic variability of single locus samples in populations with large offspring variances and with selection. Thus, the proposed project will bring together several successful lines of mathematical research geared towards applications in Population Genetics and will hence yield new insights for this area of application. |

Principal investigator: Anita Winter (Essen) |

Many micro-organisms, in particular RNA viruses, evolve so rapidly that their evolutionary and epidemiological dynamics occur at the same time scale. The high viral mutation and replication rates of RNA viruses cause viral variability which is a reason for failing in disease control. It is therefore of great interest to understand in detail the forces which maintain this diversity. Pathogen patterns - and in particular the shapes of the phylogenies - are affected by the strength of selective pressure due to various levels of cross-immunity. That is, the host's immune system develops a response against the pathogen as well as against antigenically similar but different organisms. We propose to model in a unified way the different phylogenies as they evolve in time. Our goal is to test whether it is indeed the case that strong cross-immunity promotes coexistence of strains, while rapid strain turn-over is promoted by weaker cross-immunity. We also aim to investigate effects due to spatial separation or contact networks as well as recombination. Our project extends recent work in which the mathematical framework for stochastic processes with values in tree-like metric measure spaces was provided and evolving phylogenies without and with selection were constructed. |

Principal investigator: Thomas Wiehe (Köln) |

Evolutionary trees in population genetics, phylogenetics or stem cell development can naturally be represented as a bifurcating genealogy. Our focus are combinatorial and probabilistic properties of such trees, in particular of their topologies. We have three main goals. First, we will study the properties of a mapping between trees and certain kinds of permutations with the aim to derive the probabilities of biologically interesting tree features, either explicitly or implicitly using the concept of generating functions. Second, in the framework of population genetics, we will study the topology of coalescent trees with the aim to define statistics of tree shape and tree distance which have a biological meaning. In particular, tree distance should reflect genetic distance on a recombining chromosome. Third, we will use computer simulations to derive the distributions of these statistics under various evolutionary scenarios and compare them to experimental data obtained from the HapMap and 1000-genomes projects with the aim to quantify possible deviations from expectations under the hypothesis of neutral evolution. |

Principal investigator: Arne Traulsen (Plön) |

Evolutionary game theory has been traditionally dealt with in a deterministic setting, e.g. in tenns of the replicator dynamics. Finite populations introduce demographic stochasticity which has been analysed in great detail in the past few years. This has led to numerous new results and a closer connection to models from population genetics. While population genetics has for a long time used ideas such as the coalescent, this kind of concepts have only been recently employed in evolutionary game theory to calculate the average frequencies of strategies in equilibrium in structured and unstructured populations. Population genetics pays detailed attention to the intricacies of the genetic mechanisms such as mutation, recombination, multiple ploidy levels, genetic conflicts etc. Evolutionary game theory typically neglects such aspects. We propose to develop evolutionary game theory in this respect by including relevant properties such as diploid population which entails including recombination and possible genetic conflict situations. On the other hand, results from evolutionary game theory such as the one third rule, its generalization to many players and the inclusion of multiple strategies can be transferred to population genetics. The average frequencies of alleles in equilibrium under frequency dependent selection with linear ornonlinear interactions between alleles can be approached from a game theoretic perspective. Another point of interest is the maintenance of polymorphisms in alleles. In evolutionary game theoretic sense this corresponds to mixed equilibrium states. All of this exchange of results between two areas is possible when a terminological link between the two fields is well established. |

Principal investigator: Peter Stadler (Leipzig) |

Natural selection acts on the phenotype while genetic variation is generated at the genotypic level. The connection between these two levels is the genotype-phenotype map (GP-map), which entails both molecular interactions of gene products and the regulatory networks built upon them. Since GP-map is a major determinant of the fitness-effect of genetic variants, its structural features strongly impact both on the observable variation within a population and on the pattern of substitutions separating the genomic sequences of distinct species. The goal of this project is to systematically study this influence of the GP map. To this end, we will devise a suitable mathematical framework, construct and analyze probabilistic models for concrete scenaria, and perform large-scale computer simulations. Since genotypic fitness is understood as the composition of the GP map and a phenotypic fitness function, the GP map generically introduced neutrality into the fitness landscape. The geometric structure of these neutral sets affects substitution patterns under stabilizing selection as well as the accessibility of advantageous innovations. As a practical outcome we aim at the development of statistical procedures to assess selective constraints in the large-scale genome-wide data sets that are becoming widely available. |

2020-09-24 11:01:00
(P)reprint #155: Natural selection and the advantage of recombination

2020-07-20 12:01:00
(P)reprint #152: Solving the migration-recombination equation from a genealogical point of view

2020-07-20 12:00:00
(P)reprint #150: Solving the selection-recombination equation: Ancestral lines under selection and recombination