Subscribe: Molecular Biology and Evolution - current issue
http://mbe.oxfordjournals.org/rss/current.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
adaptation  data  evolution  expression  gene  genes  genetic  genome  mutations  populations  selection  species  variation 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Molecular Biology and Evolution - current issue

Molecular Biology and Evolution Current Issue





Published: Mon, 30 Oct 2017 00:00:00 GMT

Last Build Date: Wed, 08 Nov 2017 06:49:09 GMT

 



An Arabidopsis Transcriptional Regulatory Map Reveals Distinct Functional and Evolutionary Features of Novel Transcription Factors

2017-10-30

Jinpu Jin, Kun He, Xing Tang, Zhe Li, Le Lv, Yi Zhao, Jingchu Luo, and Ge Gao



Cold Comfort: Fat-Rich Diets and Adaptation Among Indigenous Siberian Populations

2017-10-30

In the Arctic, thriving indigenous populations have long made adjustments to living in one of the coldest and harshest places on Earth. Despite seasonal extremes in daylight, food availability, and severe cold, modern humans have had settlements in Siberia beginning around 45,000 years ago, not long after their initial migration out of Africa.



Characterization of Odorant Receptors from a Non-ditrysian Moth, Eriocrania semipurpurella Sheds Light on the Origin of Sex Pheromone Receptors in Lepidoptera

2017-09-27

Jothi Kumar Yuvaraj, Jacob A. Corcoran, Martin N. Andersson, Richard D. Newcomb, Olle Anderbrant, and Christer Löfstedt



A Generative Angular Model of Protein Structure Evolution

2017-09-18

Michael Golden, Eduardo García-Portugués, Michael Sørensen, Kanti V. Mardia, Thomas Hamelryck, and Jotun Hein



A Moth and Its Flame: Mate Selection Found to Evolve from Response to Flower Odors

2017-09-15

For moths, love is literally in the air through the action of pheromones to attract mates.



Exome Sequencing Provides Evidence of Polygenic Adaptation to a Fat-Rich Animal Diet in Indigenous Siberian Populations

2017-09-12

Abstract
Siberia is one of the coldest environments on Earth and has great seasonal temperature variation. Long-term settlement in northern Siberia undoubtedly required biological adaptation to severe cold stress, dramatic variation in photoperiod, and limited food resources. In addition, recent archeological studies show that humans first occupied Siberia at least 45,000 years ago; yet our understanding of the demographic history of modern indigenous Siberians remains incomplete. In this study, we use whole-exome sequencing data from the Nganasans and Yakuts to infer the evolutionary history of these two indigenous Siberian populations. Recognizing the complexity of the adaptive process, we designed a model-based test to systematically search for signatures of polygenic selection. Our approach accounts for stochasticity in the demographic process and the hitchhiking effect of classic selective sweeps, as well as potential biases resulting from recombination rate and mutation rate heterogeneity. Our demographic inference shows that the Nganasans and Yakuts diverged ∼12,000–13,000 years ago from East-Asian ancestors in a process involving continuous gene flow. Our polygenic selection scan identifies seven candidate gene sets with Siberian-specific signals. Three of these gene sets are related to diet, especially to fat metabolism, consistent with the hypothesis of adaptation to a fat-rich animal diet. Additional testing rejects the effect of hitchhiking and favors a model in which selection yields small allele frequency changes at multiple unlinked genes.



Variation in Orthologous Shell-Forming Proteins Contribute to Molluscan Shell Diversity

2017-09-01

Abstract
Despite the evolutionary success and ancient heritage of the molluscan shell, little is known about the molecular details of its formation, evolutionary origins, or the interactions between the material properties of the shell and its organic constituents. In contrast to this dearth of information, a growing collection of molluscan shell-forming proteomes and transcriptomes suggest they are comprised of both deeply conserved, and lineage specific elements. Analyses of these sequence data sets have suggested that mechanisms such as exon shuffling, gene co-option, and gene family expansion facilitated the rapid evolution of shell-forming proteomes and supported the diversification of this phylum specific structure. In order to further investigate and test these ideas we have examined the molecular features and spatial expression patterns of two shell-forming genes (Lustrin and ML1A2) and coupled these observations with materials properties measurements of shells from a group of closely related gastropods (abalone). We find that the prominent “GS” domain of Lustrin, a domain believed to confer elastomeric properties to the shell, varies significantly in length between the species we investigated. Furthermore, the spatial expression patterns of Lustrin and ML1A2 also vary significantly between species, suggesting that both protein architecture, and the regulation of spatial gene expression patterns, are important drivers of molluscan shell evolution. Variation in these molecular features might relate to certain materials properties of the shells of these species. These insights reveal an important and underappreciated source of variation within shell-forming proteomes that must contribute to the diversity of molluscan shell phenotypes.



Spatiotemporal Dynamics of Genetic Variation in the Iberian Lynx along Its Path to Extinction Reconstructed with Ancient DNA

2017-08-29

Abstract
There is the tendency to assume that endangered species have been both genetically and demographically healthier in the past, so that any genetic erosion observed today was caused by their recent decline. The Iberian lynx (Lynx pardinus) suffered a dramatic and continuous decline during the 20th century, and now shows extremely low genome- and species-wide genetic diversity among other signs of genomic erosion. We analyze ancient (N = 10), historical (N = 245), and contemporary (N = 172) samples with microsatellite and mitogenome data to reconstruct the species' demography and investigate patterns of genetic variation across space and time. Iberian lynx populations transitioned from low but significantly higher genetic diversity than today and shallow geographical differentiation millennia ago, through a structured metapopulation with varying levels of diversity during the last centuries, to two extremely genetically depauperate and differentiated remnant populations by 2002. The historical subpopulations show varying extents of genetic drift in relation to their recent size and time in isolation, but these do not predict whether the populations persisted or went finally extinct. In conclusion, current genetic patterns were mainly shaped by genetic drift, supporting the current admixture of the two genetic pools and calling for a comprehensive genetic management of the ongoing conservation program. This study illustrates how a retrospective analysis of demographic and genetic patterns of endangered species can shed light onto their evolutionary history and this, in turn, can inform conservation actions.



Network-Based Identification of Adaptive Pathways in Evolved Ethanol-Tolerant Bacterial Populations

2017-08-28

Abstract
Efficient production of ethanol for use as a renewable fuel requires organisms with a high level of ethanol tolerance. However, this trait is complex and increased tolerance therefore requires mutations in multiple genes and pathways. Here, we use experimental evolution for a system-level analysis of adaptation of Escherichia coli to high ethanol stress. As adaptation to extreme stress often results in complex mutational data sets consisting of both causal and noncausal passenger mutations, identifying the true adaptive mutations in these settings is not trivial. Therefore, we developed a novel method named IAMBEE (Identification of Adaptive Mutations in Bacterial Evolution Experiments). IAMBEE exploits the temporal profile of the acquisition of mutations during evolution in combination with the functional implications of each mutation at the protein level. These data are mapped to a genome-wide interaction network to search for adaptive mutations at the level of pathways. The 16 evolved populations in our data set together harbored 2,286 mutated genes with 4,470 unique mutations. Analysis by IAMBEE significantly reduced this number and resulted in identification of 90 mutated genes and 345 unique mutations that are most likely to be adaptive. Moreover, IAMBEE not only enabled the identification of previously known pathways involved in ethanol tolerance, but also identified novel systems such as the AcrAB-TolC efflux pump and fatty acids biosynthesis and even allowed to gain insight into the temporal profile of adaptation to ethanol stress. Furthermore, this method offers a solid framework for identifying the molecular underpinnings of other complex traits as well.



Codon-Resolution Analysis Reveals a Direct and Context-Dependent Impact of Individual Synonymous Mutations on mRNA Level

2017-08-24

Abstract
Codon usage bias (CUB) refers to the observation that synonymous codons are not used equally frequently in a genome. CUB is stronger in more highly expressed genes, a phenomenon commonly explained by stronger natural selection on translational accuracy and/or efficiency among these genes. Nevertheless, this phenomenon could also occur if CUB regulates gene expression at the mRNA level, a hypothesis that has not been tested until recently. Here, we attempt to quantify the impact of synonymous mutations on mRNA level in yeast using 3,556 synonymous variants of a heterologous gene encoding green fluorescent protein (GFP) and 523 synonymous variants of an endogenous gene TDH3. We found that mRNA level was positively correlated with CUB among these synonymous variants, demonstrating a direct role of CUB in regulating transcript concentration, likely via regulating mRNA degradation rate, as our additional experiments suggested. More importantly, we quantified the effects of individual synonymous mutations on mRNA level and found them dependent on 1) CUB and 2) mRNA secondary structure, both in proximal sequence contexts. Our study reveals the pleiotropic effects of synonymous codon usage and provides an additional explanation for the well-known correlation between CUB and gene expression level.



Recurrent Reverse Evolution Maintains Polymorphism after Strong Bottlenecks in Commensal Gut Bacteria

2017-08-21

Abstract
The evolution of new strains within the gut ecosystem is poorly understood. We used a natural but controlled system to follow the emergence of intraspecies diversity of commensal Escherichia coli, during three rounds of adaptation to the mouse gut (∼1,300 generations). We previously showed that, in the first round, a strongly beneficial phenotype (loss-of-function for galactitol consumption; gat-negative) spread to >90% frequency in all colonized mice. Here, we show that this loss-of-function is repeatedly reversed when a gat-negative clone colonizes new mice. The regain of function occurs via compensatory mutation and reversion, the latter leaving no trace of past adaptation. We further show that loss-of-function adaptive mutants reevolve, after colonization with an evolved gat-positive clone. Thus, even under strong bottlenecks a regime of strong-mutation-strong-selection dominates adaptation. Coupling experiments and modeling, we establish that reverse evolution recurrently generates two coexisting phenotypes within the microbiota that can or not consume galactitol (gat-positive and gat-negative, respectively). Although the abundance of the dominant strain, the gat-negative, depends on the microbiota composition, gat-positive abundance is independent of the microbiota composition and can be precisely manipulated by supplementing the diet with galactitol. These results show that a specific diet is able to change the abundance of specific strains. Importantly, we find polymorphism for these phenotypes in indigenous Enterobacteria of mice and man. Our results demonstrate that natural selection can greatly overwhelm genetic drift at structuring the strain diversity of gut commensals and that competition for limiting resources may be a key mechanism for maintaining polymorphism in the gut.



Quantifying Selection with Pool-Seq Time Series Data

2017-08-21

Abstract
Allele frequency time series data constitute a powerful resource for unraveling mechanisms of adaptation, because the temporal dimension captures important information about evolutionary forces. In particular, Evolve and Resequence (E&R), the whole-genome sequencing of replicated experimentally evolving populations, is becoming increasingly popular. Based on computer simulations several studies proposed experimental parameters to optimize the identification of the selection targets. No such recommendations are available for the underlying parameters selection strength and dominance. Here, we introduce a highly accurate method to estimate selection parameters from replicated time series data, which is fast enough to be applied on a genome scale. Using this new method, we evaluate how experimental parameters can be optimized to obtain the most reliable estimates for selection parameters. We show that the effective population size (Ne) and the number of replicates have the largest impact. Because the number of time points and sequencing coverage had only a minor effect, we suggest that time series analysis is feasible without major increase in sequencing costs. We anticipate that time series analysis will become routine in E&R studies.



Intrahost Genetic Diversity of Bacterial Symbionts Exhibits Evidence of Mixed Infections and Recombinant Haplotypes

2017-08-18

Abstract
Even the simplest microbial-eukaryotic mutualisms are comprised of entire populations of symbionts at the level of the host individual. Early work suggested that these intrahost populations maintain low genetic diversity as a result of transmission bottlenecks or to avoid competition between symbiont genotypes. However, the amount of genetic diversity among symbionts within a single host remains largely unexplored. To address this, we investigated the chemosynthetic symbiosis between the bivalve Solemya velum and its intracellular bacterial symbionts, which exhibits evidence of both vertical and horizontal transmission. Intrahost symbiont populations were sequenced to high coverage (200–1,000×). Analyses of nucleotide diversity revealed that the symbiont genome sequences were largely homogeneous within individual host specimens, consistent with vertical transmission, except for particular regions that were polymorphic in ∼20% of host specimens. These variant sites were also found segregating in other host individuals from the same population, colocalized to several regions of the genome, and consistently co-occurred on the same short read pairs (derived from the same chromosome). These results strongly suggest that these variant haplotypes originated through recombination events, potentially during prior mixed infections or in the external environment, rather than as novel mutations within symbiont populations. This abundant genetic diversity could have a profound influence on symbiont evolution as it provides the opportunity for selection to limit the extent of reductive genome evolution commonly seen in obligate intracellular bacteria and to enable the evolution of adaptive genotypes.



Parallel Evolution of Chromatin Structure Underlying Metabolic Adaptation

2017-08-16

Abstract
Parallel evolution occurs when a similar trait emerges in independent evolutionary lineages. Although changes in protein coding and gene transcription have been investigated as underlying mechanisms for parallel evolution, parallel changes in chromatin structure have never been reported. Here, Saccharomyces cerevisiae and a distantly related yeast species, Dekkera bruxellensis, are investigated because both species have independently evolved the capacity of aerobic fermentation. By profiling and comparing genome sequences, transcriptomic landscapes, and chromatin structures, we revealed that parallel changes in nucleosome occupancy in the promoter regions of mitochondria-localized genes led to concerted suppression of mitochondrial functions by glucose, which can explain the metabolic convergence in these two independent yeast species. Further investigation indicated that similar mutational processes in the promoter regions of these genes in the two independent evolutionary lineages underlay the parallel changes in chromatin structure. Our results indicate that, despite several hundred million years of separation, parallel changes in chromatin structure, can be an important adaptation mechanism for different organisms. Due to the important role of chromatin structure changes in regulating gene expression and organism phenotypes, the novel mechanism revealed in this study could be a general phenomenon contributing to parallel adaptation in nature.



Fitness Effects of Cis -Regulatory Variants in the Saccharomyces cerevisiae TDH3 Promoter

2017-08-16

Abstract
Variation in gene expression is widespread within and between species, but fitness consequences of this variation are generally unknown. Here, we use mutations in the Saccharomyces cerevisiae TDH3 promoter to assess how changes in TDH3 expression affect cell growth. From these data, we predict the fitness consequences of de novo mutations and natural polymorphisms in the TDH3 promoter. Nearly all mutations and polymorphisms in the TDH3 promoter were found to have no significant effect on fitness in the environment assayed, suggesting that the wild-type allele of this promoter is robust to the effects of most new cis-regulatory mutations.



Characterization of Odorant Receptors from a Non-ditrysian Moth, Eriocrania semipurpurella Sheds Light on the Origin of Sex Pheromone Receptors in Lepidoptera

2017-08-15

Abstract
Pheromone receptors (PRs) are essential in moths to detect sex pheromones for mate finding. However, it remains unknown from which ancestral proteins these specialized receptors arose. The oldest lineages of moths, so-called non-ditrysian moths, use short-chain pheromone components, secondary alcohols, or ketones, so called Type 0 pheromones that are similar to many common plant volatiles. It is, therefore, possible that receptors for these ancestral pheromones evolved from receptors detecting plant volatiles. Hence, we identified the odorant receptors (ORs) from a non-ditrysian moth, Eriocrania semipurpurella (Eriocraniidae, Lepidoptera), and performed functional characterization of ORs using HEK293 cells. We report the first receptors that respond to Type 0 pheromone compounds; EsemOR3 displayed highest sensitivity toward (2S, 6Z)-6-nonen-2-ol, whereas EsemOR5 was most sensitive to the behavioral antagonist (Z)-6-nonen-2-one. These receptors also respond to plant volatiles of similar chemical structures, but with lower sensitivity. Phylogenetically, EsemOR3 and EsemOR5 group with a plant volatile-responding receptor from the tortricid moth Epiphyas postvittana (EposOR3), which together reside outside the previously defined lepidopteran PR clade that contains the PRs from more derived lepidopteran families. In addition, one receptor (EsemOR1) that falls at the base of the lepidopteran PR clade, responded specifically to β-caryophyllene and not to any other additional plant or pheromone compounds. Our results suggest that PRs for Type 0 pheromones have evolved from ORs that detect structurally-related plant volatiles. They are unrelated to PRs detecting pheromones in more derived Lepidoptera, which, in turn, also independently may have evolved a novel function from ORs detecting plant volatiles.



Transposable Element Exaptation into Regulatory Regions Is Rare, Influenced by Evolutionary Age, and Subject to Pleiotropic Constraints

2017-08-14

Abstract
Transposable element (TE)-derived sequences make up approximately half of most mammalian genomes, and many TEs have been co-opted into gene regulatory elements. However, we lack a comprehensive tissue- and genome-wide understanding of how and when TEs gain regulatory activity in their hosts. We evaluated the prevalence of TE-derived DNA in enhancers and promoters across hundreds of human and mouse cell lines and primary tissues. Promoters are significantly depleted of TEs in all tissues compared with their overall prevalence in the genome (P < 0.001); enhancers are also depleted of TEs, though not as strongly as promoters. The degree of enhancer depletion also varies across contexts (1.5–3×), with reproductive and immune cells showing the highest levels of TE regulatory activity in humans. Overall, in spite of the regulatory potential of many TE sequences, they are significantly less active in gene regulation than expected from their prevalence. TE age is predictive of the likelihood of enhancer activity; TEs originating before the divergence of amniotes are 9.2 times more likely to have enhancer activity than TEs that integrated in great apes. Context-specific enhancers are more likely to be TE-derived than enhancers active in multiple tissues, and young TEs are more likely to overlap context-specific enhancers than old TEs (86% vs. 47%). Once TEs obtain enhancer activity in the host, they have similar functional dynamics to one another and non-TE-derived enhancers, likely driven by pleiotropic constraints. However, a few TE families, most notably endogenous retroviruses, have greater regulatory potential. Our observations suggest a model of regulatory co-option in which TE-derived sequences are initially repressed, after which a small fraction obtains context-specific enhancer activity, with further gains subject to pleiotropic constraints.



Adaptive Mutations in RNA Polymerase and the Transcriptional Terminator Rho Have Similar Effects on Escherichia coli Gene Expression

2017-08-09

Abstract
Modifications to transcriptional regulators play a major role in adaptation. Here, we compared the effects of multiple beneficial mutations within and between Escherichia coli rpoB, the gene encoding the RNA polymerase β subunit, and rho, which encodes a transcriptional terminator. These two genes have harbored adaptive mutations in numerous E. coli evolution experiments but particularly in our previous large-scale thermal stress experiment, where the two genes characterized alternative adaptive pathways. To compare the effects of beneficial mutations, we engineered four advantageous mutations into each of the two genes and measured their effects on fitness, growth, gene expression and transcriptional termination at 42.2 °C. Among the eight mutations, two rho mutations had no detectable effect on relative fitness, suggesting they were beneficial only in the context of epistatic interactions. The remaining six mutations had an average relative fitness benefit of ∼20%. The rpoB mutations affected the expression of ∼1,700 genes; rho mutations affected the expression of fewer genes but most (83%) were a subset of those altered by rpoB mutants. Across the eight mutants, relative fitness correlated with the degree to which a mutation restored gene expression back to the unstressed, 37.0 °C state. The beneficial mutations in the two genes did not have identical effects on fitness, growth or gene expression, but they caused parallel phenotypic effects on gene expression and genome-wide transcriptional termination.



Detection of Regional Variation in Selection Intensity within Protein-Coding Genes Using DNA Sequence Polymorphism and Divergence

2017-07-28

Abstract
Numerous approaches have been developed to infer natural selection based on the comparison of polymorphism within species and divergence between species. These methods are especially powerful for the detection of uniform selection operating across a gene. However, empirical analyses have demonstrated that regions of protein-coding genes exhibiting clusters of amino acid substitutions are subject to different levels of selection relative to other regions of the same gene. To quantify this heterogeneity of selection within coding sequences, we developed Model Averaged Site Selection via Poisson Random Field (MASS-PRF). MASS-PRF identifies an ensemble of intragenic clustering models for polymorphic and divergent sites. This ensemble of models is used within the Poisson Random Field framework to estimate selection intensity on a site-by-site basis. Using simulations, we demonstrate that MASS-PRF has high power to detect clusters of amino acid variants in small genic regions, can reliably estimate the probability of a variant occurring at each nucleotide site in sequence data and is robust to historical demographic trends and recombination. We applied MASS-PRF to human gene polymorphism derived from the 1,000 Genomes Project and divergence data from the common chimpanzee. On the basis of this analysis, we discovered striking regional variation in selection intensity, indicative of positive or negative selection, in well-defined domains of genes that have previously been associated with neurological processing, immunity, and reproduction. We suggest that amino acid-altering substitutions within these regions likely are or have been selectively advantageous in the human lineage, playing important roles in protein function.



A Comprehensive Analysis of Transcript-Supported De Novo Genes in Saccharomyces sensu stricto Yeasts

2017-07-24

Abstract
Novel genes arising from random DNA sequences (de novo genes) have been suggested to be widespread in the genomes of different organisms. However, our knowledge about the origin and evolution of de novo genes is still limited. To systematically understand the general features of de novo genes, we established a robust pipeline to analyze >20,000 transcript-supported coding sequences (CDSs) from the budding yeast Saccharomyces cerevisiae. Our analysis pipeline combined phylogeny, synteny, and sequence alignment information to identify possible orthologs across 20 Saccharomycetaceae yeasts and discovered 4,340 S. cerevisiae-specific de novo genes and 8,871 S. sensu stricto-specific de novo genes. We further combine information on CDS positions and transcript structures to show that >65% of de novo genes arose from transcript isoforms of ancient genes, especially in the upstream and internal regions of ancient genes. Fourteen identified de novo genes with high transcript levels were chosen to verify their protein expressions. Ten of them, including eight transcript isoform-associated CDSs, showed translation signals and five proteins exhibited specific cytosolic localizations. Our results suggest that de novo genes frequently arise in the S. sensu stricto complex and have the potential to be quickly integrated into ancient cellular network.



Multiple Modes of Positive Selection Shaping the Patterns of Incomplete Selective Sweeps over African Populations of Drosophila melanogaster

2017-07-21

Abstract
It remains a challenge in evolutionary genetics to elucidate how beneficial mutations arise and propagate in a population and how selective pressures on mutant alleles are structured over space and time. By identifying “sweeping haplotypes (SHs)” that putatively carry beneficial alleles and are increasing (or have increased) rapidly in frequency, and surveying the geographic distribution of SH frequencies, we can indirectly infer how selective sweeps unfold in time and thus which modes of positive selection underlie those sweeps. Using population genomic data from African Drosophila melanogaster, we identified SHs from 37 candidate loci under selection. At more than half of loci, we identify single SHs. However, many other loci harbor multiple independent SHs, namely soft selective sweeps, either due to parallel evolution across space or a high beneficial mutation rate. At about a quarter of the loci, intermediate SH frequencies are found across multiple populations, which cannot be explained unless a certain form of frequency-dependent positive selection, such as heterozygote advantage, is invoked given the reasonable range of migration rates between African populations. At one locus, many independent SHs are observed over multiple populations but always together with ancestral haplotypes. This complex pattern is compatible with a large number of mutational targets in a gene and frequency-dependent selection on new variants. We conclude that very diverse modes of positive selection are operating at different sets of loci in D. melanogaster populations.



Distinct Trajectories of Massive Recent Gene Gains and Losses in Populations of a Microbial Eukaryotic Pathogen

2017-07-21

Abstract
Differences in gene content are a significant source of variability within species and have an impact on phenotypic traits. However, little is known about the mechanisms responsible for the most recent gene gains and losses. We screened the genomes of 123 worldwide isolates of the major pathogen of wheat Zymoseptoria tritici for robust evidence of gene copy number variation. Based on orthology relationships in three closely related fungi, we identified 599 gene gains and 1,024 gene losses that have not yet reached fixation within the focal species. Our analyses of gene gains and losses segregating in populations showed that gene copy number variation arose preferentially in subtelomeres and in proximity to transposable elements. Recently lost genes were enriched in virulence factors and secondary metabolite gene clusters. In contrast, recently gained genes encoded mostly secreted protein lacking a conserved domain. We analyzed the frequency spectrum at loci segregating a gene presence–absence polymorphism in four worldwide populations. Recent gene losses showed a significant excess in low-frequency variants compared with genome-wide single nucleotide polymorphism, which is indicative of strong negative selection against gene losses. Recent gene gains were either under weak negative selection or neutral. We found evidence for strong divergent selection among populations at individual loci segregating a gene presence–absence polymorphism. Hence, gene gains and losses likely contributed to local adaptation. Our study shows that microbial eukaryotes harbor extensive copy number variation within populations and that functional differences among recently gained and lost genes led to distinct evolutionary trajectories.



Detecting Long-Term Balancing Selection Using Allele Frequency Correlation

2017-07-21

Abstract
Balancing selection occurs when multiple alleles are maintained in a population, which can result in their preservation over long evolutionary time periods. A characteristic signature of this long-term balancing selection is an excess number of intermediate frequency polymorphisms near the balanced variant. However, the expected distribution of allele frequencies at these loci has not been extensively detailed, and therefore existing summary statistic methods do not explicitly take it into account. Using simulations, we show that new mutations which arise in close proximity to a site targeted by balancing selection accumulate at frequencies nearly identical to that of the balanced allele. In order to scan the genome for balancing selection, we propose a new summary statistic, β, which detects these clusters of alleles at similar frequencies. Simulation studies show that compared with existing summary statistics, our measure has improved power to detect balancing selection, and is reasonably powered in non-equilibrium demographic models and under a range of recombination and mutation rates. We compute β on 1000 Genomes Project data to identify loci potentially subjected to long-term balancing selection in humans. We report two balanced haplotypes—localized to the genes WFS1 and CADM2—that are strongly linked to association signals for complex traits. Our approach is computationally efficient and applicable to species that lack appropriate outgroup sequences, allowing for well-powered analysis of selection in the wide variety of species for which population data are rapidly being generated.



New Approach to Antibiotic Therapy is a Dead End for Pathogens

2017-07-17

The World Health Organization (WHO) is currently warning of an antibiotics crisis. The fear is that we are moving into a post-antibiotic era, during which simple bacterial infections would no longer be treatable.



Large Variation in the Ratio of Mitochondrial to Nuclear Mutation Rate across Animals: Implications for Genetic Diversity and the Use of Mitochondrial DNA as a Molecular Marker

2017-07-16

Abstract
It is commonly assumed that mitochondrial DNA (mtDNA) evolves at a faster rate than nuclear DNA (nuDNA) in animals. This has contributed to the popularity of mtDNA as a molecular marker in evolutionary studies. Analyzing 121 multilocus data sets and four phylogenomic data sets encompassing 4,676 species of animals, we demonstrate that the ratio of mitochondrial over nuclear mutation rate is highly variable among animal taxa. In nonvertebrates, such as insects and arachnids, the ratio of mtDNA over nuDNA mutation rate varies between 2 and 6, whereas it is above 20, on average, in vertebrates such as scaled reptiles and birds. Interestingly, this variation is sufficient to explain the previous report of a similar level of mitochondrial polymorphism, on average, between vertebrates and nonvertebrates, which was originally interpreted as reflecting the effect of pervasive positive selection. Our analysis rather indicates that the among-phyla homogeneity in within-species mtDNA diversity is due to a negative correlation between mtDNA per-generation mutation rate and effective population size, irrespective of the action of natural selection. Finally, we explore the variation in the absolute per-year mutation rate of both mtDNA and nuDNA using a reduced data set for which fossil calibration is available, and discuss the potential determinants of mutation rate variation across genomes and taxa. This study has important implications regarding DNA-based identification methods in predicting that mtDNA barcoding should be less reliable in nonvertebrates than in vertebrates.



Selective Constraints on Coding Sequences of Nervous System Genes Are a Major Determinant of Duplicate Gene Retention in Vertebrates

2017-07-16

Abstract
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation.



Quantifying Transmission Heterogeneity Using Both Pathogen Phylogenies and Incidence Time Series

2017-07-11

Abstract
Heterogeneity in individual-level transmissibility can be quantified by the dispersion parameter k of the offspring distribution. Quantifying heterogeneity is important as it affects other parameter estimates, it modulates the degree of unpredictability of an epidemic, and it needs to be accounted for in models of infection control. Aggregated data such as incidence time series are often not sufficiently informative to estimate k. Incorporating phylogenetic analysis can help to estimate k concurrently with other epidemiological parameters. We have developed an inference framework that uses particle Markov Chain Monte Carlo to estimate k and other epidemiological parameters using both incidence time series and the pathogen phylogeny. Using the framework to fit a modified compartmental transmission model that includes the parameter k to simulated data, we found that more accurate and less biased estimates of the reproductive number were obtained by combining epidemiological and phylogenetic analyses. However, k was most accurately estimated using pathogen phylogeny alone. Accurately estimating k was necessary for unbiased estimates of the reproductive number, but it did not affect the accuracy of reporting probability and epidemic start date estimates. We further demonstrated that inference was possible in the presence of phylogenetic uncertainty by sampling from the posterior distribution of phylogenies. Finally, we used the inference framework to estimate transmission parameters from epidemiological and genetic data collected during a poliovirus outbreak. Despite the large degree of phylogenetic uncertainty, we demonstrated that incorporating phylogenetic data in parameter inference improved the accuracy and precision of estimates.



The Structured Coalescent and Its Approximations

2017-06-28

Abstract
Phylogeographic methods can help reveal the movement of genes between populations of organisms. This has been widely done to quantify pathogen movement between different host populations, the migration history of humans, and the geographic spread of languages or gene flow between species using the location or state of samples alongside sequence data. Phylogenies therefore offer insights into migration processes not available from classic epidemiological or occurrence data alone. Phylogeographic methods have however several known shortcomings. In particular, one of the most widely used methods treats migration the same as mutation, and therefore does not incorporate information about population demography. This may lead to severe biases in estimated migration rates for data sets where sampling is biased across populations. The structured coalescent on the other hand allows us to coherently model the migration and coalescent process, but current implementations struggle with complex data sets due to the need to infer ancestral migration histories. Thus, approximations to the structured coalescent, which integrate over all ancestral migration histories, have been developed. However, the validity and robustness of these approximations remain unclear. We present an exact numerical solution to the structured coalescent that does not require the inference of migration histories. Although this solution is computationally unfeasible for large data sets, it clarifies the assumptions of previously developed approximate methods and allows us to provide an improved approximation to the structured coalescent. We have implemented these methods in BEAST2, and we show how these methods compare under different scenarios.