Journal of Molecular Evolution Meeting
Published:
Talk Titles and Abstracts
David Liberles, Temple University - Modeling the genotype-phenotype map by combining expectations from population genetics and biochemistry
Common approaches in statistical genetics use association between individual variants and observed phenotypic differences to model the genotype-phenotype map, with a number of unrealistic assumptions that provide the caveat that these models cannot be extrapolated to populations or species that the original analysis was not derived from. Here we provide an alternative modeling framework that combines a mutation-selection scheme with differential equation models rooted in pathway kinetics. Examples of this approach from metabolic pathway evolution are provided. A new model (published in JME) that characterizes selection on gene duplicates that are segregating in a population will also be presented, as another layer to the genotype-phenotype map.
Arturo Becerra, National Autonomous University of Mexico - Early evolution and the nature of the LCA
The early evolution of life is an extended period before the origin of life to the first eukaryotic cell. 4.5 to 2.5 billion years ago, when many of the main cellular traits appeared and when the Last Common Ancestor (LCA) lived. Reconstructing and quantitative estimates of the gene complement of the last common ancestor of all extant organisms, that is, the LCA, may be hindered by ancient horizontal gene transfer (HGT) events and polyphyletic gene losses, as well as by biases in genome databases and methodological artifacts. A few influential research groups have aimed to reconstruct the LCA with rich-in-detail hypotheses, high-resolution gene catalogs, and metabolic traits. However, HGT poses insurmountable challenges for detailed and rich in details reconstructions; we propose, instead, a middle-ground position with the reconstruction of a slim LCA based on traits under intense pressures of Negative Natural Selection. Nevertheless, with the development of DNA sequencing technology and the accumulation of a vast amount and diversity of sequences and complete genomes in databases, it was possible to start identifying the genes conserved in Bacteria and Archaea. Most reports agree that the last common ancestor resembled extant prokaryotes. A significant number of the highly conserved genes are sequences involved in RNA synthesis, degradation, and binding, including transcription and translation. The evidence suggests that the LCA was not a hyperthermophile, but it is currently not possible to assess its ecological niche.
Eric Haag, University of Maryland - The species-specific F-box protein FOG-2 enables self-fertility in C. elegans via a novel co-translational ubiquitination mechanism
Reproductive traits are among the most rapidly evolving features of multicellular organisms. C. elegans hermaphrodites are essentially XX females that recently evolved the ability to produce sperm in an ovary. XX spermatogenesis requires a specific protein/mRNA ternary complex composed of the GLD-1 RNA-binding protein, an F-box protein, FOG-2, and the mRNA product of the feminizing gene tra-2. GLD-1 and FOG-2 dimerize and bind the 3’ UTR of tra-2 mRNA to repress its activity during spermatogenesis. Loss of GLD-1, FOG-2, or the GLD-1-binding sites in the tra-2 3’ UTR eliminates sperm production. GLD-1 is a highly conserved regulator of meiotic progression and oogenesis, and has hundreds of RNA targets. In contrast, FOG-2 is a C. elegans-specific F-box protein only necessary for hermaphrodite spermatogenesis. It has no known role in males, although they express it. The precise mechanism through which FOG-2 regulates germline sex is unknown, but there are clues. FOG-2 interacts with SKR-1, a component of the E3 ubiquitin ligase complex [8]. This suggests it acts as a canonical F-box protein by targeting another protein for ubiquitin-mediated degradation. However, their similar feminized phenotypes indicated GLD-1 is not that target. Genetic experiments suggest that FOG-2 may have roles independent of its association with the tra-2 3’ UTR, and potentially independent of GLD-1 as well. As some protein-protein interactions are mediated co-translationally via RNA-binding proteins bound to 3’ UTRs, we tested the hypothesis that GLD-1 association serves to position FOG-2 to bind and mediate ubiquitination of nascent TRA-2 as it emerges from the ribosome. Consistent with this, we find that the cytoplasmic domain of TRA-2 (TRA-2c) interacts specifically with FOG-2 in yeast two-hybrid assays and in vitro. This interaction does not require the F-box of FOG-2. An N-terminal portion domain of TRA-2c is sufficient for interaction. Despite this interaction, overexpression of FOG-2 cannot masculinize the soma, suggesting GLD-1-binding is essential for substantial FOG-2 activity. Our work suggests FOG-2 specificity for germline sex determination results from an interaction with TRA-2, which would not impact the translation products of other GLD-1 target mRNAs.
John Bracht, American University - Evolution of a more efficient mitochondrial cytochrome c oxidase in nematodes
Nematodes constitute one of the most diverse–and adaptable–phyla on earth. In this study we report the full mitochondrial genome sequence of the Devil Worm, Halicephalobus mephisto, which encodes an evolutionarily adapted mitochondrial cytochrome c oxidase complex (Complex IV). We utilized branch-site dN/dS analysis to locate specific residues of cytochrome c oxidase genes 1 and 3 (COX1 and COX3) that are under positive selection relative to outgroup nematodes. By structural homology modeling we localize the positively-selected COX1 amino acids in space, finding they cluster along the proton translocation channel. We also find that the mitochondrial membrane potential of H. mephisto is statistically higher than C. elegans by TMRE staining. These data suggest that the unusual cytochrome c oxidase proton pump of the Devil Worm can more efficiently generate a proton gradient compared to C. elegans, likely as an adaptation to hypoxia.
David Alvarez-Ponce, University of Nevada - Dissecting the causes and consequences of the expression level-evolutionary rate anticorrelation
Einat Hazkani-Covo, The Open University of Israel - Template switching is a source of nonsynonymous substitutions in wild-type yeast
DNA polymerase template switching between the arms of short nonidentical inverted repeats (IRs) is a genetic mechanism that results in perfect IR. Template switching is a non-conservative mutation mechanism that introduces clusters of mutations at once. It is unknown how template switching affects the evolution of coding genes. Here, we tested the effect of template switching between IR arms within the coding region of 50 Saccharomyces cerevisiae wild-type strains. To do so, we looked for perfect IRs that appear on S. cerevisiae strains and for multiple nucleotide mutations that overlapped the IRs and located on the branches leading to the strains with the IRs. Our results indicate that template switching is a powerful generator of genetic variation and nonsynonymous substitutions. Considering simulated data, we identified that template switching contributed to 39 protein-coding genes in S. cerevisiae, causing nonsynonymous mutations on IR arms as well inversion of IR spacer. These events mainly resulted in one amino acid replacement but can replace up to five neighboring amino acids. Template switching can promote parallel events, causing spacer flipping and arm changing. Template switching mechanism causes nonsynonymous and parallel substitutions, both traditionally considered as evidence of positive selection. Thus, this genetic mechanism should get more attention in evolutionary analysis.
Amanda Wilson, Temple University - Expectations of Duplicate Gene Retention Under the Gene Duplicability Hypothesis
Gene duplication is an important evolutionary process. Some gene duplicate copies are retained for long evolutionary times while many are lost. The gene duplicability hypothesis proposes that some genes are inherently more “duplicable” than others. Theoretically, there is something about the function and complexity of its network that determines whether genes are more “duplicable” and have more opportunity to be retained for long evolutionary periods, through processes such dosage balance, subfunctionalization and neofunctionalization. The more copies of a gene and the longer those copies persist, the more duplicable the gene. Generally speaking, a more duplicable gene would be expected to be more likely to be retained after two duplication events than a less duplicable gene. The conditional probability ratio after consecutive duplication events is the quotient of the probability that a gene will be retained after a second whole genome duplication event (WGD2) given that it was retained after the first whole genome duplication event (WGD1) and the probability a gene will be retained after WGD2 given it was lost after WGD1. Duplicate gene retention is a time heterogeneous process so it is important to incorporate the time between the events (t1) and the time since the most recent event (t2). Here, we model the gene duplicability hypothesis using a survival analysis framework to predict expected conditional probability ratios for genomes with different t1 and t2 values. We present a formal model of the gene duplicability hypothesis that treats some gene duplicates as more susceptible to selectable functional shifts, some more susceptible to dosage compensation, and others only drifting. We use this model to characterize the evolutionary dynamics and explanatory power of a recently developed statistical framework.
Keith Crandall, George Washington University - Computational Approaches to Microbiome Characterization
Sequencing data (e.g., genomic or amino-acids) holds great promise in understanding biology. For example, in human health, causes, characteristics, and potential interventions for diseases can be determined by the specificity of DNA snapshots and mutations from individuals. There are many challenges with investigating and analyzing sequencing data, including non-independent observations, large noise components, nonlinearity, collinearity, and high dimensionality. Considering all these challenges, sequencing data is well suited for machine learning (ML) techniques that are able to capture nonstructural patterns in association with biology of interest. Here, we present deepBreaks, a generic approach with unified data analysis steps that identify the most important positions in sequence data that contribute to phenotypic traits. deepBreaks compares the performance of multiple ML algorithms and prioritizes the most important positions based on the best-fitted models.
Arndt von Haeseler, University of Vienna, Austria [TALK CANCELLED] - Measuring Phylogenetic information
We present a new statistical method to determine the phylogenetic information along the edges of a phylogenetic tree. For two sequences, this is a test for saturation. We discuss the theory and show some interesting examples.
Alejandro Gil Gomez, Stony Brook University - Wiring between close nodes in biological networks evolves more quickly than between distant nodes
Network evolution is typically studied using interactomes of model organisms, in order to compare the rewiring rates of nodes with sequence conservation. This approach is constrained by the few numbers of model organisms with a complete interactome. In order to assess biological network variation within species, and provide an alternative method that canbe scaled across taxa, we use drug-drug interactions (DDIs), a known proxy of network topology, and compute the evolutionary rates of DDIs using previously published data on a pharmacological study in gram negative bacteria. We mapped drugs of known targets to different biological networks in E.coli and assess the effects of different factors that may contribute to rewiring, such as distance between nodes, k-edge connectivity, adjacency, and node degree. We found that drug targets of synergistic DDI are close in biological networks and that these interactions have a higher evolutionary rate. Here, we modeled drug interaction scores using phylogenetic comparative methods to comment on biological network evolution and the role of different network properties.
Cara Weisman, Princeton University - Lineage-specific genes: novelty or noise? How to tell and why it matters
“Lineage-specific” genes, defined operationally as those lacking detected homologs outside of a narrow group of related species, are widely interpreted as novel genes. They have thereby become the subject of much interest, ranging from investigation of their origins to exploration of their role in the evolution of novel traits at larger scales. Are these genes truly evolutionarily novel? Or might there be other, more mundane, reasons that they lack apparent homologs beyond their lineage? Here, I describe types of technical artifacts that cause genes to appear lineage-specific and estimate their frequency, finding that these sources may produce significant fractions of the sum total of apparent lineage-specific genes. I also develop techniques for identifying which lineage-specific genes may be due to these effects. This work suggests that interpreting lineage-specific genes as novel is not generally reliable, with corresponding implications for work on evolutionary novelty that is based on such interpretations, and provides concrete steps to remedy this problem through methods to determine which lineage-specific genes are least likely to be due to these artifactual sources.
Jody Hey, Temple University - Balancing selection is common for beneficial alleles in a human population
A population genomic analysis of the selective impact and allele age of non-synonymous single nucleotide alleles was undertaken to address basic questions about mechanisms of natural selection in humans. By contrasting the evolutionary neutrality of alleles with their age and population frequencies, we discovered many polymorphic and fixed non-neutral alleles that are older on average than non-coding, non-regulatory alleles of the same frequency, indicating footprints of natural selection. Of these, alleles fixed in the population show higher local recombination rates than those still segregating, consistent with a model in which new beneficial alleles experience an initial period of balancing selection due to linkage disequilibrium with deleterious recessive alleles. Alleles that ultimately fix following a period of balancing selection will leave a modest ‘soft’ sweep impact on the local variation, consistent with the overall paucity of species-wide ‘hard’ sweeps in human genomes.
Meru Sadhu, National Human Genome Research Institute, NIH - Long-read genomes reveal pangenomic variation underlying yeast phenotypic diversity
Understanding the genetic causes of trait variation is a primary goal of genetic research. One way that individuals can vary genetically is through the existence of variable pangenomic genes – genes that are only present in some individuals in a population. The presence or absence of entire genes could have large effects on trait variation. However, variable pangenomic genes can be missed in standard genotyping workflows, due to reliance on aligning short-read sequencing to reference genomes. A popular method for studying the genetic basis of trait variation is linkage mapping, which identifies quantitative trait loci (QTLs), regions of the genome that harbor causative genetic variants. Large-scale linkage mapping in the budding yeast Saccharomyces cerevisiae has found thousands of QTLs affecting myriad yeast phenotypes. To enable the resolution of QTLs caused by variable pangenomic genes, we used long-read sequencing to generate highly complete de novo assemblies of 16 diverse yeast isolates. With these assemblies we resolved QTLs for growth on maltose, sucrose, raffinose, and oxidative stress to specific genes that are absent from the reference genome but present in the broader yeast population at appreciable frequency. Copies of genes also duplicate onto chromosomes where they are absent in the reference genome, and we found that these copies generate additional QTLs whose resolution requires pangenome characterization. Our findings demonstrate the power of long-read sequencing to identify the genetic basis of trait variation.
Josh Rest, Stony Brook University - Genetic variation in protein expression responses to heat: independence of expression level, variance, and scaling with cell size
Cells respond to stress by varying the expression level of various proteins, and genotypic differences in expression responses may be adaptive or non-adaptive. Previous work has identified many such genes with genetic variation in dynamics of mean transcriptional response, as well as in translated (protein-level) response to stressors. Mean expression level is one important component of multidimensional and dynamic response; other important dimensions are the distribution of responses among cells (noise or variance), and the scaling between cell size and expression level. Here, we address the hypothesis that mean, variance, and scaling are genetically variable components that also may vary independently of each other. We also explore co-culture as a mediator of these responses. To do so, we measured changes in protein expression of two GFP-tagged proteins after growth at high temperature among five strains of Saccharomyces cerevisiae. We studied stress response proteins HSP42 and ARA1 in these wild strains, as well as in a reference (lab) strain. Using flow cytometry, we characterized the expression level and cell size of each cell in the population. We also calculated the fitness of each strain relative to the lab strain. We found that mean, variance, and scaling of both ARA1 and HSP42 are genetically variable components that also vary independently of each other across wild strains in response to heat stress. For the lab strain, these metrics vary as a function of the growth rate of the co-cultured wild strain, illustrating competition as a major environmental factor affecting the physiological response. We also address whether differences in fitness among these strains are correlated with strain-specific differences in mean, variance, and scaling.
Joana Carneiro da Silva, University of Maryland School of Medicine - Genome-wide sieve analysis: using population genetics principles in applied vaccinology
Identification of protective targets of Plasmodium falciparum (Pf) sporozoite (SPZ)-based immunization will elucidate the genomic architecture of whole organism vaccine-induced protection and may lead to novel vaccine formulations with broader efficacy. PfSPZ-based malaria vaccines have shown significant, but partial vaccine efficacy (VE) against endemic malaria in field studies. Furthermore, clinical trials in which VE was assessed against controlled human malaria infection using either homologous or heterologous strains revealed that these vaccines are susceptible to allele-specific efficacy, the process by which vaccine protection is strongest against pathogen strains immunologically similar to the vaccine strain at protective loci. Here, we take advantage of a population genetics statistic typically used to identify differences between populations, Wright’s Fixation Index, FST, and leverage allele-specific efficacy in field trials of PfSPZ-based vaccines, to identify candidate targets of vaccine-induced protection. These target loci will be those in which allele frequencies differ significantly between infections in vaccinees and controls, with the vaccine allele depleted in the vaccine arm, due to the vaccine sieve effect. One study, in Burkina Faso, used Sanaria® PfSPZ Vaccine, composed of radiation attenuated SPZs of the PfNF54 strain, while the second study used Sanaria® PfSPZ CVac, consisting of fully infectious PfNF54 SPZs administered under the cover of the chemoprophylactic chloroquine. Both studies were randomized, double-blind, placebo-controlled trials in malaria-experienced adults. Sieve analyses in the Malian trial revealed 179 non-synonymous (NSYN) sites significantly differentiated between vaccinees and controls, in 145 protein-coding loci. In Burkina Faso, 358 significantly differentiated NSYN sites in 295 loci were identified. The proteins encoded by these loci are enriched for functions such as host cell binding, including proteins intrinsic to or anchored in the cellular membrane. The intersection of both sets of loci resulted in 36 protein-coding genes, including those encoding well-characterized sporozoite antigens, an exported protein, and several membrane-associated conserved proteins and variant surface antigens. These results suggest that genome-wide sieve analysis using infections from participants in placebo-controlled field trials of whole organism-based vaccines can help elucidate previously unknown protective antigens.
TreVaughn Ellis, American University - The Devil worm’s efficient mitochondria: Adaptation of H. Mephisto’s electron transport chain in extreme conditions
Halicephalobus mephisto is a subterrestrial worm found 1.3 kilometers underground in hypoxic, high-temperature conditions. It can thrive without connection to the surface and has been shown to contain a genomic signature for stress adaptation. Because this organism can survive oxygen levels 10-30 times lower than surface water, we hypothesize that its mitochondrial respiration has also adapted. Oxygen interacts with Complex IV of the electron transport chain, where it serves as the terminal electron acceptor, producing water and actively transporting protons across the membrane to establish the mitochondrial proton gradient. Tetramethylrhodamine, ethyl ester (TMRE) measures the mitochondrial proton gradient directly. In this study we utilized TMRE to compare C. elegans and H. mephisto’s mitochondrial proton gradient in the presence and absence of Sodium Azide, a Complex IV inhibitor. The data suggests that H. mephisto has evolved a more efficient Complex IV proton pump.
Michelle Meyer, Boston College - Using network clustering to investigate the evolution of RNA gene regulators
One of the biggest paradigm shifts of the last 30 years has been in our understanding of the roles played by RNA as a gene regulatory molecule. RNA regulation is pervasive across all domains of life, yet the mechanisms utilized are extraordinarily diverse. Within a single species, RNA acts in multiple distinct ways to regulate gene expression, and between even similar species there may be completely different RNAs the accomplish the same biological function. One challenge faced by the RNA community is tracing the evolutionary trajectories of functional RNA molecules that do not encode proteins (ncRNAs). An RNA sequence inherently contains less information per position than a protein sequence. In addition, extensive covariation between nucleotides involved in base-pairing interactions that are distal in primary sequence result in very low primary sequence identity for many homologous RNA molecules. While most protein families are relatively easily traced across long phylogenetic distances, the vast majority of ncRNA families are not. In order to assess ncRNA vertical and horizontal transmission, we turned to a network clustering approach that allows us to visualize RNA similarity across very diverse molecules. This approach enables us to use a variety of different similarity measures including sequence, secondary structure, or both, to cluster similar sequences. We applied this approach to examine outstanding questions in RNA biology. Our systems of study are riboswitches, structured ncRNA cis-regulatory sequences occurring predominantly in bacteria that bind small molecular ligands to regulate gene expression. We have used network clustering to examine two distinct evolutionary questions. First, what is the evolutionary relationship between glycine riboswitches that include tandem homologous glycine binding motifs, and those with only a single such motif. Second, what are the potential origins for the few riboswitch examples found in eukaryotes rather than bacteria. In these two examples we demonstrate how network clustering can shed light on these evolutionary questions, even in cases where traditional methodology may fail.
Greg Lang, Lehigh University - Rock, paper, scissors: nontransitivity in experimental evolution
A common misconception is that evolution is a linear “march of progress,” where each organism along a line of descent is more fit than all those that came before it. Rejecting this misconception implies that evolution is nontransitive: a series of adaptive events will, on occasion, produce organisms that are less fit compared to a distant ancestor. We identified a nontransitive evolutionary sequence in a 1000-generation yeast evolution experiment. We show that nontransitivity arises due to adaptation in the yeast nuclear genome combined with the stepwise deterioration of an intracellular virus. Our results provide a mechanistic case-study for the adaptive evolution of nontransitivity.
Keith Hackbarth, University of Maryland - Dosage Compensation and Meioitic Silencing on the Neo-X Chromosomes of Filarial Nematodes
Brugia malayi and Onchocerca volvulus are helminths consequential for public health in the family Onchocercidae. In each species, a different autosome fused with the X chromosome to form a neo-X chromosome. As a result, their ancestrally XX/XO systems became XX/XY, where the Y represents a degenerated version of the unattached autosomal homolog chromosome. The new association of a large region of the genome with the sex chromosomes introduces evolutionary pressures due to a change in gene dosage in males. These independent fusion events offer an opportunity to investigate and compare the consequences of X:autosome fusions for the genomes of these parasites. Here, we combine and reanalyze RNA-seq data sets from B. malayi and O. volvulus to show that dosage compensation and meiotic silencing are present along the neo-X chromosomes of these parasitic nematodes.
Ananias Escalante, Temple University - Evolution of primate malaria parasites
Ayna Mammedova, Temple University - Spatial clustering of amino acid substitution in proteins
Spatial clustering of amino acid changes in proteins is a result of varying mutational rates in different regions of proteins coupled to different types of selection. Direct selection on a functional region of a protein is one reason for clustered substitution. Compensatory processes within a folded structure is another. Amino acid substitution data conditioned on the number of substitutions occurring on a branch, the protein structure, and the branch dN/dS value are used to evaluate the propensity of different processes to give rise to clustered substitution and to evaluate the use of such clusters to detect positive selection. To generate this analysis, comprehensive analysis of spatial distances between substitutions in proteins under selection is necessary, using data taken from the TAED and PDB databases.
Jenna Archambeau, American University - Coral microbiomes across the Red Sea and their potential role in coral thermal resilience
Coral microbiomes quickly respond to high sea water temperatures and have been speculated to influence coral thermal resilience. Determining the microbial composition of naturally thermally resilient corals could provide insights on the microbial mechanisms underlying thermal adaptation. We previously identified thermally resilient and sensitive coral colonies from four species across five regions spanning the natural latitudinal temperature gradient in the Red Sea (published in Evensen et.al., 2022). Here we aim to characterize microbiomes of such colonies and identify microbial markers indicative of coral thermal resilience using 16S rRNA amplicon sequencing data. Although native coral microbiome diversity was shown to be highly species-specific, northern Red Sea coral microbiomes were significantly different from other regions. In addition, we found common microbial dynamics during thermal stress across all species and sites, suggesting a unique microbial response to thermal stress that is independent of the native community composition. Understanding this partnership can help to answer questions about coral dependency on their microbial symbionts for survival, development, and adaptation to future climate scenarios.
Ryan Skalsky, University of Maryland School of Medicine - Analysis of Structure and Epitope Characteristics of Novel Plasmodium falciparum Antigens
In 2020, Plasmodium falciparum (Pf) caused an estimated 241 million human infections and 627,000 deaths. The immense burden exerted by Pf highlights the need for new, effective interventions, including highly efficacious vaccines. To date, no vaccine has achieved consistently high efficacy in endemic regions, in part due to allele-specific efficacy. The design of new, highly efficacious vaccines would be facilitated by novel Pf vaccine candidates, a better understanding of functional Pf epitopes associated with protection and validation of novel antigen discovery methods. In a separate abstract, our group reports leveraging allele-specific efficacy in clinical trials of a whole-organism, sporozoite-based malaria vaccine tested against field challenge to identify Pf antigens associated with protection, using a genome-wide sieve analysis. These analyses revealed a few hundred putative protective antigens in each of two trials, including well-established pre-erythrocytic candidates. We hypothesized that, if most of these candidates are in fact protective antigens then, at the Pf genomic sites significantly differentiated between vaccinees and controls (or “target sites”), (i) the vaccine allele should be underrepresented in infections from vaccinees relative to controls, and (ii) target sites should fall preferentially in epitopes. Our analyses show that, consistent with (i), in ~70% of target sites, the vaccine allele is significantly depleted among Pf infections in vaccinees (p<0.05). We are currently analyzing the distribution of target sites relative to predicted epitopes. T-cell epitopes were predicted using the netMHCpan suite and predominant HLA types at the clinical trial sites. Target sites that fall in epitopes or structural motifs will be further analyzed to characterize the properties of the amino acid residue changes associated with differences in immunogenicity and that contribute to vaccine evasion. A down-selection pipeline was also developed to identify ideal antigens for in vitro validation based on sites with shared vaccine allele depletion, containing significantly differentiated sites resulting in nonsynonymous mutations, those not belonging to multi-gene families, with known function and structural characteristics, with informative GO term enrichment analysis, and containing sites within predicted T-cell epitopes. Future work intends to validate these down-selected vaccine antigens by measuring T-cell response from patient PBMCs from the respective clinical trials to reconstructed peptide arrays using T-cell ELISpot.
Posters Amruta Tendolkar, George Washington University Hox genes are deeply conserved developmental genes that bring about differentiation of serial homologs in bilateral animals. These genes trigger segment-specific signaling networks that establish the fate of the different regions of the embryo. Mutations in Hox genes result in homeosis - loss of one body identity and gain of another. This study focuses on the Hox gene Ultrabithorax (Ubx) and its role in insect wing diversification. Wings show structural differences among insect orders - balancing halteres in flies, protective hardened shells in beetles and intricate patterns in butterflies. In insects, Ubx is expressed in the third thoracic segment and is responsible for the ontogenesis of the hindwing. Role of Ubx in hindwing differentiation has been studied in flies, beetles and planthoppers; however, description of its role in butterflies and moths is limited. Our previous study in Lepidoptera identified Ubx as a micromanager of hindwing identity where a loss of the protein generated mutations in size and shape of scales, venation, eyespot formation, etc, results consistent with several other studies in beetles, bugs and planthoppers. However, we needed to disentangle these functions of Ubx in hindwing patterning in Lepidoptera. Using ATAC-seq and conservation analysis, we identified hindwing-specific open chromatin regions at the Ubx locus that were conserved between Lepidoptera and Trichoptera. To functionally test these putative regulatory regions, we knocked them out using CRISPR-Cas9. Our results identify an ~18kb region potentially containing an enhancer and a silencer of Ubx. Future studies will dig deeper into this locus to identify the extent of each of these regions.
Artemiza Martinez, Lehigh University Speciation is thought to arise by gradual evolution of genetic incompatibilities. These incompatibilities prevent mating (pre-zygotically) or cause inviable or infertile hybrid offspring (pos-tzygotically). However, in yeast, prezygotic barriers are weak, and interspecific hybridization may occur. In yeast hybrids, there is indirect evidence for nuclear-nuclear Dobzhansky-Muller incompatibilities; however, due to their hybrid sterility, none of these genetic interactions have been corroborated. In this study, we generated F1 viable tetrads hybrids of S. cerevisiae and S. paradoxus species by suppressing SGS1 and MSH2. We are trying to increase the sample size by sorting viable spores before germination. By increasing the statistical power, we can apply QTL mapping to identify genetic incompatibilities. Furthermore, we started an experimental evolution of haploid and diploid F1 hybrids, previously genotyped, to identifying compensatory mutations after hundreds of generations. Broadly, these studies will contribute to the understanding reproductive isolation and speciation in yeast.
Sophie Scobell, National Institutes of Health, National Human Genome Research Institute As global connectivity increases and human interference on natural environments intensifies, zoonotic spillovers and epidemics are becoming increasingly common. knowledge of the host range of a virus can inform the likelihood of future spillover or spillback events and can aid in spillover prevention measures such as viral surveillance or preemptive vaccination campaigns. Poxviruses have extremely broad and variable host ranges that are largely unknown. One determinant of viral host range in poxviruses is K3: a protein encoded by most poxviruses that contributes to host species specificity. K3 functions by antagonizing a component of an animal’s innate antiviral immune response: protein kinase R (PKR). We seek to use a high throughput approach to model the ability of a given poxvirus K3 to inhibit a given animal species’ PKR. We utilize several hundred animal species’ PKR representing a diverse range of animals and mammals. Due to the length of the EIF2AK2 gene that encodes PKR, we design chimeric PKR proteins by replacing a specific region of PKR that binds K3 with the variable binding domains in our species library. Those variable species’ domains are then placed into a human PKR scaffold. We will characterize the theoretical host range of 16 poxviruses, including variations of Monkeypox K3, based on K3 inhibition of several hundred species’ PKR in our yeast growth assay. We hope that our findings can inform host-virus surveillance that will ultimately aid in highlighting potential spillover events to prevent future poxvirus pandemics.
Agusto Luzuriaga, University of Nevada Analyses in a number of organisms have shown that duplicated genes are less likely to be essential than singletons. This implies that genes can often compensate for the loss of their paralogs. However, it is unclear why the loss of some duplicates can be compensated by their paralogs, whereas the loss of other duplicates cannot. Surprisingly, initial analyses in mice did not detect differences in the essentiality of duplicates and singletons. Only subsequent analyses, using larger gene knockout datasets and controlling for a number of confounding factors, did detect significant differences. Previous studies have not taken into account the tissues in which duplicates are expressed. We hypothesized that in complex organisms, in order for a gene’s loss to be compensated by one or more of its paralogs, such paralogs need to be expressed in at least the same set of tissues as the lost gene. To test our hypothesis, we classified mouse duplicates into two categories based on the expression patterns of their paralogs: “compensable duplicates” (those with paralogs expressed in all the tissues in which the gene is expressed) and “non-compensable duplicates” (those whose paralogs are not expressed in all the tissues where the gene is expressed). In agreement with our hypothesis, the essentiality of non-compensable duplicates is similar to that of singletons, whereas compensable duplicates exhibit a substantially lower essentiality. Our results imply that duplicates can often compensate for the loss of their paralogs, but only if they are expressed in the same tissues. Indeed, the compensation ability is more dependent on expression patterns than on protein sequence similarity. The existence of these two kinds of duplicates with different essentialities, which has been overlooked by prior studies, may have hindered the detection of differences between singletons and duplicates.
Michael Chambers, Georgetown University The interface between interacting host and viral proteins can be a battleground in which genetic variants are naturally pursued. One such scenario is found between the mammalian innate immunity protein PKR (protein kinase R) and its poxvirus antagonist K3. Exploring the impact of missense variants in both PKR and K3 will highlight residues of evolutionary constraint and opportunity while also elucidating the mechanism by which human PKR is able to subvert a rapidly evolving antagonist. We reason that paired human PKR and vaccinia K3L variants can be characterized using a combinatorial high-throughput cloning approach and a yeast growth assay. In this assay, PKR activity suppresses yeast growth, which is restored if K3 successfully inhibits PKR. By tracking barcodes from sample timepoints in the assay we will be able to quantify and characterize the impact of each PKR and K3L variant combination, highlighting points of evolutionary constraint an opportunity for PKR and K3L. This strategy would allow us to scan a vast combinatorial space in a single experiment, providing details of the evolutionary fitness landscape of PKR and K3L as well as the ability of each protein to adapt to the other.
Maria Pacheco, Temple University Malaria is caused by protists of the genus Plasmodium (Phylum Apicomplexa). These vector-borne parasites are found in many terrestrial vertebrates in almost all ecosystems. Among the primate malaria parasites, those found in lemurs have been neglected. Thus far, 11 Plasmodium lineages have been detected using Mitochondrial genomes (mtDNA, ≈6Kb) in 246 samples of twelve lemur species. Also, data from the apicoplast genome (≈6Kb), a plastid-like organelle, is available from some Plasmodium and other Haemosporida species. Thus far, putative lemur Plasmodium species are a diverse monophyletic group that shares a common ancestor with other Plasmodium in primates from continental Africa. Given the extended taxonomic sampling, time trees for the mtDNA were estimated under different scenarios, and the origin of the lemur clade coincides with the proposed time of their host species’ most recent common ancestor (Lemuridae-Indriidae). The phylogenetic congruence of the lemurs and their parasites was explored. A statistically significant scenario identified four cospeciations, two duplications, four transfers (host-switches), and zero loss events. Thus, the parasite species sampled in lemurs seem to radiate with their hosts. A time tree with fewer taxa was also estimated with mtDNA + Apicoplast loci. The apicoplast loci evolve at a different rate than the mtDNA with less rate variation. However, this observation needs to be confirmed by expanding apicoplast data. Time estimates suggest that the subgenus Laverania, the clade including Plasmodium falciparum, the most lethal human malaria parasite and its related species in African Apes, may have originated with Homininae (African apes). This result coincides with the host range of the extant species of Plasmodium indicates. Overall, expanding the taxa and molecular taxa of parasites in lemurs will inform us about the radiation of this clade and provide a clear picture of the diversification of Plasmodium in primates, including the origin of human malaria parasites.
Brett Morgan, Smithsonian Environmental Research Center Discordant gene trees indicate evolutionary heterogeneity between mitochondrial loci in Dynastes
Many studies assume that because mitochondrial genes are closely linked and maternally inherited without recombination, they will share the same genealogical history. However due to their high mutation rate, different mitochondrial loci may not always tell a consistent evolutionary story. Contributing factors to such discordance among mt gene trees have rarely been investigated. We assembled mitochondrial genomes of ten Dynastes beetle species and compared individual gene trees to a trusted nuclear gene phylogeny, to test each mitochondrial gene’s utility in phylogenetic inference. We also searched for gene characteristics that were predictive of phylogenetic utility.
Hyewoo Shin, Smithsonian
Orchid mycorrhizal fungi are essential for the growth and survival of orchids, which obtain nutrients, including carbon, from the interaction with their mycorrhizal partners. To obtain the necessary resources for growth, especially at early life history stages which are non-photosynthetic, orchids interact with mycorrhiza in several fungal families, including the Tulasnellaceae and Ceratobasidiaceae. Little is known about mechanisms that are associated with orchid acquisition of resources from orchid mycorrhiza. One of the first steps in understanding the dynamics of orchid-fungal interactions is to characterize the genomics of the different groups of fungi that are known orchid mycorrhiza. The important role of mitochondria is to generate energy for the cell. Mitochondria are organelles of endosymbiotic origin and contain their own genomes. With the distinct evolutionary history of mitochondrial genomes (mitogenomes), mitogenomes can provide advantages to phylogenetic and evolutionary studies. However, mitogenomes of fungi have been less reported than animals and plants. In Cantharellales, only five mitogenomes have been completed up to date. None of the mitogenomes of orchid mycorrhizal fungi have been investigated. We report two mitogenomes in this first study. The two fungal families, Tullasnellaceae and Ceratobasidiaceae, have many taxa that are known orchid mycorrhiza. The genomic DNAs of Tulasnella sp. (Tulasnellaceae) and Ceratobasidium sp. (Ceratobasidiaceae) were extracted from pure cultures of single pelotons isolated from two native orchids; Goodyera pubescens and Platanthera lacera (Orchidaceae), respectively.