Seasonality and the effectiveness of mass vaccination

Dennis L. Chao,  Dobromir T. Dimitrov

Mathematical Biosciences and Engineering

November, 2015


Many infectious diseases have seasonal outbreaks, which may be driven by cyclical environmental conditions (e.g., an annual rainy season) or human behavior (e.g., school calendars or seasonal migration). If a pathogen is only transmissible for a limited period of time each year, then seasonal outbreaks could infect fewer individuals than expected given the pathogen's in-season transmissibility. Influenza, with its short serial interval and long season, probably spreads throughout a population until a substantial fraction of susceptible individuals are infected. Dengue, with a long serial interval and shorter season, may be constrained by its short transmission season rather than the depletion of susceptibles. Using mathematical modeling, we show that mass vaccination is most efficient, in terms of infections prevented per vaccine administered, at high levels of coverage for pathogens that have relatively long epidemic seasons, like influenza, and at low levels of coverage for pathogens with short epidemic seasons, like dengue. Therefore, the length of a pathogen's epidemic season may need to be considered when evaluating the costs and benefits of vaccination programs.

Cholera Outbreak in Grande Comore: 1998–1999

Christopher Troeger, Jean Gaudart, Romain Truillet, Kankoe Sallah, Dennis L. Chao, Renaud Piarroux

The American Journal of Tropical Medicine and Hygiene

November 16, 2015


In 1998, a cholera epidemic in east Africa reached the Comoros Islands, an archipelago in the Mozambique Channel that had not reported a cholera case for more than 20 years. In just a little over 1 year (between January 1998 and March 1999), Grande Comore, the largest island in the Union of the Comoros, reported 7,851 cases of cholera, about 3% of the population. Using case reports and field observations during the medical response, we describe the epidemiology of the 1998–1999 cholera epidemic in Grande Comore. Outbreaks of infectious diseases on islands provide a unique opportunity to study transmission dynamics in a nearly closed population, and they may serve as stepping-stones for human pathogens to cross unpopulated expanses of ocean.


Cholera Transmission in Ouest Department of Haiti: Dynamic Modeling and the Future of the Epidemic

Alexander Kirpich, Thomas A. Weppelmann, Yang Yang, Afsar Ali, J. Glenn Morris Jr., Ira M. Longini

PLOS Neglected Tropical Diseases

October 21, 2015


In the current study, a comprehensive, data driven, mathematical model for cholera transmission in Haiti is presented. Along with the inclusion of short cycle human-to-human transmission and long cycle human-to-environment and environment-to-human transmission, this novel dynamic model incorporates both the reported cholera incidence and remote sensing data from the Ouest Department of Haiti between 2010 to 2014. The model has separate compartments for infectious individuals that include different levels of infectivity to reflect the distribution of symptomatic and asymptomatic cases in the population. The environmental compartment, which serves as a source of exposure to toxigenic V. cholerae, is also modeled separately based on the biology of causative bacterium, the shedding of V. cholerae O1 by humans into the environment, as well as the effects of precipitation and water temperature on the concentration and survival of V. cholerae in aquatic reservoirs. Although the number of reported cholera cases has declined compared to the initial outbreak in 2010, the increase in the number of susceptible population members and the presence of toxigenic V. cholerae in the environment estimated by the model indicate that without further improvements to drinking water and sanitation infrastructures, intermittent cholera outbreaks are likely to continue in Haiti.

The role of influenza in the epidemiology of pneumonia

Sourya Shrestha, Betsy Foxman, Joshua Berus, Willem G. van Panhuis, Claudia Steiner, Cécile Viboud, Pejman Rohani

NATURE Scientific Reports

October 21, 2015


Interactions arising from sequential viral and bacterial infections play important roles in the epidemiological outcome of many respiratory pathogens. Influenza virus has been implicated in the pathogenesis of several respiratory bacterial pathogens commonly associated with pneumonia. Though clinical evidence supporting this interaction is unambiguous, its population-level effects—magnitude, epidemiological impact and variation during pandemic and seasonal outbreaks—remain unclear. To address these unknowns, we used longitudinal influenza and pneumonia incidence data, at different spatial resolutions and across different epidemiological periods, to infer the nature, timing and the intensity of influenza-pneumonia interaction. We used a mechanistic transmission model within a likelihood-based inference framework to carry out formal hypothesis testing. Irrespective of the source of data examined, we found that influenza infection increases the risk of pneumonia by ~100-fold. We found no support for enhanced transmission or severity impact of the interaction. For model-validation, we challenged our fitted model to make out-of-sample pneumonia predictions during pandemic and non-pandemic periods. The consistency in our inference tests carried out on several distinct datasets, and the predictive skill of our model increase confidence in our overall conclusion that influenza infection substantially enhances the risk of pneumonia, though only for a short period.



The effects of a deleterious mutation load on patterns of influenza A/H3N2’s antigenic evolution in humans

 Katia Koelle, David A. Rasmussen


September 15, 2015


Recent phylogenetic analyses indicate that RNA virus populations carry a significant deleterious mutation load. This mutation load has the potential to shape patterns of adaptive evolution via genetic linkage to beneficial mutations. Here, we examine the effect of deleterious mutations on patterns of influenza A subtype H3N2’s antigenic evolution in humans. By first analyzing simple models of influenza that incorporate a mutation load, we show that deleterious mutations, as expected, act to slow the virus’s rate of antigenic evolution, while making it more punctuated in nature. These models further predict three distinct molecular pathways by which antigenic cluster transitions occur, and we find phylogenetic patterns consistent with each of these pathways in influenza virus sequences. Simulations of a more complex phylodynamic model further indicate that antigenic mutations act in concert with deleterious mutations to reproduce influenza’s spindly hemagglutinin phylogeny, co-circulation of antigenic variants, and high annual attack rates. 

Epidemic processes in complex networks

Romualdo Pastor-Satorras, Claudio Castellano, Piet Van Mieghem, Alessandro Vespignani

Reviews of Modern Physics

August 31, 2015


In recent years the research community has accumulated overwhelming evidence for the emergence of complex and heterogeneous connectivity patterns in a wide range of biological and sociotechnical systems. The complex properties of real-world networks have a profound impact on the behavior of equilibrium and nonequilibrium phenomena occurring in various systems, and the study of epidemic spreading is central to our understanding of the unfolding of dynamical processes in complex networks. The theoretical analysis of epidemic spreading in heterogeneous networks requires the development of novel analytical frameworks, and it has produced results of conceptual and practical relevance. A coherent and comprehensive review of the vast research activity concerning epidemic processes is presented, detailing the successful theoretical approaches as well as making their limits and assumptions clear. Physicists, mathematicians, epidemiologists, computer, and social scientists share a common interest in studying epidemic spreading and rely on similar models for the description of the diffusion of pathogens, knowledge, and innovation. For this reason, while focusing on the main results and the paradigmatic models in infectious disease modeling, the major results concerning generalized social contagion processes are also presented. Finally, the research activity at the forefront in the study of epidemic spreading in coevolving, coupled, and time-varying networks is reported.



Positive selection in CD8+ T-cell epitopes of influenza nucleoprotein revealed by a comparative analysis of human and swine viral lineages

Heather M. Machkovech, Trevor Bedford, Marc A. Suchard, Jesse D. Bloom

Journal of Virology

August 26, 2015


Numerous experimental studies have demonstrated that CD8+ T-cells contribute to immunity against influenza by limiting viral replication. It is therefore surprising that rigorous statistical tests have failed to find evidence of positive selection in the epitopes targeted by CD8+ T-cells. Here we use a novel computational approach to test for selection in CD8+ T-cell epitopes. We define all epitopes in the nucleoprotein (NP) and matrix protein (M1) with experimentally identified human CD8+ T-cell responses, and then compare the evolution of these epitopes in parallel lineages of human and swine influenza that have been diverging since roughly 1918. We find a significant enrichment of substitutions that alter human CD8+ T-cell epitopes in the NP of human versus swine influenza, consistent with the idea that these epitopes are under positive selection. Furthermore, we show that epitope-altering substitutions to human influenza NP are enriched on the trunk versus the branches of the phylogenetic tree, indicating that viruses that acquire these mutations have a selective advantage. However, even in human influenza NP, sites in T-cell epitopes evolve more slowly than non-epitope sites, presumably because these epitopes are under higher inherent functional constraint. Overall, our work demonstrates that there is clear selection from CD8+ T-cells in human influenza NP, and illustrates how comparative analyses of viral lineages from different hosts can identify positive selection that is otherwise obscured by strong functional constraint.

Efficacy and effectiveness of an rVSV-vectored vaccine expressing Ebola surface glycoprotein: interim results from the Guinea ring vaccination cluster-randomised trial

Ana Maria Henao-Restrepo, Ira M Longini, Matthias Egger, Natalie E Dean, W John Edmunds, Anton Camacho, Miles W Carroll, Moussa Doumbia, Bertrand Draguez, Sophie Duraffour, Godwin Enwere, Rebecca Grais, Stephan Gunther, Stefanie Hossmann, Mandy Kader Kondé, Souleymane Kone, Eeva Kuisma, Myron M Levine, Sema Mandal, Gunnstein Norheim, Ximena Riveros, Aboubacar Soumah, Sven Trelle, Andrea S Vicari, Conall H Watson, Sakoba Kéïta, Marie Paule Kieny, John-Arne Røttingen

the Lancet

July 31, 2015


Background A recombinant, replication-competent vesicular stomatitis virus-based vaccine expressing a surface glycoprotein of Zaire Ebolavirus (rVSV-ZEBOV) is a promising Ebola vaccine candidate. We report the results of an interim analysis of a trial of rVSV-ZEBOV in Guinea, west Africa.



The ring vaccination trial: a novel cluster randomised controlled trial design to evaluate vaccine efficacy and effectiveness during outbreaks, with special reference to Ebola

Anton Camacho, Miles W Carroll, Natalie E Dean, Moussa Doumbia, W John Edmunds, Matthias Egger, Godwin Enwere, Yper Hall, Ana Maria Henao-Restrepo, Stefanie Hossman, Sakoba Keita, Mandy Kader Kondé, Ira M Longini, Sema Mandal, Gunnstein Norheim, Ximena Riveros, John-Arne Røttingen, Sven Trelle, Andrea S Vicari, Sara V Watle, Conall H Watson


July 27, 2015


A World Health Organization expert meeting on Ebola vaccines proposed urgent safety and efficacy studies in response to the outbreak in West Africa. One approach to communicable disease control is ring vaccination of individuals at high risk of infection due to their social or geographical connection to a known case. This paper describes the protocol for a novel cluster randomised controlled trial design which uses ring vaccination.

In the Ebola ça suffit ring vaccination trial, rings are randomised 1:1 to (a) immediate vaccination of eligible adults with single dose vaccination or (b) vaccination delayed by 21 days. Vaccine efficacy against disease is assessed in participants over equivalent periods from the day of randomisation. Secondary objectives include vaccine effectiveness at the level of the ring, and incidence of serious adverse events.

Ring vaccination trials are adaptive, can be run until disease elimination, allow interim analysis, and can go dormant during inter-epidemic periods.

Masking of antigenic epitopes by antibodies shapes the humoral immune response to influenza

Veronika I. ZarnitsynaAli H. EllebedyCarl Davis,  Joshy JacobRafi AhmedRustom Antia

Philosophical Transactions B

July 20, 2015


The immune responses to influenza, a virus that exhibits strain variation, show complex dynamics where prior immunity shapes the response to the subsequent infecting strains. Original antigenic sin (OAS) describes the observation that antibodies to the first encountered influenza strain, specifically antibodies to the epitopes on the head of influenza's main surface glycoprotein, haemagglutinin (HA), dominate following infection with new drifted strains. OAS suggests that responses to the original strain are preferentially boosted. Recent studies also show limited boosting of the antibodies to conserved epitopes on the stem of HA, which are attractive targets for a ‘universal vaccine’. We develop multi-epitope models to explore how pre-existing immunity modulates the immune response to new strains following immunization. Our models suggest that the masking of antigenic epitopes by antibodies may play an important role in describing the complex dynamics of OAS and limited boosting of antibodies to the stem of HA. Analysis of recently published data confirms model predictions for how pre-existing antibodies to an epitope on HA decrease the magnitude of boosting of the antibody response to this epitope following immunization. We explore strategies for boosting of antibodies to conserved epitopes and generating broadly protective immunity to multiple strains.

Efficient Transition Probability Computation for Continuous-Time Branching Processes via Compressed Sensing

Jason Xu, Vladimir N. Minin


July, 2015


Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computingtransition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes.


Crossing the scale from within-host infection dynamics to between-host transmission fitness: a discussion of current assumptions and knowledge

Andreas Handel, Pejman Rohani

Philosophical Transactions B

July 6, 2016


The progression of an infection within a host determines the ability of a pathogen to transmit to new hosts and to maintain itself in the population. While the general connection between the infection dynamics within a host and the population-level transmission dynamics of pathogens is widely acknowledged, a comprehensive and quantitative understanding that would allow full integration of the two scales is still lacking. Here, we provide a brief discussion of both models and data that have attempted to provide quantitative mappings from within-host infection dynamics to transmission fitness. We present a conceptual framework and provide examples of studies that have taken first steps towards development of a quantitative framework that scales from within-host infections to population-level fitness of different pathogens. We hope to illustrate some general themes, summarize some of the recent advances and—maybe most importantly—discuss gaps in our ability to bridge these scales, and to stimulate future research on this important topic.

nextflu: Real-time tracking of seasonal influenza virus evolution in humans

Richard A. Neher and Trevor Bedford


June 26, 2015


Seasonal influenza viruses evolve rapidly, allowing them to evade immunity in their human hosts and reinfect previously infected individuals. Similarly, vaccines against seasonal influenza need to be updated frequently to protect against an evolving virus population. We have thus developed a processing pipeline and browser-based visualization that allows convenient exploration and analysis of the most recent influenza virus sequence data. This web-application displays a phylogenetic tree that can be decorated with additional information such as the viral genotype at specific sites, sampling location and derived statistics that have been shown to be predictive of future virus dynamics. In addition, mutation, genotype and clade frequency trajectories are calculated and displayed.

Availability and implementation

Python and Javascript source code is freely available from https://github.com/blab/nextflu, while the web-application is live at http://nextflu.org.

One versus two doses: What is the best use of vaccine in an influenza pandemic?

Laura Matrajt, Tom Britton, M. Elizabeth Halloran, Ira M. Longini Jr


June 22, 2015


Avian influenza A (H7N9), emerged in China in April 2013, sparking fears of a new, highly pathogenic, influenza pandemic. In addition, avian influenza A (H5N1) continues to circulate and remains a threat. Currently, influenza H7N9 vaccines are being tested to be stockpiled along with H5N1 vaccines. These vaccines require two doses, 21 days apart, for maximal protection. We developed a mathematical model to evaluate two possible strategies for allocating limited vaccine supplies: a one-dose strategy, where a larger number of people are vaccinated with a single dose, or a two-dose strategy, where half as many people are vaccinated with two doses. We prove that there is a threshold in the level of protection obtained after the first dose, below which vaccinating with two doses results in a lower illness attack rate than with the one-dose strategy; but above the threshold, the one-dose strategy would be better. For reactive vaccination, we show that the optimal use of vaccine depends on several parameters, with the most important one being the level of protection obtained after the first dose. We describe how these vaccine dosing strategies can be integrated into effective pandemic control plans.

Unraveling the Transmission Ecology of Polio

Micaela Martinez-Bakker, Aaron A. King, Pejman Rohani

PLOS Biology

June 19, 2015


Sustained and coordinated vaccination efforts have brought polio eradication within reach. Anticipating the eradication of wild poliovirus (WPV) and the subsequent challenges in preventing its re-emergence, we look to the past to identify why polio rose to epidemic levels in the mid-20th century, and how WPV persisted over large geographic scales. We analyzed an extensive epidemiological dataset, spanning the 1930s to the 1950s and spatially replicated across each state in the United States, to glean insight into the drivers of polio’s historical expansion and the ecological mode of its persistence prior to vaccine introduction. We document a latitudinal gradient in polio’s seasonality. Additionally, we fitted and validated mechanistic transmission models to data from each US state independently. The fitted models revealed that: (1) polio persistence was the product of a dynamic mosaic of source and sink populations; (2) geographic heterogeneity of seasonal transmission conditions account for the latitudinal structure of polio epidemics; (3) contrary to the prevailing “disease of development” hypothesis, our analyses demonstrate that polio’s historical expansion was straightforwardly explained by demographic trends rather than improvements in sanitation and hygiene; and (4) the absence of clinical disease is not a reliable indicator of polio transmission, because widespread polio transmission was likely in the multiyear absence of clinical disease. As the world edges closer to global polio eradication and continues the strategic withdrawal of the Oral Polio Vaccine (OPV), the regular identification of, and rapid response to, these silent chains of transmission is of the utmost importance.

Ebola Virus Epidemiology, Transmission, and Evolution during Seven Months in Sierra Leone

Daniel J. Park, Gytis Dudas, Shirlee Wohl, Augustine Goba, Shannon L.M. Whitmer, Kristian G. Andersen,  Rachel S. Sealfon, Jason T. Ladner, Jeffrey R. Kugelman, Christian B. Matranga, Sarah M. Winnicki, James Qu, Stephen K. Gire, Adrianne Gladden-Young, Simbirie Jalloh, Dolo Nosamiefan, Nathan L. Yozwiak, Lina M. Moses, Pan-Pan Jiang, Aaron E. Lin, Stephen F. Schaffner, Brian Bird, Jonathan Towner, Mambu Mamoh, Michael Gbakie, Lansana Kanneh, David Kargbo, James L.B. Massally, Fatima K. Kamara, Edwin Konuwa, Josephine Sellu, Abdul A. Jalloh, Ibrahim Mustapha, Momoh Foday, Mohamed Yillah, Bobbie R. Erickson, Tara Sealy, Dianna Blau, Christopher Paddock, Aaron Brault, Brian Amman, Jane Basile, Scott Bearden, Jessica Belser, Eric Bergeron, Shelley Campbell, Ayan Chakrabarti, Kimberly Dodd, Mike Flint, Aridth Gibbons, Christin Goodman, John Klena, Laura McMullan, Laura Morgan, Brandy Russell, Johanna Salzer, Angela Sanchez, David Wang, Irwin Jungreis, Christopher Tomkins-Tinch, Andrey Kislyuk, Michael F. Lin, Sinead Chapman, Bronwyn MacInnis, Ashley Matthews, James Bochicchio, Lisa E. Hensley, Jens H. Kuhn, Chad Nusbaum, John S. Schieffelin, Bruce W. Birren, Marc Forget, Stuart T. Nichol, Gustavo F. Palacios, Daouda Ndiaye, Christian Happi, Sahr M. Gevao, Mohamed A. Vandi, Brima Kargbo, Edward C. Holmes, Trevor Bedford, Andreas Gnirke, Ute Ströher, Andrew Rambaut, Robert F. Garry, Pardis C. Sabeti


June 18, 2015


The 2013–2015 Ebola virus disease (EVD) epidemic is caused by the Makona variant of Ebola virus (EBOV). Early in the epidemic, genome sequencing provided insights into virus evolution and transmission and offered important information for outbreak response. Here, we analyze sequences from 232 patients sampled over 7 months in Sierra Leone, along with 86 previously released genomes from earlier in the epidemic. We confirm sustained human-to-human transmission within Sierra Leone and find no evidence for import or export of EBOV across national borders after its initial introduction. Using high-depth replicate sequencing, we observe both host-to-host transmission and recurrent emergence of intrahost genetic variants. We trace the increasing impact of purifying selection in suppressing the accumulation of nonsynonymous mutations over time. Finally, we note changes in the mucin-like domain of EBOV glycoprotein that merit further investigation. These findings clarify the movement of EBOV within the region and describe viral evolution during prolonged human-to-human transmission.

Synonymous and nonsynonymous distances help untangle convergent evolution and recombination

Peter B. Chi, Sujay Chattopadhyay, Philippe Lemey, Evgeni V. Sokurenko, Vladimir N. Minin

Statistical Applications in Genetics and Molecular Biology

June 10, 2015


When estimating a phylogeny from a multiple sequence alignment, researchers often assume the absence of recombination. However, if recombination is present, then tree estimation and all downstream analyses will be impacted, because different segments of the sequence alignment support different phylogenies. Similarly, convergent selective pressures at the molecular level can also lead to phylogenetic tree incongruence across the sequence alignment. Current methods for detection of phylogenetic incongruence are not equipped to distinguish between these two different mechanisms and assume that the incongruence is a result of recombination or other horizontal transfer of genetic information. We propose a new recombination detection method that can make this distinction, based on synonymous codon substitution distances. Although some power is lost by discarding the information contained in the nonsynonymous substitutions, our new method has lower false positive probabilities than the comparable recombination detection method when the phylogenetic incongruence signal is due to convergent evolution. We apply our method to three empirical examples, where we analyze: (1) sequences from a transmission network of the human immunodeficiency virus, (2) tlpB gene sequences from a geographically diverse set of 38 Helicobacter pylori strains, and (3) hepatitis C virus sequences sampled longitudinally from one patient.

Global circulation patterns of seasonal influenza viruses vary with antigenic drift

Trevor Bedford, Steven Riley, Ian G. Barr, Shobha Broor, Mandeep Chadha, Nancy J. Cox, Rodney S. Daniels, C. Palani Gunasekaran, Aeron C. Hurt, Anne Kelso, Alexander Klimov, Nicola S. Lewis, Xiyan Li, John W. McCauley, Takato Odagiri, Varsha Potdar, Andrew Rambaut, Yuelong Shu, Eugene Skepner, Derek J. Smith, Marc A. Suchard, Masato Tashiro, Dayan Wang, Xiyan Xu, Philippe Lemey, Colin A. Russell


June 8, 2015

Understanding the spatiotemporal patterns of emergence and circulation of new human seasonal influenza virus variants is a key scientific and public health challenge. The global circulation patterns of influenza A/H3N2 viruses are well characterized but the patterns of A/H1N1 and B viruses have remained largely unexplored. Here we show that the global circulation patterns of A/H1N1 (up to 2009), B/Victoria, and B/Yamagata viruses differ substantially from those of A/H3N2 viruses, on the basis of analyses of 9,604 haemagglutinin sequences of human seasonal influenza viruses from 2000 to 2012. Whereas genetic variants of A/H3N2 viruses did not persist locally between epidemics and were reseeded from East and Southeast Asia, genetic variants of A/H1N1 and B viruses persisted across several seasons and exhibited complex global dynamics with East and Southeast Asia playing a limited role in disseminating new variants. The less frequent global movement of influenza A/H1N1 and B viruses coincided with slower rates of antigenic evolution, lower ages of infection, and smaller, less frequent epidemics compared to A/H3N2 viruses. Detailed epidemic models support differences in age of infection, combined with the less frequent travel of children, as probable drivers of the differences in the patterns of global circulation, suggesting a complex interaction between virus evolution, epidemiology, and human behaviour.

Dynamics of Pertussis Transmission in the United States

F. M. G. Magpantay and P. Rohani

American Journal of Epidemiology

May 27, 2015

Past patterns of infectious disease transmission set the stage on which modern epidemiologic dynamics are played out. Here, we present a comprehensive account of pertussis (whooping cough) transmission in the United States during the early vaccine era. We analyzed recently digitized weekly incidence records from Morbidity and Mortality Weekly Reports from 1938 to 1955, when the whole-cell pertussis vaccine was rolled out, and related them to contemporary patterns of transmission and resurgence documented in monthly incidence data from the National Notifiable Diseases Surveillance System. We found that, during the early vaccine era, pertussis epidemics in US states could be categorized as 1) annual, 2) initially annual and later multiennial, or 3) multiennial. States with predominantly annual cycles tended to have higher per capita birth rates, more household crowding, more children per family, and lower rates of school attendance than the states with multiennial cycles. Additionally, states that exhibited annual epidemics during 1938–1955 have had the highest recent (2001–2010) incidence, while those states that transitioned from annual cycles to multiennial cycles have had relatively low recent incidence. Our study provides an extensive picture of pertussis epidemiology in the United States dating back to the onset of vaccination, a back-story that could aid epidemiologists in understanding contemporary transmission patterns.

Software for the analysis and visualization of deep mutational scanning data

Jesse D. Bloom

BMC Bioinformatics

May 20, 2015


Background: Deep mutational scanning is a technique to estimate the impacts of mutations on a gene by using deep sequencing to count mutations in a library of variants before and after imposing a functional selection. The impacts of mutations must be inferred from changes in their counts after selection.

Results: I describe a software package, dms_tools, to infer the impacts of mutations from deep mutational scanning data using a likelihood-based treatment of the mutation counts. I show that dms_tools yields more accurate inferences on simulated data than simply calculating ratios of counts pre- and post-selection. Using dms_tools, one can infer the preference of each site for each amino acid given a single selection pressure, or assess the extent to which these preferences change under different selection pressures. The preferences and their changes can be intuitively visualized with sequence-logo-style plots created using an extension to weblogo.

Conclusions: dms_tools implements a statistically principled approach for the analysis and subsequent visualization of deep mutational scanning data.

Keywords: Deep mutational scanning, Sequence logo, Amino-acid preferences