A beginners guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genome

Daniel C. Jeffares, Bartłomiej Tomiczek, Victor Sojo, Mario dos Reis

Research output: Contribution to journalArticlepeer-review

Abstract

The ratio of non-synonymous to synonymous substitutions (dN/dS) is a useful measure of the strength and mode of natural selection acting on protein-coding genes. It is widely used to study patterns of selection on protein genes on a genomic scale—from the small genomes of viruses, bacteria, and parasitic eukaryotes to the largest eukaryotic genomes. In this chapter we describe all the steps necessary to calculate the dN/dS of all the genes using at least two genomes. We include a brief discussion on assigning orthologs, and of codon-aware alignment of orthologs. We then describe how to use the CODEML program of the PAML package for phylogenetic analysis to calculate the dN/dS and how to perform some statistical tests for positive selection. We then outline some methods for interpreting output and describe how one may use this data to make discoveries about the biology of your species. Finally, as a worked example we show all the steps we used to calculate dN/dS for 3,261 orthologs from six Plasmodium species, including tests for adaptive evolution (see worked_example.pdf).

Original languageEnglish
Pages (from-to)65-90
Number of pages26
JournalMethods in Molecular Biology
Volume1201
DOIs
Publication statusPublished - 2015

Keywords

  • Adaptive evolution
  • CODEML
  • dN/dS
  • Evolutionary rate
  • Malaria
  • PAML
  • Plasmodium
  • Synonymous/non-synonymous rate ratio

Cite this