Sequence and HMM libraries

  • Eddy Elisee (Creator)
  • Laurine Ducrot (Creator)
  • Raphaël Méheust (Creator)
  • Karine Bastard (Creator)
  • Aurelie Fossey-Jouenne (Creator)
  • Eric Pelletier (Creator)
  • Jean-Louis Petit (Creator)
  • Mark Stam (Creator)
  • Gideon James Grogan (Creator)
  • Veronique de Berardinis (Creator)
  • Anne Zaparucha (Creator)
  • David Vallenet (Creator)
  • Carine Vergne-Vaxelaire (Creator)



This repository contains the data specified in the paper entitled "A refined picture of the native Amine Dehydrogenase family revealed by extensive biodiversity screening". It includes: 1. NAD-dependent_enzymes.fa.gz - The library of 20,315,745 sequences of NADPH-dependent enzymes recovered from genomic and metagenomic sequence databases. 2. ref-AmDHs17959_nr.fa.gz - The library of 17,959 ref-AmDH sequences recovered from genomic and metagenomic sequence databases. Considered as the updated nat-AmDH family. 3. NAD_subfams_HMMs.tar.gz - The library of 104,686 Hidden Markov Models (HMMs) of NADPH-dependent protein subfamilies. As described in the paper, those HMMs were obtained by clustering the set uploaded here as NAD-dependent_enzymes.fa.gz and building one HMM per subfamily. 4. ref-AmDHs_HMMs.tar.gz - This repertory includes the HMMs used to update the nat-AmDH family (all_ASMC_no_nad,hmm and all_ASMC_nad_dom.hmm) as well as the ones used to search for distant homologs; HMMs designed for the phylogenetic and structure-based groups built from the set uploaded here as ref-AmDHs17959_nr.fa.gz (asmc_*.hmm and phylo_*.hmm) . 5. ref-AmDH_ASMC_models.tar.gz - The library of 9886 ref-AmDH models built using the ASMC pipeline. 6. 72_ref-AmDH_seqs_representatives.txt.gz - Sequences of the 72 representative ref-AmDHs experimentally tested and found to be active. 7. 17_nat-AmDH_seqs_specific_feature.txt.gz - Sequences of the 17 nat-AmDHs with specific feature that have been heterologously expressed and tested.

External deposit with Zenodo
Date made available12 Sept 2023

Cite this