Publication Details
digIS: Towards detecting distant and putative novel insertion sequences in prokaryotic genomes
IS elements, mobile element, profile HMM, prokaryotic genomes, genome annotation
The insertion sequence elements (IS elements) represent the smallest and the most
abundant mobile elements in prokaryotic genomes. It has been shown that they play
a significant role in genome organization and evolution. To better understand
their function in the host genome, it is desirable to have an effective detection
and annotation tool. This need becomes even more crucial when considering
rapid-growing genomic and metagenomic data. The existing tools for IS elements
detection and annotation are usually based on comparing sequence similarity with
a database of known IS families. Thus, they have limited ability to discover
distant and putative novel IS elements. In this paper, we present digIS,
a software tool based on profile hidden Markov models assembled from catalytic
domains of transposases. It shows a very good performance in detecting known IS
elements when tested on datasets with manually curated annotation. The main
contribution of digIS is in its ability to detect distant and putative novel IS
elements while maintaining a moderate level of false positives. In this category
it outperforms existing tools, especially when tested on large datasets of
archaeal and bacterial genomes. We provide digIS, a software tool using a novel
approach based on manually curated profile hidden Markov models, which is able to
detect distant and putative novel IS elements. Although digIS can find known IS
elements as well, we expect it to be used primarily by scientists interested in
finding novel IS elements. The tool is available at
https://github.com/janka2012/digIS.
@article{BUT168521,
author="Janka {Puterová} and Tomáš {Martínek}",
title="digIS: Towards detecting distant and putative novel insertion sequences in prokaryotic genomes",
journal="BMC BIOINFORMATICS",
year="2021",
volume="22",
number="258",
pages="1--20",
doi="10.1186/s12859-021-04177-6",
issn="1471-2105",
url="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-021-04177-6#article-info"
}