top of page

Data Structures to Represent a Set of k-long DNA Sequences

  • oueb70
  • Apr 1, 2021
  • 1 min read

Updated: Nov 22, 2022

ACM Computing Surveys

Chikhi R, Holub J, Medvedev P The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k-mers, which are short fixed-length strings present in a dataset. While these approaches are rather diverse, storing and querying a k-mer set has emerged as a shared underlying component. A set of k-mers has unique features and applications that, over the past 10 years, have resulted in many specialized approaches for its representation. In this survey, we give a unified presentation and comparison of the data structures that have been proposed to store and query a k-mer set. We hope this survey will serve as a resource for researchers in the field as well as make the area more accessible to researchers outside the field.



Recent Posts

See All
The extended mobility of plasmids

Nucleic Acid Research Maria Pilar Garcillán-Barcia, Fernando de la Cruz, Eduardo P C Rocha Summary Plasmids play key roles in the...

 
 
 

Comments


bottom of page