Benchmarking
2022

SequenceLab

SequenceLab: A Comprehensive Benchmark of Computational Methods for Comparing Genomic Sequences

Maximilian-David Rumpf1,#, Mohammed Alser1,2,#, Arvid E. Gollwitzer2,*,#,  Joël Lindegger2, Nour Almadhoun2, Can Firtina2,*, Serghei Mangul3, Onur Mutlu1,2,* 

1Department of Computer Science, ETH Zürich, 8092 Zürich, Switzerland

2Department of Information Technology and Electrical Engineering, ETH Zürich, 8092 Zürich, Switzerland

3Department of Clinical Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA

#These authors contributed equally.

*Corresponding author. Department of Information Technology and Electrical Engineering, ETH Zurich, Gloriastrasse 35, 8092 Zurich, Switzerland.

E-mail: arvidg@ethz.ch (A. E. G.), firtinac@ethz.ch (C. F.), omutlu@ethz.ch (O. M.)

Abstract

Computational complexity is a key limitation of genomic analyses. Thus, over the last 30 years, researchers have proposed numerous fast heuristic methods that provide computational relief. Comparing genomic sequences is one of the most fundamental computational steps in most genomic analyses. Due to its high computational complexity, optimized exact and heuristic algorithms are still being developed. We find that these methods are highly sensitive to the underlying data, its quality, and various hyperparameters. Despite their wide use, no in-depth analysis has been performed, potentially falsely discarding genetic sequences from further analysis and unnecessarily inflating computational costs. We provide the first analysis and benchmark of this heterogeneity. We deliver an actionable overview of the 11 most widely used state-of-the-art methods for comparing genomic sequences. We also inform readers about their advantages and downsides using thorough experimental evaluation and different real datasets from all major manufacturers (i.e., Illumina, ONT, and PacBio). SequenceLab is publicly available at https://github.com/CMU-SAFARI/SequenceLab.

DOI

https://arxiv.org/abs/2310.16908

Collaborate, Feedback, Questions?

Thanks for joining our newsletter.
Oops! Something went wrong.