Error correction with nanopore sequencing

by Anisha Banerjee (Technical University of Munich)

2025-06-09 14:00 (60 min) in USB 2.022

Among the many benefits and societal threats accompanying the digital age, we wish to focus on a particularly nasty problem: our ever-increasing data storage requirement. While DNA data storage seems to be a promising solution to this, further work must be done to make the writing and readout processes more practical and commercially viable while maintaining accuracy. A specific readout or sequencing technology that has recently turned many heads, including ours, is nanopore sequencing by Oxford Nanopore due to its support for longer reads and portability. However, its physical operation is plagued with multiple noise sources, implying that coding theorists might need to step in.

In this talk, we focus on nanopore sequencing, specifically the basics of its operation and what makes it noisy. Since the road most taken to making any synthesis/sequencing technology usable for DNA data storage applications is riddled with many insertion, deletion and substitution errors, we consider an alternative approach, namely to embrace the noise introduced by the nanopore sequencing channel. We observe how in a theoretical setting, incorporating a simplified model of the channel into the design of error-correcting codes can make such codes more efficient in comparison to those designed naively. An approach to how one might do the same in a probabilistic setting on the signal level is also demonstrated.