Paper ID | T.4.2 | ||
Paper Title | Duplication with transposition distance to the root for q-ary strings | ||
Authors | Nikita Polyanskii, Technical University of Munich, Germany; Ilya Vorobyev, Skolkovo Institute of Science and Technology, Russia | ||
Session | T.4: Sequences | ||
Presentation | Lecture | ||
Track | Topics in Information Theory | ||
Manuscript | Click here to download the manuscript | ||
Virtual Presentation | Click here to watch in the Virtual Symposium | ||
Abstract | We study the duplication with transposition distance between strings of length $n$ over a $q$-ary alphabet and their roots. In other words, we investigate the number of duplication operations of the form $x = (abcd) \to y = (abcbd)$, where $x$ and $y$ are strings and $a$, $b$, $c$ and $d$ are their substrings, needed to get a $q$-ary string of length $n$ starting from the set of strings without duplications. For exact duplication, we prove that the maximal distance between a string of length at most $n$ and its root has the asymptotic order $n/\log n$. For approximate duplication, where a $\beta$-fraction of symbols may be duplicated incorrectly, we show that the maximal distance has a sharp transition from the order $n/\log n$ to $\log n$ at $\beta=(q-1)/q$. The motivation for this problem comes from genomics, where such duplications represent a special kind of mutation and the distance between a given biological sequence and its root is the smallest number of transposition mutations required to generate the sequence. |
Plan Ahead
2021 IEEE International Symposium on Information Theory
11-16 July 2021 | Melbourne, Victoria, Australia