Genome Sequencing
The process of determining the complete DNA sequence of an organism's genome, typically using high-throughput sequencing technologies. \frac{Reads}{GenomeSize} represents sequencing accuracy.
📜
The statement of the theorem
Let be the set of short reads, where each is a sequence of length . Let be the unknown reference genome sequence. The goal is to reconstruct by solving the assembly problem, often modeled via a de Bruijn graph , where are k-mers and represents overlaps. The estimated genome size is . The sequencing accuracy is defined by the ratio of observed reads to the expected genome size: \nwhere is the total number of reads, and is the length of the reconstructed genome.
Source: Wikipedia