Beta Phase: Square45 is currently in beta testing. Expect some features or content to be incomplete or missing.
45

Comparative Genomics

The analysis of genomic differences between two or more species to understand evolutionary processes, identify conserved regions, and predict gene function.
📜

The statement of the theorem

Let SAS_A and SBS_B be two genomic sequences. An alignment A\mathcal{A} is a set of pairs of sequences (SA,SB)(S'_A, S'_B) of equal length LL', derived from SAS_A and SBS_B by introducing gaps. The objective is to find the optimal alignment A\mathcal{A}^* that maximizes the total score Score(A)Score(\mathcal{A}): Score(A)=i=1L(W(SA[i],SB[i])G(SA[i],SB[i]))Score(\mathcal{A}) = \sum_{i=1}^{L'} (W(S'_A[i], S'_B[i]) - G(S'_A[i], S'_B[i])) where W(,)W(\cdot, \cdot) is the substitution matrix score (e.g., BLOSUM) for matching characters, and G(,)G(\cdot, \cdot) is the gap penalty function, typically defined as G(a,b)=max(gopen,gextend)δgapG(a, b) = \max(g_{open}, g_{extend}) \cdot \delta_{gap}. This maximization is solved using dynamic programming (e.g., Needleman-Wunsch or Smith-Waterman algorithms).
Source: Wikipedia