Beta Phase: Square45 is currently in beta testing. Expect some features or content to be incomplete or missing.
45

Phylogenetic Analysis

The study of evolutionary relationships among organisms based on their genetic similarities and differences, utilizing genomic data to construct phylogenetic trees.
📜

The statement of the theorem

Let S1,S2,,SNS_1, S_2, \dots, S_N be the sequences of NN species, each of length LL. Define the pairwise distance d(Si,Sj)d(S_i, S_j) using a substitution model M\mathcal{M} (e.g., Jukes-Cantor or Kimura 2-parameter) based on the observed differences DijD_{ij} at each site ll: d(Si,Sj)=1Ll=1L(112c{A,T,C,G}pc(l))d(S_i, S_j) = \frac{1}{L} \sum_{l=1}^{L} \left(1 - \frac{1}{2} \sum_{c \in \{A, T, C, G\}} p_{c}(l) \right) where pc(l)p_{c}(l) is the probability of character cc at site ll under model M\mathcal{M}. The phylogenetic tree T\mathcal{T} is then optimized by maximizing the likelihood function L(TS1,,SN)=i=1NL(SiT)\mathcal{L}(\mathcal{T} | S_1, \dots, S_N) = \prod_{i=1}^{N} \mathcal{L}(S_i | \mathcal{T}), typically using methods like Maximum Likelihood or Neighbor-Joining.