Beta Phase: Square45 is currently in beta testing. Expect some features or content to be incomplete or missing.
45

Protein Structure

The three-dimensional arrangement of atoms in an amino acid-chain molecule.

Sequence of Expressions

Define the hydrogen bond potential VHbondV_{H-bond} between a donor (D), a hydrogen (H), and an acceptor (A) based on distance rDAr_{DA} and the angle θDHA\theta_{DHA}. A common empirical form is: VHbond=ϵHbond(1rDA)(cos(θDHAθ0)1)exp(k(rDAr0)2).V_{H-bond} = \epsilon_{H-bond} \left( \frac{1}{r_{DA}} \right) \left( \cos(\theta_{DHA} - \theta_0) - 1 \right) \exp\left( -k (r_{DA} - r_0)^2 \right). The total H-bond energy EHbondE_{H-bond} is the sum over all potential donor-acceptor pairs: EHbond=pairsVHbond.E_{H-bond} = \sum_{\text{pairs}} V_{H-bond}.
Let the local backbone geometry of a residue ii be defined by the dihedral angles (ϕi,ψi)(\phi_i, \psi_i). The classification of secondary structure is based on the allowed regions in the Ramachandran plot. Define the structural motif S\mathcal{S} as the set of allowed (ϕ,ψ)(\phi, \psi) pairs: \n\nS=SαSβSrandom\mathcal{S} = \mathcal{S}_{\alpha} \cup \mathcal{S}_{\beta} \cup \mathcal{S}_{\text{random}} \n\nwhere Sα\mathcal{S}_{\alpha} is the region corresponding to α\alpha-helices (e.g., 60<ϕ<30,70<ψ<30-60^{\circ} < \phi < -30^{\circ}, -70^{\circ} < \psi < -30^{\circ}), Sβ\mathcal{S}_{\beta} is the region corresponding to β\beta-strands (e.g., 120<ϕ<60,10<ψ<140-120^{\circ} < \phi < -60^{\circ}, 10^{\circ} < \psi < 140^{\circ}), and Srandom\mathcal{S}_{\text{random}} encompasses the remaining, less constrained regions.
Consider an amino acid residue ii with a stereocenter at the α\alpha-carbon, Cα\mathbf{C}_{\alpha}. The chirality is determined by the spatial arrangement of the four distinct substituents (amino group, carboxyl group, side chain RiR_i, and backbone N\text{N}). The stereochemistry is quantified by the absolute configuration, typically assigned using the Cahn-Ingold-Prelog (CIP) rules, yielding the RR or SS designation. Mathematically, this requires defining the handedness of the local coordinate system (v1,v2,v3)(\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3) formed by the bonds emanating from Cα\mathbf{C}_{\alpha}: \n\nChirality=sgn(v1(v2×v3))\text{Chirality} = \text{sgn}(\mathbf{v}_1 \cdot (\mathbf{v}_2 \times \mathbf{v}_3)) \n\nFor biological proteins, the overwhelming preference is for the L-amino acid configuration, corresponding to a specific, consistent sign for this scalar triple product.
Define the pairwise interaction potential VvdW(rij)V_{vdW}(r_{ij}) between two non-bonded atoms ii and jj separated by distance rijr_{ij} using the Lennard-Jones potential: VvdW(rij)=4ϵij[(σijrij)12(σijrij)6].V_{vdW}(r_{ij}) = 4\epsilon_{ij} \left[ \left( \frac{\sigma_{ij}}{r_{ij}} \right)^{12} - \left( \frac{\sigma_{ij}}{r_{ij}} \right)^{6} \right]. The total non-bonded interaction energy EvdWE_{vdW} is the summation over all pairs of atoms: EvdW=i<jVvdW(rij).E_{vdW} = \sum_{i < j} V_{vdW}(r_{ij}).
Let ϕ\phi and ψ\psi be the dihedral angles defining the backbone conformation of an amino acid residue. The allowed conformational space is restricted by steric hindrance and local energy minima. The Ramachandran plot defines the allowed region R\mathcal{R} in the (ϕ,ψ)(\phi, \psi) plane: R={(ϕ,ψ)[π,π)2Esteric(ϕ,ψ)<Ecutoff and Etorsion(ϕ,ψ)<Ecutoff}.\mathcal{R} = \{(\phi, \psi) \in [-\pi, \pi)^2 \mid E_{steric}(\phi, \psi) < E_{cutoff} \text{ and } E_{torsion}(\phi, \psi) < E_{cutoff} \}. The function EstericE_{steric} accounts for atomic overlaps, and EtorsionE_{torsion} accounts for backbone bond angle strain.
Let ri\mathbf{r}_i be the position vector of the ii-th residue's Cα\text{C}\alpha atom. The geometry of an α\alpha-helix is defined by the constraints on the backbone dihedral angles (ϕi,ψi)(\phi_i, \psi_i) and the hydrogen bonding pattern C(i)HO(i+4)\text{C}(i) - \text{H} \cdots \text{O}(i+4). Specifically, the ideal geometry requires: \n\nϕi57 and ψi47\phi_i \approx -57^{\circ} \text{ and } \psi_i \approx -47^{\circ} \n\nFurthermore, the hydrogen bond constraint dictates that the distance d(C(i),O(i+4))d(\text{C}(i), \text{O}(i+4)) and the angle (C(i)H(i)O(i+4))\angle(\text{C}(i) - \text{H}(i) - \text{O}(i+4)) must approximate the ideal values for a stable hydrogen bond, ensuring a pitch P5.4 A˚P \approx 5.4 \text{ \AA} and a rise per residue h1.5 A˚h \approx 1.5 \text{ \AA}.
Define two adjacent polypeptide segments, S1S_1 and S2S_2, with backbone atoms ri(1)\mathbf{r}_{i}^{(1)} and rj(2)\mathbf{r}_{j}^{(2)}, respectively. The formation of a β\beta-sheet is stabilized by inter-strand hydrogen bonds between the backbone carbonyl oxygen O(i)\text{O}(i) and the amide proton N(j)\text{N}(j) of the adjacent strand. The stability is maximized when the potential energy EH-bondE_{\text{H-bond}} is minimized, subject to the geometric constraints: \n\nEH-bond=i,j[Ad(O(i),N(j))2Bd(O(i),N(j))]+Angle PenaltyE_{\text{H-bond}} = \sum_{i, j} \left[ \frac{A}{d(\text{O}(i), \text{N}(j))^2} - \frac{B}{d(\text{O}(i), \text{N}(j))} \right] + \text{Angle Penalty} \n\nwhere d(,)d(\cdot, \cdot) is the distance between atoms, and the angle penalty enforces the near-planarity and optimal dihedral angles characteristic of the extended β\beta-strand conformation.
Let ri\mathbf{r}_i be the position vector of the ii-th atom (where i=1,2,,Ni=1, 2, \dots, N) in the protein structure. The center of mass RCM\mathbf{R}_{CM} is defined as the weighted average of atomic positions: \n\nRCM=1Ni=1Nri\mathbf{R}_{CM} = \frac{1}{N} \sum_{i=1}^{N} \mathbf{r}_i \n\nThe Radius of Gyration, RgR_g, is then defined as the root mean square distance of all atoms from the center of mass: \n\nRg=1Ni=1NriRCM2R_g = \sqrt{\frac{1}{N} \sum_{i=1}^{N} ||\mathbf{r}_i - \mathbf{R}_{CM}||^2}
Let S=(a1,a2,,aN)\mathbf{S} = (a_1, a_2, \dots, a_N) be the primary sequence, where ai{A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y}a_i \in \{A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y\}. Define the conformational energy function E(r)E(\mathbf{r}) as a function of the atomic coordinates r=(r1,,rN)\mathbf{r} = (\mathbf{r}_1, \dots, \mathbf{r}_N). The folding process seeks the native state rN\mathbf{r}_N such that the free energy G(r)=E(r)TS(r)G(\mathbf{r}) = E(\mathbf{r}) - T S(\mathbf{r}) is minimized: G(rN)=minrG(r)subject to r being compatible with S.G(\mathbf{r}_N) = \min_{\mathbf{r}} G(\mathbf{r}) \quad \text{subject to } \mathbf{r} \text{ being compatible with } \mathbf{S}.
Consider the free energy change ΔGhydro\Delta G_{hydro} associated with the folding of a protein from an unfolded state (U) to a folded state (N) in an aqueous solvent. This effect is primarily driven by the increase in solvent entropy SsolventS_{solvent} upon burial of nonpolar surface area AnonpolarA_{nonpolar}: ΔGhydro=TΔSsolvent=TΔSsolventγAnonpolar,\Delta G_{hydro} = -T \Delta S_{solvent} = T \Delta S_{solvent} \approx \gamma A_{nonpolar}, where γ\gamma is the surface tension coefficient related to the nonpolar solute-water interaction, and AnonpolarA_{nonpolar} is the total nonpolar surface area buried in the core of the native state relative to the unfolded state.