Key Equations
Stokes-Einstein:\(D = \frac{k_B T}{6\pi\eta R_h}\)
Hydrodynamic radius (spherical approx.):
\(R_h \approx 0.066 \cdot MW^{0.333}\text{ (nm)}\)
Sedimentation coefficient:
\(s = \frac{M(1-\bar{v}\rho)}{N_A f}\)
Real-time calculation of net charge, hydrodynamic radius Rh, Stokes-Einstein diffusion coefficient D, sedimentation coefficient s, and SDS-PAGE band estimate from molecular weight, pH, and temperature.
The core model treats the protein as a sphere moving through a viscous fluid (the buffer). The hydrodynamic radius ($R_h$) is its effective spherical size. The fundamental relationship between size and diffusion is given by the Stokes-Einstein equation:
$$D = \frac{k_B T}{6\pi\eta R_h}$$Where $D$ is the diffusion coefficient (m²/s), $k_B$ is Boltzmann's constant, $T$ is absolute temperature (K), $\eta$ is the solvent viscosity (Pa·s), and $R_h$ is the hydrodynamic radius (m). This equation shows that larger proteins (bigger $R_h$) diffuse more slowly.
Since we often only know the molecular weight (MW), we use an empirical scaling law to estimate $R_h$ for a typical globular, folded protein:
$$R_h \approx 0.066 \cdot MW^{0.333}\text{ (nm)}$$Here, $MW$ is in Daltons (Da), and $R_h$ is in nanometers. The exponent 1/3 (or 0.333) comes from the volume scaling with mass. This is a simplification—unfolded or elongated proteins will have a larger $R_h$ for the same MW.
Chromatography Method Development: The calculated isoelectric point (pI) and net charge vs. pH profile are used to select the optimal pH and resin type for ion-exchange chromatography, a critical protein purification step in biopharmaceutical manufacturing.
Analytical Ultracentrifugation (AUC): The predicted sedimentation coefficient helps in designing AUC experiments and interpreting data to determine protein homogeneity, aggregation state, and binding interactions in solution.
SDS-PAGE Experiment Planning: The estimated apparent molecular weight on an SDS-PAGE gel allows researchers to select the correct percentage of polyacrylamide gel and to identify their protein band among standards, a daily task in molecular biology labs.
Drug Formulation & Stability: Understanding how diffusion and effective size change with temperature or buffer conditions is vital for formulating stable biologic drugs (like antibodies) and predicting their behavior during storage and delivery.
Here are a few points that experimental researchers often stumble upon when starting to use this tool. First, understand that "the calculated pI is not an absolute value." The tool calculates using the "standard" pKa values for each amino acid. However, in actual proteins, the pKa of side chains can deviate significantly from these values due to influences from the local electric field or hydrophobic environment. For example, glutamic acid 35 in lysozyme has a pKa elevated to about 6 due to its unique environment. Therefore, it's not uncommon for calculated and measured values to differ by 0.5 to 1.0 pH units. While it's powerful as a guide, always confirm final experimental conditions with preliminary tests.
Next, remember that "the estimated SDS-PAGE position is just a guideline." This calculation assumes a standard globular protein. But real proteins can migrate slower due to post-translational modifications (like phosphorylation or glycosylation) or show anomalous migration, like membrane proteins. For instance, a heavily glycosylated protein with a calculated molecular weight of 70 kDa might show a band around 100 kDa on SDS-PAGE. Think of the tool's output as the theoretical position "if the protein behaved ideally in a denatured state."
Finally, make good use of the fact that "the hydrodynamic radius Rh reflects structural information." The tool calculates it from the molecular weight using a simple empirical formula $R_h \approx 0.066 \cdot MW^{1/3}$. If the Rh you measure experimentally (e.g., via dynamic light scattering) is significantly larger than this calculated value, it could be a sign that the protein is aggregated or has adopted an unfolded, expanded structure. Conversely, if it's smaller than the calculated value, the structure is likely very compact. Comparing calculated and measured values gives you qualitative information about your sample's state.
The calculations performed by this tool can be considered the "entry point to multiphysics simulation" in the world of CAE. This is because understanding protein behavior requires simultaneously considering both electrical properties (pI, charge) and physical properties (size, diffusion).
The most direct application is in bioprocess engineering. In ion-exchange chromatography for purifying protein therapeutics, the difference in pI between the target protein and impurities is leveraged. Looking at the titration curve calculated by this tool allows you to formulate a broad strategy for "at which pH the target protein binds to the column and at which pH it elutes." Furthermore, calculating the sedimentation coefficient (s) aids in designing conditions for centrifugation or sedimentation velocity analysis. For example, you can determine that a higher centrifugal force is needed to precipitate a protein with a small s value (= more diffusive).
Taking it a step further connects to the design of microfluidics (lab-on-a-chip) and drug delivery systems. The diffusion coefficient (D) output by this tool becomes a crucial input parameter for predicting protein diffusion and adsorption within microchannels. Also, when loading proteins onto nanoparticles, efficiency suffers if their surface charges (dependent on pI) repel each other. Information on charge is indispensable for preliminary consideration of such interfacial and colloid chemistry problems.
If you're interested in the results from this tool, next try delving deeper into the "model" behind it. As a first step, investigate "why each amino acid has a defined pKa value?" This boils down to organic chemistry's acid-base equilibria and the chemical structures of amino acid side chains. The first step is memorizing the "properties of amino acids" table from a textbook, along with their pKa values.
Regarding the mathematical background, the tool finds the pI by solving a numerical solution of a nonlinear equation. Finding the pH where the net charge sums to zero is the problem of finding the root (zero point) of the function $f(pH)=net\ charge$, solved using numerical algorithms like the bisection method or Newton-Raphson method. If you're interested in programming, writing a small program that calculates the pI using these algorithms with a simple array you prepare yourself will rapidly deepen your understanding.
To advance your learning further, I recommend exploring the world of molecular dynamics (MD) simulation. While this tool treats a protein as a "point" or a "uniform sphere," MD simulation calculates the movement of each individual atom, allowing estimation of a more realistic "Rh including structural fluctuations" and a "diffusion coefficient" that directly considers the influence of surrounding water molecules. Nowadays, there are more environments where you can easily try this on the cloud. I'd be delighted if this tool serves as a starting point for you to appreciate the engineering fascination of modeling a complex system like a protein from multiple angles.