Friday, August 22, 2008

Research

From Saven Group

Jump to: navigation, search

Our research interests involve theoretical chemistry, particularly as it applies to biopolymers, macromolecules, condensed phases, and disordered systems. We are developing computational methods for understanding and designing molecular sytems having many physical and chemical degrees of freedom. In addition, we use molecular simulation techniques both to study chemical systems in molecular detail and to test and illustrate our theories. Most of our research involves applications of statistical mechanics.



Redesign of the DNA-protection protein Dps to obtain a hydrophobic interior nano-cavity [19]. (a) Wild-type Dps dodecamer. (b) Wild-type Dps, with two subunits deleted to expose the interior cavity. Hydrophobic residues (alanine, valine, isoleucine, leucine, phenylalanine, methionine and tryptophan) are rendered yellow; all other amino acids are blue. (c) Model of the protein with 120 computationally redesigned residues on the interior hydrophobic surface. Hydrophobic residues are rendered yellow; all other amino acids are magenta.


Protein folding spans biology, physics, and chemistry and has applications to biomedicine and biomaterials. Since proteins are the direct products of genes, folding is fundamental to the expression of genetic information in the cell. Experimentally determining the structures of proteins, however, which is an important part of understanding their function, remains a time intensive task. A quantitative and predictive understanding of protein folding will accelerate the interpretation of genomic information. Folding is also of fundamental physical interest, since it involves spontaneous ordering at the molecular scale. With few exceptions, proteins fold reversibly to unique structures. The three-dimensional folded structure of a protein is encoded in its sequence of amino acids. Thus we may be able to predict structure from sequence alone and to design desired folded structures through careful choice of sequence. Important goals include determining structure from gene sequence, re-engineering existing proteins, and crafting new ones. Using synthetic sequences, features important in protein stability and folding kinetics may be probed via selective mutations. Once particular structures can successfully designed, novel functional proteins can be crafted. Already these ideas are being expanded beyond the naturally occurring biopolymers and being applied to nonbiological "foldamers." Folding polymers, both biological and synthetic, can provide new types of structures and properties and lead to novel pharmaceuticals, catalysts, and materials.



Computational de novo design of a four-helix bundle containing the non-biological cofactor iron diphenyl porphyrin (DPP-Fe) [27]. (a) Structure of the DPP-Fe cofactor. (b) Computationally designed structure containing two DPP-Fe cofactors (yellow) and four helices (magenta). (c) Computationally designed complex rendered in space-filling format.


By synthesizing large numbers of peptide sequences, researchers can not only enhance their chance of discovering sequences that fold to a particular structure but also stand to learn more concerning the properties that foldable sequences share. Recently, combinatorial methods have become available that allow researchers to synthesize and keep track of very large numbers of sequences (> 106). Researchers have found peptides having protein-like properties in libraries of partially random sequences. From the viewpoint of designing new proteins and understanding mutational variability, it would helpful to know beforehand the number of sequences that are likely to fold to a given structure and what those sequences might be---at least in some average sense. Given the exponentially large number of possible sequences of even a small protein having only 100 residues, about 10130 sequences, obtaining an understanding of the library of all possible sequences seems at first glance to be impossible. However, counting large numbers of configurations is a task well-suited to statistical mechanics.



Most probable amino acids in protein L as determined from statisitical theory. More conserved positions (higher probability) are in red, less conserved in blue.


Using theory and simulation, our group is studying molecular folding in proteins and polymers. We are developing tools that estimate not only the number of sequences at different energies but also the probability that each position in the chain is a particular amino acid. The theory provides a convenient means to evaluate combinatorial design strategies and probe the ``designability'' of chosen target structures. This statistical formalism has a structure very similar to statistical thermodynamics and draws upon contemporary molecular modeling techniques to estimate the number and composition of sequences that are likely to fold to a given three dimensional structure. The theory uses an entropy formalism, and just as in thermodynamics constraints reduce the entropy so in our theory constraints can be introduced to focus combinatorial libraries. Such constraints can be physical, e.g., the overall energy of sequences, or synthetic, e.g., the patterning of amino acid properties. This theory yields the number and composition of sequences likely to be compatible with a particular structure. The theory takes as input a given target structure and a many-body energy (or scoring) function. Because explicit enumeration is avoided, the properties of an exponentially large number of possible protein sequences can be addressed. We are using such statistical methods to investigate folding in simple models of proteins. We are also using the theory as a guide to understanding the variability of naturally occurring protein sequences that fold to a common structure. Via our collaborations with experimental groups, we are also involved in the design of particular protein architectures or nonbiological folding polymers.



Home | Research | People | Publications | Contact Us | Links