Principles of Gene Expression Evolution
Prof. Naama Barkai, Weizmann Institute of Science, Israel
It is widely believed that evolution of gene expression plays a key role in the generation of phenotypic diversity. Still, now much is known about the priciples which drive inter-species differences in gene expression. I will describe recent results from our lab addressing this issue.

Structural Bioinformatics, Cancer and Drug Discovery
Prof. Sir Tom Blundell, FRS, University of Cambridge, UK

Structural bioinformatics - using the knowledge of the three-dimensional structures of protein targets now emerging from genomic data - has the potential to accelerate greatly drug discovery, but technical challenges and time constraints have traditionally limited use primarily to target validation and lead optimization. Application are now being extended into new approaches for lead discovery (for reviews see Blundell et al., 2002; Congreve et al., 2005). New informatics approaches are changing the way we look at protein structures and emphasising the importance of protein ensembles. Virtual screening coupled with high throughput X-ray crystallography is focused on identifying one or more weakly binding small-molecule fragments from compound libraries consisting of hundreds of small-molecule fragments. The high-resolution definition of this binding interaction provides an information-rich starting point for medicinal chemistry. I will give examples of using this approach in protein kinase targets important in oncology.

But many important targets for cancer are multiprotein assemblies. Structural bioinformatics can contribute to predicting protein interactions, but a major challenge for drug discovery arises from the very large surfaces that are characteristic of many of the protein complexes, for example those involved in receptor recognition and signal transduction (see for example, Pellegrini et al., 2000). This is especially true of complexes that are assembled from preformed globular domains. Not only is it difficult to bind a small molecule to the large, relatively flat surfaces of such proteins involved in protein interactions, but it is also difficult to disrupt the interaction entirely even if one did. It remains to be seen whether the emerging lead discovery approaches discussed in this lecture will prove suitable for these systems. However, recent analyses of multiprotein systems involved in cell regulation and signalling have identified a large number in which one component involves a flexible or unstructured region of the polypeptide chain. An example involves the complex of the human recombinase, Rad51, and the product of the breast cancer associated gene, BRCA2 (Pellegrini et al., 2003), which is not only scientifically revealing but offers an encouraging and perhaps more druggable site of interaction that could be used to target agents that would be helpful during chemo- or radio-therapy. We suggest that proteins forming interactions with a ligand that comprises a continuous region of flexible peptide may be more druggable targets than where complexes are formed from preformed globular protein structures.

Blundell TL, Sibanda BL, Wander Montalvao R, Brewerton S, Chelliah V, Worth CL, Harmer NJ, Davies O, Burke D (2006) Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery. Phil. Trans. R. Soc. B 361: 413-423

2. Blundell, T.L., Jhoti, H. and Abell, C. (2002). High-Throughput crystallography for lead discovery in drug design. Nature Reviews Drug Discovery. 1, 45-54

3. Furnham N, Blundell TL, DePristo MA, Terwilliger T (2006) Is one solution sufficient? Nature Structural Molecular Biology. 13, 184-185

4. Pellegrini L., Burke D.F., von Delft F., Mulloy B., Blundell TL (2000) Crystal Structure of fibroblast growth factor receptor ectodomain bound to ligand and heparin. Nature 407, 1029-1034

Pellegrini L, Yu DS, Lo T, Anand S, Lee M, Blundell TL, Venkitaraman AR (2002) Insights into DNA recombination from the structure of a RAD51-BRCA2 complex. Nature 420, 287-293

Congreve M, Murray CW and Blundell TL (2005) Structural Biology and Drug Discovery. Drug Discovery Today 10, 895-907

Combinatorial Algorithms for Regulatory Network Analysis and Pathway Reconstruction
Prof. Richard Karp,
University of California, Berkeley, USA

This talk will survey new and old algorithms for identifying coherently interacting sets of proteins conserved in two or more species and revealing the causal relationships and detailed logic of regulatory pathways.

Specific topics will include:
(1) Finding dense subgraphs of large graphs;
(2) Finding structures indicative of signal transduction pathways or molecular complexes occurring in the protein interaction graph of a single species or conserved across the protein interaction graphs of two or more species;
(3) Determining causal relationships among proteins using single-cell flow cytometry measurements of protein activity.

Joint work with Joseph Dale, Eran Halperin, Trey Ideker, Mani Narayanan, Ron Shamir and Roded Sharan.

Prediction of Protein Structure, Function and Druggability on a Proteomic Scale
Prof. Jeffrey Skolnick, Georgia Institute of Technology, USA
A novel method for the prediction of protein structure, function and druggability based on the "sequence to structure to function" paradigm has been developed. We first show recent results that suggest that for compact single domain proteins, the PDB is most likely complete and that the completeness can be explained by the packing of compact, hydrogen bonded, secondary structural elements. We next present results from the application of our structure prediction algorithm, TASSER to all GPCRs in the human genome. Based on confidence criteria, 90% should have approximately correct structures, and clustering shows that structurally similar GPCRs have similar function even when their sequences are diverse. We then describe our multimeric structure prediction algorithm, MULTITASSER, and its application to the prediction of protein-protein interactions. Next, a newly developed method for the accurate inference of protein biochemical function is presented and results of the comprehensive analysis of all sequenced genomes and the automated assignment of proteins to metabolic pathways are shown. Finally, we combine these approaches into a pathway-based method for the prediction of druggable protein targets and apply the resulting methodology to the human genome.

Evolution Teaches Protein Function Prediction
Prof. Burkhard Rost, Columbia University, New York, USA

The function of most proteins in the human genome remains obscure and target for extensive and expansive investigations. Protein function is a rather intuitive concept, yet it is neither a well-defined term, nor is there a generic descriptor for all aspects of function. This has hampered the progress in the development of general methods that predict function directly from sequence. Our group has been developing many, unique template-based and de novo prediction methods. Our special edge is the combination of evolutionary information and machine learning. One particular challenge that we have been addressing over the last years with our CS colleague Yechiam Yemini is the design of a semantic modeling server to integrate access and manipulations of diverse, distributed biological resources. This modeling server is based on an extended object-relationship network model for correlating and manipulating biological resources.

Our long-term goal is the contribution to the development of a comprehensive system that models three aspects of function particularly in multi-cellular organisms: where is a molecule located most of the time, what does it interact with, and when does the interaction occur. In my talk, I will focus on three particular topics pertaining to the first of these three objectives. Firstly, I will describe one of our methods for the de novo prediction subcellular localization. Secondly, I will sketch our first steps toward de novo predictions of protein-protein and protein-DNA interactions. Lastly, I will briefly sketch the idea behind our most recent, preliminary improved method for database searches employed for the selection of targets in the context of large-scale structural genomics. One goal of structural genomics is to determine one experimental protein structure for each representative protein family. Experimental structures have been determined for over 200 of the proteins selected by us by our experimental colleagues. In this context, I shall present some of our recent findings about the organization of sequence/structure space.

Markov Models in Protein Evolution: The Resolvent Method and Family-Specific Rates
Prof. Martin Vingron
, Max Planck Institute for Molecular Genetics, Germany

Markov models for amino acid exchanges are the basis for evolutionary studies on proteins. This talk will shortly summarize the corresponding theory and introduce an estimation procedure for amino exchange matrices. We will further explore a simple parameterization to explore family specific rates in protein evolution. As expected, one can see that, e.g., enzymes tend to have low family specific rates while extracellular proteins show high rates. Likewise, one can observe a certain correlation between this rate and whether a protein is essential. A possible link between the family specific rate and the age of a protein family will also be discussed.

T. Muller and M. Vingron, Modeling amino acid replacement. J.Comput.Biol. 7, 761-776, 2000.
H. Luz and M. Vingron, Family specific rates of evolution. Bioinformatics, Advance online publication March 2006.