During part a) of the practical the emphasis was on protein sequence
retrieval and analysis. We will now slowly turn towards protein
structure and focus on what can be deduced on a protein's structure
based on it's sequence. Specifically, we will predict the structure of
a small protein based on its sequence similarity to another protein,
with known structure.
We are going to predict the structure of the alpha-dendrotoxin from
the green mamba snake. This is the toxin contained in the venom of the
green mamba that endangers the prey after a bite.
First, we will extract the toxin sequence from the UniProt
database. Open a browser, search for "alpha-dendrotoxin" in UniProt after selecting "UniProtKB" in the dropdown to the left of the search field. Click on the required sequence
(it should be the first one listed in the UniProtKB database: VKTHA_DENAN (P00980)), and on the
the top of page, click Dowload, and save the FASTA (Canonical) file to a local file.
As discussed in the lectures, a protein's sequence (primary structure) can be used as a basis for a prediction of its secondary structure. The principle of such methods is based on the fact that different amino acids and amino acid combinations have different preferences for different types of secondary structure. Alanine, for example is often found in alpha helices, whereas prolines are known to destabilise helices. Automated procedures exist that have optimised prediction algorithms against a databank of proteins with known structures. One such prediction program is available as an online server: the JPred4 Secondary Structure Prediction server.
The prediction is presented in the line "jnetpred". A continuous line (-) stands for unstructured
(i.e. neither helix nor sheet), an arrow stands for extended, or sheet, and a red cylinder
stands for helix. The "JNETCONF" bar graph indicates the confidence of the system in its prediction (higher is better).
As you can see, the server predicts the protein to
start from the N-terminus with an unstructured loop, followed by two
beta strands and a short helix.
We have the sequence of our protein of interest, we need a
suitable template structure of a homologous protein on the basis of
which we can build a model of the venom structure. For this, we visit
the Protein Data Bank. The
protein we're going to use as a template is the bovine (cow)
pancreatic trypsin inhibitor. In the
search field, search for "trypsin inhibitor bovine". Among the search
results select "4PTI" (or search for it directly).
Alternative, download the structure from our site and have a look at it by typing:
wget http://www3.mpibpc.mpg.de/groups/de_groot/compbio2/p14/4PTI.pdb pymol 4PTI.pdb
We are going to build a model using an internet server, the SWISS-MODEL server. Paste the sequence of the snake venom in the sequence window (or use the SWISS-PROT access code: P00980) and upload 4PTI.pdb via "Add Template File". Now, submit the request by hitting the "Build Model" button. Depending on the load of the server, it may take a couple of minutes for the model to finish. Once the calculation has finished, go to "Models", click "Model 01" and download the stucture in PDB format. Save the structure as swissmodel.pdb. You are going to need 1) an index file (index_swiss.ndx) and 2) a correct reference structure (1DTX.pdb) to compare against as well. In case the calculation takes too long, we also provide the coordinates. Download them with:
wget http://www3.mpibpc.mpg.de/groups/de_groot/compbio2/p14/swissmodel.pdb wget http://www3.mpibpc.mpg.de/groups/de_groot/compbio2/p14/index_swiss.ndx wget http://www3.mpibpc.mpg.de/groups/de_groot/compbio2/p14/1DTX.pdb
gmx confrms -f1 1DTX.pdb -f2 swissmodel.pdb -n1 index_swiss.ndx -o fit_swiss.pdb -one
pymol 1DTX.pdb fit_swiss.pdb
Go the the AlphaFold Structure Database and search for the UniProt accession number of the toxin (P00980). Download the PDB file as "toxin-alphafold.pdb".
We will then compare the SwissModel model and the AlphaFold model, as well as the experimental structure.
pymol 1DTX.pdb fit_swiss.pdb toxin-alphafold.pdb
To briefly sketch a picture of how AlphaFold works, let's look at this diagram from the
AlphaFold Nature Paper:
Question:
Which of the predicted structures (SwissModel or AlphaFold) seems closer to the experimental structure?
Question:
Given the above description of how AlphaFold works, is this a fair comparison to SwissModel?
Go back to Contents