Instructions for Web-based
Gene/Protein Analysis.
Part
1.
Sequence
comparison of different albumins.
Point
your favorite browser to http://www.ncbi.nlm.nih.gov/protein. In the input box at the top next to the word
“protein” type in “bovine serum albumin” and search. The first entry will be the sequence of the
protein. Click on the FASTA
representation. Copy the sequence
including the header beginning with the > symbol.
Point
a new browser window to http://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi?link_loc=BlastHomeLink
In
the “Enter at least 2 protein sequences” area, paste your FASTA sequence from
above into the box.
Next
go back to the protein page and retrieve the sequences for goat serum albumin
and the ovalbumin and paste their FASTA sequences into the box after the BSA
sequence (with spaces between each entry).
Click ALIGN and print the results.
Do
the results correspond with the Western results? Can you tell if the
anti-BSA antibody recognizes the N-terminal portion or the C-terminal portion
of the protein ? Explain.
Add
these data and observations into your Western experiment results and put them
together as one lab report.
Part
2.
Your
assignment: follow the instructions below for your assigned gene, and submit
everything I've highlighted in bold.
Your
laboratory has isolated the cDNA for a gene.
You want to know things like "What protein is made from this
gene?" and "What does the protein do?" Following the protocol below, you will be
able to determine the hypothetical structure of the protein, related proteins,
and possibly the function of the protein from knowing the primary nucleotide
sequence. This is Bioinformatics.
1. You will be emailed a DNA sequence that is your hypothetical
cDNA clone.
2. Select the
entire sequence, and "copy" it (ctrl-C).
3. Paste (ctrl-V)
your sequence into the box located at the following website:
http://web.expasy.org/translate/.
Below the box is a drop down menu. Use
the “compact” feature and click “Translate Sequence.” The algorithm has searched for open reading
frames in all six possible orientations.
You must figure out which one makes the most sense. (This will be explained in more detail in
class.) Indicate the most likely protein
sequence you chose. Also you should copy
that sequence electronically to be used later in this assignment.)
4. Now, point your browser at the
Click
on the “DNA and RNA” link. On the next
page scroll down to the "BLAST" button under the “tools” section. Scroll down and click "Nucleotide BLAST.” When the next page loads, paste (ctrl-v) your
DNA sequence into the form box and then click "BLAST!" A new page will pop up telling you how long
it should take for your request to be filled.
Click on "Format!"
When your results arrive, look them over to see what
known sequences look like yours. This
will give you an idea if your gene is related to any known genes and what it's
function might be. Print out the first few summary pages (NOT the sequence alignment pages!).
5. Go back to your
translated protein sequence and copy it and go to the "Expasy"
homepage: http://www.expasy.org/
Click
on “proteomics” under the “Categories” section. Click “similarity search…" Once there, you’ll do a protein BLAST in France. Click on the “BLAST-PBIL" button on the
right side and proceed similarly to what you did with your DNA sequence (but
obviously use your protein sequence for this part). You will get results from this search in a
few minutes. Again, print out the first few summary pages of similar protein
sequences. Do these homologies agree
with the DNA homologies? Discuss.
6.
Go "back" a few pages to www.expasy.org and go back to the proteomics
area again and look for “sequence sites, features, and motifs.” Within the Tools area click on “Scan Prosite." This
will take you to the EMBL website. Paste
your protein sequence into the field, then "Search" using Option 1. This
will return a page that informs you of any potential post-translational
modification sites, based on homology to other protein motifs. (This may take a few minutes for the search
to complete.) Print this page.
7. Go to http://web.expasy.org/protparam/
which is the "ProtParam" page. Paste in
your sequence as above, and click "Compute Parameters." Your results
will come back to you quickly. It should tell you the number of residues,
molecular weight, isoelectric point, amino acid composition, extinction
coefficient, and predicted stability (some proteins have "tell-tale"
instability sequences) of your protein. Print
this out.
8.
Go "back" a few pages to www.expasy.org and go back to the proteomics
area again and look for “protein structure.” Click on “TMPred"
on the right side. Paste in your sequence as before and "Run TMPred." The results will come back with a hydropathy plot, plus evaluation of that plot. Print this page out. Is your protein expected to be transmembrane
or cytoplasmic or secreted?
9. Go back to "Protein Structure"
again and click on "Phyre2".
Type your e-mail address to tell the program where to send your results,
give the sequence a name, paste your protein sequence into the box and click
"Quick Phyre Search." You will get an e-mail telling you where to
retrieve your results. While you're
waiting, the computer will be doing a primary, secondary and tertiary structure
alignment with all known proteins!).
Your
results will be too cumbersome to print out, but you can look through the names
of the matches to see how this search compares to the others. Discuss. Also, you can click on the proteins to
"see" how they might fold. The
"normal" size lines indicate well-matched structure alignments, thick
lines indicate that your sequence is missing some sequence found in the
structure that it's being compared to, and thin lines indicate that your
sequence has extra sequence not found in the structure that it's being compared
to.
You
should now be able to tell me what your DNA encodes and what the protein
product probably does. You may have to
look up data for homologous proteins in textbooks or in PubMed to answer the
"what does it do" parts.