Instructions for Web-based Gene/Protein Analysis

Instructions for Web-based Gene/Protein Analysis.

Part 1.

Sequence comparison of different albumins.

Point your favorite browser to http://www.ncbi.nlm.nih.gov/protein. In the input box at the top next to the word “protein” type in “bovine serum albumin” and search. The first entry will be the sequence of the protein. Click on the FASTA representation. Copy the sequence including the header beginning with the > symbol.

Point a new browser window to http://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi?link_loc=BlastHomeLink

In the “Enter at least 2 protein sequences” area, paste your FASTA sequence from above into the box.

Next go back to the protein page and retrieve the sequences for goat serum albumin and the ovalbumin and paste their FASTA sequences into the box after the BSA sequence (with spaces between each entry). Click ALIGN and print the results.

Do the results correspond with the Western results? Can you tell if the anti-BSA antibody recognizes the N-terminal portion or the C-terminal portion of the protein ? Explain.

Add these data and observations into your Western experiment results and put them together as one lab report.

Part 2.

Your assignment: follow the instructions below for your assigned gene, and submit everything I've highlighted in bold.

Your laboratory has isolated the cDNA for a gene. You want to know things like "What protein is made from this gene?" and "What does the protein do?" Following the protocol below, you will be able to determine the hypothetical structure of the protein, related proteins, and possibly the function of the protein from knowing the primary nucleotide sequence. This is Bioinformatics.

1. You will be emailed a DNA sequence that is your hypothetical cDNA clone.

2. Select the entire sequence, and "copy" it (ctrl-C).

3. Paste (ctrl-V) your sequence into the box located at the following website:

http://web.expasy.org/translate/. Below the box is a drop down menu. Use the “compact” feature and click “Translate Sequence.” The algorithm has searched for open reading frames in all six possible orientations. You must figure out which one makes the most sense. (This will be explained in more detail in class.) Indicate the most likely protein sequence you chose. Also you should copy that sequence electronically to be used later in this assignment.)

4. Now, point your browser at the National Center for Biotechnology Information (NCBI) homepage: http://www.ncbi.nlm.nih.gov/

Click on the “DNA and RNA” link. On the next page scroll down to the "BLAST" button under the “tools” section. Scroll down and click "Nucleotide BLAST.” When the next page loads, paste (ctrl-v) your DNA sequence into the form box and then click "BLAST!" A new page will pop up telling you how long it should take for your request to be filled. Click on "Format!"

When your results arrive, look them over to see what known sequences look like yours. This will give you an idea if your gene is related to any known genes and what it's function might be. Print out the first few summary pages (NOT the sequence alignment pages!).

5. Go back to your translated protein sequence and copy it and go to the "Expasy" homepage: http://www.expasy.org/

Click on “proteomics” under the “Categories” section. Click “similarity search…" Once there, you’ll do a protein BLAST in France. Click on the “BLAST-PBIL" button on the right side and proceed similarly to what you did with your DNA sequence (but obviously use your protein sequence for this part). You will get results from this search in a few minutes. Again, print out the first few summary pages of similar protein sequences. Do these homologies agree with the DNA homologies? Discuss.

6. Go "back" a few pages to www.expasy.org and go back to the proteomics area again and look for “sequence sites, features, and motifs.” Within the Tools area click on “Scan Prosite." This will take you to the EMBL website. Paste your protein sequence into the field, then "Search" using Option 1. This will return a page that informs you of any potential post-translational modification sites, based on homology to other protein motifs. (This may take a few minutes for the search to complete.) Print this page.

7. Go to http://web.expasy.org/protparam/ which is the "ProtParam" page. Paste in your sequence as above, and click "Compute Parameters." Your results will come back to you quickly. It should tell you the number of residues, molecular weight, isoelectric point, amino acid composition, extinction coefficient, and predicted stability (some proteins have "tell-tale" instability sequences) of your protein. Print this out.

8. Go "back" a few pages to www.expasy.org and go back to the proteomics area again and look for “protein structure.” Click on “TMPred" on the right side. Paste in your sequence as before and "Run TMPred." The results will come back with a hydropathy plot, plus evaluation of that plot. Print this page out. Is your protein expected to be transmembrane or cytoplasmic or secreted?

9. Go back to "Protein Structure" again and click on "Phyre2". Type your e-mail address to tell the program where to send your results, give the sequence a name, paste your protein sequence into the box and click "Quick Phyre Search." You will get an e-mail telling you where to retrieve your results. While you're waiting, the computer will be doing a primary, secondary and tertiary structure alignment with all known proteins!).

Your results will be too cumbersome to print out, but you can look through the names of the matches to see how this search compares to the others. Discuss. Also, you can click on the proteins to "see" how they might fold. The "normal" size lines indicate well-matched structure alignments, thick lines indicate that your sequence is missing some sequence found in the structure that it's being compared to, and thin lines indicate that your sequence has extra sequence not found in the structure that it's being compared to.

You should now be able to tell me what your DNA encodes and what the protein product probably does. You may have to look up data for homologous proteins in textbooks or in PubMed to answer the "what does it do" parts.