


Teacher Version (for student version, click HERE ) |
||||||||
|
Bioinformatics is emerging as a hugely important field affecting all areas of biology. While bioinformatics is formally the application of computer technologies to biological sciences - ranging from automated analysis of microarrays containing thousands of individual experiments to the development of browser tools for looking at whole genomes - students in all areas of biology need to be familiar with software tools developed by bioinformaticians to accomplish routine tasks in biology.
| ||||||||
PART ONE | ||||||||
|
| ||||||||
|
1. First you will do the guided activity asking the question, "Is Euglena a plant or an animal?" using Cytochrome C as a demonstration exercise. 2. You will then have the tools to answer the question: "Are whales and dolphins a sister group to Artiodactyls (ungulates)? Or should they be placed within the Artiodactyls as a sister group to Hippopotami?" You will answer that question during part two of the lab (see Reading).
| ||||||||
|
It is impossible to provide a reasonable guide to even a small section of this tremendous resource... You will have to explore it yourself...
As you can see, there is a vast amount of information cataloged even for this monachine phocid...
"Sequential megafaunal collapse in the North Pacific
Ocean: For these next steps use a pencil or pen and put a check next to the steps as you complete them. To see what is available for Euglena let's enter that search term instead of Mirounga. Go ahead and refine the search a bit by clicking "Protein" and adding the search modifier for "organism" like this:
That should reduce the number of hits a bit. Adding "cytochrome c" with quotes like this should help a lot:
Finally, if you add the search modifier for "protein" like this:
...the list should be reduced to a very few hits that include the Cytochrome C sequences for Euglena viridis and Euglena gracilis.
At this point it is worth it to begin taking some notes... If you examine the sequence record you will see that there is an accession number. Writing down "Euglena viridis Cytochrome C P22342" in your labbook can make your life much easier later. Using the accession number P22342 whenever you search or communicate your results will ensure that the exact sequence you used is known. Write down "Euglena gracilis Cytochrome C P00076" as well.
What do you
see? You should see that the results have been narrowed to about 45
items (2006) on 3 pages. One of the
first ones in the list should be Q7YR71,
the Silvered Leaf Monkey version of Cytochrome C. Once again, you see a lot of "meta data" - additional information associated with the actual sequence data. We can save this file again as HTML, just as we did for the two Euglena sequences. This makes it easier to cite the authors of that data when (and if) we are writing the reseach paper. But this time we will also save the sequences in FASTA format. Sequences are available in a variety of formats which are selected via the "Display" button. The sequences can also be sent to "text" for printing or saved in a file. Copying and pasting into Notepad also works. There is also information associated with structure, taxonomy, other genes and publications, etc. A FASTA format file looks like this:
Try clicking on the DISPLAY button and selecting FASTA for the monkey sequence Q7YR71.
In order to save time I have downloaded an additional five sequences in FASTA format and saved them into a single master file for us to use in this exercise. Follow the numbered steps below the sequences. | ||||||||
|
The Cytochrome C sequences we will use:
Preparing sequences for comparison by aligning them using ClustalX
Preparing phylogenetic trees based on the sequence
comparison
The program also compares the aligned sequences and measures how different they are from each other. The more differences, the less related they should be, and the more distant they should appear on a phylogenetic tree. The program first finds the two most related sequences then adds the next most related "neighbor" sequence. It calculates a difference score and outputs a little file of brackets and numbers that show the relationships and degree of relationship in the form of "branch lengths."
The distance from Euglena to Monkey is very slightly less than the distance to the plants... and it's the same as the distance to the Mosquito... so, one could argue that based on a comparison of Cytochrome C sequences, Euglena is an animal. Are you convinced?!!
| ||||||||
PART TWO | ||||||||
|
Now for
the more interesting question which you will answer on your own!
For a project like this, Cytochrome C would probably be useless. It changes so little during evolution that it is essentially the same for all mammals. On the other hand, Pancreatic Ribonuclease is an enzyme that exhibits just the right amount of variability in mammals.
| ||||||||
|
Example Pancreatic Ribonuclease sequences for this project:
| ||||||||
| You may, choose a different project for your lab... some of you may choose to work with fish, insects or plants... Or, perhaps the most challenging and interesting of all, comparing whales, seals, bears, weasels... However, be aware that it will take you a lot of extra time since you will have to find your own sequences to compare and confirm that they are what you think they are... not always easy for beginners. | ||||||||
|
| ||||||||
|
| ||||||||
|
How about another interesting dataset to try? Remember, you can pick and choose from among this set... no need to run them all (See the paper from the O'Brien Lab on the Origins of Placentals [2001]). Important! Notice that these sequences are cDNA! You can try to run alignments using the DNA and check your results, or you can click on the accession number, and then on the protein ID link (it's down in the CDS section)... finally convert them to FASTA. >Rhinocerus
(white) ATP7A [Ceratotherium
simum] >Horse
ATP7A [Equus
caballus] >Hippopotamus
ATP7A [Hippopotamus
amphibius] >Elephant
(African) ATP7A [Loxodonta
africana] >Whale
(Humpback) ATP7A [Megaptera
novaeangliae] >Okapia
(Giraffe family) ATP7A [Okapia johnstoni] Photo >Pig
(note "X" at 2nd to last residue) ATP7A [Sus
scrofa] >Manatee
(Caribbean) ATP7A [Trichechus
manatus] >Dolphin
(Bottle-nosed) ATP7A [Tursiops
truncatus] | ||||||||
|
| ||||||||
|
Preparing a nicer-looking figure of your tree using Adobe Acrobat.When you are done with alignment and tree-building, you may be interested in producing a nicer image of your tree than a screen shot will provide. If your computer has the software "Adobe Acrobat," or other software that can read postscript files, follow the steps below:
| ||||||||
PART THREE | ||||||||
|
Some proteins have had their structures determined by X-ray crystallography or Nuclear Magnetic Resonance. This is an arduous but rewarding endeavor and especially important for understanding enzyme mechanisms or for drug discovery. | ||||||||
| ||||||||
| ||||||||
![]() | ||||||||
|
| ||||||||
| ||||||||
|
© Henrik Kibak 2004 | ||||||||