


|
|||||
|
|||||
PART ONE | |||||
|
| |||||
| |||||
|
Getting Euglena, animal, and plant sequences to compare...
We are going to use the National Center for Biotechnology Information (NCBI) to obtain the data we need to try to answer this question. Enter this location in your browser: http://www.ncbi.nlm.nih.gov/ or RIGHT CLICK on the link to open it in a new browser window or tab. To see what is available for Euglena let's enter that search term into the NCBI search box:
Also refine the search a bit by clicking "Protein" on the drop-down menu and adding the search modifier for "organism" like this: Euglena [orgn]
That should reduce the number of hits a bit. Adding "cytochrome c" with quotes like this should help a lot:
Finally, if you add the search modifier for "protein" like this:
...the list should be reduced to a very few hits that include the Cytochrome C sequences for Euglena viridis and Euglena gracilis (see below).
At this point it is worth it to begin taking some notes... If you examine the sequence record you will see that there is an accession number. Writing down "Euglena viridis Cytochrome C P22342" in your labbook can make your life much easier later. Using the accession number P22342 whenever you search or communicate your results will ensure that the exact sequence you used is known. Write down "Euglena gracilis Cytochrome C P00076" as well.
As you can see, a FASTA format file
looks like this:
It is important to respect this format because in our next step we will put the sequences we want to compare all into the same Notepad text file. The format helps the computer programs distinguish one sequence from another. In order to save time we will copy and past the five sequences below into the Euglena viridis file and save them into a single master file for us to use in this exercise. Follow the numbered steps below the sequences. |
|||||
|
The Cytochrome C sequences we will use: >Arabidopsis gi|4539007 Cytochrome c [Arabidopsis thaliana] MASFDEAPPGNPKAGEKIFRTKCAQCHTVEKGAGHKQGPNLNGLFGRQSGTTPGYSYSAA NKSMAVNWEEKTLYDYLLNPKKYIPGTKMVFPGLKKPQDRADLIAYLKEGTA >Monkey (Silvered Leaf) Q7YR71 Cytochrome c [Trachypithecus cristatus] MGDVEKGKKILIMKCSQCHTVEKGGKHKTGPNHHGLFGRKTGQAPGYSYTAANKNKGITWGEDTLMEYLE NPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE >E_gracilis P00076 Cytochrome c [Euglena gracilis] GDAERGKKLFESRAAQCHSAQKGVNSTGPSLWGVYGRTSGSVPGYAYSNANKNAAIVWEE ETLHKFLENPKKYVPGTKMAFAGIKAKKDRQDIIAYMKTLKD >Mosquito gi|31202411|ref|XP_310154.1| [Anopheles gambiae] MGVPAGDVEKGKKLFVQRCAQCHTVEAGGKHKVGPNLHGLFGRKTGQAAGFSYTDANKAK GITWNEDTLFEYLENPKKYIPGTKMVFAGLKKPQERGDLIAYLKSATK >Rice gi|218249 Cytochrome C [Oryza sativa (japonica cultivar-group)] MASFSEAPPGNPKAGEKIFKTKCAQCHTVDKGAGHKQGPNLNGLFGRQSGTTPGYSYSTA NKNMAVIWEENTLYDYLLNPKKYIPGTKMVFPGLKKPQERADLISYLKEATS
Preparing the comparison by aligning the Cytochrome C sequences from two plants, two animals, and two Euglenas
Although there can be some machine errors... Clustal does a fairly good job of aligning the sequences. Notice that the program has rearranged the order of the sequences and grouped the two plants together, the two animals together, and the two Euglena together. Not surprisingly the software has been able to identify the organisms that are most closely related on the basis of their Cytochrome C amino acid sequences. Notice that there is a region "NPKKYIPGTKM" that is nearly identical in all six
organisms! That can be interpreted as a region that was already present
in the common ancestor to plants and animals and which cannot be changed
without affecting the survival of the organism. Often these sites are
critical to the function of the protein and good drug
targets. Preparing phylogenetic trees based on the sequence
comparison
The distance from Euglena to Monkey is very slightly less than the distance to the plants... and it's the same as the distance to the Mosquito... so, one could argue that based on a comparison of Cytochrome C sequences, Euglena is an animal. Are you convinced?!!
| |||||
PART TWO | |||||
|
Now for
the more interesting question which you will answer on your own!
For a project like this, Cytochrome C would probably be useless. It changes so little during evolution that it is essentially the same for all mammals. On the other hand, Pancreatic Ribonuclease is an enzyme that exhibits just the right amount of variability in mammals. It is found only in organisms with a pancreas... which rules out plants and mosquitoes, and pretty much leaves chordates. Let's Review your "Plan of Attack!"
| |||||
|
If we are short on time you may click on these sequences to speed your completion of this project.
| |||||
| If there is time, you are encouraged to choose a different project for your lab... some of you may choose to work with fish, insects or plants... Or, perhaps the most challenging and interesting of all, comparing whales, seals, bears, weasels... However, be aware that it will take you a lot of extra time since you will have to find your own sequences to compare and confirm that they are what you think they are... not always easy for beginners. | |||||
|
| |||||
|
| |||||
|
How about another interesting dataset to try? Remember, you can pick and choose from among this set... no need to run them all (See the paper from the O'Brien Lab on the Origins of Placentals [2001]). Important! Notice that these sequences are cDNA! You can try to run alignments using the DNA and check your results, or you can click on the accession number, and then on the protein ID link (it's down in the CDS section)... finally convert them to FASTA. >Rhinocerus
(white) ATP7A [Ceratotherium
simum] >Horse
ATP7A [Equus
caballus] >Hippopotamus
ATP7A [Hippopotamus
amphibius] >Elephant
(African) ATP7A [Loxodonta
africana] >Whale
(Humpback) ATP7A [Megaptera
novaeangliae] >Okapia
(Giraffe family) ATP7A [Okapia johnstoni] Photo >Pig
(note "X" at 2nd to last residue) ATP7A [Sus
scrofa] >Manatee
(Caribbean) ATP7A [Trichechus
manatus] >Dolphin
(Bottle-nosed) ATP7A [Tursiops
truncatus] | |||||
|
| |||||
|
Preparing a nicer-looking figure of your tree using Adobe Acrobat.When you are done with alignment and tree-building, you may be interested in producing a nicer image of your tree than a screen shot will provide. If your computer has the software "Adobe Acrobat," or other software that can read postscript files, follow the steps below:
| |||||
|
© Henrik Kibak 2004 | |||||