Project FAQ Project Staff Publications Sponsors Gallery 2006-07
Biotechnology Bioinformatics Teaching and Learning Evaluation Lesson Plans
Telecommunications STEM Careers Tutorials Publishing Poster Showcase

Interpreting Sequence Files Using Chromas

Download Chromas.

Download the example sequence file by right-clicking on the link and saving the file to your desktop.

Open Chromas.

Select FILE --> OPEN --> example_trace.ab1

Recall how sequencing machines operate and how the program has predicted which of the four nucleotides is at that point in the sequence based on flourescent trace peaks. Notice how messy the trace looks. Now scroll to the right and notice how clean the trace becomes at about 110... and then how the last few nucleotides at about 488 become difficult to assign.

Select EDIT --> COPY SEQUENCE --> FASTA FORMAT

Open Notepad (Notepad is the simple ascii text editor that comes with Windows) or TextEdit (the simple ascii text editor that comes on a Mac). We always use simple text editors rather than WORD because there must be no hidden characters inserted into our sequence files.

Select EDIT --> PASTE

Be careful with the next two steps... usually at least one end of every sequence trace is "chemically messy." We cannot be certain of those nucleotides and Chromas also has doubts, so it designates them "N" instead of A, G, C or T. Often, when we look at the second trace generated by the sequencing primer in the other direction we will be able to assign a nucleotide to the "N." The GOAL of this step is to prepare the largest possible chunk of clean sequence containing only A's, G's, T's and C's.

Use Ctrl-F to find the "N's," at the beginning of the sequence and delete the sequence up to the final "N" in that group (see example below).

Then use Ctrl-F to find the next "N," and delete the sequence after that N as was done in the example below:

>Example raw sequence from trace file
  CGGGNNGGGCCGGCGCCGGGNAACCGNCCACTNNCACTCCTCGCCGGGGGGGTNCGCCTC
  CNNTCCANGACGCANCCNTGGGTACGCTCTATNTTGAANAGCATCNACAATCGTATGCGG
  GACCGTGTTTTTCTTGGTACAACTGCGGGAATATTACTGAAACTCCTACACTATTGCAGA
  TAGGGTTTATGGTAGGGTTTTCTATTTACTAACTGGATTCCATGGGATACATGTTGTCGT
  AGGGACTATTTGGCTAATGGTAAGGTTAGTTCGACTATGACGCGGGGAGTTTTCTAGTCA
  ACGACACTTTGGGTTTGAGGCTTGTATTTGGTACTGACATTTTGTAGATGTGGTATGAGT
  GGCATTGTGGTGCTTAGTGTACGTGTGGTTTGGAGGATGATTATATATGTGATGGTTTAA
  GATATGGGACGGGGATGTTTATACATTTAAGTATCCGGATGCCAAACCTTCCTGGTACGC
  TTATTGNAAANAGCATANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
  NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
  NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
  NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
  NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

BECOMES

>Example raw sequence from trace file
  ACAATCGTATGCGG
GACCGTGTTTTTCTTGGTACAACTGCGGGAATATTACTGAAACTCCTACACTATTGCAGA
TAGGGTTTATGGTAGGGTTTTCTATTTACTAACTGGATTCCATGGGATACATGTTGTCGT
AGGGACTATTTGGCTAATGGTAAGGTTAGTTCGACTATGACGCGGGGAGTTTTCTAGTCA
ACGACACTTTGGGTTTGAGGCTTGTATTTGGTACTGACATTTTGTAGATGTGGTATGAGT
GGCATTGTGGTGCTTAGTGTACGTGTGGTTTGGAGGATGATTATATATGTGATGGTTTAA
GATATGGGACGGGGATGTTTATACATTTAAGTATCCGGATGCCAAACCTTCCTGGTACGC
TTATTG

Then select FILE --> SAVE AS --> example_fasta.txt

 


NSF Logo
Marine Biotechnology and Bioinformatics is a teacher professional development program of the Innovative Technology Experiences for Students and Teachers (ITEST) program. This material is based upon work supported by the National Science Foundation (NSF) under Grant No. 0323175 (2004-2006) and Grant No. 0525224 (2006-2009). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.