JEvTrace File Formats

Sequence coloring format (SCF) and seqsel refer to types of files written by JEvTrace:

" JEvTrace: refinement and variations of the evolutionary trace in JAVA," M. P. Joachimiak and F. E. Cohen, Genome Biology, 3(12) (2002).
There are older and newer versions of the format; both contain information for coloring regions in sequence alignments and in any associated structures.

The older format can be read by both MSF Viewer and Multalign Viewer Each line indicates the position of the residue in the alignment (not necessarily the same as its sequence number) to be colored (actually position minus one), the number of the sequence in which it should be colored (0 means all sequences in the alignment), and the red, green, and blue components of the color, each on a scale from 0 to 255. Example:

337    0      0     0   255
340    1      0   255   255
338    9    255   255     0
Columns are separated by spaces and do not need to be aligned. The first line indicates that the residues at the 338th position in the alignment in all of the sequences should be colored blue. The second line indicates that the residue at position 341 of the alignment in the first (top) sequence should be colored cyan. The third line indicates that the residue at position 339 of the alignment in the 9th sequence should be colored yellow.

The format accomodates any 24-bit digital color and allows any subset of residues in any subset of sequences to be highlighted. The residue/coloring information is ordered hierarchically, i.e., smallest to largest sequence ID, and within this grouping, smallest to largest sequence position, and within this grouping, smallest to largest color values. This formula for sorting entries makes the data easier to read. Importantly, only the positions to be colored are encoded, minimizing disk space usage and maximizing the speed at which the information can be read.

The newer format can be read by Multalign Viewer. It is similar to the older format, but allows easier specification of residue and sequence ranges. Example:

8    8    -1     -1    0   255   255      //
8    8    9     9    255   175   175      //
8    8    10     10    255   175   175      //
The first two fields in each line indicate a range of positions in the alignment, starting position minus one and ending position minus one. The next two fields indicate a range of sequences in the alignment, starting sequence and ending sequence; 0 means all sequences, while -1 means the line is for internal use by JEvTrace (and will be ignored by Multalign Viewer). The next three fields specify the red, green, and blue components of the color, each on a scale from 0 to 255. In the example, a pinkish color is specified for the ninth position in the alignment within the ninth and tenth sequences. The last required field is either "//" or "#" and may be followed by a comment. If a comment is present, it is used as the region name. Lines in the SCF file with the same comment are used to define a single region.

For further examples, see Figure 5 of the JEvTrace paper and the author's format descriptions.


UCSF Computer Graphics Laboratory / June 2003