Sequence Logos
A sequence logo is a graphical representation of aligned
sequences where at each position the size of each residue is proportional to
its frequency in that position and the total height of all the residues in the
position is proportional to the conservation (information content) of the
position
(
TD Schneider & RM Stephens, "Sequence Logos: A New Way to Display
Consensus Sequences", NAR 18:6097-6100 (1990) ).
Tom Schneider's Sequence Logos site
Logos from Related Sequences, Blocks, and Multiple Alignments
Blocks can be displayed as logos to examine sequence
conservation.
Start with a set of related sequences or a multiple alignment in Blocks, Clustal or FASTA-alignment format. For a set of related
sequences, get a ClustalW alignment using either the
EBI Clustal or the
BCM Search Launcher multiple sequence alignment site for
global multiple alignments from which blocks are made, or use
Block Maker to make blocks directly. Clustal-generated alignments
are copied and pasted into the
Multiple Alignment Processor window in Clustal (include the word 'CLUSTAL' from the heading) or
FASTA-alignment format, and choose 'Submit the sequences'. The processor carves out blocks from fully-ungapped regions that are
at least 10 residues wide and provides an automatic link for making logos from
the full set of blocks.
Get Blocks (for Blocks/Prints Database entries) and Block Maker (which uses Motif or Gibbs sampling)
provides blocks that are ready to go. In each case, there are links to
view logos or other displays.
The logo for a block is computed from the position-specific
scoring matrix
(PSSM)
that is used to score the block against a
query sequence. The PSSM is based on sequence-weighted counts
of each amino acid in each column of the block normalized by
dividing by the expected frequency of each amino acid in a protein
sequence database.
(
S. Henikoff, J. G. Henikoff, W. J. Alford & S. Pietrokovski, "Automated
construction
and graphical presentation of protein blocks from unaligned sequences",
Gene-COMBIS,
Gene 163 (1995) GC 17-26).
Tom Schneider's makelogo program is used to create the sequence logo from the
PSSM.
The logo is in PostScript or PDF format.
The amino acids in the logo are
colored
according to their chemical and physical characteristics.
[Blocks Home]
Contact us
Page last modified January 15, 1998