Figure 1: Focus of ClustalW on the ExPASy website


Figure 2: Where to insert sequences on ClustalW.


Figure 3: Example of a sequence that can be loaded into ClustalW


ExPASy, or Expert Protein Analysis System, is a resource that was developed as part of the Swiss Insitute of Bioinformatics (SIB).  The site offers many options for sequence alignment, including: ClustalW, MUSCLE, and T.Coffee.  This article will focus on using ClustalW, which is a program designed for multiple sequence alignment of nucleic acid and protein sequences (Figure 1).  This is a useful tool for looking at relationships, including that of a particular gene among different species, or comparing two different genes.  Up to 30 FASTA-formatted sequences can be submitted at a time.  If only two sequences are being analyzed, a pairwise alignment can be preformed.  FASTA format is a text-based way of representing nucleotide or peptide sequences.  FASTA originated from a software package, but is now commonly used for Bioinformatics.  DNA sequences can be obtained through GenBank, which is a genetic sequence database generated by the NIH.  Genbank includes all DNA sequences that are publically available.

An example of an article that uses ExPASy: Buel et al. (cited in the references section).

How to use the databaseEdit

Once in the ClustalW site, FASTA-formatted sequences for multiple genes and species can be added for an alignment, as shown in Figure 2 [2].   For each FASTA file, the first line (circled in blue in Figure 3) gives the gene infomation.  The following line starts the sequence.  Figure 4 shows an expample of a multiple sequence alignment.  *'s indicate identical sequences. 

Figure 4: An example of a multiple sequence alignment




Buel GR, Rush J, Ballif BA (2010) Fyn promotes phosphorylation of collapsin mediator protein 1 at tyrosine 504, a novel, isoform-specific regulatory site. J Cell Biochem 111(1): 20-28.

(PDF avaialable with request)