Overview Edit

Mutation assessor is a database and scoring system to predict the functional impact of mutations in translated proteins in human cancers. It provides data from other databases such as COSMIC, UniProt, and Pfam, as well as its own "functional impact score" of a mutation based on evolutionary conservation of the protein sequence containing the mutation. This technique is also able to identify "driver" mutations in human cancers. The database provides a public server which provides annotation and scoring of user data based on the database.

Scores are currently available for all positions in the hg18 and hg19 reference genomes. The database also states whether the mutation is of high, medium, low, or neutral functional impact.

The Scoring Algorithm Edit

The algorithm calculates the "entropy" of a column for a set of aligned sequences: the entropy of column $i$ is calculated as

$S_i = \ln \frac{N!}{\prod_\alpha n_i(\alpha)!}$

where $\alpha$ cardinally represents the type of residue (e.g. 1 through 21 for each amino acid and a gap), $n_i(\alpha)$ is the number of residues of type $\alpha$ in that column, and $N$ is the total number of residues in a column. The program can thus calculate the entropy difference for a residue change from $\alpha$ to $\beta$ by

$\Delta S_i^c( \alpha \to \beta ) = - \ln \frac{n_i(\beta)+1}{n_i(\alpha)}$

Since the residue counts already exist in the mutation assessor database, the scores for $n$ SNPs can be calculated in $O(n)$ time [1][2]

References Edit

1. Reva, Antipin, and Sander (2011). Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Research. doi:10.1093/nar/gkr407
2. Reva, Antipin, and Sander (2007). Determinants of protein function revealed by combinatorial entropy optimization. Genome Biology. doi:10.1186/gb-2007-8-11-r232