##
A novel protein fingerprint

# Mihaly Mezei

Mihaly.Mezei@mssm.edu

# Theory

For a protein of *n* residues its fingerprint is defined by up
to three of *n* x *n *binary matrices.
The primary fingerprint matrix **FP**^{0} whose elements
are defined by the angles (f_{o}
in the figure below) that the line
connecting the carbonyl carbons of residue *i *and *j*
forms with the C=O bond's direction on residue *i*:

FP^{0}_{ij}=
sign
{[**r**(O_{i})-**r**(C_{i})]
.
[**r**(C_{j})-**r**(C_{i})]},

Since the C=O directions essentially alternate by 180^{o}
in sheets, **FP**^{0} will be dominated by
alternating white and black bars in such regions.
On the other hand, the C=O directions are essentially parallel in
helices,
resulting in black equilateral right angle triangles
located above the diagonal.

The secondary fingerprint matrices **FP**^{1}
and **FP**^{2} are defined to allow the differentiation
between parallel and antipararllel sheets, and between different
packing of helices, respectively.
**FP**^{1} is defined by the angle between the line
connecting the carbonyl carbons of residue *i *and *j*
and the line connecting the C and N atoms of residue *i*:

FP^{1}_{ij}=
sign
{[**r**(N_{i})-**r**(C_{i})]
.
[**r**(C_{j})-**r**(C_{i})]}.
while **FP**^{2} is defined by the angle between the
line
connecting the carbonyl carbons of residue *i *and *j*
and the normal to the plane formed by the C and O atoms of residue
*i* and the N of residue *i*+1:

FP^{2}_{ij}=
sign
{[**r**(N_{i+1})-**r**(C_{i})]x
[**r**(O_{i})-**r**(C_{i})]
.
[**r**(C_{j})-**r**(C_{i})]}.

The information in **FP**^{1} encodes the direction the
backbone path takes.
Generally, **FP**^{0} contains the most
information and in several cases it can serve in itself to
characterize the fold.
Combining two maps results in a matrix whose elements can
take four values.

# Examples

The examples below show that the minima (as a function of the
alignment of a motif with a protein containig that motif)
in the fingerprint fit tracks the minima in the RMSD.
Note that 50% difference in fingerprint maps corresponds to
random aligment.

# 3-helix bundle

# PDZ domain

# Checking for false positives

In the eaxmple below, the two four-helix bundle proteins difer
only in the orientation of a single helix.
The resulting fingerprints have obviously different pattern.

Back to the Mezei Lab home page
Last modified: 11/29/02 (MM)