Icahn School of Medicine at Mount Sinai
New York, NY 10029
July 25, 2003.
For a protein of n residues its fingerprint is defined by up to three of n x n binary matrices. The elements of the primary fingerprint matrix FP0 are defined by the angles that the line connecting the carbonyl carbons of residue i and j forms with the C=O bond's direction on residue i:
The secondary fingerprint matrices FP1 and FP2 are defined to allow the differentiaition between parallel and antipararllel sheets, and between different packing of helices, respectively. FP1 is defined by the angle between the line connecting the carbonyl carbons of residue i and j and the line connecting the C and N atoms of residue i:
while FP2 is defined by the angle between the line connecting the carbonyl carbons of residue i and j and the normal to the plane formed by the C and O atoms of residue i and the N of residue i+1:
Since the C=O directions essentially alternate by 180o in sheets, FP0 will be dominated by alternating white and black bars in such regions. On the other hand, the C=O directions are essentially parallel in helices, resulting in black equilateral right angle triangles located above the diagonal. The information in FP1 encodes the direction the backbone path takes, allowing to separate parallel and antiparallel sheets. The information in FP2 encodes the relative position of the backbone segments, allowing to separate differently packed helices. Generally, FP0 contains the most information and in several cases it can serve in itself to characterize the fold. Combining two maps results in a matrix whose elements can take four values.
The fingerprint matrices and the result of the comparison are
written on asci-formatted files.
The fingerprints are also plotted in Postcscript format and,
For runs comparing two proteins the fingerprint of the smaller one will be compared with the fingerprint of the larger one in all possible positions. Both the difference in the fingerprints and the RMSD of the overlaid backbone segment are printed on the output file and plotted on the postscript file. Furthermore, the overlaid backbones of the best matches will also be plotted (in stereo).
The program is written in Fortran-77. It has been tested on several platforms (including Linux). To include the Iris-GL calls (when compiling on an SGI graphics workstation) the comments C@GL have to be removed from the source file and reference to the graphics libraries has to be included:
cat pfp.f | sed 's/C@GL//' > pfp_tmp.f
f77 pfp_tmp.f -o pfp.bin -lfgl -lgl