#
Circular variance for macromolecular topography

### Mihaly Mezei

Published in the
*Journal of Molecular Graphics and Modeling*,
**21,** 463-472 (2003).

Download preprint in PDF
format
The basic idea is that if there is a set of points and a query
point (R_{o} below),
the sum of vectors from the query point to the set of points
will approach zero when the query point is in the 'middle' of the
set and apprach the sum of the vector lengths when the query
point is way outside:

Circular variance, used to characterize the spread in a set on
angles is defined as

This can be written in the more general form of

This form is valid in three dimensions as well (in that case it
is also referred to as __spherical variance__).
Applying this form to the vectors from the query point gives
a smooth 0-1 scale for the extent of burial of the query point in
the set.
In the example below a 6 A slice of bacteriorhodopsin simulated in
water is shown, where each atom is color-coded by its circular
variance w.r.t. the atoms of the protein.

**Detecting pocket regions** (steps 1-2)

1. Overlay a grid on the macromolecule and
remove the gridpoints that are covered by an atom
of the macromolecule
(see also Stahl et al., *Prot. Engng.,* **13,** 218
(2001)).

2. Find the connected clusters of the remaining gridpoints.
One of these clusters will include
all the gridpoints external to the macromolecule
while the rest will delineate various cavities.

**Detecting pocket regions** (steps 3-5)

3. Calculate the CV for the remaining gridpoints
with respect to the macromolecule

4. Eliminate all gridpoints where the value of CV
is below a threshold value, CV_{max}

5. Each connected cluster will represent one pocket.
The more gridpoints are in the cluster, the larger the pocket is.

**Example**: pockets of a bromodomain.
One of the pockets is the binding site.

**Detecting domain-separation**

Domain separation can be detected by examining the circular
varaince
map, determined by the representative atoms of each residue
(*e.g.,* the alpha carbons for proteins):

where *r*_{ij} is the vector from the representative
atom of residue *i* to that of residue *j*. Since a
domain-separating segment is outside both domains it seprates, a
low-CV swath of this CV map is diagnostic of domain-separating
regions.

**Example:** CV map of bacteriorhodopsin.
Loops connecting the transmembrane helices are
domain-separators. Black bars below the map
show the positions of the (transmembrane) helices.
The loop between helices 3 and 4 is buried.

The calculations were performed with the programs
Simulaid
(insideness labeling and domain-separation map) and
MMC (pocket regions),
available at this website.

Back to the Mezei Lab home page
Last modified: 11/15/2004 (MM)