More indepth details on the Structural Filter (BETA version) |
|
| |
Accessibility and secondary structure assignment
The solvent accessibility and secondary structure values are collected from DSSP
files. For the solvent exposure of a residue, a relative (normalized) value is calculated
as the ratio of the residue's accessibility DSSP value to the residue accessible surface
area value as defined by Miller and co-workers (Miller et al., 1987). The latter is
calculated for the residue in a Gly-Xaa-Gly tripeptide in extended conformation.
The relative accessibility varies between 0 and 1.
The secondary structure assignments are: H = alpha helix, B = residue in isolated beta-bridge,
E = extended strand (participates in beta ladder), G = 3-helix (3/10 helix), I = 5 helix (pi helix),
T = hydrogen bonded turn and S = bend.
|
| |
Score of an individual position
Based on the structural study of true motifs, accessibility (SA (p)) and
secondary structure (SSSE (p)) score of a position p are assigned as follows:
Accessibility:
relative accessibility >= 0.7: SA (p) = 1
relative accessibility < 0.7: SA (p) = relative accessibility
Secondary structure:
helix: SSSEM(p) = 0.3
strand: SSSE (p) = 0.5
G-helix: SSSE (p) = 0.7
loop: SSSE (p) = 1.0
|
| |
Score of a match
Given a motif match that can be modeled onto a structure domain
and such that len(match) = N, its global SA and
SSSE scores are evaluated as:
SA (match) = ∑p (SA (p))/N
SSSE (match) = ∑p (SSSE (p))/N
|
| |
Benchmark
The benchmark consists of the whole set of ELM's true motifs that can be mapped
onto a domain structure (at ≥ 70% sequence similarity). A set of reliable false
positives (FPs) is determined as well.
Our benchmark is composed of 218 TPs and 28790 FPs.
|
Score calibration
The percentage of TPs and FPs has been calculated for 0.1 SA and
SSSE score bins and plotted (figure 1 and 2).

Figure 1 - TPs and FPs frequencies versus SA score bins

Figure 2 - TPs and FPs frequencies versus SSSE score bins
Based on these findings, we assigned the score range into three categories:
| range |
colour |
accessibility and 2D structure conditions |
| poor context |
grey |
(0.0 ≤ SA< 0.3) or (0.3 ≤ SA ≤ 0.4 and 0.0 ≤ SSSE < 0.5) |
| quite good context |
half blue, half grey |
(0.3 ≤ SA ≤ 0.4 and 0.5 ≤ SSSE ≤ 1.0) or (0.4 < SA ≤ 0.7 and 0.0 ≤ SSSE ≤ 0.5) |
| best context |
blue |
(0.4 < SA ≤ 0.7 and 0.5 < SSSE ≤ 1.0) or (SA ≥ 0.7) |
By evaluating the percentage of TPs and FPs into these categories,
we obtained the results in figure 3.

Figure 3 - TPs and FPs frequencies versus range categories.
|