DISPHOS
Disorder-Enhanced Phosphorylation
Sites Predictor

Run DISPHOS!

About DISPHOS

DISPHOS computationally predicts serine, threonine and tyrosine phosphorylation sites in proteins. The new version of the predictor (DISPHOS 1.3) was trained on over 2000 non-redundant experimentally confirmed protein phosphorylation sites (1,079 Serine sites, 666 Threonine sites, and 375 Tyrosine sites). The new set of phosphorylation sites was augmented using the entries from SwissProt R44, Phospho.ELM database, and literature. The observation that amino acid composition, sequence complexity, hydrophobicity, charge and other sequence attributes of regions adjacent to phosphorylation sites are very similar to those of intrinsically disordered protein regions suggests that disorder in and around the potential phosphorylation target site is an important prerequisite for phosphorylation. Thus, DISPHOS uses disorder information to improve the discrimination between phosphorylation and non-phosphorylation sites. The accuracy of DISPHOS reaches 81.3% +/- 2.2% for Serine, 74.8% +/- 2.5% for Threonine, and 79.0% +/- 2.4% for Tyrosine. The application of DISPHOS to ordered and disordered protein regions, as well as to various functional protein categories and proteomes provides strong support for the hypothesis that protein phosphorylation predominantly occurs in regions of intrinsic disorder.

Executable version of DISPHOS 1.3 was developed in collaboration with Molecular Kinetics, Inc. This predictor is also available on the Molecular Kinetics website: http://www.pondr.com


Usage

Input

First paste your query sequence in
FASTA format (only 20 symbols corresponding to the conventional amino acid code are supported). Then choose the kingdom from which the protein originates. If it is not known, please check "unknown" box. If your protein is eukaryotic and you know the organism, from which it originates, please check the appropriate boxes. If the organism is unknown, then leave the field blank. If you know the functional category, to which your query protein belongs, then check the appropriate "functional category" box. If the category is unknown or cannot be found in the list provided, please leave the field blank.

Although DISPHOS was trained on eukaryotic phosphorylation sites, it was applied to estimate phosphorylation rates in different proteomes and protein functional categories to achieve a better precision. These estimates were incorporated into predictor to allow for prediction on organisms other than eukaryotes.

Output

The output consists of 5 columns: position, residue, phosphorylation score, surrounding sequence, phosphorylation annotation (yes for positive result). The predictions are made on all serine, threonine and tyrosine residues of a query sequence. Only residues with the score >0.5 are considered to be phosphorylated. A high score implies a more confident positive prediction. Graphical output shows only residues that are predicted to be phosphorylated (DISPHOS score >0.5). The score generally approximates the probability that the residue is phosphorylated, given the information about the protein sequence (such as kingdom, organism, functional category). Outputs are adjusted to the estimated class priors (relative frequencies of non-phosphorylated and phosphorylated residues) of each group. Hence, the predictor will try to decrease the number of misclassified residues. Note that since the probability that a site is phophorylated is usually smaller than 0.5 (except for some functional categories), this will typically reduce the number of sites predicted to be phosphorylated. If there is no information about the sequence or the user needs to know to which class the new query site is closer (P- or NP- class) the Default Predictor should be used (assumed class priors are 0.5 and 0.5).


References

Iakoucheva LM, Radivojac P, Brown CJ, O'Connor TR, Sikes JG, Obradovic Z, 
Dunker AK. Intrinsic disorder and protein phosphorylation. Nucleic Acids
Research, 2004, 32 (3), 1037-1049.

Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradovic Z. Intrinsic 
disorder and protein function. Biochemistry, 2002, 41, 6573-6582.

Iakoucheva LM, Brown CJ, Lawson JD, Obradovic Z, and Dunker AK. Intrinsic 
disorder in cell signaling and cancer-associated proteins. J Mol Biol, 
2002, 323, 573-584.

Dunker AK, Lawson JD, Brown CJ., et al. Intrinsically Disordered Protein, 
J. Mol. Graph. Model., 2001, 19, 26-59.

Diella F, Cameron S, Gemund C, Linding R, Via A, Kuster B, Sicheritz-Ponten T,
Blom N, Gibson TJ. Phospho.ELM: a database of experimentally verified
phosphorylation sites in eukaryotic proteins. BMC Bioinformatics, 2004, 22; 5(1):79
	    


Datasets

The datasets are available upon request from
Predrag Radivojac

Links