NEBC Menu Banner

Bio-Linux Software Documentation Pages NEBC Home NEBC EnvBase Data Catalogue Training opportunities and courses run by the NEBC and other UK Bioinformatics Institutions Bioinformatics Software and "Bio-IT" projects supported by the NEBC The NEBC Bio-Linux Project NEBC News and Announcements

Back to search form

prfx

Name prfx
Description

prfx is part of the fasta3 package. FASTA contains many programs for searching DNA and protein databases for evaluating statistical significance from randomly shuffled sequences.

prfx is used to evaluate the significance of a translated-DNA:protein sequence similarity score by comparing two sequences and calculating optimal similarity scores, and then repeatedly shuffling the second sequence, and calculating optimal similarity scores using the Smith-Waterman algorithm. An extreme value distribution is then fit to the shuffled-sequence scores. The characteristic parameters of the extreme value distribution are then used to estimate the probability that each of the unshuffled sequence scores would be obtained by chance in one sequence, or in a number of sequences equal to the number of shuffles. prss is a related program allowing evaluation of DNA:DNA and protein:protein matches.

References:
Pearson, W.R. Flexible sequence similarity searching with the FASTA3 program package. Methods Mol Biol. 2000;132:185-219 [Entrez]

Pearson, W.R. Empirical statistical estimates for sequence similarity searches. J Mol Biol. 1998 Feb 13;276(1):71-84 [Entrez]

Pearson WR, Wood T, Zhang Z, Miller W. Comparison of DNA sequences with protein sequences. Genomics. 1997 Nov 15;46(1):24-36. [Entrez]


Homepage http://www.people.virginia.edu/~wrp/pearson.html  
Remote Documentation http://www.people.virginia.edu/~wrp/papers/ismb2000.pdf
 
Local Documentation
fasta3x.doc.txt