Cross_match 


Cross_match search for term

Phil Green's (University of Washington) general purpose utility for comparing any two sets of DNA sequences. cross_match is a program for rapid protein and nucleic acid sequence comparison and database search, based on an efficient implementation of the Smith-Waterman-Gotoh algorithm (Smith & Waterman 1981, Gotoh 1982). By revising the recursion relations slightly and making efficient use of word-packing, we have been able to significantly reduce the number of machine instructions executed per Smith-Waterman matrix cell, resulting in a speed enhancement of 2-3 fold over the previous best implementation (Pearson and Miller, 1992). In addition, cross_match allows the comparison of a pair of sequences to be constrained to bands of the Smith-Waterman matrix that surround one or more matching words in the sequences. This substantially increases speed for large-scale nucleotide sequence comparisons without significantly compromising sensitivity. We have found cross_match useful for the following tasks: comparing a set of reads to a set of vector sequences to produce vector-masked versions of the reads; comparing contig sequences found by two alternative assembly procedures (e.g. phrap and xbap) to each other; and comparing assembled contigs to the final edited cosmid sequence, in retrospective studies of the accuracy of fully automated assembly. We also now routinely use cross_match to find repeats in finished cosmid sequences and mask them out prior to database searches. Cross_match can also be used to search a profile against a database. (http://www.phrap.org/phredphrap/general.html)