POZITIV: Normalised SMith-Waterman alignments --------------------------------------------- Elements of the system ---------------------- Data sources, for example database = 'data/yeast.aa' queries = 'data/aravindque' true_positives = 'data/aravindtp' Search Method, for example pozitiv, smiwat, FASTA, BLAST All search methods takes as input one query q, a database D and a list of (expected) true positives. The query is a string representing a protein sequence, D is a dictionary of the form {id: sequence}. A search returns a sorted list of protein objects sorted by their score - the largest first. Each protein has an attribute tp which is either 'T' or 'F' depending on whether the protein was found in the true positives file of that query or not. Note on performance --------------------- The pozitiv algorithm is about half as fast as the raw smith waterman. The reason is that it requires the actual alignments in order to compute the statistics used for normalising the swscore. The extra time is taken in two operations: 1: Storing states in the smiwat_align algorithm 2: Tracing back and building alignments EXAMPLE ------- See the file doit.py