Fast protein classification by using the most significant pairs
Loading...
Date
2010-11-17
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study introduces a new approach to speed up the protein classification process. The basic idea is rewriting the sequences of each family by using the most significant pairs, where the total number of the pairs that can be appeared in the protein sequences is 400 different pairs. The sequence length could be reduced to 0.86, 0.91 and 0.95 by using the most 100, 200 and
300 significant pairs, respectively. The average time reduction is 0.53 %, 0.33 % and 0.22 % for 100, 200, and 300 pairs, respectively. In the three cases the suggested procedure can be adopted to speed up the testing time. However to get identical classification rate to the previous
profile HMM, 300 pairs at least must be used.
Description
Table of contents
Keywords
G-Protein coupled receptor, hidden Markov, multi alignment, significant pairs