Fast protein classification by using the most significant pairs

Loading...
Thumbnail Image

Date

2010-11-17

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This study introduces a new approach to speed up the protein classification process. The basic idea is rewriting the sequences of each family by using the most significant pairs, where the total number of the pairs that can be appeared in the protein sequences is 400 different pairs. The sequence length could be reduced to 0.86, 0.91 and 0.95 by using the most 100, 200 and 300 significant pairs, respectively. The average time reduction is 0.53 %, 0.33 % and 0.22 % for 100, 200, and 300 pairs, respectively. In the three cases the suggested procedure can be adopted to speed up the testing time. However to get identical classification rate to the previous profile HMM, 300 pairs at least must be used.

Description

Table of contents

Keywords

G-Protein coupled receptor, hidden Markov, multi alignment, significant pairs

Citation