To search, Click below search items.

 

All Published Papers Search Service

Title

Application of Modified General Regression Model to Cluster Protein Sequences

Author

G Lavanya Devi, Allam Appa Rao, A Damodaram, GR Sridhar, G Jaya Suma

Citation

Vol. 8  No. 4  pp. 225-231

Abstract

Cluster analysis is the study of techniques for finding the most representative cluster prototypes. Linear relation of two sequences can be modeled perfectly through the classical linear regression model. Protein sequence clustering has many applications such as helps in classifying a new sequence, predicting the protein structure of unknown sequence and finding the family and subfamily relationships of protein sequences. To cluster a repository of protein sequences into groups where sequences have strong linear relationship with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we have proposed a new technique named General Regression Model Technique (GRMT1) to test the linearity of the sequences. Later we have applied General Regression Model Technique Clustering Algorithm (GRMTCA) to cluster the protein sequences. The performance of the algorithm was evaluated with 50 protein sequences. We used BLAST to annotate the clusters obtained by GRMTCA. It is observed that the clusters have biological significance.

Keywords

Clustering, BLAST, General Regression Model, Protein Sequences

URL

http://paper.ijcsns.org/07_book/200804/20080432.pdf