Development and testing of algorithmic solutions for problems in computational genomics and proteomics

Ramaraj, Thiruvarangan

Development and testing of algorithmic solutions for problems in computational genomics and proteomics

Files

RamarajT0810.pdf (6.61 MB)

Date

2010

Authors

Ramaraj, Thiruvarangan

Publisher

Montana State University - Bozeman, College of Engineering

Abstract

This dissertation covers three subjects: (i) computational characterization of Antigen (Ag)-Antibody (Ab) interactions (ii) a novel and effective algorithm to predict the epitope of a protein based on an antibody imprinting technique (iii) a comparison of existing de novo genome assembler algorithms targeted specifically at the assembly of data generated by Illumina (Solexa) short-read sequencing technology, and suggestions for their improvement. The first part focuses on identification, characterization and understanding the ways in which the antibodies and antigens interact. We analyze Epitope/Paratope region using a large dataset of Ag - Ab complex structural data taken from the PDB. Epitope/Paratope regions in our dataset have been characterized in terms of their size, average amino acid residue composition, residue-residue pairing preferences, and residue dispersion in the epitope and paratope regions. This analysis provides a more up-to-date picture of the Ag-Ab interface and provides new insights into the role of residue composition and distribution in Ag-Ab recognition. The above analysis helps in obtaining a refined substitution matrix optimized for antibody imprinting technique and used to improve the effectiveness of the epitope prediction algorithms that have also been developed and are the second focus of the thesis. The third and the final part focus on the de novo genome assembly problems. The genome assembly programs takes the short reads generated by Whole genome shotgun sequencing technology and computationally reconstructs the genome. For the genome assembly problem the connections between read length, read type, repeat complexity, quality score and coverage and how these parameters help in improving or diminishing the capability of the assembly programs to assemble the sequence data were studied in depth. At the end of this experimental process it gives us a better understanding of the impact of the above mentioned parameters on the complexity of genome assembly and helps ascertain margins on these parameters of sequence data that enable efficient and accurate assembly by the programs.

URI

https://scholarworks.montana.edu/xmlui/handle/1/2100

Collections

Theses and Dissertations at Montana State University (MSU)

Full item page

Development and testing of algorithmic solutions for problems in computational genomics and proteomics

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections