Major Research Activities

  • Categorization and classification of of bioinformatics softwares applicable in agricultural domain.
  • Phylogenetic and structure predictions of hypothetical proteins.
  • Development of miRNA prediction tool.
  • Comparative studies of halophytes and glycophytes using bioinformatics tools.
  • Missing gene identification in Thiamine Biosynthesis pathway.
  • t-RNA study using comparative genomics approach.
  • Analyses of database on pedigree management of wheat, maize, rice, chickpea and mung


New Research Activities

Application of Bioinformatics tools is being initiated for nucleotide and protein sequence alignment, gene prediction, genetic markers establishment and protein structure prediction and modeling. We aim to identify sequence genes and their products, which are responsible for quality traits, pest and disease resistance, drought resistance, salinity resistance in different crops and genes responsible for pesticide resistance in important pest species of crops. Besides, ICT tools are being used for designing user-friendly application software on agri-informatics, which will ultimately be made online. We are also planning to initiate work on NGS data analysis and mining. A proposal to initiate M.Sc program is also under consideration with Director IARI. A proposal to initiate wheat bioinformatics is also under process with consultation with Director IARI and Advisor DBT.


Major activities at the centre during the last 5 years

  • Micro RNA Designer: A tool for designing MicroRNA for cDNA/mRNA sequence has been developed. In this tool, we can create mRNA after pasting cDNA/mRNA sequence. This is based on the defined parameters and existing rules. The validation of this tool is under process.
  • Prediction of microRNA: This tool, developed at the Centre, can predict the possible microRNAs present in the query sequence. The software is useful for possible prediction of microRNAs before starting the wet lab experiments for microRNA work.

  • Comparative study between glycophytes and halaphytes: Studies have been carried out for comparing proteins of glycophytes and halophytes using in-silco approach. Salt stress is one of the most serious abiotic stress factors limiting the crop productivity. Accordingly a few halophytes (salt tolerant) and glycophytes (salt sensible) were chosen to study the salt stress gene and their phylogenic relationship. The 3D structures of all proteins were generated and superimposed to know their phylogenic relationship. It has been observed that though the mechanism of salt tolerance is somewhat similar in both categories, but the difference lying only in their gene expression. It has also been observed that there is cent percent similarity between the proteins of glycophytes and halophytes.

    Multiple Sequence Alignment generated by Clustal X of Glycophytes and Halophytes

  • Prediction of structure and function of hypothetical proteins: In-silco work on hypothetical proteins of rice (Oryza sativa) has been done to predict the structure and function of hypothetical protein. In rice there are about 40,000 to 50,000 genes present in its genome (430Mb). Majority of proteins are still unannotated and have been termed as unknown or putative or hypothetical. Through In-silco methods we have tried to predict the structure and function of hypothetical proteins such as a new domain was found in the protein BAC78599 with similarity to SAD1/UNC-84 domain. However our prediction requires wet lab validation for confirmation.
  • Missing gene identification in Thiamine Biosynthesis Pathway: This work is an attempt to form signature molecules of all enzymes involved in thiamine biosynthesis pathway, which showed many missing genes in various genomes. Accordingly, an attempt was made to fill these gaps by generating enzymes specific profiles. Further these profiles were used to find missing link in the genomes of Thermotoga maritime MSB8, a eubacteria. This genome shows evidence of presence of thiamine biosynthesis pathway as all enzymes except thiD and thiG are present in the genome. We searched this genome for enzyme phosphomethyl pyrimidine kinase (thiD activity) using profile generated by MEME and MAST to search this genome. Two significant hits, which have phosphomethyl pyrimidine activity in the genome were obtained. We got similar results in few other organisms too.
  • Insilco prediction of structure and function of hypothetical proteins: In our study we have tried to predict the structure and function of the protein (acc, no AAG 52239) of Arabidopsis thaliana, which is hypothetical in nature. Arabidopsisthaliana is a mustard weed, which contains 27,029 protein- coding genes (TAIR7). Majority of proteins are still unannotated and have been termed as unknown or putative or hypothetical. The difficulty in the prediction of structure and function of hypothetical protein is due to its low similarity with other known proteins. Therefore, more emphasis was given on fold recognition and domain identification. In this study we used number of online as well as offline tools. We have found that our query sequence is quite similar to 2bw3A sequence and this template have the same domain i.e. hATC dimerisation domain.
  • Novel gene identification in Medicago truncatual using in silico approaches: Present work was carried on completed sequences of Medicago truncatula, which is a forage legume. Genes fromM.truncatula share identity to legumes and established symbiotic relationships with nitrogen fixing Rhizobia and is colonized 143by mycorrhizal fungi. In this study several gene-finding programs were used to find genes in the sequences, which utilized one or the other gene-finingalgorithmic strategies. For increasing gene prediction accuracy we included both instrinsic and extrinsic approaches by using ab-initio programs and blast search. We also used TWINSCAN, which uses a combination of both approaches for prediction enhancement. The novel genes as predicted by most of the softwares was composed of single exons, hence there was no splice results when NNSplice and NetPlantGene were used on this gene, which again validates our results. Some of the gene finding programs used were GENSCAN, GeneID, GrailEXP, GenMark.hmm and Augustus. BLAST analysis was performed on the genes predicted by these programs to infer homology. WU-BLAST was also executed against the EST sequences of M.Truncatula to enhance the validity of the gene. Further confirmation of the result was done through splice-site detection programs and gene prediction program well trained on dicotyledons databasets (Diogenes, TWIMSCAN). From the study, nineteen new genes were predicted from the seven sequences. Out of 19 sequences, 10 had a function, 8 were of unknown function and 1 sequence was novel gene.
  • Transfer Ribonucleic acids (tRNA) are small molecules of length ~73-90 nucleotides: These play key role in translation of genetic information from mRNA into proteins. In Genome tRNA database (GtRDB) the number of tRNA genes that have been predicted in Arabidopsis are 639, Oryza are 764 and till date in Populus trichocarpa are 858. In this study, we have predicted the tRNAs of Populus trichocarpa and then compared it with the tRNA sequences of Arabidopsis thaliana and Oryza sativa. The total number of 765 tRNAs of Populus trichocarpa are predicted by computational approaches. After the study it was found that 68 tRNAs in Populus trichocarpa are absent in both Arabidopsis and rice and the tRNAs which are present in Populus but absent in Arabidopsis are 16 and in Oryza are 11. It is also observed that there is presence of 21st amino acid named Selenocysteine (Sec) which is absent in both the annual plants i.e. Arabidopsis thaliana and rice. The Suppressor tRNA was found in Populus and absent in Arabidopsis and rice. Our results suggest that there are some tRNAs which are present only in Populus and may have specific role for development of plants. The presence of Sec tRNA suggests that there is some role of this tRNA in plants also. As it is reported earlier that Sec tRNA are present only in animals and algae but in our study we observed that Sec tRNA is also present in Populus trichocarpa.
  • In Silico analysis of Hevein-like Protein (Arabidopsis): Carbohydrate -binding proteins are known to be important in a variety of biological processes, mediated through their carbohydrate specificities. Some of the well-characterized roles of carbohydrate -binding proteins are in cell–cell communication, host–pathogen interactions, cancer metastasis, embryogenesis and tissue development. Detailed knowledge of the molecular mechanisms of carbohydrate recognition by these proteins is therefore required not only to understand the prime events in various biological processes but also to translate them into applications in medicine and biotechnology. Arabidopsis thaliana contain a small chitin-binding protein that strongly resembles hevein from the rubber tree and the hevein domains of the win proteins and tobacco CBP20 with respect to its primary structure and physicochemical properties. It is known to act as an antimicrobial compound and its transcript level increases on pest attack therefore it has been categorized as defense-related protein. Analysis of the deduced amino acid sequence of the Hevein-like protein revealed the presence of an N-terminal domain with striking sequence similarity to previously reported chitin-binding domains whereas the C-terminal domain showed extensive similarity to the catalytic domain of chitinases. We carried out a systematic database analysis extending to several Hevein-like molecules to derive common minimum principles characterizing the features generating carbohydrate recognition capability as well as the determinants of specificity. The vast numbers of sequences, a significant amount of biochemical data, as well as a few crystal structures reported enable a simultaneous analysis of all known members of the family to develop a broader perspective of the functionalities as well as potential uses of these proteins.