Vandana Sreedharan,1 Olac Fuentes,2 and Stephen Aley1
(1) Bioinformatics Program andDepartment of Biological Sciences,成人头条
(2) Computer Science Department,成人头条
Abstract
The task of mapping genes to biological pathways is quite consequential, because it is necessary to understand every detail of a pathway in order to understand how it functions and where it can malfunction. Exploration of gene expression patterns has revealed that a gene's expression is linked to the pathways where the gene belongs and can be used to predict the gene's pathway. Due to the complexity of the correlation between expression pattern and pathways, classifiers such as Support Vector machines and Artificial Neural Networks can be used to predict the pathways using the expression pattern.
Yeast has many genes for which the pathways are not known. Hence finding pathways for these unknown genes might be highly beneficial. Considering 17 major pathways, 639 Microarray experiments and 1169 genes in Yeast, the prediction is done using Artificial Neural Networks and Support Vector Machines using complete set of data sets as well as using a reduced data set. Predictions and validation came out to be somewhat pathway dependence- True positive rate ranged from 33% to 53% for the overall data, but it was around 67% for eight pathways. False positive rate range was 10% -25% for overall dataset, and 1% to 15% for 8 pathways. Then prediction using the same model is done on the genes that have no known pathways. The predictions on these unknown genes are validated by two techniques: GO term analysis and BLAST. Most of the results agree with each other, giving an indication to the pathways that the unknown genes might be involved in.
Linkage Analysis with Sib-Pair Data
Lizette Ortega,1 Jaclyn Scholl,2 Jaime Ramos,3 Javier Rojo4
(1) University of Arizona (2) Providence College (3) 成人头条 (4) Rice University, Houston, TX
Abstract
Through various statistical methodologies, many tests have been devised to help identify connections between a particular genotype of interest and a phenotype. The purpose of this work is to not only acknowledge the great contributions of several sib-pair methods, but to also move beyond the isolated conclusions of each paper by presenting more general comparisons and ultimately establishing more comprehensive results. Using computer simulation, five regression methods proposed by Hasemen, Elston, and Drigalenko are compared in different conditions such as, three sample sizes and three phenotype distributions. Results are obtained looking at Type I Error and Power.
Research done at Rice University Summer Institute of Statistics
Three Dimensional Structural Analysis of the Intact
Thermus thermophilus ATP Synthase by Electron Microscopy
(1) Department of Chemistry, 成人头条 (2) MRC-LMB Cambridge, UK
Abstract
ATPases/synthases are proton translocating molecular machines that are essential to all living organisms because of their role in energy interconversion. These molecular rotary motors are not only responsible for the synthesis of ATP but, depending on their location, can also function in the acidification of intracellular compartments at the expense of ATP. The most intensively studied and best understood is the eukaryotic F type ATPase, which synthesizes ATP via the proton motive force that is generated by photosynthesis or respiration. V-type ATPases are a related group of enzymes that rotate in the opposite direction to that of F-type ATPases and hydrolyze ATP in order to pump protons across the membrane. A-type ATPases make up the third group, which is mostly composed of ATPases of archaebacterial origin. All three ATPase types share a gross conservation of structure and are evolutionarily related. The general arrangement is that of a multi-subunit water-soluble domain (V1/F1/A1) that is connected by a central stalk to an multi-subunit integral membrane domain (Vo/Fo/Ao).
The exact stoichiometry and location of various ATPase proteins remains in question, particularly for those in the peripheral stalk. In order to gain some insight into the structure and function of the bacterial ATPase, we have examined the structure of the ATPase from the hyperthermophilic eubacterium Thermus thermophilus using electron microscopy. A three dimensional negative stain reconstruction has revealed the presence of not one but two peripheral stalks. The central stalk is well resolved, especially with respect to its interaction with a single catalytic subunit in the soluble sector giving rise to an asymmetry comparable to the three catalytic states identified in the F-ATPase. Moreover, density corresponding to the membrane domain reveals 6-fold symmetry, indicating that there are probably 12 proteolipids in the membrane component of the rotor. As a whole, the ATPase appears to be about 20Å longer along the long-axis when compared to the X-ray structure of the F1c10 ATPase. The increased length appears to be solely due to a longer central stalk and not a larger soluble or membrane domain.
Poster: Biochemical and Computational Analyses of
Calcium Binding Proteins in Bacteria Charmy Gandhi1,2 and Delfina C. Domínguez2
The function of calcium (Ca2+) as a cell regulator is well documented in eukaryotes. However, little is known about the role of Ca2+ in prokaryotes. Calcium ions play a pivotal role in eukaryotes by maintaining and regulating many vital functions including cell differentiation, gene expression, transport, motility, cell division. Ca2+ homeostasis depends on the existence of calcium binding proteins (CaBPs) as well as other mechanisms. Recent studies suggest that bacteria, similar to eukaryotes, keep tight control of cytosolic free Ca2+, have Ca2+ transporters, and CaBPs. We hypothesize that CaBPs play an important role in Ca2+ homeostasis and that Ca2+ ions are involve in the regulation of several intracellular processes in bacteria. An essential step toward an increased understanding of the role of Ca2+ in prokaryotes is the identification of intercellular CaBPs. Our preliminary data indicates that several CaBPs are present in bacteria (E. coli, B. subtilis and B. pertussis). These proteins share similar characteristics with eukaryotic CaBPs including calmodulin (CaM). The identified proteins are acidic, low molecular weight, cross-react with both monoclonal anti-calmodulin and anti-calerythrin antibodies and bind radioactive calcium (45CaCl2). In an effort to identify and sequence these proteins we analyzed crude cell lysates by 2D-electrophoresis followed by mass spectrometry. Most of the proteins associated with CaBP characteristics are associated with stress responses (including DnaK, EF-Tu/Ts, AhpC, L7/L12, and GroEl). Based on these findings and other published data the purpose of this research is to perform a computational analysis to investigate the presence of Ca2+ binding domains (including EF-hand, C2 domain, Gla domain, ANX domain) in these bacterial protein sequences. The long-term goal of this research is to illuminate the role of Ca2+ in bacteria.
Development, Implementation and Testing of
a DNA Microarray Test Suite Ehsanul Haque
Bioinformatics Program, 成人头条
Abstract
Affymetrix Gene Chip technology for measuring gene expression is one of the most popular in medical science and basic biology research. After the experiment has been performed, a series of computational processing steps take place to convert the raw image data file to one intensity value per gene. The number of competing microarray data processing methods is large and growing, each having areas of strengths and weaknesses. I initiated the development of a test suite to help the user identify the best method for microarray data analysis for their particular purpose. The test suite includes graphics and summary statistics for parameters such as CV (Coefficient of Variance) and RA (Relative Accuracy) and will help the user to compare different processing methods. I used the test suite to compare the results of four microarray data processing methods.
Using Proteomic Approach to Identify Tumor Associated Antigens as Markers in Hepatocellular Carcinoma (HCC) Kok Sun Looi and Jianying Zhang
Department of Biological Sciences, 成人头条
Abstract
Liver cancer, especially hepatocellular carcinoma (HCC), affects the Hispanic population of the United States at a rate double that of the white population. The majority of people with HCC will die within one year of its detection. This high case-fatality rate can in part be attributed to lack of diagnostic methods that allow early detection. In this project, we identified TAAs in HCC using two-dimensional polyacrylamide gel electrophoresis (2-DE gel) and mass spectrometry. Identified 29 proteins were immunoreacted with HCC sera. Of 29 identified proteins, 17 were reported relating to cancer and five relating to apoptosis. The molecular identification and characterization of TAAs in HCC will also contribute to our understanding of their role in malignant transformation of the liver, thereby providing attractive candidates for early diagnosis and targeted therapies.
Computational Data (Physical Properties) of Structurally Modified Lead Compounds of Thiophene Derivatives Rama Krishna Empati,1 Suman Sirimulla,2 G. Nagrajan,3
K.S Manjunath,3 and S. Mohan3
(1) Department of Chemistry, 成人头条 (2) PES College of Pharmacy, Bangalore, India. (3) SSR College of Pharmacy, Mahabubnagar, India.
Abstract
A novel series of thiophene compounds with chlorine as substituent is taken in to consideration. These compounds have been considered by the fact that Chlorine containing β-lactam antibiotics like Cloxacillin, dicloxacillin, clotrimazole, miconazole, ketaconazole have been synthesized and screened for antifungal & antimicrobial activity. We have calculated the physical properties like Melting point, Heat of formation, HOMO, LOMO, Dipole, area, volume, electronic charges and energy using computational software’s (Gaussian, PC Model, Titan) of the structurally modified lead compounds of thiophene derivatives.
Applying a Hybrid Data Mining Approach to
Tumor Malignancy Prediction
Tzu-Liang (Bill) Tseng, Udayvarun Konada, Alexander Nadackal,
and Kalyan Aleti
Department of Mechanical and Industrial Engineering
成人头条
Abstract
Automated decision support for clinicians has been proposed in recent years. However, little work has been devoted to the development of computer-based systems to support clinicians' judgments and diagnoses. This paper presents a new hybrid approach to automated clinical decision support. The approach consists of a novel rough-set method for feature selection and an enhanced support vector machine algorithm for accurate prediction. Being unique and useful in solving medical decision problems, the approach can derive decision rules and identify the most significant features simultaneously. We tested the approach using data from diagnoses of real patients with solitary pulmonary nodule, an indication of potential lung malignancy. Variants of the approach achieved over 90 percents diagnostic accuracies and the derived rules were shown to effectively assist further examination. This research thus contributes to developing and validating a useful approach to automated clinical decision support.
"Histrionics": A Database Mining Approach for Classification of Functional Disorders of the Autonomic Nervous System
Elise Marshall Bioinformatics Program, 成人头条
Abstract
A statistical association approach applied to medical history information provides a means to characterize syndromes, potentially facilitating identification of pathophysiological mechanisms. In dysautonomias, altered function of one or more components of the autonomic nervous system adversely affect health. Chronic orthostatic intolerance (COI) syndromes exemplify dysautonomias in which the patient cannot tolerate prolonged standing. Postural tachycardia syndrome (POTS) is characterized by an excessive increment in heart rate during standing, and neurocardiogenic syncope (NCS), the most common cause of acute loss of consciousness in adults, can be evoked by orthostasis. For instance, the symptom cluster in POTS could reflect decreased venous return to the heart and compensatory activation of the sympathetic nervous and adrenomedullary hormonal systems.
The Function of Protein Disulfide Isomerase
Yu-Hsiang Wang1 and Mahesh Narayan2
(1) Department of Biological Sciences, 成人头条 (2) Department of Chemistry, 成人头条
Abstract
Multi-disulfide-bond-containing proteins acquire their native structures through an oxidative folding reaction involving the formation of native disulfide bonds and native structure through thol-disulfide exchange reactions and a conformational folding event, respectively. In many proteins, the rate-determining step in oxidative folding involves the formation of a structured intermediate from its unstructured isomers through isomerisation of non-native disulfide bonds to the native ones coupled with the conformational folding reaction; the ensuing native-like tertiary structure protects the formed native disulfides from further thiol-disulfide isomerisation reactions. In vivo, the 56-kDA oxidoreductase, protein disulfide isomerise (PDI), catalyzes oxidative protein folding of “substrate proteins” before export to their respective extracellular environments.
We have studied the PDI-catalyzed formation of des [40-95], a three-disulfide-bond-containing structured intermediate of the four-disulfide-bond-containing protein bovine pancreatic ribonuclease A (RNase A) from its unstructured isomers as a function of pH. Our data indicate that PDI has the greatest impact on the reaction-rate at pH 7, with decreasing influence as the pH of the reaction environment is increased.
Given the anomalously low pKa (6.7) of a PDI thiol, our results demonstrate that the isomerisation activity of PDI is ideally suited to the environs of the lumen of the ER where the pH is ~ 7 and uncatalyzed thiol-disulfide reactions are inherently slow. These results have important implications for the development of PDI-mimics that might eventually be used as chemotherapeutics for alleviating misfolding-related diseases such as Alzheimer’s, Parkinson’s and Jakob-Creutzfeldt’s disease.
Small-molecule Catalyzed Oxidative Protein Folding:
The Quest for In Vivo Chemotherapeutics
Paul Nieves,1,2 Saemin Chang,2 Matthew Fink,2 Luis Martínez,2
and Mahesh Narayan2
(1) Universidad Metropolitana, PR (2) Department of Chemistry, 成人头条
Abstract
Multi-disulfide-bond-containing proteins acquire their native structures through an oxidative folding reaction; a process involving the formation of the native set of protein disulfide bonds through thiol-disulfide exchange reactions (oxidation, isomerisation and reduction) of their cysteines/disulfides coupled with a conformational folding event. In vivo, the 56-kDa oxidoreductase, protein disulfide isomerise (PDI), catalyzes oxidative folding reactions in the lumen of the E.R. prior to export of the “substrates (disulfide-bond-containing proteins)” to their extracellular environs.
The oxidative folding rate of the four-disulfide-bond-containing protein bovine pancreatic ribonuclease A (RNase A) was examined in the presence of a synthetic small-molecule dithiol, (+/-)-trans-1,2-bis(2-mercaptoacetamido) cyclohexane (BMC), and in combination with a naturally occurring osmolyte, trimenthylamine-N-oxide (TMAO). The results indicate that the oxidative folding rate of RNase A is enhanced 2-fold by the presence of BMC (0.4 mM) and 3-fold by the combined presence of the dithiol (0.4 mM) and the osmolyte (0.2 M) relative to the control experiment.
Current efforts are geared towards the synthesis of a second-generation small-molecule mimic of PDI, viz., (+/-)-trans-1,2,4,5-tetra (2-mercaptoacetamido) cyclohexane which will be tested for its efficacy in catalyzing oxidative folding reactions. The ultimate objective is the synthesis of a small-molecule chemotherapeutic that can be used to catalyze in vivo protein folding, thereby alleviating misfolding-related diseases such as Alzheimer’s, Parkinson’s and Jakob-Creutzfeldt’s disease.
Cross-validated QSAR studies of a Systematic Simple
Traditional Protocol verses Fallacious and Complicated
Suman Sirimulla, Carrie Ash-Mott, and William C. Herndon
Department of Chemistry, 成人头条
Abstract
A common procedure for QSAR analysis consist of data selection (generally sets of congeneric compounds and their corresponding biological activities), tabulation of trial physico-chemico or ad hoc molecular structural descriptors, followed by a multilinear statistical analysis to derive a statistically valid QSAR correlation of the activity data making use of a subset of the trial descriptors. A final important step is cross-validation to assess the putative predictive (rather than just correlative) capabilities of the derived QSAR model equation.
The results presented in this study will consist of an analysis of three recent cross-validated studies in which antimalarials activities of a set of aromatic mefloquine derivatives are correlated with calculated atomic charges using increasingly complex statistical procedures. The reported conclusions are that these methods give high quality statistical results, providing useful techniques with very good predictive power. However, these conclusions are negated by the fact that over 60% of the compounds (13 out of 21) in the study are assumed to have insensible fictitious structures.
The perceived high quality of the overall statistical results may indicate deficiencies in the modeling protocols used in the above studies, and in rationales that have been used to justify cross-validation procedures. In particular, the interpretation of the results of the cross-validation as measuring predictive power of a QSAR model will be criticized. We argue that cross-validation is valuable to primarily establish robustness of the fit to a model equation, and, in particular, the leave-one-out procedure gives useful information about outliers.
The results of a very successful elementary QSAR study using substituents indicator variables, coupled with two calculated theoretical AM1 parameter for the actual compounds used in the work outlined above are presented.
OGPET v1.0: Prediction of mucin-type O-glycosylation residues using variation profiling.
Rafael Torres, Jr.,1 Yash Dayal,2 Ming-Ying Leung,2 and Igor Almeida1
(1) Department of Biological Sciences, Border Biomedical Research Center, 成人头条. (2) Department of Mathematical Sciences, Bioinformatics Program, Border Biomedical Research Center, 成人头条.
Abstract
O-Glycosylation (OG) is a key post-translational modification of proteins that is considerably altered in certain pathologies (e.g., cancer). Therefore, owing its potential diagnostic and therapeutic relevance, few algorithms for prediction of OG sites were developed. However, these algorithms exhibit rather low specificity in predicting true OG sites. Based on experimentally mapped mucin-type OG residues, we have developed an algorithm, namely O-Glycosylation Prediction Electronic Tool (OGPET), which shows very high sensitivity and specificity. OGPET makes amino acid (aa) prediction motifs considering 5 relevant positions (-3, -1, +1, +3, and +4) around the possible Thr/Ser residue (position 0) that are known to influence the interaction of the polypeptide GalNAc-transferase (ppGalNacT) with the target protein. Furthermore, analysis of the physical and chemical properties of aa allowed the algorithm to indistinctively switch aa in any of the 5 relevant positions without increasing the rate of false-positive predictions. Our results showed a sensitivity of 0.97 and a specificity of 0.98 for standard performance tests. OGPET predicted true-positive sites despite mutations on the protein primary sequence using the aa variation approach (variation profiling). Finally, a new set of prediction constraints was able to find novel sites that were not originally included on the training sets. OGPET is currently available through the WWW (http://129.108.112.23/OGPET/).
Project supported by Grant#5G12RR008124 from the National Center for Research Resources (NCRR)/NIH. Its contents are solely the responsibility of the authors, and do not necessarily represent the official views of NCRR or NIH. R.T., Jr. is recipient of a NIH/MARCU*STAR scholarship.
RNAVLab: An Open-source User-friendly Virtual Laboratory
for the Study of RNA Secondary Structures Michela Taufer,1 Ming-Ying Leung,2 Kyle Johnson,3 Abel Licon,1 Prayook Tungjatooronrusamee,2 Yash Dayal,2 Daniel Catarino,1 Hao Lei2
(1) Computer Science Department, 成人头条 (2) Bioinformatics Program, 成人头条 (3) Department of Biological Sciences, 成人头条
Abstract
The goal of the RNAVLab project is to design and build an adaptive grid computing system that, at runtime, identifies and exploits computer resources across the 成人头条 (成人头条) campus to study secondary structures of large numbers of RNA segments using a variety of prediction programs. The grid environment at 成人头条 is based on an unified software tool for RNA secondary structure prediction, alignment, comparison, and classification. Our tool uses grid computing to build the computing power needed for predictions of large RNA sequences. New features are easy to integrate in our tool because of its modularity. We are currently using our tool for the study of prediction accuracy of a variety of codes for RNA secondary structure predictions, including pseudoknots; the identification of common motifs and their functions in virus secondary structures, e.g., viral replication; and the identification of common pseudoknots across viruses within the same family, species, or genus.
DAPLDS: Dynamically Adaptive Protein-ligand Docking System Using Volunteer Computing
Michela Taufer,1 Patricia J. Teller,1 Martine Ceberio,1 David Anderson,2 Charles L. Brooks III,3 Andre Kerstens,1 Trilce Estrada,1 David Flores,1 Richard Zamudio,1 Karina Escapita,1 Guillermo Lopez,1 Roger Armen3
(1) Computer Science Department, 成人头条 (2) Space Sciences Laboratory, The University of California at Berkeley (3) Department of Molecular Biology, The Scripps Research Institute
Abstract
DAPLDS or Dynamically Adaptive Protein-Ligand Docking System is a project that involves collaboration among the University of Texas - El Paso, The Scripps Research Institute (TSRI), and the University of California - Berkeley. This project, through implementation and use of a cyber tool, DAPLDS, that enables adaptive multi-scale modeling in a GC environment, will further knowledge of the atomic details of protein-ligand interactions and, by doing so, will accelerate the discovery of novel pharmaceuticals. The goals of the project are: (1) to explore the multi-scale nature of algorithmic adaptations in protein-ligand docking and (2) to develop cyber infrastructures based on computational methods and models that efficiently accommodate these adaptations.
Topaz: A Friendly Tool for Scientists to Access Data
on Grid Repositories
Richard Zamudio,1 Daniel Catarino,1 Michela Taufer,1 Karan Bhatia,2 and Brent Stern2
(1) Computer Science Department, 成人头条 (2) San Diego Supercomputer Center, University of California at San Diego
Abstract
As grid infrastructures mature, an increasing challenge is to provide end-user scientists with intuitive interfaces to computational services, data management capabilities, and visualization tools. The current approach used in a number of cyber-infrastructure projects is to leverage the capabilities of the Mozilla framework to provide rich end-user tools that seamlessly integrate with remote resources such as web/grid services and data repositories.
The goal of this project is to provide the scientific community with an user-friendly, efficient interface to grid technologies. Therefore we are designing and implementing Topaz, an open-source GridFTP protocol extension to the Firefox browser. In the design, implementation and performance analysis of Topaz, we are been guided by rigorous software engineering tools such as the Data Flow Diagrams (DFDs). GridFTP servers, similar to FTP servers used on the Internet, provide a data repository for files and are optimized for grid use (support for very large file sizes, high-performance data transfer, third-party transfer, integration with Grid Security Infrastructure). Topaz provides scientists with a familiar and user-friendly interface with which to access arbitrary GridFTP servers by providing upload and download functionalities, as well as by obtaining and managing certificates.
Connect With Us
成人头条
Bioinformatics Office
Bell Hall
500 W University
El Paso, Texas 79968-0766