OVERVIEW

1. About the resource for array genes.

2. Linking to cumulative gene information through database, spreadsheet or data analysis software

3. About data collection, storage, updates and data relationship.

4. Frequently ask questions about the data.

5. Literature Citations for this resource and how to Cite or acknowledge this resource.

 

About this resource

This Bio resource is a platform designed as an online resource to assist researchers in analyzing results of microarray experiments and developing a biological interpretation of the results. This site is mainly to interpret the unique gene expression patterns found as biological changes that can lead to new diagnostic procedures and drug targets.

This interactive site allows users to selectively view a variety of information about gene functions that is stored in an underlying database. Although there are other online resources that provide a comprehensive annotation and summary of genes, this resource differs from these by further enabling researchers to mine biological relationships amongst the genes captured in the database using new query tools. Thus providing a unique way of interpreting the microarray data results based on the knowledge provided for the cellular roles of genes and proteins.

A total of six different query tools are provided and each offer different search features, analysis options and different forms of display and visualization of data. The data is collected in relational database from public resources: Unigene, Locus link, OMIM, NCBI dbEST, protein domains from NCBI CDD, Gene Ontology, Pathways (Kegg, Genmapp and Biocarta) and BIND (Protein interactions). Data is dynamically collected and compiled twice a week from public databases. Search options offer capability to organize and cluster genes based on their Interactions in biological pathways, their association with Gene Ontology terms, Tissue/organ specific expression or any other user-chosen functional grouping of genes. A color coding scheme is used to highlight differential gene expression patterns against a background of gene functional information. Concept hierarchies (Anatomy and Diseases) of MESH (Medical Subject Heading) terms are used to organize and display the data related to Tissue specific expression and Diseases.

For more details on the functionality of the tools that are available Click on the Help button in the top bar or here. Step by step detail for input of microarray data and explanation of results is provided for each individual tool.

Since this is in development stages many more features will be added along.

Linking to cumulative gene information through database, spreadsheet or data analysis software

The detail Gene information can be downloaded for user chosen dataset using the Gene Info tool. But the collective data in this resource for any gene can be also accessed directly by Excel spread sheets or software by linking to the following URL :

By Genbank accession : http://www.biorag.org/perl/biorag.pl?id=xyz

By Unigene : http://www.biorag.org/perl/biorag.pl?uid= xyz

By Locus Link : http://www.biorag.org/perl/biorag.pl?lid=xyz

where xyz= Genbank accession, Unigene id ( hs.xyz or mm.xyz), Locus Link Id.

To link Genbank Accessions from the Gene Inspector tool of GeneSpring
Follow the instructions given at the GeneSpring FAQ site at the URL http://www.sigenetics.com/cgi/HelpFaqGen.cgi?how_buttons

Simply retrieve the file by going to GeneSpring/data and locating the genome folder (i.e.data/Demo Chips/Human/) that you are working on. Find the file at the top level of this folder that ends in the extension ".genomedef" (not .genomedef.backup). Open this file in Notepad or a similar text editor.
For weblink to this resource, add the following line
GeneHypertextLinks : BIORAG:http://www.biorag.org/perl/biorag.pl?id=<genbank>
It will look like this.

About data collection, storage, relationship and updates.

In order to provide an up to data version of the genes, the data in this resource is updated twice a week from NCBI dbEST, Unigene and LocusLink. The Update date is provided under the Update link. The associated data with the genes (Unigene, Locus Link and Genbank accessions) that is: Pathways, Diseases and Protein interactions is updated based on the latest changes (eg. changed Unigene etc.) provided by the update. The relational database that stores the functional relationships resides on a 900 MHz processor Sun Fire 280R Unix Server that uses 700 gigabyte RAID system for data storage. MySQL is used as the relational database management system. The website is driven by the Apache server.

All the data is collected from external resources and hence the annotation quality and accuracy is as provided by the parent annotation resources. This Resource collects data from various open public resources and integrates them together in a relational schema which can be further accesses through a common platform. For some of these external data resources there are terms governing use of their site and data. Wherever applicable you should review their terms of agreement before using the data.

Relationship between the biological entities has been set up using the following Entity Relationship diagram. The database is developed using this ER schema.

Frequently ask questions about the data

1. No annotation found for a gene?

We have tried to capture all genes based on combined information from Unigene and Locus Link. A gene is represented locally in our database only if it is present in any one or both the resources from NCBI. If the query accessions you provide are not included in any one of the NCBI resource then the information for that particular accession will not be available.

2. You find the NCBI version of a record different than the one on the site on a given day?

The data is updated twice a week from NCBI, so between the intervals if there are changes they will not be reflected on this resource. Also the updates are done using the FTP site of the NCBI and the contents are picked up from the NCBI data directories. We depend and rely on the accuracy and most up to date information from them.

3. You know that a given data is present for your gene of interest at a third party website but the same data does not come up in this resource although the data is included from that site into our database? á

All the information about a gene from other databases like interactions and pathways is collected using the identifiers like Locus Link, Unigene, Genbank accessions or the Hugo gene nomenclature. Any of these have to be present in an external resource for the related information to get integrated with any given gene. It is possible we will miss out information on few genes if proper identifiers are not found but effort has been made that we capture all the information. á

Literature Citations and how to Cite or acknowledge this resource

Publications and presentations that have cited/used this site:

Technology in Cancer Research and Treatment.5(6) 553-64 Dec 2006.Ignatenko NA, Yerushalmi HF, Watts GS, Futscher BW, Stringer DE, Marton LJ,Gerner EW.Pharmacogenomics of the Polyamine Analog 3,8,13,18-tetraaza-10,11-[(E)-1,2-cyclopropyl]eicosane Tetrahydrochloride,CGC-11093, in the Colon Adenocarcinoma Cell Line HCT1161.

Virology 348(1):242-252 April 2006.Thomas MJ, Agy MB, Proll SC, Paeper BW,et al. Functional gene analysis of individual response to challenge of SIVmac239 in M. mulatta PBMC culture

Inflammatory Bowel Diseases 12(4):278-293, April 2006. Bernstein H, Holubec HMS, Bernstein C, Ignatenko N, Gerner E, et al. Unique Dietary-Related Mouse Model of Colitis.

Pharmacogenomics 7(3):407-419 April 2006.Broderick G,Craddock C,Whistler T,Taylor R, et al. Identifying illness parameters in fatiguing syndromes using classical projection methods.

Proteomics March 2006. Segura V, Podhorski A, Guruceaga E, Sevilla JL, et al. GARBAN II: An integrative framework for extracting biological information from proteomic and genomic data.

Cancer Research 66, 1114-1122, January 15, 2006. Chemnitz JM, Driesen J, Classen S, et al. Prostaglandin E2 Impairs CD4+ T Cell Activation by Inhibition of lck: Implications in Hodgkin's Lymphoma.

Trends in Genetics 21(10):553-8 Oct 2005. Seifert M, Scherf M, Epple A, Werner T. Multievidence microarray mining.

Trends In Biotechnology 23 (8): 429-435 Aug 2005. Curtis RK, Oresic M, Vidal-Puig A. Pathways to the analysis of microarray data.

Nucleic Acids Res 33: p. W633-W637 Jul 1 2005 . Mlecnik B, Scheideler M, Hackl H, et al. PathwayExplorer: web service for visualizing high-throughput expression data on biological pathways.

Bmc Bioinformatics 6: Art. No. 163 Jun 29 2005. Breslin T, Krogh M, Peterson C, et al. Signal transduction pathway profiling of individual tumor samples.

Drug Discovery Today 10 (10): 727-734 May 15 2005. Cavalieri D, De Filippo C. Bioinformatic methods for integrating whole-genome expression results into cellular networks.

British Journal Of Nutrition 93 (4): 425-432 Apr 2005. Garosi P, De Filippo C, van Erk M, et al. Defining best practice for microarray analyses in nutrigenomic studies

Chemical Research In Toxicology 18 (3): 403-414 Mar 2005 Hayes KR, Bradfield RA Advances in toxicogenomics.

Current Molecular Medicine 5 (1): 11-21 Feb 2005. Yue L, Reisdorf WC. Pathway and ontology analysis: Emerging approaches connecting transcriptome data and clinical endpoints.

J Biol Chem. 279(2):937-44. Jan 9 2004. Chauhan S, Davis K, et al. Androgen control of cell proliferation and cytoskeletal reorganizationin human fibrosarcoma cells: Role of RhoB signaling.

Toxicology And Industrial Health, Oct 2003, Vol. 19, No. 7-10, 157-163 Sun NN, Fastje CD, etal. Dose-dependent transcriptome changes by metal ores on a human acute lymphoblastic leukemia cell line.

Biochem Biophys Res Commun. Oct 2003;310(2):421-32. Chauhan S, Pandey R, Way JF, etal. Androgen regulation of the human FERM domain encoding gene EHM2 in a cell model of steroid-induced differentiation.

J Steroid Biochem Mol Biol. 84(4):441-52 Mar 2003. Chauhan S, Leach CH, Kunz S, et al. Glucocorticoid regulation of human eosinophil gene expression.

Gene Expression of DU-145 Cells Stimulated with Human Laminin 5 or Laminin 10. Beck SK, Hoying J, Pandey R, Calaluce R, Barrera J, Mount DW, Nagle RB. (43rd ASCB annual meeting 03 Poster presentation)

Publications and Presentation on this resource:

Bioinformatics 2004 20(13):2156-8. Pandey R, Guru RK and Mount. DW. Pathway Miner: Extracting Gene Association Networks from Molecular Pathways for Predicting the Biological Significance of Gene Expression Microarray Data.

BioRag (Bio Resource for Array Genes): An Online Resource for Analyzing and interpreting Microarray data.
Pandey R, Guru RK and Mount DW (ISMB 03 Poster presentation).

Steroid Regulated Gene Expression Database. Pandey R, Appikatla V, Chauhan S, Mount DW and Miesfeld RL (2002). (Genomics & Proteomics in Endocrinology 02 Poster Presentation)

Citing Bioresource : If you have used this website for your research purpose, please cite or acknowledge this resource in your publications/presentations as "Bioresource for array genes at www.biorag.org."

Citing Biorag : If you have used or are using BioRag for your research please cite or acknowledge this resource in your publications as "BioRag (Bioresource for array genes) at www.biorag.org".


For any comments or questions contact Dr. Ritu Pandey or Prof. David Mount.
BioRag database is maintained by the Bioinformatics group at Arizona Cancer Center. The material presented here is compiled from different public databases. BioRag is hosted by the Biotechnology Computing Facility of the University of Arizona.© 2002,2003 University of Arizona. All Rights Reserved.