Home Contact Us Participants Advisory Cmte Steering Cmte PI Research Foci Neuro Core Microarray Core Bioinformatics Funding Ops For Students Activities

INBRE BIOINFORMATICS CORE

          Bioinformatics Core Director David Dyer, Ph.D.

           Professor
           Microbial Pathogenesis and Microbial Genomics

           University of Oklahoma Health Sciences Center

           Biomedical Research Center, room 362

           975 NE 10th Street

           Oklahoma City, OK  73104

           405.271.1201  x1  

           david-dyer@ouhsc.edu 

 

OKRA Project    

Bioinformatics Graduate Program    

Core Profile    

User Fees   

Quarterly Newsletter

Genomics/Proteomics/Bioinformatics Discussion Group

 

 

CORE PROFILE

The Bioinformatics Core Facility is located within the University of Oklahoma Health Sciences Center Laboratory for Genomics and Bioinformatics (LGB).  The focus of the original Oklahoma BRIN bioinformatics core was specific to the analysis and database management of microarray data.  The INBRE Bioinformatics Core, by contrast, is more flexible and diverse in its offerings.  The core facility provides genomics and informatics support to the OUHSC community, DNA sequence analysis and data cleanup, advice on taxonomic analysis of this DNA sequence data, in addition to supporting proposed microarray experiments.  Efforts are collaborated with the INBRE Microarray Core, located at the Oklahoma Medical Research Foundation, and the satellite bioinformatics cores located on the University of Oklahoma Norman campus, Oklahoma State University and The University of Tulsa.  Activities are directed towards integrating these groups into a coherent whole.  The Bioinformatics Core acts as a central repository of data for submission to the national repository GEO.  The Core works directly with the newly hired bioinformatics faculty members in the OUHSC Department of Biostatistics and Epidemiology, to provide training and advice on the design of microarray experiments.  

The Core recently implemented the Oklahoma Bioinformatics User's Group (OBUG) (http://obug.ouhsc.edu) that is used as a clearinghouse for bioinformatics exchanges in Oklahoma.  In addition, a quarterly newsletter and other information is available at http://microgen.ouhsc.edu

Hardware.  Currently, the Bioinformatics Core has two available servers.  i) A Sun Enterprise 450 Unix server with RAID array and dedicated backup.  This is used for DNA sequence analysis and assembly, and is essential for some services that are provided by the Bioinformatics Core.   ii) A Dell PowerEdge 4400 running Red Hat Linux.  A BASE database is installed on this server for supporting microarray data capture and archival.  The BASE database format can export microarray data in MIAME format, using the MAGE-ML standard.

Staff.  Two personnel, Mr. Matt Carson, a certified Sun Systems Administrator, and Mr. Joshua Orvis, an Oracle-certified database manager, staff the Bioinformatics Core.  These individuals have worked in the OUHSC LGB for over two years, and Mr. Orvis has been supported by the Oklahoma BRIN to provide database and microarray analysis services to the OUHSC campus.  Both individuals have worked with members of the bioinformatics group in the OMRF Microarray Core facility during the Oklahoma BRIN funding period, have participated in the monthly BRIN programmers meetings, and will continue to do so during the proposed INBRE funding period.  These individuals also work directly with the newly hired bioinformatics faculty members in the OUHSC Department of Biostatistics and Epidemiology, to provide training and advice on the design of microarray experiments.  Lastly, these individuals have had extensive experience in genome sequence analysis and annotation, having participated in the sequencing and annotation of the Neisseria gonorrhoeae and Actinobacillus actinomycetemcomitans genome projects.  

 

TOP

 

Oklahoma Re-Annotation (OKRA) Project    

The Bioinformatics Core will engage students from the INBRE undergraduate campuses in the re-annotation of microbial genome sequence databases.  The one-semester project directs students to the primary literature to evaluate experimental “proof” of function.  Students will update and enhance the annotation of existing microbial genome sequence databases.  The re-annotated genome sequence database will be made publicly available on the website of the Bioinformatics Core http://microgen.ouhsc.eduAAt the end of each project period, a one-day symposium will be held where two or three nationally-recognized investigators who work on this organism will present seminars and meet with students. Database papers will be published with students as co-authors.

Currently, one of the most vexing problems faced by the discipline of microbial genomics is the paucity of efforts for ongoing curation of the existing microbial genome sequence databases.  For instance, the Haemophilus influenzae genome sequence database was first published in 1995, and has not been updated since then.  This leads to a gradual degradation of the usefulness of this resource since no new studies are being incorporated into this database and the scientific community thus cannot derive its full benefit.  Funding agencies have typically been reluctant to devote resources to curation or “re-annotation” of these genome sequence databases, once established.  Re-annotation of these resources has therefore typically been an ad hoc process, generally performed by the original group that sequenced the genome. The process of updating these databases has generally been haphazard and non-uniform.  Further, re-annotation of a genome sequence database typically involves considerable effort but little explicit academic reward, which acts as a disincentive.  The OKRA project will allow for microbial genome sequence re-annotation as a way to introduce Oklahoma undergraduate students to bioinformatic analyses of genome sequence data, and at the same time update and enhance the annotation of existing microbial genome sequence databases.   

TOP

  The OKRA project will proceed as follows:

 1)                 Using the facilities and the INBRE Bioinformatics Core and in consultation with their personnel, the Multi-campus Bioinformatics Education specialist (MBE) will download the genome sequence of a microbial genome of interest.  This will then be sent via ftp to The Institute for Genome Research (TIGR) for analysis by their automated Annotation Engine (see below).  The results of this automated process, in a MySQL database, will be returned to the Bioinformatics Core at OUHSC.  This MySQL database can be viewed using a web-based TIGR application called Manatee, so that manual annotators can examine each predicted gene and the relevant information used to predict the function of the protein encoded by that gene.  This system is already in place on the OUHSC campus, as Dr. Dyer, the Bioinformatics Core Director, is currently using this two-step automated/manual annotation process for annotation of several ongoing genome-sequencing projects.

2)                 On each of the participating INBRE undergraduate campuses, the campus representatives will recruit outstanding undergraduate students (typically, at least five per campus) who will participate in a manual second-pass annotation of this genome sequence data. This will begin with a one- or two-day initial training session in which the MBE specialist will train participating undergraduate faculty and students on genome sequence annotation.  This training also will include presentations by Dr. Dyer and co-workers on the process of microbial genome sequencing. 

3)                 The aggregate student group from all campuses will be divided into two annotation teams, which will work in parallel to manually annotate the genome sequence.  That is, each open reading frame will be annotated independently by at least two students.  This arrangement will allow the INBRE MBE specialist and undergraduate faculty to arbitrate differences between the annotation decisions made by each undergraduate student.  Each student will be assigned a specific group of predicted genes, and will examine the results of the automated annotation to determine whether to accept the automated gene call, or modify this appropriately.  As required, each student will be expected to go to the primary literature and obtain publications that support the annotation decision made by that student.  Since the basis for this manual annotation process uses the web-based TIGR utility Manatee, the annotation can be performed over the Internet at each undergraduate campus, with each student supported by faculty on that campus.  Additional support and advice will be provided by the MBE specialist and will include periodic visits to each campus.

4)                 Annotation of a typical 2Mb microbial genome should take approximately 1 semester.  At the end of this period, all students and faculty will be invited to the OUHSC campus for a one-day symposium focused on the organism whose genome sequence has been re-annotated by the students. During this symposium, outstanding scientists from throughout the nation who work on the organism in question will be invited to give a lecture on their work for the students, and will attend a luncheon with the students.  This symposium also will include a poster session where all students will present a discussion of the region of the genome that they have annotated.

 5)                 The re-annotated genome sequence database will be made publicly available on the website of the Bioinformatics Core (http://microgen.ouhsc.edu) and the contribution of each undergraduate campus and each student will be noted on this website.  We also will publish a description of this database in the annual database issue of Nucleic Acids Research, and the contribution of each student will be noted by co-authorship on this publication.

We are currently exploring how to provide these students with coursework credit for this experience, so that it will become a part of their regular curriculum.   

 

TOP

 

 

Bioinformatics Graduate Program

A graduate program in Bioinformatics is currently under development on the University of Oklahoma Norman campus and will be expanded to other INBRE campuses over time.  This program will operate as a M.S. and Ph.D. degree-granting program.  The program is not an administratively independent unit, rather it represents an over-aching graduate program residing in five host departments.  Studies in Bioinformatics at the University of Oklahoma are centered primarily in the Departments of Botany/Microbiology, Chemistry/Biochemistry, Zoology, Health Sciences Center Microbiology and Immunology, and Computer Science.

http://www.ou.edu/cas/zoology/Bioinformatics

The primary academic goals of the proposed program, include students to:

1.  Serve as faculty in colleges and universities of the state, nation, and international community

2.  Serve in private biotechnology industry

3.  Assume leadership roles in their academic disciplines

4.  Develop sophisticated abilities need to understand complex genomic data

5.  Develop helpful and novel ways of applying this knowledge and disseminating it to the public.

TOP

User Fees

For several years, the OUHSC LGB has had a user reimbursement target of 50%.  While user fees are used to support a significant portion of the cost of the facility, it is unrealistic to expect that all costs will be supported, particularly those costs for equipment and software upgrades, training of new personnel and development of new services.  Equipment and software upgrades have been paid for by equipment grants.  Training of newly hired personnel can be costly and time-consuming, the training period and success will be variable, costs difficult to estimate, and even more difficult to amortize over a fluctuating user base.  Thus, these costs have been borne by the university with the understanding that, without this investment, the core facility will "die on the vine", with a consequent negative impact on the university research environment.  Routine user fees are published on the facility web page; specialty services are negotiated with users after consultation with LGB personnel.  Bioinformatics support is charged at $45/hour.  Microarray scanning services are currently $20/scan.

BRIN BIOINFORMATICS CORE PROGRESS

The Bioinformatics Core during the BRIN was located on the University of Oklahoma Norman campus.  The Core completed its first macroarray database, written in Oracle, and a second microarray database is nearing completion; similar, coordinated projects are underway at OUHSC, OSU and OMRF (written in MYSQL and alternative database languages).  In addition, the international microarray database project, BASE, has been transferred to OMRF for distribution; OU staff have contributed significantly to debugging BASE for local installations.  The statistical approach developed at OMRF and the fruits of interactions of the OU Core with Statistica, an Oklahoma based statistical software company, were implemented during the third year of the BRIN.