PMID- 21370079
OWN - NLM
STAT- MEDLINE
DCOM- 20110609
LR  - 20110303
IS  - 1940-6029 (Electronic)
IS  - 1064-3745 (Linking)
VI  - 719
DP  - 2011
TI  - Omics data management and annotation.
PG  - 71-96
LID - 10.1007/978-1-61779-027-0_3 [doi]
AB  - Technological Omics breakthroughs, including next generation sequencing, bring 
      avalanches of data which need to undergo effective data management to ensure 
      integrity, security, and maximal knowledge-gleaning. Data management system 
      requirements include flexible input formats, diverse data entry mechanisms and 
      views, user friendliness, attention to standards, hardware and software platform 
      definition, as well as robustness. Relevant solutions elaborated by the 
      scientific community include Laboratory Information Management Systems (LIMS) and 
      standardization protocols facilitating data sharing and managing. In project 
      planning, special consideration has to be made when choosing relevant Omics 
      annotation sources, since many of them overlap and require sophisticated 
      integration heuristics. The data modeling step defines and categorizes the data 
      into objects (e.g., genes, articles, disorders) and creates an application flow. 
      A data storage/warehouse mechanism must be selected, such as file-based systems 
      and relational databases, the latter typically used for larger projects. Omics 
      project life cycle considerations must include the definition and deployment of 
      new versions, incorporating either full or partial updates. Finally, quality 
      assurance (QA) procedures must validate data and feature integrity, as well as 
      system performance expectations. We illustrate these data management principles 
      with examples from the life cycle of the GeneCards Omics project 
      (http://www.genecards.org), a comprehensive, widely used compendium of annotative 
      information about human genes. For example, the GeneCards infrastructure has 
      recently been changed from text files to a relational database, enabling better 
      organization and views of the growing data. Omics data handling benefits from the 
      wealth of Web-based information, the vast amount of public domain software, 
      increasingly affordable hardware, and effective use of data management and 
      annotation principles as outlined in this chapter.
FAU - Harel, Arye
AU  - Harel A
AD  - Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
FAU - Dalah, Irina
AU  - Dalah I
FAU - Pietrokovski, Shmuel
AU  - Pietrokovski S
FAU - Safran, Marilyn
AU  - Safran M
FAU - Lancet, Doron
AU  - Lancet D
LA  - eng
PT  - Journal Article
PT  - Research Support, Non-U.S. Gov't
PL  - United States
TA  - Methods Mol Biol
JT  - Methods in molecular biology (Clifton, N.J.)
JID - 9214969
SB  - IM
MH  - Animals
MH  - Computational Biology/*methods/standards
MH  - Data Display
MH  - Databases, Genetic
MH  - Humans
MH  - Information Management/*methods/standards
MH  - Molecular Sequence Annotation/*methods/standards
MH  - Quality Control
MH  - Research Personnel
MH  - Software
EDAT- 2011/03/04 06:00
MHDA- 2011/06/10 06:00
CRDT- 2011/03/04 06:00
PHST- 2011/03/04 06:00 [entrez]
PHST- 2011/03/04 06:00 [pubmed]
PHST- 2011/06/10 06:00 [medline]
AID - 10.1007/978-1-61779-027-0_3 [doi]
PST - ppublish
SO  - Methods Mol Biol. 2011;719:71-96. doi: 10.1007/978-1-61779-027-0_3.