KEGG Overview
1. Genomes to Biological System
KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from genomic and molecular-level information. It is a computer representation of the biological system, consisting of molecular building blocks of genes and proteins (genomic information) and chemical substances (chemical information) that are integrated with the knowledge on molecular wiring diagrams of interaction, reaction and relation networks (systems information).
2. The KEGG Database
KEGG is an integrated database resource consisting of the seventeen databases shown below. They are broadly categorized into systems information, genomic information and chemical information and further subcategorized by color coding of web pages.| Category | Database | Content | Color |
| Systems information |
KEGG PATHWAY | Pathway maps | |
| KEGG BRITE | Functional hierarchies | ||
| KEGG MODULE | KEGG modules of functional units | ||
| KEGG DISEASE | Human diseases | ||
| KEGG DRUG | Drugs | ||
| KEGG ENVIRON | Crude drugs and health-related substances | ||
| Genomic information |
KEGG ORTHOLOGY | KEGG Orthology (KO) groups | |
| KEGG GENOME | KEGG organisms with complete genomes | ||
| KEGG GENES | Gene catalogs with curation | ||
| KEGG DGENES | Gene catalogs with automatic annotation | ||
| KEGG SSDB | Sequence similarity database for GENES | ||
| Chemical information |
KEGG COMPOUND | Metabolites and other small molecules | |
| KEGG GLYCAN | Glycans | ||
| KEGG REACTION | Biochemical reactions | ||
| KEGG RPAIR | Reactant pair chemical transformations | ||
| KEGG RCLASS | Reaction class defined by RPAIR | ||
| KEGG ENZYME | Enzyme nomenclature |
These database contain various data objects for computer representation of the biological systems. Thus, the database entry of each database is called the KEGG object, which is identified by the KEGG object identifier consisting of a database-dependent prefix and a five-digit number (see: KEGG objects).
| Release | Database | Object Identifier |
| 1995 | KEGG PATHWAY | map number |
| KEGG GENES | locus_tag / GeneID | |
| KEGG ENZYME | EC number | |
| KEGG COMPOUND | C number | |
| 2000 | KEGG GENOME | organism code / T number |
| 2001 | KEGG REACTION | R number |
| 2002 | KEGG ORTHOLOGY | K number |
| 2003 | KEGG GLYCAN | G number |
| 2004 | KEGG RPAIR | RP number |
| 2005 | KEGG BRITE | br number |
| KEGG DRUG | D number | |
| 2007 | KEGG MODULE | M number |
| 2008 | KEGG DISEASE | H number |
| 2010 | KEGG ENVIRON | E number |
| KEGG RCLASS | RC number |
3. KEGG Molecular Networks
The most unique data object in KEGG is the molecular networks -- molecular interaction, reaction and relation networks representing systemic functions of the cell and the organism. Experimental knowledge on such systemic functions is captured from literature and organized in the following three forms:- Pathway map - in KEGG PATHWAY (see: Pathway maps)
- Hierarchical list (ontology) - in KEGG BRITE (see: Brite hierarchies)
- Membership list - in KEGG MODULE and KEGG DISEASE
In 1995 the concept of mapping was first introduced in KEGG for linking genomes to metabolic pathways (metabolic reconstruction) using the EC number. Once the EC numbers were assigned to enzyme genes in the genome, organism-specific pathways could be generated automatically by matching against the enzyme (EC number) networks of the KEGG reference metabolic pathways. The EC number is no longer used as an identifier in KEGG. The KEGG Orthology (KO) system is the basis for genome annotation and KEGG mapping.
| Period | Identifier | Reference knowledge | Assignment |
| 1995-1999 | EC number | Metabolic pathways | Domain based |
| 2000-2002 | Ortholog ID | Metabolic and regulatory pathways | Domain based |
| 2003- | KO | Pathways and BRITE hierarchies | Gene based |
References
- Kanehisa, M.; Toward pathway engineering: a new database of genetic and molecular pathways. Science & Technology Japan, No. 59, pp. 34-38 (1996). [pdf]
- Kanehisa, M.; A database for post-genome analysis. Trends Genet. 13, 375-376 (1997). [pubmed]
- Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M.; KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27, 29-34 (1999). [pubmed] [pdf]
- Kanehisa, M. and Goto, S.; KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27-30 (2000). [pubmed] [pdf]
- Kanehisa, M., Goto, S., Kawashima, S., and Nakaya, A.; The KEGG databases at GenomeNet. Nucleic Acids Res. 30, 42-46 (2002). [pubmed] [pdf]
- Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M.; The KEGG resources for deciphering the genome. Nucleic Acids Res. 32, D277-D280 (2004). [pubmed] [pdf]
- Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., and Hirakawa, M.; From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354-357 (2006). [pubmed] [pdf]
- Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Kawashima, S., Okuda, S., Tokimatsu, T., and Yamanishi, Y.; KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480-D484 (2008). [pubmed] [pdf]
- Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., and Hirakawa, M.; KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res. 38, D355-D360 (2010). [pubmed] [pdf]
- Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., and Tanabe, M.; KEGG for integration and interpretation of large-scale molecular datasets. Nucleic Acids Res. 40, D109-D114 (2012). [pubmed] [pdf]
Last updated: January 1, 2012
