KEGG Objects

KEGG objects are biological entities from molecular to higher levels that are represented as database entries in KEGG, such as genes and protein, small molecules, reactions, pathways, diseases and drugs.

KEGG Object Identifier

The identifier of each database entry is generally in the form of
where "db" is the database name and "entry" is the entry name or the accession number (see DBGET for the list of database names and abbreviations). However, "db" may be omitted in fourteen of the sixteen databases shown below, because the entry name, called the KEGG object identifier consisting of a database-dependent prefix and a five-digit number, is unique across the databases. The KEGG object identifiers are often called D numbers, K numbers, C numbers, etc. The D number example shown below D01441 is thus equivalent to dr:D01441 or drug:D01441.

Database Object Prefix Example
KEGG PATHWAY Pathway map map, ko, ec, rn, (org) hsa04930
KEGG BRITE Functional hierarchy br, jp, ko, (org) ko01003
KEGG MODULE KEGG module M, (org)_M M00010
KEGG GENOME KEGG organism T T01001 (hsa)
KEGG GENES Gene / protein hsa:3643
KEGG COMPOUND Small molecule C C00031
KEGG GLYCAN Glycan G G00109
KEGG REACTION Reaction R R00259
KEGG RPAIR Reactant pair RP RP04458
KEGG RCLASS Reaction class RC RC00046
KEGG ENZYME Enzyme ec:
KEGG DISEASE Human disease H H00004
KEGG DRUG Drug D D01441
KEGG DGROUP Drug group DG DG00710
KEGG ENVIRON Crude drug, etc. E E00048
(org) represents three-, four-, or five-letter organism code

The fourteen databases are all manually created by KEGG. The remaining two databases, KEGG GENES derived from RefSeq and KEGG ENZYME derived from ExplorEnz, are also given KEGG-original annotations.

KEGG Organism Code

In addition to the T number shown above, an organism in KEGG is given a three-letter KEGG organism code (with prefix "d" for automatically annotated genomes), which is treated like a database name. Therefore, individual genes in an organism can be identified in the following way:
where "org" is the KEGG organism code and "gene" is the KEGG GENES or DGENES entry name (see below). The KEGG organism code is also used as a prefix to identify organism-specific pathway maps or BRITE functional hierarchies (see: Pathway maps and Brite hierarchies).

Links to Outside Databases

The KEGG objects are linked to numerous outside database entries with the same biological meaning, such as the same gene or protein in the same organism, enabling the user to integrate information from different sources (see the list of outside links in the GenomeNet LinkDB page).

The KEGG GENES entry names, which are usually NCBI GeneIDs or locus_tags, can be converted to/from the identifiers of the major sequence databases through the GenomeNet LinkDB system. The following interface allows conversion from NCBI or UniProt.

