KEGG Objects

KEGG objects are biological entities from molecular to higher levels that are represented as database entries in KEGG, such as genes and protein, small molecules, reactions, pathways, diseases and drugs.

KEGG Identifier

The KEGG object identifer or simply the KEGG identifier (kid) is a unique identifier for each KEGG object, which is also the database entry identifier. Generally it takes the form of a prefix followed a five-digit number as shown below:

Database Object Prefix Example
KEGG PATHWAY Pathway map map, ko, ec, rn, (org) hsa04930
KEGG BRITE Functional hierarchy br, jp, ko, (org) ko01003
KEGG MODULE KEGG module M, (org)_M M00010
KEGG ORTHOLOGY KO K K04527
KEGG GENOME KEGG organism T T01001 (hsa)
KEGG GENES Gene / protein hsa:3643
KEGG COMPOUND Small molecule C C00031
KEGG GLYCAN Glycan G G00109
KEGG REACTION Reaction R R00259
KEGG RCLASS Reaction class RC RC00046
KEGG ENZYME Enzyme ec:2.7.10.1
KEGG DISEASE Human disease H H00004
KEGG DRUG Drug D D01441
KEGG DGROUP Drug group DG DG00710
KEGG ENVIRON Crude drug, etc. E E00048
(org) represents three- or four-letter organism code

Exceptions are GENES identifiers and EC numbers for enzymes, which follow the general form of the the DBGET retrieval system (see About DBGET). Each entry in DBGET is identified by
db:entry
where "db" is the database name and "entry" is the entry name or the accession number (see DBGET for the list of database names and abbreviations).

The prefixed-version of KEGG identifiers are often called D numbers, K numbers, C numbers, etc. Since these numbers are unique across the KEGG databases, "db" may be omitted in the DBGET system. For example,
D01441
dr:D01441
drug:D01441
are all equivalent.

KEGG Organism Code

In addition to the T number shown above, an organism in KEGG is given a three- or four-letter KEGG organism code, which is treated like a database name. Therefore, individual genes in an organism can be identified in the following way:
org:gene
where "org" is the KEGG organism code and "gene" is the KEGG GENES entry name. The KEGG organism code is also used as a prefix to identify organism-specific pathway maps or BRITE functional hierarchies (see: Pathway maps and Brite hierarchies).


Links to Outside Databases

The KEGG objects are linked to numerous outside database entries that are related in certain ways (see the list of outside links in the GenomeNet LinkDB page), enabling the user to integrate information from different sources.

The sequence data of the KEGG GENES entries are derived from the NCBI resource, and the links are given by ProteinIDs for GenBank and RefSeq derived data and by GeneIDs for RefSeq derived data. The Conv ID tool in KEGG Mapper may be used to convert ProteinIDs and GeneIDs, as well as UniProt IDs, to KEGG identifiers.


Last updated: November 1, 2016