How KEGG can be used for genome/metagenome annotation

The KEGG database contains three main components for genome/metagenome annotation:
  • the collection of internally annotated gene catalogs for the complete genomes (called KEGG organisms) and additional protein sequences in the KEGG GENES database
  • the knowledge base of high-level functions represented as the molecular interaction, reaction and relation networks in the KEGG PATHWAY, BRITE and MODULE databases, and
  • the knowledge base of molecular-level functions associated with ortholog groups in the KO database, where most KO entries are defined in a context-dependent manner as nodes of the KEGG molecular networks.
In general, KO entries (identified by K numbers) also represent sequence similarity groups. Thus, the sequence similarity search of a query genome against KEGG GENES is a search for most appropriate K numbers, and the assigned set of K numbers can be used to reconstruct KEGG pathway maps, BRITE hierarchies and KEGG modules, enabling interpretation of high-level functions (see BlastKOALA and GhostKOALA as examples).

Continuous efforts are being made to expand the repertoire of KO entries and the improved collection of KEGG modules (identified by M numbers) for automated interpretation of high-level funtions.

Ortholog Table

The ortholog table (OT) has existed from the beginning of the KEGG project. For a given set of K numbers it displays the current assignment of genes in KEGG organisms.

Enter K numbers      (Example) K14579 K14580 K14578 K14581 K14582 K14583 K18242 K18243

Module Table

The module table (MT) is a new addition to the KEGG annotation resource. For a given set of M and/or K numbers it identifies organisms that contain complete modules and/or KO groups. The list of organisms may be collapsed into species or genus level.

Enter M/K numbers      (Example) M00595 K16952 M00596

Taxonomic Distribution

The KO entry page (such as K08940) contains the Taxonomy button for mapping against the NCBI taxonomy (br08610), while the module entry page (such as M00165) and the show_module page (such as M00165) contain the Taxonomy button for mapping against the KEGG taxonomy (br08601). The following is a more flexible interface that can be used to examine taxonomic distribution of a single K number, a single M number, or a logical expression of M and K numbers.

Enter single M/K number or logical expression of M/K numbers
complete only
include 1 block missing
include 1 block or 2 block missing
1. M00165
2. K08940+K08941+K08942+K08943
3. (K02575,M00438) M00531

Annotation Guide

The KEGG Annotation Guide is a collection of HTML tables showing summary views of the current annotation of the KEGG GENES database, such as how K numbers are defined and assigned for distinguishing related genes and for comparing different subunit structures.

Distinguishing related genes Comparing subunit structures Characterizing metabolic and other capacities

