KEGG in Keg

KEGG Annotation - Assign KO

Annotation Assign KO BlastKOALA GhostKOALA

Assign KO tool

The Assign KO tool is an interactive tool equivalent to the BlastKOALA server for use with a small dataset. BlastKOALA assigns KOs (K numbers) to a given set of amino acid sequences for subsequent analysis with the KEGG Mapper Reconstruct tool.

Enter query sequence data in FASTA format (Up to 10 sequences may be given)
Select reference GENES dataset to be searched
Eukaryotes
Prokaryotes
Viruses


More details about this tool

Processing of BLAST search result
  1. Run BLASTP and extract from the output format 6 (tabular output) qseqid (query sequence id), sseqid (target sequence id), pident (percentage of identical matches), length (alignment length) and bitscore for a given threshold of evalue (0.1 for example).
  2. For each query sequence, create a GFIT table containing a single top-scoring sequence from each target genome.
  3. Compute the modified identity score: identity = pident * min(1, length*2/(aalen1+aalen2)) where aalen1 and aalen2 are the query and target sequence lengths.
  4. The GFIT table is grouped according to the assigned K numbers (including none assigned), and compute the weighted sum of identity scores. This may be done by looking only up to 10 hits in the GFIT table.
  5. Assign a K number or none according to the weighted sum.
Reference GENES dataset
Reference genomes in KEGG are well-curated genomes and/or representative genomes in taxonomic groups. They are not subject to automatic KOALA annotation. The reference GENES dataset is a collection of protein coding genes in reference genomes and publication-based sequence data associated with KOs, many of which are in the GENES Addendum (ag) category.

Last updated: December 1, 2025