KEGG in Keg

KEGG Syntax – Synteny Analysis


VOG Alignment Dataset

The newly developed gene order alignment method is applied to a comprehensive comparion of all KEGG organisms and viruses. Here precomputed results of alignments by VOGs are made available. For a given organism or virus the following interface searches precomputed alignment datasets for locally similar gene orders represented as VOG sequences.

Gene order alignment for improving viral gene annotation

As of January 2025 only about 8% of genes are annotated with KOs for viruses, in comparison to over 50% for cellular organisms. Apparently, there are candidate genes that can be annotated, but sequence similarity levels are not sufficiently high for the internally used KO assignment tools.

Thus, efforts have been initiated to include conserved gene orders for KO assignments. The current procedure is the following.
  1. All viral proteins are examined for classification into Viral Ortholog Groups (VOGs). Roughly 90% of 680 thousand proteins are assigned VOG identifiers.
  2. All proteins of KEGG organisms are examined if they can be considered to belong to any VOGs. About 1.3% of 50 million proteins are given VOG identifiers.
  3. Each cellular organism genome is compared against all virus genomes to detect locally similar gene orders (represented as VOG sequences) by the newly developed gene order alignment tool. About 68% of over 10,000 genomes share synteny regions of 3 or more genes with viruses.
The result of all genome comparisons can be viewed from the search box above. Assuming that virus genes without KOs can be assigned the same KOs already assigned for cellular organisms in the alignment result data, the current repertoire of KOs for "vg" (virus gene) category in the GENES database can be expanded. This expanded dataset is given the prefix "vx" for viruses extension, which would increase the virus gene annotation ratio from 8% to 10%. The pathway maps with significant increases include:
  • 00520 [ vg | vx ] Amino sugar and nucleotide sugar metabolism
  • 00541 [ vg | vx ] Biosynthesis of various nucleotide sugars
Note that the vx dataset will eventually disappear after the computational expansion is manually examined.


Last updated: January 1, 2025