KEGG Virus Resource

KEGG Virus is a resource for integrating viruses and cellular organisms from an evolutionary perspective. Virus data are part of the PATHWAY, BRITE, KO, GENES, GENOME, NETWORK, DISEASE and DRUG databases as summarized below.

Database Virus data Remark
PATHWAY Disease pathway maps for viral infections
Genetic information processing in viruses
Manually created KEGG pathway maps
BRITE Viral protein classifications Manually created hierarchical classifications
KO Viral proteins Manually defined KOs (K number entries) for viral proteins
GENES KEGG viruses in the NCBI taxonomy Linked to virus gene (vg) entries created from NCBI RefSeq
GENOME KEGG selected viruses Completely sequenced viral genomes for human pathogens
NETWORK Network variation maps for viral infections Viruses viewed as perturbants to human signaling networks
DISEASE Viral infectious diseases and genomes KEGG DIESEASE (H number) entries
DRUG Antivirals and targets KEGG DRUG (D number) entries

Pathway maps for viruses

The pathway maps for viral infections are interaction networks of both human proteins (colered in green and linked to human gene entries) and viral proteins (colored in blue and linked to virus KOs). The new category of genetic information processing in viruses will contain virus-specific genes and proteins.

Category Pathway map Virus
Viral infections 05166 Human T-cell leukemia virus 1 infection HTLV-1
05170 Human immunodeficiency virus 1 infection HIV-1
05161 Hepatitis B HBV
05160 Hepatitis C HCV
05171 Coronavirus disease - COVID-19 SARS-CoV-2
05164 Influenza A IAV
05162 Measles MV
05168 Herpes simplex virus 1 infection HSV-1
05163 Human cytomegalovirus infection HCMV
05167 Kaposi sarcoma-associated herpesvirus infection KSHV
05169 Epstein-Barr virus infection EBV
05165 Human papillomavirus infection HPV
05203 Viral carcinogenesis HBV, HCV, HPV, HTLV-1, KSHV
Genetic information
processing in viruses
03230 Viral genome structure HIV-1, HTLV-1, HBV, HCV, SARS-CoV-2

Brite hierarchies and tables for viruses

Virus specific Brite hierarchy files and Brite table files are being developed.

Category Brite file
Functional classification 03200 Viral proteins (all viral KOs)
03210 Viral fusion proteins
Taxonomy 08620 KEGG viruses in the NCBI taxonomy (ICTV and Baltimore classifications)
08621 KEGG viruses (simplified version with family, genus and species only)

Virus genes and proteins

The virus category of the GENES database is generated from the RefSeq database and given KO annotation. RefSeq GeneID's are used as gene identifiers and the organism code "vg" (and T40000 identifier) is used for the entire set of viral genes. Each virus may be distinguished by the NCBI taxonomy identifier.

Selected viruses with medical relevance are given T4 identifiers and are part of the GENOME database. A new category of the GENES database, "vp" for virus mature peptides, has been introduced to define mature peptides cleaved from polyproteins for these selected viruses, which are often important to define protein interaction networks in KEGG pathway maps for viral infections.
Example KEGG identifier Remark
SARS-CoV-2 T40361 TAX:2697049
SARS-CoV-2 S (spike) protein vg:43740568 K24152
SARS-CoV-2 S1 (attachment) peptide vp:43740568-1 K25001
SARS-CoV-2 S2 (fusion) peptide vp:43740568-2 K25002

Virus taxonomy

All the viruses present in KEGG GENES are classified according to the NCBI taxonomy, which is based on the ICTV (International Committee on Taxonomy of Viruses) classification, supplemented by KEGG with the traditional Baltimore classification (see br08620). The correspondence between the ICTV realm, kingdom, phylum, class, order, family classification and the seven types of Baltimore classification is shown below.
         Blubervirales  (VII dsDNA-RT) 
         Ortervirales   (VI ssRNA-RT)
           Caulimoviridae (VII dsDNA-RT)
     Duplornaviricota   (III dsRNA)
       Duplopiviricetes (III dsRNA)
           Hypoviridae    (IV +ssRNA)
       Pisoniviricetes  (IV +ssRNA)
       Stelpaviricetes  (IV +ssRNA)
     Kitrinoviricota    (IV +ssRNA)
     Lenarviricota      (IV +ssRNA)
     Negarnaviricota    (V -ssRNA)
 Ribozyviria            (V -ssRNA)
 Duplodnaviria       (I dsDNA)
 Varidnaviria        (I dsDNA)
 Adnaviria           (I dsDNA)
 Monodnaviria        (II ssDNA) 
       Papovaviricetes (I dsDNA)

Last updated: June 18, 2021