=========================================
README for the "genes/fasta" subdirectory
=========================================

This directory contains the entire sets of GENES protein sequences in the
FASTA format and associated KO assignment and sequence length data

   eukaryotes.pep.gz  - Eukaryotic amino acid sequences and associated data
   eukaryotes.dat.gz
   prokaryotes.pep.gz - Prokaryotic amino acid sequences and associated data
   prokaryotes.dat.gz
   viruses.pep.gz     - Viral amino acid sequences and associated data
   viruses.dat.gz

   ncgenes.rna.gz  - Non-coding RNA sequences and associated data
   ncgenes.dat.gz    for eukaryotes, prokaryotes and viruses

as well as small representative subsets of GENES used by the BlastKOALA server
containing KEGG reference genomes and individual sequences linked from pubmed
records of KO entries

   refeuk.pep.gz   - Representative eukaryotic sequences and associated data
   refeuk.dat.gz
   refprok.pep.gz  - Representative prokaryotic sequences and associated data
   refprok.dat.gz

---
Updated: July 1, 2025

(c) Kanehisa Laboratories
