The database is composed of 16 "smaller" databases which can be organized, based on content, into 3 categories: systems information, chemical information, and genomic information. These databases are all integrated to form a visual representation of various biological pathways. The pathways are created from data gathered from available genomic sequences and the functional data assocaited with those sequences.
One of the most useful aspects of this database is the ability to visualize different biological pathways (individually or in the context of a particular organism) along with the enzymes, reactions, and substrates/products involved. Another way KEGG can be useful is when determining which enzymes are present in different organisms and how those enzymes are used to create a certain metabolite.
From a human biology perspective, KEGG has been used by researchers to gather genetic information for different diseases. This includes the genetic interactions in the commonly used breast cancer cell line MCF-7 and susceptibility to coranary artery disease (2 , 3 ).
This section will be devoted to a quick guide of how to use the basics of the KEGG Pathway feature. Please refer to the slideshow to see screenshots of the major steps described.
For the purpose of this demonstration, we will look at histidine metabolism in the hyperthermophilic bacteria species Thermotoga maritima. Start at the homepage for KEGG (http://www.genome.jp/kegg/). The link titled (KEGG Pathway" will bring us to the KEGG Pathway database where we can select from a set of categories focusing on biological systems. By selecting metabolism, we are brought to a list of metabolic pathways for which the genomic and functional data are available. When we select "Histidine metabolism" under section 1.5 Amino Acid Metabolism, we are brought to a map depicting the enzymes and substrates needed to generate histidine, its precursors, and products created from histidine. From here, we can select any of the metabolites or enzymes; this is the link to the page for E.C. 220.127.116.11 (imidazoleglycerol-phosphate dehydratase) . These pages include a large amount of information on each exnyme, including name, orthology, associated genes, reactions carried out, and papers relating to the particular enzyme. Also included are links to other useful databases. Included in the slideshow is an image of the page for the product of this enzyme, imidazole-acetol phosphate. We can go back to the histidine map and select an organism to see which ezymes are in that organism's genome and which substrates it can utilize. KEGG contains complete genomes for over 2500 organisms (201 eukaryotes, 2458 bacteria, and 160 archaea). When we choose Thermotoga maritima, we see the exact same map, except the genes found in that organism are highlighted in green. When we select E.C. 18.104.22.168 while on the organism's map, the information now becomes relevant for the enzyme in that particular organism, including amino acid and nucletide sequences.