-F.Chang, unpublished). A total of 1,054 additional reactions and three shatter libraries were necessary to close gaps and to raise the quality of the finished sequence. Illumina http://www.selleckchem.com/products/Imatinib(STI571).html reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI . The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Illumina and 454 sequencing platforms provided 199.5 �� coverage of the genome. The final assembly contained 697,305 pyrosequence and 20,331,123 Illumina reads Genome annotation Genes were identified using Prodigal  as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline .
The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes – Expert Review (IMG-ER) platform . Genome properties The genome consists of a 4,888,353 bp long chromosome with a GC content of 33.8% (Table 3 and Figure 3). Of the 4,347 genes predicted, 4,285 were protein-coding genes, and 62 RNAs; 122 pseudogenes were also identified. The majority of the protein-coding genes (59.5%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.
Table 3 Genome Statistics Figure 3 Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew. Table 4 Number of genes associated with the general COG functional categories Insights from genome sequence A closer look on the genome sequence of strain IC166T revealed a set of genes which might be responsible for the yellow-orange color of C. algicola cells by encoding enzymes that are involved in the synthesis of carotenoids. Carotenoids are produced by the action of geranylgeranyl pyrophosphate synthase (Celal_1770), phytoene synthase (Celal_2446), phytoene desaturase (Celal_2447), lycopene cyclase (Celal_1771) and carotene hydroxylase (Celal_2445).
Geranylgeranyl pyrophosphate synthases Cilengitide start the biosynthesis of carotenoids by combining farnesyl pyrophosphate with C5 isoprenoid units to C20-molecules, geranylgeranyl pyrophosphate. The phytoene synthase catalyzes the condensation of two geranylgeranyl pyrophosphate molecules followed by the removal of diphosphate and a proton shift leading to the formation of phytoene.