Draft genome sequence of Bosea sp. WAO an arsenite and sulfide oxidizer isolated from a pyrite rock outcrop in New Jersey.
Journal: 2018/November - Standards in genomic sciences
ISSN: 1944-3277
Abstract:
This genome report describes the draft genome and physiological characteristics of Bosea sp. WAO (=DSM 102914), a novel strain of the genus Bosea in the family Bradyrhizobiaceae. Bosea sp. WAO was isolated from pulverized pyritic shale containing elevated levels of arsenic. This aerobic, gram negative microorganism is capable of facultative chemolithoautotrophic growth under aerobic conditions by oxidizing the electron donors arsenite, elemental sulfur, thiosulfate, polysulfide, and amorphous sulfur. The draft genome is of a single circular chromosome 6,125,776 bp long consisting of 21 scaffolds with a G + C content of 66.84%. A total 5727 genes were predicted of which 5665 or 98.92% are protein-coding genes and 62 RNA genes. We identified the genes aioA and aioB, which encode the large and small subunits of the arsenic oxidase respectively. We also identified the genes for the complete sulfur oxidation pathway sox which is used to oxidize thiosulfate to sulfate.
Relations:
Content
Citations
(1)
Drugs
(4)
Similar articles
Articles by the same authors
Discussion board
Standards in Genomic Sciences. Dec/31/2017; 13
Published online Apr/9/2018

Draft genome sequence of Bosea sp. WAO an arsenite and sulfide oxidizer isolated from a pyrite rock outcrop in New Jersey

Abstract

This genome report describes the draft genome and physiological characteristics of Bosea sp. WAO (=DSM 102914), a novel strain of the genus Bosea in the family Bradyrhizobiaceae. Bosea sp. WAO was isolated from pulverized pyritic shale containing elevated levels of arsenic. This aerobic, gram negative microorganism is capable of facultative chemolithoautotrophic growth under aerobic conditions by oxidizing the electron donors arsenite, elemental sulfur, thiosulfate, polysulfide, and amorphous sulfur. The draft genome is of a single circular chromosome 6,125,776 bp long consisting of 21 scaffolds with a G + C content of 66.84%. A total 5727 genes were predicted of which 5665 or 98.92% are protein-coding genes and 62 RNA genes. We identified the genes aioA and aioB, which encode the large and small subunits of the arsenic oxidase respectively. We also identified the genes for the complete sulfur oxidation pathway sox which is used to oxidize thiosulfate to sulfate.

Introduction

Bosea sp. WAO (white arsenic oxidizer) was enriched from a pulverized sample of weathered black shale obtained from an outcropping near Trenton, NJ that contained high levels of arsenic [1]. Bosea sp. WAO belongs to the class Alphaproteobacteria and family Bradyrhizobiaceae which currently consists of 12 genera: Bradyrhizobium, Afipia, Agromonas, Balneimonas, Blastobacter, Bosea, Nitrobacter, Oligotropha, Rhodoblastus, Rhodopseudomomonas, Salinarimonas, and Tardiphaga [2]. This phenotypically diverse family is composed of microorganisms that are involved in nitrogen cycling, human diseases, phototropism in non-sulfur environments, plant commensalism, and chemolithoautotrophic growth [2]. 16S rRNA gene analysis of the Bradyrhizobiaceae family indicates that the Bosea genus is most closely related to the genus Salinarimonas which currently consists of two species, Salinarimonas rosea and Salinarmonas ramus [2]. The microorganisms belonging to the genus Bosea have been isolated from a variety of environments such as soils, sediments, hospital water systems, and digester sludge [35]. The type strain Bosea thiooxidans BI-42Tis capable of thiosulfate oxidation and the initial genus definition included this characteristic [3]. In 2003 La Scola emended the genus description to remove thiosulfate oxidation as a key descriptor after isolation of several other Bosea spp. that were unable to oxidized thiosulfate [4]. These organisms have a very diverse metabolism but their common characteristics include being Gram-negative, aerobic, rod shaped, motile, good growth between 25 to 35 °C, intolerant to salt concentrations above 6% NaCl and have been described to be heterotrophic [35]. Using selective enrichment and isolation techniques with arsenite [As(III)] as the sole electron donor Bosea sp. WAO was isolated under autotrophic conditions [1]. Here we summarize the physiological features together with the draft genome sequence and data analysis of Bosea sp. WAO.

Organism information

Classification and features

The genus Bosea has nine species with validly published names isolated from various environments: B. thiooxidans BI-42T (AF508803) from agricultural soil [3], B. eneae 34614T (AF288300), B. vestrisii 34635T (AF288306), and B. massiliensis 63287T (AF288309) from a hospital water system [4], B. minatitlanensis AMX51T (AF273081) from anaerobic digester sludge [5] B. lupini R-45681T (FR774992), B. lathyri R-46060T (FR774993), and B. robiniae R-46070T (FR774994) from the root nodules of legumes [6], and B. vaviloviae Vaf-18T(KJ848741) from the root nodules of Vavilovia formosa [7]. Strain WAO’s previously published identity was confirmed using the EzTaxon server [8]. The highest 16S rRNA pairwise similarities for strain WAO were found with the type strains B. vestrisii 34635T (99.72%), B. eneae 34614T (99.65%), B. lupini R-45681T (99.65%), B. thiooxidans BI-42T (99.24%), B. robiniae R-46070T (98.88%), B. massiliensis 63287T (98.81%), B. minatitlanensis AMX51T (98.48%) and B. lathyri R-46060T (98.18%). Phylogenetic analysis based on the 16S rRNA gene of Bosea spp. and phylogenetically related organisms placed Bosea sp. WAO closest to the type strain B. lupiniDSM 26673T with B. vestrisii 34635T and B. eneae 34614T in the same cluster (Fig. 1, Table 1). An average nucleotide identity analysis (ANI) score between strain WAO and B. lupiniDSM 26673T of 84.64% was computed using IMG/ER [9]. This value is lower than the ANI species demarcation threshold range (95–96%) [10]. To further identify Bosea sp. WAO to the species level phylogenic trees based on the housekeeping genes atpD, dnaK, recA, gyrB and rpoB were produced from available Bosea and related Bradyrhizobiaceae type strains using MEGA7 (Figs. 2, 3, 4, 5, 6 and 7). Strain WAO did not consistently group with any of the type strains for all five genes further suggesting that it is a separate species. The ability of B. lupini to oxidize thiosulfate has not been determined [6]; however, B. vestrisii, B. eneae, and B. massiliensis have been determined to not oxidize thiosulfate to sulfate [4]. These results suggest that strain WAO represents a distinct species in the genus Bosea.

Fig. 1

Molecular Phylogenetic analysis by Maximum Likelihood method of the 16S rRNA gene. A Phylogenetic tree highlighting the position of Bosea sp. WAO relative to the other Bosea spp. based on the 16 s rRNA gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [19]. The tree with the highest log likelihood (− 4792.5378) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 19 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 1376 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [20]. Type strains are indicated with a superscript T

Table 1

Classification and general features of Bosea sp. WAO [22]

MIGS IDPropertyTermEvidence codea
ClassificationDomain BacteriaTAS [23, 24]
Phylum ProteobacteriaTAS [25]
Class AlphaproteobacteriaTAS [26, 27]
Order RhizobialesTAS [27, 28]
Family BradyrhizobiaceaeTAS [27, 29]
Genus BoseaTAS [3, 30]
Species Bosea sp.TAS [24]
Strain: WAO (DSM 102914)TAS [1]
Gram stainNegativeIDA
Cell shapeRodTAS [1]
MotilityMotileTAS [1]
SporulationNot reportedNAS
Temperature rangeMesophileIDA
Optimum temperature25–30 °CIDA
pH range; Optimum6–9; 8IDA
Carbon sourceD-glucose, lactose, acetate, bicarbonateTAS [1]
MIGS-6HabitatTerrestrial, Black shaleTAS [1]
MIGS-6.3SalinityNo growth with > 3.5% NaCl (w/v)IDA
MIGS-22Oxygen requirementAerobicTAS [1]
MIGS-15Biotic relationshipfree-livingTAS [1]
MIGS-14PathogenicityNot reportedNAS
MIGS-4Geographic locationLockatong formation, New Jersey, USATAS [1]
MIGS-5Sample collection2005IDA
MIGS-4.1Latitude40.289329IDA
MIGS-4.2Longitude− 74.814366IDA
MIGS-4.4Altitude60 mIDA

These evidence codes are from the Gene Ontology project [31]

aEvidence codes

IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence)

Fig. 2

Molecular Phylogenetic analysis by Maximum Likelihood method of aligned concatenated atpD, dnaK, gyrB, recA, and rpoB. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [1]. The tree with the highest log likelihood (− 13842.8588) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 16 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 1413 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [21]

Fig. 3

Molecular Phylogenetic analysis by Maximum Likelihood method of the aptD gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [19]. A Phylogenetic tree highlighting the position of Bosea sp. WAO relative to the other Bosea spp. and related organisms based on the aptD gene. The tree with the highest log likelihood (− 2412.0185) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 18 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 361 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [20]. Type strains are indicated with a superscript T

Fig. 4

Molecular Phylogenetic analysis by Maximum Likelihood method of the dnaK gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [19]. The tree with the highest log likelihood (− 613.9292) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 18 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 103 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [20]. Type strains are indicated with a superscript T

Fig. 5

Molecular Phylogenetic analysis by Maximum Likelihood method of gyrB gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [19]. The tree with the highest log likelihood (− 4279.1901) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 18 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 508 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [20]. Type strains are indicated with a superscript T

Fig. 6

Molecular Phylogenetic analysis by Maximum Likelihood method of recA gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [19]. The tree with the highest log likelihood (− 1263.1252) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 17 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 190 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [20]. Type strains are indicated with a superscript T

Fig. 7

Molecular Phylogenetic analysis by Maximum Likelihood method of rpoB gene. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [19]. The tree with the highest log likelihood (− 419.8311) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 18 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 76 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [20]. Type strains are indicated with a superscript T

Extended feature descriptions

Bosea sp. WAO cells are Gram-negative, aerobic, motile, and rod shaped. Colonies on trypticase soy agar are smooth, mucoid, round, convex, and beige with a diameter as large as 10 mm after 2 weeks at 30 °C. Colonies on minimal salts medium supplemented with 5 mM sodium thiosulfate are smooth, round, white and only grow to a diameter of 2 mm after 2 weeks at 30 °C. Optimal growth occurs at a temperature range from + 25 to 30 °C and pH 6 to 9 with an optimum at pH 8 (Table 1). Growth did not occur at salinity > 3.5% w/v of NaCl. Cells will grow freely floating or attached to a mineral surface as shown in Fig. 8.

Fig. 8

Confocal microscopy of Bosea sp. WAO. Bosea sp. WAO (green) was stained with DAPI and imaged growing on the surface of a cadmium sulfide particle (faint white/grey) in a mostly black background

Strain WAO is a strict aerobe that can grow heterotrophically on acetate, glucose, and lactate in addition to autotrophically on carbon dioxide with the electron donors arsenite, thiosulfate, polysulfide, and elemental sulfur. The organism is also able to grow on the mineral arsenopyrite (FeAsS) by oxidizing both the arsenic and sulfur to produce sulfate and arsenate. No growth was observed under aerobic conditions with the aromatic compounds phenol, benzoate or ferulic acid or with the electron donors sulfite, ammonium, nitrite, selenite, or chromium(III). This organism was enriched from pulverized black shale that contained high levels of arsenic. The initial enrichment cultures using the shale material were amended with 5 mM arsenite and then serially diluted until purity was obtained [1].

Genome sequencing information

Genome project history

Bosea sp. WAO was selected for sequencing based on the organism’s ability to grow both heterotrophically and chemolithoautotrophically with arsenite and reduced sulfur compounds. Sequencing and assembly was completed at the Rutgers School of Environmental and Biological Sciences Genome Cooperative. A paired-end library was constructed using an Illumina Nextera Kit and sequenced using an Illumina Genome Analyzer IIX (Illumina Inc., San Diego, CA). The sequence assembly was performed using a CLC Genomics Workbench 5.1 (CLC Bio, Cambridge, MA). The draft genome was submitted to NCBI Whole Genome Shotgun (WGS) and to the JGI Integrated Microbial Genomes/ Expert Review (IMG/ER). A summary of the project is shown in Table 2.

Table 2
Project information
MIGS IDPropertyTerm
MIGS 31Finishing qualityDraft
MIGS-28Libraries usedOne pair-end
MIGS 29Sequencing platformsIllumina Genome Analyzer IIX
MIGS 31.2Fold coverage240×
MIGS 30AssemblersCLC Genomics Workbench 5.1
MIGS 32Gene calling methodGlimmer
Locus TagDK26
Genbank IDJXTJ00000000
GenBank Date of ReleaseJanuary 8, 2016
GOLD IDGp0113237
BIOPROJECTPRJNA243637
MIGS 13Source Material IdentifierDSM 102914
Project relevanceEnvironmental, biogeochemical cycling of arsenic and sulfur

Growth conditions and genomic DNA preparation

A culture of Bosea sp. WAO (GeneBank: DQ986321.1, DSM 102914) was grown in a dilute (50% normal strength) trypticase soy broth amended with 5 mM sodium arsenite and 5 mM sodium thiosulfate then incubated at 30 °C on an orbital shaker for maximum oxygen exchange. Once turbid genomic DNA was extracted using the MoBio Powersoil Kit following manufacturer’s directions with the modification that DNA was eluted into 100 uL water instead of buffer.

Genome sequencing and assembly

A paired-end library was constructed using an Illumina Nextera Kit and sequenced using an Illumina Genome Analyzer IIX (Illumina Inc., San Diego, CA). The sequence assembly was performed using the CLC Genomics Workbench 5.1 (CLC Bio, Cambridge, MA). An average coverage of 240× and a mean read length of 106 bp was obtained. The genome was assembled into 42 contigs with no additional gap closures.

Genome annotation

Genes were identified using the standard operating procedures of the DOE-JGI Microbial Genome Annotation pipeline [9] and The RAST Server: Rapid Annotation using subsystem technology [11, 12]. JGI-IMG/ER was used to obtain COG identities and overall statistics of the genome. RAST was used to identify functional genes of interest involved in sulfur and arsenic metabolism.

Genome properties

The draft genome is 6,125,776 bp with 66.84% G + C content. There are 62 RNA genes, 1 each of 5S rRNA, 16S rRNA, and 23S rRNA, and 46 tRNA, plus 13 unclassified RNA (Table 3). Of the predicted 5727 genes, 5665 or 98.92% are protein-coding genes, with 82.77% identified with protein function. The draft genome contains no identified pseudo genes. Of the protein-coding genes 4193 were sorted into COG functional categories. The COG categories are broken down in Table 4. COG analysis assigned a large number of genes to amino acid transport and metabolism (13.76%), transcription (8.13%), inorganic ion transport and metabolism (8.06%), and energy production and conservation (6.97%). Bosea sp. WAO has 53 genes encoding for cytochromes alone. RAST subsystem analysis placed 44% of the protein coding genes into subsystem categories with the largest percentage assigned to amino acids and derivatives. The genome sequence was deposited in GenBank ID JXTJ00000000.

Table 3
Genome statistics
Genes in internal clusters
AttributeValue% of total
Genome size (bp)6,125,776100.00
DNA coding (bp)5,469,60189.29
DNA G + C (bp)4,094,62166.84
DNA scaffolds42100.00
Total genes5727100.00
Protein coding genes566598.92
RNA genes621.08
Pseudo genes00
Genes with function prediction474082.77
Genes assigned to COGs419373.21
Genes with Pfam domains483784.46
Genes with signal peptides63511.09
Genes with transmembrane helices139124.29
CRISPR repeats00
Table 4
Number of genes associated with general COG functional categories
CodeValue%ageDescription
J2124.44Translation, ribosomal structure and biogenesis
A00RNA processing and modification
K3888.13Transcription
L1122.35Replication, recombination and repair
B30.06Chromatin structure and dynamics
D290.61Cell cycle control, cell division, chromosome partitioning
V1092.28Defense mechanisms
T2134.46Signal transduction mechanisms
M2395.01Cell wall/membrane biogenesis
N751.57Cell motility
U601.26Intracellular trafficking and secretion
O1753.66Posttranslational modification, protein turnover, chaperones
C3336.97Energy production and conversion
G2765.78Carbohydrate transport and metabolism
E65713.76Amino acid transport and metabolism
F1022.14Nucleotide transport and metabolism
H2334.88Coenzyme transport and metabolism
I2495.21Lipid transport and metabolism
P3858.06Inorganic ion transport and metabolism
Q1553.25Secondary metabolites biosynthesis, transport and catabolism
R4709.84General function prediction only
S2685.61Function unknown
153426.79Not in COGs
The total is based on the total number of protein coding genes in the genome

Extended insights

Ten other genome sequences of Bosea spp. are publicly available of which four are validly named and characterized to the species level: B. thiooxidansCGMCC 9174 V5_1, B. lathyriDSM 26656T, B. lupiniDSM 26673T, B. vaviloviae strain SD260 and six uncharacterized: Bosea sp. 117, Bosea sp. UNC402CLCol, Bosea sp. LC85, Bosea sp. OK403, Bosea sp. AAP35, and Bosea sp. AAP25. Only B. thiooxidansCGMCC 9174 V5_1 and B. vaviloviae strain SD260 are complete genomes. Table 5 details the basic characteristics of the ten genomes. The genomes range in size from 4.4 Mb to 6.6 Mb and G + C content between 64 to 68%, a predicated gene number range from 3984 to 6267. Boseasp. WAO’s genome size (6.1 Mb), number of predicted genes (5727), number genes with function (4570), and number placed in COGs (4193) are all higher than the average for the draft genomes. However, both the percentage values for genes with functional predication (79.8%) and percentage in COGs (73.2%) are similar to the average values for the draft genomes. B. thiooxidansCGMCC 9174 V5_1, B. vaviloviae strain SD260, Bosea sp. 117 and Bosea sp. UNC402CLCol contain pseudo genes. None of the IMG database genomes have been finished with scaffold numbers ranging between 16 and 72.

Table 5

Comparison of basic genome features of Bosea spp.

Genome NameStatusaGenome Size (Mbp)G + C Content (%)Gene CountNo. of protein coding genes w/ function predictionPercentage (%)No. of protein coding genes in COGsPercentage (%)IMG Taxon ID
Bosea sp. WAO [1]D6.12675727457079.8419373.22615840542
Bosea sp. LC85 [32]PD6.56656267497579.4454872.62609460206
Bosea sp. UNC402CLColPD5.61675389437581.1406775.52579779168
Bosea lupini DSM 26673 [6]D6.13675985475279.4439673.42634166302
Bosea sp. OK403D6.64656099506683.1439677.02609459641
Bosea sp. AAP25PD4.14643984302375.9265166.52636415410
Bosea lathyri DSM 26656 [21]D5.91655559447680.5412074.12622736433
Bosea sp. 117PD4.63684344363983.8336677.52562617052
Bosea sp. AAP35PD4.46664298343579.9314473.12636415883
Bosea vaviloviae strain SD260F5.60665487426077.6383969.92654587694
Bosea thiooxidans CGMCC 9174 V5_1F5.46675176N/A

These data were obtained from the IMG/ER platform [9] and NCBI genomes

aStatus: D draft, PD permanent draft, F finished

Arsenite oxidation

Bosea sp. WAO is able to grow under chemolithoautotrophic conditions with arsenite in addition to growing under heterotrophic conditions. Metabolic studies indicated that the organism was able to stoichiometically oxidize the electron donors As(III) to As(V). Aerobic arsenite oxidation occurs using the aio genes renamed to reduce confusion from aso, aro and aox, which were formerly used to identify these genes in different organisms [13]. aioA encodes for a large molybdopterin containing subunit with a guanosine dinucleotide at the active site and aioB encodes for a small Rieske subunit [1315]. This pathway has a two component regulatory system that includes a sensor histidine kinase encoded by aioS (aoxS, aroS) and a transcriptional regulator encoded by aioR (aoxR, aroR) [1315]. For the initial publication of Bosea sp. WAO, only the large subunit gene for the arsenite oxidation pathway aioA (EF015463) was amplified by traditional PCR [1, 16]. Analysis of the genome herein revealed that the arsenite oxidation pathway was complete with Bosea sp. WAO possessing both the small subunit aioB and reconfirming the large subunit aioA in addition to the remaining genes in the pathway. Of the available genomes only Bosea sp. WAO, and Bosea sp. 117 genomes contain both the large and small arsenite subunits with an amino acid similarity of 78% for AioA and 73% for AioB. The genes within the arsenite oxidation operon are in the same order (Fig. 9). The operon begins with a sensor histidine kinase, aioS, followed by a transcriptional response regulator, aioR, and then aioB, followed by aioA.

Fig. 9

Operon structure for arsenite oxidation viewed 5′-3′ direction on the plus strand. The gene order is the same in both Bosea sp. WAO and Bosea sp. 117 with a senor histidine kinase, aioS, then a transcriptional response regulator, aioR, followed by the aioB and aioA genes

Reduced sulfur compound oxidation

Bosea sp. WAO is also able to grow under chemolithoautotrophic conditions with thiosulfate, polysulfide, and elemental sulfur. Metabolic studies indicated that the organism is able to stoichiometically oxidize the electron donor S2O3 to SO42−. The sox gene cluster is a pathway consisting of seven essential genes, soxXYZABCD, that code for proteins required for direct oxidation from sulfide to sulfate in vivo [17]. The genome analysis indicated that strain WAO possesses all the genes necessary for the sulfur oxidation pathway. KEGG analysis indicated genes are all present to code for the enzymes SoxB, SoxX, SoxY, SoxA, SoxC, and SoxD to allow for complete oxidation of S2O3 to SO42−. Bosea sp. WAO, in addition to B. thiooxidansCGMCC 9174 V5_1, Bosea sp. 117, Bosea sp. LC85, and B. lupini contain the complete sox system. For the four genomes available in IMG the overall gene order in the operons are the same for all organisms; however, Bosea sp. WAO and B. lupini have soxA and soxX on the plus strand and soxY, soxZ, soxB, soxC, soxD on the minus strand (Fig. 10). While Bosea sp. 117 and Bosea sp. LC85 have the genes on the reverse strands with soxY, soxZ, soxB, soxC, soxD on the plus and soxA and soxX on the minus strand (Fig. 10). Comparison of the translated nucleotide sequence of soxB from Bosea sp. WAO to the translated soxB of the other five organisms showed that the protein sequence is 90% similar to Bosea sp. LC85, 88% similar to B. lupini and B. thiooxidansCGMCC 9174 V5_1, and 70% similar to Bosea sp. 117. The presence of all the genes in the same order suggests other strains in addition to the experimentally confirmed Bosea thiooxidans BI-42T, may be able to perform thiosulfate oxidation.

Fig. 10

Operon structure for the sox genes for thiosulfate oxidation. The orientation is 5′-3′ with the plus strand on top. The orientation of the genes for Bosea sp. WAO and Bosea lupini are the same while Bosea sp. LC85 and Bosea sp. 117 have the same orientation. These operons are inverted between the plus and minus strands

Additional metabolic pathways

The Calvin Cycle consists of 13 enzymatic reactions with the enzyme ribulose-1,5 bisphosphate carboxylase/oxygenase (RuBisCO) responsible for the carbon fixation step [18]. For the initial publication of Bosea sp. WAO the type II ribulose-1,5’bisphosphate carboxylase/oxygenase (RuBisCO) was amplified by traditional PCR [1, 16]. Analysis for the remaining genes of the Calvin-Benson-Bassham Cycle for carbon fixation indicated that all the other required genes were present for carbon fixation to occur. Nine of the available genomes have a match for strain WAO’s ribulose 1,5-bisphosphate carboxylase amino acid sequence: B. thiooxidansCGMCC 9174 V5_1, (85%), B. lathyriDSM 26656T, (86%), B. lupiniDSM 26673T, (82%), B. vaviloviae strain SD260, (85%), Bosea sp. 117, (72%), Bosea sp. UNC402CLCol, (85%), Bosea sp. LC85, (84%), Bosea sp. OK403, (87%), and Boseasp. AAP35, (84%). Since RuBisCO is considered a biomarker for the Calvin Cycle this suggests carbon fixation maybe be widespread in this genus despite the limited experimental evidences.

Additional KEGG analysis indicated incomplete pathways for nitrogen reduction. Bosea sp. WAO possesses some genes for each of the reductive pathways but each is incomplete supporting the observation that no growth occurred when nitrate was provided as an electron acceptor. No genes involved in ammonia oxidation were identified again supporting the absence of growth when cultivated under those conditions [1]. Using IMG/ER Pipeline analysis Bosea sp. WAO was determined to be prototrophic for L-aspartate, L-glutamate, and glycine; auxotrophic for L-lysine, L-alanine, L-phenylalanine, L-tyrosine, L-tryptophan, L-histine, L-arginine, L-isoleucine, L-leucine, and L-valine; and not able to synthesize selenocycteine synthesizer or biotin based on the draft of the genome [9]. Using the SEED viewer Bosea sp. WAO has complete pathways for the: tricarboxylic acid cycle, pentose phosphate pathway, acetyl-coA acetogenesis pathway, methylglyoxal metabolism, dihydroxyacetone kinases, catechol branch of beta-ketoadipate pathway, glycerol and clycerol-3-phosphate uptake and utilization, D-ribose utilization, deoxyribose and deoxynucleoside catabolism, and lactate utilization.

Conclusions

Bosea sp. WAO is able to grow chemolithoautotrophically on both arsenite and reduced sulfur compounds. It was originally enriched from pyritic shale obtained from a rock outcropping containing arsenic in the Lockatong geological formation in the Newark Basin near Trenton, New Jersey [1]. The draft genome is 6.1 Mbps and a G + C content of 66.84%. COG analysis for Bosea sp. WAO assigned a large number of genes to amino acid transport and metabolism (13.76%), transcription (8.13%), inorganic ion transport and metabolism (8.06%), and energy production and conservation (6.97%). Bosea sp. WAO has 53 genes encoding for cytochromes alone. Strain WAO is able to engage in the oxidative part of biogeochemical cycling and grow autotrophically when nutrient conditions are low. When conditions favor heterotrophic growth, however, the organism is able to rapidly increase in biomass and maintain its population under the varying conditions that expected to prevail at an oxic mineral surface.

Acknowledgements

Technical support was provided by the Rutgers School of Environmental and Biological Sciences Genome Cooperative.

Funding

This work was supported in part by the New Jersey Water Resources Research Institute at Rutgers University.

Authors’ contributions

ABW performed the laboratory experiments, analysed the assembled genome sequence data and wrote the draft manuscript. NY and LYY participated in the design of the study and revised the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1. RhineEDOnesiosKMSerfesMEReinfelderJRYoungLYArsenic transformation and mobilization from minerals by the arsenite oxidizing strain WAOEnviron Sci Technol20084214231429[PubMed][Google Scholar]
  • 2. de SouzaJAMCarrareto AlvesLMde Mello VaraniAde Macedo LemosERosenbergEDeLongEFLorySStackebrandtEThompsonFThe family BradyrhizobiaceaeThe prokaryotes2014Berlin HeidelbergSpringer135154[Google Scholar]
  • 3. DasSKMishraAKTindallBRaineyFAOxidation of thiosulfate by a new bacterium, Bosea thiooxidans. (strain BI-42) gen. nov., sp. nov.: analysis of phylogeny based on chemotaxonomy and 16S ribosomal DNA sequencingInt J Syst Bacteriol199646981987[PubMed][Google Scholar]
  • 4. La ScolaBMalletMGrimontPADRaoultDBosea eneae sp. nov., Bosea massiliensis sp. nov. and Bosea vestrisii sp. nov., isolated from hospital water supplies, and emendation of the genus Bosea (Das et al. 1996)Int J Syst Evol Microbiol2003531520[PubMed][Google Scholar]
  • 5. OuattaraASAssihEAThierrySCayolJLabatMMonroyOBosea minatitlanensis sp. nov., a strictly aerobic bacterium isolated from an anaerobic digesterInt J Syst Evol Microbiol20035312471251[PubMed][Google Scholar]
  • 6. De MeyerSEWillemsAMultilocus sequence analysis of Bosea species and description of Bosea lupini sp. nov., Bosea lathyri sp. nov. and Bosea robiniae sp. nov., isolated from legumesInt J Syst Evol Microbiol20126225052510[PubMed][Google Scholar]
  • 7. SafronovaVIKuznetsovaIGSazanovaALKimeklisAKBelimovAAAndronovEEBosea vaviloviae sp. nov., a new species of slow-growing rhizobia isolated from nodules of the relict species Vavilovia formosa (Stev.) FedAntonie Van Leeuwenhoek J Microb2015107911920[PubMed][Google Scholar]
  • 8. KimOSChoYJLeeKYoonSHKimMNaHIntroducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with phylotypes that represent uncultured speciesInt J Syst Evol Microbiol201262716721[PubMed][Google Scholar]
  • 9. MarkowitzVMChenIAPalaniappanKChuKSzetoEGrechkinYIMG: the integrated microbial genomes database and comparative analysis systemNucleic Acids Res201240D115D122[PubMed][Google Scholar]
  • 10. KimMOhHParkSChunJTowards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotesInt J Syst Evol Microbiol201464346351[PubMed][Google Scholar]
  • 11. OverbeekRBegleyTButlerRMChoudhuriJVChuangHCohoonMThe subsystems approach to genome annotation and its use in the project to annotate 1000 genomesNucleic Acids Res20053356915702[PubMed][Google Scholar]
  • 12. AzizRBartelsDBestADeJonghMDiszTEdwardsRThe RAST server: rapid annotations using subsystems technologyBMC Genomics2008975[PubMed][Google Scholar]
  • 13. LettMMullerDLièvremontDSilverSSantiniJUnified nomenclature for genes involved in prokaryotic aerobic arsenite oxidationJ Bacteriol2012194207208[PubMed][Google Scholar]
  • 14. SilverSPhungLTGenes and enzymes involved in bacterial oxidation and reduction of inorganic arsenicAppl Environ Microbiol200571599608[PubMed][Google Scholar]
  • 15. AmendJPSaltikovCLuGHernandezJMicrobial arsenic metabolism and reaction energeticsRev Mineral Geochem201479391433[PubMed][Google Scholar]
  • 16. RhineEDNí ChadhainSMZylstraGJYoungLYThe arsenite oxidase genes (aroAB) in novel chemoautotrophic arsenite oxidizersBiochem Biophys Res Commun2007354662667[PubMed][Google Scholar]
  • 17. FriedrichCGRotherDBardischewskyFQuentmeierAFischerJOxidation of reduced inorganic sulfur compounds by bacteria: emergence of a common mechanism?Appl Environ Microbiol20016728732882[PubMed][Google Scholar]
  • 18. ShivelyJMvan KeulenGMeijerWGSomething from almost nothing: carbon dioxide fixation in chemoautotrophsAnnu Rev Microbiol199852191230[PubMed][Google Scholar]
  • 19. TamuraKNeiMEstimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzeesMol Biol Evol199310512526[PubMed][Google Scholar]
  • 21. MeyerBImhoffJFKueverJMolecular analysis of the distribution and phylogeny of the soxB gene among sulfur-oxidizing bacteria - evolution of the sox sulfur oxidation enzyme systemEnviron Microbiol2007929572977[PubMed][Google Scholar]
  • 22. FieldDGarrityGGrayTMorrisonNSelengutJSterkPThe minimum information about a genome sequence (MIGS) specificationNat Biotechnol200826541547[PubMed][Google Scholar]
  • 23. WoeseCRKandlerOWheelisMLTowards a natural system of organisms: proposal for the domains archaea, bacteria, and eucaryaProc Natl Acad Sci19908745764579[PubMed][Google Scholar]
  • 24. WangQGarrityGMTiedjeJMColeJRNaïve Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomyAppl Environ Microbiol20077352615267[PubMed][Google Scholar]
  • 25. GarrityGMBellJLilburnTPhylum XIV. Proteobacteria phyl. novBergey’s manual of systematic bacteriology20051[Google Scholar]
  • 26. Garrity GM, Bell JA, Lilburn T. Class I. Alphaproteobacteria class. nov. In: Garrity G, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s manual® of systematic bacteriology. US: Springer; 2005. p. 1–574.
  • 27. List EditorValidation List No. 107. List of new names and new combinations previously effectively, but not validly, published200616
  • 28. Kuykendall LD. Order Vi. Rhizobiales ord. nov. In: Garrity G, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s manual® of systematic bacteriology. US: Springer; 2005. p. 324.
  • 29. Garrity GM, Bell JA, Lilburn T. Family VII. Bradyrhizobiaceae fam. nov. In: Garrity G, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s manual® of systematic bacteriology. US: Springer; 2005. p. 438.
  • 30. Das SK. Genus V. Bosea Das, Mishra, TIndall, Rainey and Stackebrandt 1996, 985VP. In: Garrity G, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s manual® of systematic bacteriology. US: Springer; 2005. p. 459–461.
  • 31. Gene Ontology ConsortiumThe Gene Ontology (GO) database and informatics resourceNucleic Acids Res200432D258D261[PubMed][Google Scholar]
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.