Some of the sensor domains identified, such as MASE, CHASE,
TSA HDAC mouse CACHE and the CSS-motif have not been well characterized to date. In contrast to other well-studied microorganisms, such as C. crescentus and P. aeruginosa, no REC domains were identified. The phylogenetic analysis also Selleck PXD101 indicated similarity with GGDEF proteins from other bacteria, which raises questions regarding the origin and distribution of these copies among multiple bacterial species. This analysis therefore shows parallels and differences with other bacteria and the presence of multiple proteins with diverse domain architecture that is indicative of a complex c-di-GMP network in K. pneumoniae. Future studies focused on the function of many of these DGC and PDE proteins might shed light on the processes involving growth and survival of this bacterium in different environmental settings. Methods The analysis was carried out with the following genomes: K. pneumoniae Kp342, K. pneumoniae MGH 78578 and K. pneumoniae NTUH-K2044 (GenBank NC_011283, NC_009648 and NC_012731, respectively). Genes coding for proteins with the GG(D/E)EF and E(A/V)L sequence motifs were identified with PSI-BLAST [38] using reference sequences available at NCBI Gene Entrez [39] [See
Additional file 1, against the three K. pneumoniae genomes. Input sensory domains were identified using the databases CDD at the NCBI [40], InterproScan [41], pFam [42] and SMART [43]. Transmembrane segments were identified using SHP099 supplier SMART and SOSUIsignal [43, 44], and the presence and localization of signal peptides was predicted using the SignalP Histamine H2 receptor 3.0 Server and SOSUIsignal [44, 45]. Multiple alignments were done with the program MUSCLE [46] to identify the I site in each of the K. pneumoniae GGDEF
domain proteins. Finally, the Genomic BLAST database from NCBI [38] was used to identify homologous GGDEF/EAL proteins in these three genomes. For all homologous proteins, Blastp was performed and the following parameters were considered: E-value greater than 10-6, identity percentage less than 85% and query coverage greater than 95%. The homologous protein obtained was validated by Random Shuffling through PRSS/PRFX, using 500 shuffles [47]. The phylogenetic reconstruction was done with MEGA 5.05 [48], using 73 amino acid sequences and the neighbor-joining method with 1000 bootstrap replicates. Sequences from other families of Bacteria were selected from the Signaling Census database [20]. The logo sequences were generated using WebLogo 3.0 [49]. For DGCs we used an alignment of 9 DGC sequences [GenBank: YP_653766.1, YP_002517919.1, YP_258266.1, NP_252391.1, YP_631414.1, YP_471572.1, NP_459380.1, NP_463410.1, NP_416465.2] and 40 K. pneumoniae single-domain DGCs identified here. The logo for the PDE domain was done from an alignment of 7 PDE sequences [GenBank: AAC23902.1, AAC76550.2, ABJ13888.1, AAG07334.1, ACP09769.1, AAC73418.1, CAB13282.1] and 40 K.