Investigating genome composition in multiple bee species

The honey bee Apis mellifera was the first eusocial animal to have its genome assembled. Analysis of the complete draft sequence of the honey bee genome revealed several interesting features compared with the other metazoan genomes: a low but heterogeneous GC content, an overabundance of CpG dinucleotides and a lack of repetitive elements. The average GC content of the honey bee genome is only 33%, but GC content is highly heterogeneous, ranging from 11% to 67%, with a bimodal distribution. Furthermore, unlike genes in most other metazoans, honey bee genes are overly abundant in regions of low GC content (<30%). Some studies have suggested that the high GC-content regions of the honey bee genome are associated with areas of high meiotic recombination rates; indeed the honey bee exhibits the highest known recombination rate among eukaryotes. Other studies have suggested that honey bee genome nucleotide composition is associated with DNA methylation, which occurs at a low frequency at CpG sites within exons. However, reasons for the highly heterogeneous base composition are not well understood, and whether any of the unusual genome features are related to the emergence of eusociality in bees is not known. Since the publication of the honey bee genome, genomes of several other bee species have become available. I am investigating the composition and organization of genomes of multiple bee species with different levels of social complexity to identify features that are unique to eusocial bees. Results of this exploratory analysis will allow me to develop a hypothesis about the relationship of genome composition to the evolution of eusociality.