Current knowledge about plant and seed microbiota

Current knowledge about plant and seed microbiota

Practicum site

My internship had last two and a half months in the Emersys team at the IRHS (Horticulture and Seeds Research Institute) in Beaucouzé (France). The IRHS is part of a larger structure, the SFR QUASAV (Research Federative Structure in Plant Quality and Health). The SFR Quasav (Appendix I) combines scientific teams from three different institutions: i) the INRA (National Institute of the Agronomic Research), ii) the engineer school Agrocampus Ouest, and iii) the Angers University. In addition to these three institutions, the SFR Quasav also federate every other plant biology teams from the region of Pays de la Loire. Since 2008, this lab team clustering allows different institutions to aim one federative scientific project and to pool resources. This SFR gathers 380 people including 150 researchers and 60 PhD students. The federative scientific project is split into three research axis: the sustainable management of plant health, the seed biology, quality and health and the horticultural plant product quality. Almost, four technical facilities and two platforms are mutualized (Appendix I). The IRHS (Figure 1) is one of the joint research unit (UMR) of the SFR Quasav. It is one of the biggest partner of the SFR Quasav with its 220 employees. The IRHS is specially focused on the horticultural plant biology and on the seed production. Moreover, this institute share its technological resources and expertise’s between thirteen joint research teams from the INRA, the engineer school Agrocampus Ouest and the Angers University. Among this united means, the Emersys team cope with the Emergence, systematics and ecology of the pathogenic bacteria, thus its name. This team is currently composed of 22 members, including researchers, lab techs, post-docs, PhD students and interns. In addition, the team includes one bioinformatics member specially dedicated to data processing. Indeed, the ecology of bacteria research field needs a lot of data processing. The team researches are focused on the plant associated bacteria. Three main axis are studied in the group. The first one is the identification of the processes leading plant disease emergence. Within the second one, they study the molecular mechanisms involved in the transmission of the bacteria from and to the seeds. And last, but not least, they transfer and share their results. Moreover, the team hold a genetic resource center (Figure 1): the CIRM-CFBP, i.e. the French Collection of Phytopathogenic Bacteria. Most of the bacteria studied here belong to the Xanthomonas genus. Therefore, the Emersys team develop national and international collaborations related to Xanthomonas like the FNX (French Network on Xanthomonads) and the Xanthomonas Genomics Conference. The Emersys team is also part of other research programs such as the SEEDS project. This project studies the evolution of the bacterial community of seeds in partnership work with the seed company Vilmorin and the Berkeley University.

Current knowledge about plant and seed microbiota

Plants are not only made of vegetal cells but can be considered as holobiont (Figure 2, Vandenkoornhuyse et al., 2015). Indeed, they shelter and interact with many other organisms, including bacteria, viruses, fungi and archaea, both inside and outside their tissues. Together, all of these microbes associated to the plant, is what is considered as the plant microbiota (Bulgarelli et al., 2013). Different microbiota can be deemed separately owing to plant organs in which there is different living standards. In that respect, we can examine from eight (Shade et al., 2017) to 17 (Nelson, 2017) different plantassociated microbial habitats (Table I). A habitat can be defined as “a specific place occupied by a community of organisms for growth and reproduction” (Bulgarelli et al., 2013). The study of the plant microbiota is of interest because plant associated microbes can have many positive effects to the plants like resistance against biotic or abiotic stresses or nutrient acquisition and biomass accumulation (Sugiyama et al., 2012). For example, the plant growth-promoting rhizobacteria (PGPRs) (Spaepen et al., 2009) can help the plant to assimilate nutrients such as nitrogen, phosphorus or iron. Moreover, they can synthetize phytohormone such as auxin and interfere in its activity (Bulgarelli et al., 2013). Some microorganisms, instead of directly promote plant growth, stimulate PGPR activity (Combes-Meynet et al., 2010). Further, other members of the plant microbiota can provide biocontrol against biotic stresses such as pathogens (Mendes et al., 2011; et al., 2013; Santhanam et al., 2015; Busby et al., 2016). In this line, microbial community can produce antimicrobial compounds against other micro-organisms (Emmert & Handelsman, 2006; Weller, 2007; Berg, 2009; Pérez-García et al., 2011), or they can also activate what is name as the induced systemic resistance which increase the plant resistance against a broad spectrum of pathogens using the ethylene or jasmonate pathway (De Vleesschauwer & Höfte, 2009; Zamioudis & Pieterse, 2011). The seed microbiota influences the seed life as it affects the seed preservation (Chee-Sanford et al., 2006), the release of seed dormancy (Goggin et al., 2015) and the germination rate (Nelson, 2017). Thus, it is essential to better know the seed core microbiota as it influences the primordial very first step of the plant life cycle, not only at a lowest taxonomic rank but also at a community functional level (Shade & Handelsman, 2012). The detailed knowledge of the assembly and the composition of the seed microbiote provides promising agricultural approaches such as microorganism introduction. The microbiota manipulation can provide plant-growth promoting effects or biocontrol activity. In order to manipulate the seed core microbiote, a detailed knowledge is needed and thus accurate analysis methods.

Objective of this study and strategy

In this study, we’ll focus on the bacterial seed microbiota. Datasets from different seed associated bacterial communities from different plants were studied and compared. We have defined two main aims for this work. The first aim is to identify the main factors driving the composition of the seed microbiota. Since we have data from seeds from different plants and environmental conditions, in this first objective, we would like to analyse the possible relationships between the different treatments and seed microbiota. Thus, we will compare the observed richness, the alpha and the beta diversities. The second aim is to identify some ubiquitous strain strains and to establish seed specific associated bacterial taxa. This would represent the bacterial taxa that are present in all the seeds from different plants and would be the seed core microbiota. This taxonomic composition analysis will be implement with two house-keeping genes: the v4 region of the 16S rRNA gene and the gyrB portion of the bacterial gyrase gene.

Material and Methods

Initial data

All the data analysed in this report have been collected from seven different studies done in the hosting group. To analyse these data together, the prerequisite was that they came from the same amplified gene portion. A total of 641 samples representing 16 different plant species and nine different organs or development stages have been gathered. These different plants were grown in 16 different sites of 8 different countries, during 10 different years .

Clustering MiSeq reads into Amplicon Sequence Variants (ASVs)

The data analysis pipeline is divided in different steps . In the initial step (demultiplexing), raw reads are assigned to their original sample by the sequencer. In the second step, the Cutadapt software (Martin, 2011) is used to remove the Illumina adapter sequence from the reads and to match each read to one gene, if there is several amplified genes in the run (e.g. 16S & gyrB). This produces a fastq file per gene for each sample that was used as an input file for Dada2 .

Then Dada2 (Callahan et al., 2016) defines how many reads are in each fastq file. Subsequently, each sample file with less than 1000 reads is manually erased. Afterwards, Dada2 produce a sequencing quality graph. From this graph, we notify the position where to cut the amplicons in Dada2. Dada2 recommends that only the nucleotides with a sequencing quality score higher than Q30 have to be conserved. It must be noted in this step that some nucleotides with less than a Q30 quality score were conserved to allow the assembly of the forward and reverse primers. Then, with the cut sequences, the reads are filter, trimmed and merged to produce the ASVs. The interest of Dada2 lies in its error correction model which seems to be the more accurate so far (Callahan et al., 2016). During this step, the Dada2 algorithm removes ASVs that it considers as false ASVs, i.e. ASVs produced by sequencing errors. Finally, two rds files are produced: the first rds file gather all the ASVs of the run with their corresponding abundance; the second rds file gather all the ASVs of the run with their corresponding taxonomy. The taxonomy was assigned according to the 16S RDP database. In this study, the obtained sequences were analysed as ASVs (Callahan et al., 2017).

Phylogenetic analysis of mock communities

To phylogenetically analyse the mock communities, we first extracted their ASVs sequences in fasta file (mock fasta file). In addition, a fasta file with the 16S gene v4 region sequences of each strain presented in the mock community samples, was also retrieved from the NCBI (reference fasta file). This two fasta files were combined and aligned using the Clustal software (Chenna et al., 2003). The alignment was visualize and editied using Jalview (Clamp et al., 2004; Waterhouse et al., 2009). all the sequences present in the alignment were trimmed to the same length (253 ntd). Still using Jalview, the alignment was used to build a phylogenetic tree by the Neighbour Joining calculation. This tree that was exported as newick file. The newick tree file was visualized with the Figtree software (Rambaut, 2007). This tree was used to manually list the ASVs and reference sequences that match at 100% sequence identity. Finally, we deduced the number of references that have been detected by ASVs. These data are not shown because of their size.

Seed microbial community analysis

Different ecological indexes have been used in this project to study the structure and composition of the seed microbiota (Hill, 1973): i) the observed richness that correspond to the number of detected ASVs and ii) the Shannon and iii) inverse Simpson’s index reflexing the alpha diversity . The alpha diversity represents the species diversity in one habitat or in one condition. For both these indices, the higher they are, the higher the diversity between species is. At the same time, these indices are affected by differences between sample sizes. Therefore, we had to homogenize the sample sizes by rarefying at 5,000 reads. The rarefaction curve was obtained with the rarecurve function of the Phyloseq package. Differences in richness and alpha-diversity were evaluated as whole by a Kruskal-Wallis test with post hoc Dunn test between each variable.

Beta diversity represent the species diversity among different habitats or conditions. It was investigated by Bray-Curtis dissimilarity matrix and Jaccard dissimilarity (Whittaker, 1972). These two indices were used because they do not highlight the same part of the diversity. Indeed, Jaccard index is based on a presence/absence matrix while Bray-Curtis index is based on abundance. Both these indices were calculated on normalised ASVs abundance i.e. ASVs count were divided by the number of reads per sample and multiply par 106 . To evaluate the impact of the variables on the dissimilarity, a principal coordinate analysis was performed with the capscale function of the vegan package (version 2.5-1) on the following model: “distance ~ Site + Plant + Genotype + Experience + Harvest+ Inoculation + Plant Family + Plant Genus + Pollination +Process + Year”. To assess the significance of constraints, a permutation test was performed on the model with anova.cca function of the vegan package. To assess the importance of each variable on the dissimilarity, permutated multivariate analysis of variance (PERMANOVA; Anderson, 2001) was implemented with the adonis function of the vegan package on R studio. Both dissimilarity indices were then ordinate using a Principal Coordinate Analysis (PCoA) with the plot_ordination function of the Phyloseq package, highlighting one variable effect.

Results

Mock communities

Mock communities are routinely used in sequencing projects to analysed taxa detection by the different sequencing protocols. In our project, we analysed the percentage of taxa detected by Dada2 in the mock communities composed of 69 bacterial strains. According to our analysis, 81 16S rRNA ASVs were detected in the mock communities represented 88.46% of taxa present in the mock community. Thefact that we identified more ASV than the number of strains associated to the mock community can be explained by the fact that some bacterial strains have more than one copy of the 16S gene in their genome (e.g. Bacillus sp., Rhodococcus sp. & Erwinia sp.) and that these copies may present some polymorphism.

Data set display

In our study, there is 417 seed samples representing 13,235 ASVs (Figure 8). Seeds samples represent 63.15% of the ASVs of the initial data set. As a whole, seeds samples gathered 17.5×106 reads. The number of read per seed sample ranges from 1,098 to 199,203 reads. The sample size median is at 33,609 with a standard deviation of 38,040. Highly variable ASVs abundance ranges from 1 to 5.4×106 reads. The seed ASV abundance median is 14 with a standard deviation of 52,768. These numbers reveal the important part of seed samples in the original dataset, allowing a metaanalysis of the seed microbiota with these data. In addition, the very high variability of sample size and ASVs abundance is clearly highlight here.

Conclusion

By comparing and summarizing different microbiota studies, we managed to state that production site is one of the main factors driving the seed microbiota and that there is a group of 10 bacteria taxa that are always present in all the seed samples analysed. It would be interesting to analyse further the functions of these 10 taxa in the community assembly on sterile seeds. For this, representative isolates of these taxa need to be isolated from seeds and tested in sterile seeds. By doing a similar approach in roots, Niu et al. (Niu et al., 2017), observed that only the removal of one of the dominant taxa lead to the complete loss of the community. A similar approach be done in seeds by using part of the results coming out from this master thesis. This would be a nice system to study how bacterial interactions affect the assembly of the seed microbiota.

Le rapport de stage ou le pfe est un document d’analyse, de synthèse et d’évaluation de votre apprentissage, c’est pour cela chatpfe.com propose le téléchargement des modèles complet de projet de fin d’étude, rapport de stage, mémoire, pfe, thèse, pour connaître la méthodologie à avoir et savoir comment construire les parties d’un projet de fin d’étude.

Table des matières

1. Introduction
1.1. Practicum site
1.2. Current knowledge about plant and seed microbiota
1.3. Objective of this study and strategy
2. Material and Methods
2.1. Initial data
2.2. Clustering MiSeq reads into Amplicon Sequence Variants (ASVs)
2.3. Data subsetting
2.4. Phylogenetic analysis of mock communities
2.5. Seed microbial community analysis
3.Results
3.1. Mock communities
3.2. Data set display
3.3. Factors influencing the richness and diversity of the seed microbiota
3.3.1. Seed production site influence seed microbiota richness and alpha diversity
3.3.2. Seed production site is the main factor influencing seed microbiota beta diversity
3.4. Taxonomic composition of the seed microbiota
3.4.1. Seed microbiota is composed by bacteria belonging to Proteobacteria, Bacteroidetes, Actinobacteria and
Firmicutes phyla
3.4.2. Pantoea and Pseudomonas genera are the main members of the seed core microbiota
3.4.3. Pantoea agglomerans and Pseudomonas viridflava are the main representative species of their genera
in the seed core microbiota
4. Discussion
5. Conclusions