Chapter 14 Examples

Here, we give users two examples to practice the data analysis workflow by XMAS 2.0. By the way, we also recommend users handling your own microbiota data in a reasonable manner when you utilize this package or workflow. Pay attention to whether your data fit the methods this package provided.

14.1 Loading Packages

library(XMAS2)
library(dplyr)
library(tibble)
library(phyloseq)
library(ggplot2)
library(ggpubr)
library(SummarizedExperiment)

14.2 Workflow description

The standard data analysis of 16S and MGS data by XMAS 2.0.

  • 16S
    Functions of XMAS 2.0 in 16s

    Figure 14.1: Functions of XMAS 2.0 in 16s

  • MGS
    Functions of XMAS 2.0 in MGS

    Figure 14.2: Functions of XMAS 2.0 in MGS

14.3 Amplicon sequencing (16s)

The upstream process is performed by in-house pipeline. and this example just shows how to perform downstream data analysis. In briefly, the example comprises the following steps:

  1. Converting inputs into phyloseq object;

  2. Quality Evaluation;

  3. Pre-Processing Data;

  4. Diversity analysis;

  5. Ordination analysis;

  6. Composition analysis;

  7. Differential analysis.

14.3.1 Converting inputs into phyloseq-class object

dada2 result from standardized_analytics_workflow_R_function.

  1. /home/xuxiaomin/project/standardized_analytics_workflow_R_function/demo_data/16S/process/xdada2/dada2_res.rds

  2. /home/xuxiaomin/project/standardized_analytics_workflow_R_function/demo_data/16S/process/fasta2tree/tree.nwk

  3. /home/xuxiaomin/project/standardized_analytics_workflow_R_function/demo_data/16S/metadata.txt

# dada2 results from in-house 16s pipeline
dada2_res <- readRDS(
  system.file(
    "extdata", "dada2_res.rds",
    package = "XMAS2"    
    )
)

# the metadata matches to dada2 result
sam_tab <- read.table(
    system.file(
        "extdata", "dada2_metadata.tsv",
        package = "XMAS2"
    ),
    sep = "\t",
    header = TRUE,
    stringsAsFactors = FALSE
)

# tree file from dada2 reference data silva
tree <- phyloseq::read_tree(
  system.file(
    "extdata", "tree.nwk",
    package = "XMAS2"    
    )
)

tax_tab <- import_dada2_taxa(dada2_taxa = dada2_res$tax_tab)
otu_tab <- dada2_res$seq_tab
sam_tab <- sam_tab %>% tibble::column_to_rownames("seqID")

# Shouldn't use the Total Number as SampleID (wrong: 123456; right: X123456)
rownames(otu_tab) <- paste0("S", rownames(otu_tab))
rownames(sam_tab) <- paste0("S", rownames(sam_tab))

dada2_ps <- get_dada2_phyloseq(
                seq_tab = otu_tab, 
                tax_tab = tax_tab, 
                sam_tab = sam_tab, 
                phy_tree = tree)
dada2_ps
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 896 taxa and 24 samples ]
## sample_data() Sample Data:       [ 24 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 896 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 896 tips and 893 internal nodes ]
## refseq()      DNAStringSet:      [ 896 reference sequences ]

Here, the phyloseq object comprises five components (OTU Table, Sample Data, Taxonomy Table, Phylogenetic Tree and DNAStringSet).

14.3.1.1 Summarize phyloseq-class object

summarize_phyloseq(ps = dada2_ps)
## Compositional = NO2
## 1] Min. number of reads = 511812] Max. number of reads = 936223] Total number of reads = 15025374] Average number of reads = 62605.70833333335] Median number of reads = 619157] Sparsity = 0.8653738839285716] Any OTU sum to 1 or less? NO8] Number of singletons = 09] Percent of OTUs that are singletons
##         (i.e. exactly one read detected across all samples)010] Number of sample variables are: 1Group2
## [[1]]
## [1] "1] Min. number of reads = 51181"
## 
## [[2]]
## [1] "2] Max. number of reads = 93622"
## 
## [[3]]
## [1] "3] Total number of reads = 1502537"
## 
## [[4]]
## [1] "4] Average number of reads = 62605.7083333333"
## 
## [[5]]
## [1] "5] Median number of reads = 61915"
## 
## [[6]]
## [1] "7] Sparsity = 0.865373883928571"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? NO"
## 
## [[8]]
## [1] "8] Number of singletons = 0"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons\n        (i.e. exactly one read detected across all samples)0"
## 
## [[10]]
## [1] "10] Number of sample variables are: 1"
## 
## [[11]]
## [1] "Group"

The minus account of the OTU counts is 51181 in the phyloseq object, and we can use it as the threshold to rarefy.

Notice the Sparsity (0.865), indicating the data has many zeros and pay attention to the downstream data analysis. A common property of amplicon based microbiota data generated by sequencing.

14.3.2 Quality Control

Quality control of DADA2 results will help us have more rational determinations on the further data analysis.

14.3.2.1 Reads’ track by DADA2

plot_Dada2Track(data = dada2_res$reads_track)
Reads' track by DADA2 (16s example)

Figure 14.3: Reads’ track by DADA2 (16s example)

The percentage of the final remained read counts approximate 70%, indicating that we should consider the sequence depth for analysis when we build the sequence library.

14.3.2.2 Spike-in sample (BRS) assessment

  • Extract the genus level phyloseq and getting the BRS_ID
dada2_ps_genus <- summarize_taxa(ps = dada2_ps, 
                                 taxa_level = "Genus")
sample_data(dada2_ps_genus)
##       Group
## S6030    BB
## S6032    BB
## S6033    BB
## S6035    AA
## S6036    BB
## S6037    AA
## S6040    BB
## S6043    AA
## S6045    BB
## S6046    BB
## S6048    BB
## S6049    AA
## S6050    BB
## S6054    BB
## S6055    BB
## S6058    BB
## S6059    AA
## S6060    AA
## S6061    AA
## S6063    BB
## S6065    AA
## S6066    AA
## S6068    BB
## S8005    QC

The BRS_ID is S8005 .

  • Run run_RefCheck
run_RefCheck(
    ps = dada2_ps_genus,
    BRS_ID = "S8005",
    Ref_type = "16s")
## Noting: the Reference Matrix is for 16s
## S8005 is in the Reference Matrix's samples and remove it to run
## 
## ############Matched baterica of the BRS sample#############
## The number of BRS' bacteria matched the Reference Matrix is [15]
## g__Bifidobacterium
## g__Bacteroides
## g__Faecalibacterium
## g__Lactobacillus
## g__Parabacteroides
## g__Collinsella
## g__Coprococcus_3
## g__Dorea
## g__Streptococcus
## g__Roseburia
## g__Anaerostipes
## g__Escherichia_Shigella
## g__Enterococcus
## g__Prevotella_9
## g__Eggerthella
## 
## The number of the additional bacteria compared to Reference Matrix is [1]
## ###########################################################
## 
## ##################Status of the BRS sample##################
## Whether the BRS has the all bateria of Reference Matrix: TRUE
## Correlation Coefficient of the BRS is: 0.9714
## Bray Curtis of the BRS is: 0.07607
## Impurity of the BRS is: 0.06409
## ###########################################################
## #####Final Evaluation Results of the BRS #######
## The BRS of sequencing dataset passed the cutoff of the Reference Matrix 
## Cutoff of Coefficient is 0.8946
## Cutoff of BrayCurtis is 0.3878
## Cutoff of Impurity is 0.1565
## ###########################################################
##                             8002        8003        8004       8006        8007        8008        8009        8005        mean
## Bifidobacterium      31.11079015 30.88310969 32.31232692 18.4930259 20.20409870 17.96225391 18.03588291 27.22437034 24.52823232
## Bacteroides          20.44753484 14.46581958 24.57151411 26.7370147 25.85863655 27.51353663 26.99272343 24.23896093 23.85321759
## Faecalibacterium      0.79850615  0.62937893  1.05531023  1.7487249  1.64282727  1.96346413  1.81219797  1.04035376  1.33634542
## Lactobacillus         2.61732573  3.36856272  3.44379163  5.9292703  5.78000836  5.78189064  6.32672332  3.87088505  4.63980722
## Parabacteroides       7.11124408  7.45952579  5.36075144  8.7149995  8.01840234  8.74899584  8.62634005  5.61833757  7.45732457
## Collinsella           0.12792605  0.88271385  0.55665744  1.2764130  0.67921372  1.89356397  1.26367828  0.45502126  0.89189845
## Coprococcus_3         1.00380683  0.97969362  0.80270938  1.6693557  1.56419908  1.67969035  1.71221463  0.87586251  1.28594151
## Dorea                 2.80715148  3.45564660  2.46613277  3.9684605  3.99163530  3.81529666  3.69382881  2.34880690  3.31836988
## Streptococcus         2.91960260  3.43387563  2.59149764  3.4818362  3.31409452  3.51378702  3.37721491  2.68740253  3.16491388
## Roseburia             0.03404484  0.04750030  0.02806676  0.0338295  0.03178586  0.02503886  0.02499583  0.03311188  0.03229673
## Anaerostipes          0.32291011  0.43245062  0.31528329  0.5386697  0.44500209  0.53416240  0.48047548  0.32471000  0.42420796
## Escherichia_Shigella 15.27581475 16.00265210 12.36527954 10.8423545 13.43203680 10.15743185 11.59945565 14.03516268 12.96377349
## Enterococcus         14.51444842 14.66472707 11.04239952 13.1674820 12.07360937 12.88771113 12.84924735 11.61906390 12.85233610
## Prevotella_9          0.77374627  3.07465463  2.75709154  2.9145415  2.52446675  3.04952478  2.84952508  5.40257632  2.91826586
## Eggerthella           0.04951976  0.15437596  0.27131203  0.4840221  0.43998327  0.47365181  0.35549631  0.16128688  0.29870601
## Impurity_level        0.08562792  0.06531291  0.05987576  0.0000000  0.00000000  0.00000000  0.00000000  0.06409000  0.03436332

14.3.2.3 Spike-in sample’s remove

dada2_ps_remove_BRS <- get_GroupPhyloseq(
                         ps = dada2_ps,
                         group = "Group",
                         group_names = "QC",
                         discard = TRUE)
dada2_ps_remove_BRS
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 896 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 896 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 896 tips and 893 internal nodes ]
## refseq()      DNAStringSet:      [ 896 reference sequences ]

14.3.2.4 Rarefaction curves

plot_RarefCurve(ps = dada2_ps_remove_BRS,
                taxa_level = "OTU",
                step = 400,
                label = "Group",
                color = "Group")
## rarefying sample S6030
## rarefying sample S6032
## rarefying sample S6033
## rarefying sample S6035
## rarefying sample S6036
## rarefying sample S6037
## rarefying sample S6040
## rarefying sample S6043
## rarefying sample S6045
## rarefying sample S6046
## rarefying sample S6048
## rarefying sample S6049
## rarefying sample S6050
## rarefying sample S6054
## rarefying sample S6055
## rarefying sample S6058
## rarefying sample S6059
## rarefying sample S6060
## rarefying sample S6061
## rarefying sample S6063
## rarefying sample S6065
## rarefying sample S6066
## rarefying sample S6068
Rarefaction curves (16s example)

Figure 14.4: Rarefaction curves (16s example)

The result showed that all the samples had different sequence depth but had the full sample richness.

14.3.3 Data processing

This part has too may procedures and we only choose some of them. Please go to Chapter 6 to see more approaches and details for being familiar with this part.

14.3.3.1 Rarefy otu counts

From previous results of quality evaluation, the sequence depth of samples are different which have effects on the downstream analysis. Here, choosing the rarefy (Normaliztion method: random subsampling counts to the smallest library size) to get the equal sample sums.

dada2_ps_rarefy <- norm_rarefy(object = dada2_ps_remove_BRS, 
                               size = 51181)
dada2_ps_rarefy
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 891 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 891 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 891 tips and 888 internal nodes ]
## refseq()      DNAStringSet:      [ 891 reference sequences ]

In addition, we could also perform some other normalization methods on the rarefied phyloseq object. By the way, we didn’t transform the data by using log algorithm because the count matrix is required by the following data analysis methods.

14.3.3.2 Extracting specific taxonomic level

dada2_ps_rare_genus <- summarize_taxa(ps = dada2_ps_rarefy, 
                                      taxa_level = "Genus")
dada2_ps_rare_genus
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 198 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 198 taxa by 6 taxonomic ranks ]

14.3.3.3 Filtering the low relative abundance or unclassified taxa by the threshold (total counts < 10)

dada2_ps_rare_genus_filter <- run_filter(ps = dada2_ps_rare_genus, 
                                         cutoff = 10, 
                                         unclass = TRUE)
dada2_ps_rare_genus_filter 
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 149 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 149 taxa by 6 taxonomic ranks ]

14.3.3.4 Trimming the taxa with low occurrence less than threshold

dada2_ps_rare_genus_filter_trim <- run_trim(object = dada2_ps_rare_genus_filter, 
                                            cutoff = 0.2, 
                                            trim = "feature")
dada2_ps_rare_genus_filter_trim
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 100 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 100 taxa by 6 taxonomic ranks ]

Finally, we obtained the final phyloseq-class object dada2_ps_rare_genus_filter_trim and changed its name.

14.3.4 Diversity analysis

14.3.4.1 Alpha diveristy

  • Calculate the alpha diversity

Notes: the otu table must be counts matrix (rarefied but not trim counts matrix) when you choose Observed etc measures.

dada_ps_rare_genus_alpha <- run_alpha_diversity(ps = dada2_ps_rare_genus, 
                                                measures = c("Shannon", "Chao1", "Observed"))
print(dada_ps_rare_genus_alpha)
##    TempRowNames Group Observed    Chao1  se.chao1  Shannon
## 1         S6030    BB       74 74.00000 0.0000000 2.622800
## 2         S6032    BB       38 38.00000 0.0000000 1.711068
## 3         S6033    BB       80 80.16667 0.5431879 2.719495
## 4         S6035    AA       35 35.00000 0.1232013 1.953807
## 5         S6036    BB       95 95.00000 0.0000000 2.942657
## 6         S6037    AA       35 35.00000 0.0000000 1.933007
## 7         S6040    BB       58 58.00000 0.0000000 2.490198
## 8         S6043    AA       65 65.33333 0.9246628 2.098003
## 9         S6045    BB       60 60.00000 0.0000000 2.920363
## 10        S6046    BB       71 71.00000 0.1241166 2.264433
## 11        S6048    BB       66 66.00000 0.0000000 2.630421
## 12        S6049    AA       60 60.00000 0.0000000 2.506885
## 13        S6050    BB       53 53.00000 0.0000000 2.431334
## 14        S6054    BB       84 84.00000 0.0000000 2.769973
## 15        S6055    BB       75 75.00000 0.0000000 2.297388
## 16        S6058    BB       37 37.00000 0.2465985 2.262059
## 17        S6059    AA       71 71.00000 0.0000000 2.682960
## 18        S6060    AA       48 48.00000 0.0000000 1.883809
## 19        S6061    AA       80 80.00000 0.0000000 3.088651
## 20        S6063    BB       38 38.00000 0.0000000 1.543267
## 21        S6065    AA       77 77.00000 0.0000000 2.877568
## 22        S6066    AA       67 67.00000 0.0000000 2.638564
## 23        S6068    BB       40 40.00000 0.0000000 2.100363
  • visualization
plot_boxplot(data = dada_ps_rare_genus_alpha,
             y_index = c("Shannon", "Chao1", "Observed"),
             group = "Group",
             group_names = c("AA", "BB"),
             group_color = c("red", "blue"),
             method = "wilcox.test")
Alpha diversity (16s example)

Figure 14.5: Alpha diversity (16s example)

14.3.4.2 Beta diversity

  • beta dipersion
dada2_ps_beta <- run_beta_diversity(ps = dada2_ps_rare_genus_filter_trim, 
                                    method = "bray", 
                                    group = "Group")
## 
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
## 
## Response: Distances
##           Df  Sum Sq   Mean Sq      F N.Perm Pr(>F)
## Groups     1 0.01412 0.0141198 1.6675    999  0.201
## Residuals 21 0.17783 0.0084679                     
## 
## Pairwise comparisons:
## (Observed p-value below diagonal, permuted p-value above diagonal)
##         AA   BB
## AA         0.21
## BB 0.21063
dada2_ps_beta$BetaDispersion
Beta diversity (16s example)

Figure 14.6: Beta diversity (16s example)

14.3.5 PERMANOVA + Ordination

14.3.5.1 PERMANOVA

dada2_ps_per <- run_permanova(ps = dada2_ps_rare_genus_filter_trim, 
                              method = "bray", 
                              columns = "Group")
print(dada2_ps_per)
##       SumsOfSample Df SumsOfSqs   MeanSqs  F.Model         R2 Pr(>F) AdjustedPvalue
## Group           23  1 0.2290643 0.2290643 1.331995 0.05964515  0.225          0.225

The PERMANOVA result of the Group (Pr(>F) > 0.05) revealed that the two groups had not the distinct patterns of microbial community.

14.3.5.2 Ordination

We performed ordination by using Principal Coordinate Analysis (PCoA). If you want to try other methods please go to see Chapter 8 for more details.

dada2_ps_ordination <- run_ordination(
                         ps = dada2_ps_rare_genus_filter_trim,
                         group = "Group",
                         method = "PCoA")

plot_Ordination(ResultList = dada2_ps_ordination, 
                group = "Group", 
                group_names = c("AA", "BB"),
                group_color = c("blue", "red"))
PCoA (16s example)

Figure 14.7: PCoA (16s example)

14.3.6 Microbial composition

A whole picture of the microbial composition.

14.3.6.1 Stacked barplot

  • XMAS package
plot_StackBarPlot(
        ps = dada2_ps_rarefy,
        taxa_level = "Phylum",
        group = "Group",
        cluster = TRUE)
## [1] "This palatte have 20 colors!"
Microbial composition (16s example)

Figure 14.8: Microbial composition (16s example)

  • XVIZ package
plot_stacked_bar_XIVZ(
        phyloseq = dada2_ps_rarefy,
        level = "Phylum",
        feature = "Group")
Microbial composition (16s example) XVIZ

Figure 14.9: Microbial composition (16s example) XVIZ

14.3.6.2 Core microbiota

  • convert absolute abundance into relative abundance
dada2_ps_rare_genus_filter_trim_rb <- XMAS2::normalize(object = dada2_ps_rare_genus_filter_trim, 
                                                       method = "TSS")
dada2_ps_rare_genus_filter_trim_rb
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 100 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 1 sample variables ]
## tax_table()   Taxonomy Table:    [ 100 taxa by 6 taxonomic ranks ]
  • visualization
prevalences <- seq(0.05, 1, 0.05)
detections <- 10^seq(log10(1e-3), log10(0.2), length = 10)

pl_core <- plot_core_taxa(dada2_ps_rare_genus_filter_trim_rb, 
                    plot.type = "heatmap", 
                    colours = gray(seq(0, 1, length=5)),
                    prevalences = prevalences, 
                    detections = detections, 
                    min.prevalence = 0.5)+
    xlab("Detection Threshold (Relative Abundance (%))")

pl_core
Core taxa (16s example)

Figure 14.10: Core taxa (16s example)

The degree of color indicates the size of abundance and prevalence.

  • Use core_members to obtain the core taxa. detection for abundance and prevalence for occurrence.
core_taxa_name <- core_members(dada2_ps_rare_genus_filter_trim_rb, detection = 0.01, prevalence = 0.8)
print(core_taxa_name)
## [1] "g__Bifidobacterium" "g__Blautia"

Result:

Only 2 genera (g__Bifidobacterium and g__Blautia) passed the threshold of detection and prevalence which we choose.

14.3.7 Differential Analysis

There are more than 10 approaches to perform differential analysis. Here, we choose two of them and recommend users going to Chapter 10 to see more detials.

14.3.7.1 Liner discriminant analysis (LDA) effect size (LEfSe)

  • Calculation
dada2_ps_lefse <- run_lefse(
                      ps = dada2_ps_rare_genus_filter_trim,
                      group = "Group",
                      group_names = c("AA", "BB"),
                      norm = "CPM",
                      Lda = 2)
head(dada2_ps_lefse)
##                           TaxaID         Block Enrichment LDA_Score EffectSize Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Clostridium_sensu_stricto_1 9_AA vs 14_BB         BB  3.677589   2.679298                                NA
## 2             g__Intestinibacter 9_AA vs 14_BB         BB  3.177811   2.384308                                NA
## 3               g__Lactobacillus 9_AA vs 14_BB         BB  4.093463   2.450656                         -3.336128
## 4                 g__Odoribacter 9_AA vs 14_BB         BB  2.058258   1.876320                                NA
## 5              g__Parasutterella 9_AA vs 14_BB         AA -3.662466   2.234915                          4.402050
## 6                  g__Romboutsia 9_AA vs 14_BB         BB  3.699671   2.929140                         -6.856424
##   Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1               225.11921              0.00000           1173.88222                       -4.746617             5861.9051
## 2               383.50911              0.00000           1398.79740                       -1.964554             1883.3386
## 3              1283.85401            520.46061           5256.08763                       -5.173067            19938.7655
## 4                21.93175            604.99989              0.00000                        2.099583              578.0758
## 5               172.77125           1080.35695             51.09966                        4.481148             3901.0977
## 6              2189.79596             50.35627           5835.02623                       -3.334490             7094.1306
##   Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1           350.3379          9405.0554                    60.87                 22.22                 85.71
## 2           680.6440          2656.4994                    60.87                 22.22                 85.71
## 3           892.0297         32183.0957                    86.96                 77.78                 92.86
## 4          1083.9017           252.9020                    52.17                 77.78                 35.71
## 5          9320.3061           417.3209                    65.22                 88.89                 50.00
## 6          1086.1419         10956.4090                    78.26                 55.56                 92.86
##         Odds Ratio (95% CI)
## 1          4900 (5000;4900)
## 2               3.4 (5.8;1)
## 3 1.1e+08 (1.1e+08;1.1e+08)
## 4             0.31 (-2;2.6)
## 5           0.0025 (-12;12)
## 6                76 (85;68)
  • Visualization
# # don't run this code when you do lefse in reality
# dada2_ps_lefse$LDA_Score <- dada2_ps_lefse$LDA_Score * 1000

plot_lefse(
  da_res = dada2_ps_lefse,
  x_index = "LDA_Score",
  x_index_cutoff = 1,
  group_color = c("green", "red"))
Lefse analysis (16s example)

Figure 14.11: Lefse analysis (16s example)

14.3.7.2 Wilcoxon Rank-Sum test

  • Calculation
dada2_ps_wilcox <- run_wilcox(
                      ps = dada2_ps_rare_genus_filter_trim,
                      group = "Group",
                      group_names = c("AA", "BB"))

head(dada2_ps_wilcox)
##               TaxaID         Block Enrichment EffectSize Statistic    Pvalue AdjustedPvalue Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Acidaminococcus 9_AA vs 14_BB  Nonsignif   70.56349      63.5 1.0000000      1.0000000                                NA
## 2     g__Actinomyces 9_AA vs 14_BB  Nonsignif   39.65079      44.5 0.2556616      0.7014746                        -0.9577718
## 3   g__Adlercreutzia 9_AA vs 14_BB  Nonsignif   23.17460      44.0 0.1980169      0.7014746                                NA
## 4    g__Agathobacter 9_AA vs 14_BB  Nonsignif 1222.58730      71.0 0.6293971      0.8964389                         0.1524904
## 5     g__Akkermansia 9_AA vs 14_BB  Nonsignif  104.78571      64.5 0.9304707      0.9692403                                NA
## 6       g__Alistipes 9_AA vs 14_BB  Nonsignif   23.03175      67.0 0.8239942      0.9363570                        -0.1085245
##   Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Rank)\nAA_vs_BB Mean Rank Abundance\nAA
## 1                       0                    0                  0.0                      0.01201252                   12.06
## 2                      30                   26                 50.5                     -0.42227633                    9.94
## 3                       0                    0                  9.0                     -0.43387758                    9.89
## 4                     316                  329                296.0                      0.17342686                   12.89
## 5                       0                    0                  0.0                      0.03358045                   12.17
## 6                      64                   64                 69.0                      0.08724541                   12.44
##   Mean Rank Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1                   11.96                    21.74                 22.22                 21.43    0.67 (-0.11;1.5)
## 2                   13.32                    82.61                 88.89                 78.57       4.1 (6.8;1.3)
## 3                   13.36                    43.48                 33.33                 50.00        8.6 (13;4.3)
## 4                   11.43                    65.22                 55.56                 71.43     0.39 (-1.4;2.2)
## 5                   11.89                    21.74                 22.22                 21.43      1.5 (2.3;0.71)
## 6                   11.71                    73.91                 77.78                 71.43     0.88 (0.64;1.1)
  • Volcano
plot_volcano(
    da_res = dada2_ps_wilcox,
    group_names = c("AA", "BB"),
    x_index = "Log2FoldChange (Rank)\nAA_vs_BB",
    x_index_cutoff = 0.5,
    y_index = "Pvalue",
    y_index_cutoff = 0.05,
    group_color = c("red", "grey", "blue"),
    topN = 5)
Wilcoxon Rank-Sum test (16s example)

Figure 14.12: Wilcoxon Rank-Sum test (16s example)

14.4 Metagenomics (MGS)

The metagenomic data analysis pipeline is just the same as 16s. In briefly, the example comprises the following steps:

  1. Converting inputs into phyloseq object;

  2. Quality Evaluation;

  3. Pre-Processing Data;

  4. Diversity analysis;

  5. Ordination analysis;

  6. Composition analysis;

14.4.1 Converting inputs into phyloseq-class object

The result of the in-house Metaphlan2/3 pipeline:

  1. /home/xuxiaomin/project/standardized_analytics_workflow_R_function/demo_data/MGS/metaphlan2_merged.tsv

  2. /home/xuxiaomin/project/standardized_analytics_workflow_R_function/demo_data/MGS/metadata.txt

metaphlan2_res <- read.table(
    system.file(
        "extdata", "metaphlan2_merged.tsv",
        package = "XMAS2"
    ),
    header = TRUE, 
    stringsAsFactors = FALSE
)
metaphlan2_sam <- read.table(
    system.file(
        "extdata", "metaphlan2_metadata.tsv",
        package = "XMAS2"
    ),
    sep = "\t",
    header = TRUE,
    stringsAsFactors = FALSE
)

metaphlan2_res_list <- import_metaphlan_taxa(data_metaphlan2 = metaphlan2_res, 
                                             taxa_level = "Species")
otu_tab <- metaphlan2_res_list$abu_tab
tax_tab <- metaphlan2_res_list$tax_tab
sam_tab <- metaphlan2_sam %>% tibble::column_to_rownames("SampleID")

metaphlan2_ps <- get_metaphlan_phyloseq(
                    otu_tab = otu_tab, 
                    sam_tab = sam_tab,
                    tax_tab = tax_tab)
metaphlan2_ps
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 326 taxa and 23 samples ]
## sample_data() Sample Data:       [ 23 samples by 2 sample variables ]
## tax_table()   Taxonomy Table:    [ 326 taxa by 7 taxonomic ranks ]

Here, the phyloseq object comprises three components (OTU Table, Sample Data and Taxonomy Table).

14.4.1.1 Summarize phyloseq-class object

summarize_phyloseq(ps = metaphlan2_ps)
## Compositional = NO2
## 1] Min. number of reads = 0.98218212] Max. number of reads = 1.00000053] Total number of reads = 22.94511414] Average number of reads = 0.9976136565217395] Median number of reads = 0.99980247] Sparsity = 0.7131234995998936] Any OTU sum to 1 or less? YES8] Number of singletons = 3239] Percent of OTUs that are singletons
##         (i.e. exactly one read detected across all samples)010] Number of sample variables are: 2Groupphynotype2
## [[1]]
## [1] "1] Min. number of reads = 0.9821821"
## 
## [[2]]
## [1] "2] Max. number of reads = 1.0000005"
## 
## [[3]]
## [1] "3] Total number of reads = 22.9451141"
## 
## [[4]]
## [1] "4] Average number of reads = 0.997613656521739"
## 
## [[5]]
## [1] "5] Median number of reads = 0.9998024"
## 
## [[6]]
## [1] "7] Sparsity = 0.713123499599893"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
## 
## [[8]]
## [1] "8] Number of singletons = 323"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons\n        (i.e. exactly one read detected across all samples)0"
## 
## [[10]]
## [1] "10] Number of sample variables are: 2"
## 
## [[11]]
## [1] "Group"     "phynotype"

Notice the Sparsity (0.713), indicating the data has many zeros and pay attention to the downstream data analysis.

14.4.2 Quality Control

14.4.2.1 Spike-in sample (BRS) assessment

  • Extract the species level phyloseq and obtain the BRS_ID
metaphlan2_ps_species <- summarize_taxa(ps = metaphlan2_ps, 
                                        taxa_level = "Species")
metaphlan2_ps_species@sam_data
##      Group phynotype
## s1      BB      0.00
## s2      AA      2.50
## s3      BB      0.00
## s4      AA      1.25
## s5      AA     30.00
## s6      AA     15.00
## s7      BB      8.75
## s8      BB      0.00
## s9      BB      3.75
## s10     BB      2.50
## s11     BB     15.00
## s12     BB      2.50
## s13     BB      2.50
## s14     BB      0.00
## s15     BB      1.07
## s16     BB      2.50
## s17     AA      5.00
## s18     BB     35.00
## s19     BB      7.50
## s20     BB     15.00
## s21     AA      3.75
## s22     AA      3.75
## refE    QC        NA
  • Run run_RefCheck
run_RefCheck(
    ps = metaphlan2_ps_species,
    BRS_ID = "refE",
    Ref_type = "MGS")
## Noting: the Reference Matrix is for MGS
## 
## ############Matched baterica of the BRS sample#############
## The number of BRS' bacteria matched the Reference Matrix is [16]
## s__Bifidobacterium_longum
## s__Bacteroides_uniformis
## s__Faecalibacterium_prausnitzii
## s__Bifidobacterium_adolescentis
## s__Bacteroides_thetaiotaomicron
## s__Collinsella_aerofaciens
## s__Coprococcus_comes
## s__Dorea_formicigenerans
## s__Streptococcus_salivarius
## s__Bacteroides_xylanisolvens
## s__Bacteroides_ovatus
## s__Roseburia_hominis
## s__Bacteroides_vulgatus
## s__Prevotella_copri
## s__Bifidobacterium_pseudocatenulatum
## s__Lachnospiraceae_bacterium_5_1_63FAA
## The number of bacteria unmatched the Reference Matrix is [9]
## s__Lactobacillus_salivarius
## s__Bacteroides_fragilis
## s__Parabacteroides_goldsteinii
## s__Escherichia_coli
## s__Enterococcus_faecalis
## s__Bifidobacterium_bifidum
## s__Lactobacillus_pentosus
## s__Bacteroides_intestinalis
## s__Eggerthella_unclassified
## The number of the additional bacteria compared to the Reference Matrix is [56]
## ###########################################################
## 
## ##################Status of the BRS sample##################
## Whether the BRS has the all bateria of Reference Matrix: FALSE
## Correlation Coefficient of the BRS is: -0.06471
## Bray Curtis of the BRS is: 0.8752
## Impurity of the BRS is: 32.69
## ###########################################################
## #####Final Evaluation Results of the BRS #######
## The BRS of sequencing dataset didn't pass the cutoff of the Reference Matrix
## ###########################################################
##                                         7682    7683    7684    7685     7842     7843     7844     7845     refE        mean
## Bifidobacterium_longum              10.31563 9.25812 9.69184 7.76031 11.03311 11.61484 12.29030 11.69019  0.01646  9.29675556
## Bacteroides_uniformis                2.24195 2.24035 1.92015 2.18435  2.43230  2.38180  2.13830  2.41437  0.21061  2.01824222
## Faecalibacterium_prausnitzii         0.65615 0.60153 0.60112 0.62079  0.54147  0.55383  0.58806  0.54655  1.61939  0.70321000
## Bifidobacterium_adolescentis         7.05426 6.28460 6.57297 6.25448  4.69357  4.80628  4.94943  4.84278  0.04649  5.05609556
## Bacteroides_thetaiotaomicron         3.25076 3.31897 3.22418 3.43809  3.35611  3.38323  3.29098  3.30355  1.47422  3.11556556
## Collinsella_aerofaciens              0.56220 0.53249 0.60476 0.47934  0.65513  0.68833  0.76063  0.66896  0.07251  0.55826111
## Coprococcus_comes                    2.28581 2.42978 2.25227 2.82904  1.21527  1.16485  1.07160  1.07834  0.01444  1.59348889
## Dorea_formicigenerans                4.83509 5.02149 5.18268 5.56891  3.34720  3.09877  3.08677  2.70732  0.02775  3.65288667
## Streptococcus_salivarius             3.54266 3.74119 3.62036 4.01546  2.90216  2.70193  2.61760  2.57348  0.02846  2.86036667
## Bacteroides_xylanisolvens            1.55648 1.84824 1.91166 1.85273  1.75220  1.74002  1.69811  1.67676  0.32466  1.59565111
## Bacteroides_ovatus                   3.08489 3.27226 3.10904 3.24565  3.51376  3.50063  3.37872  3.42300  0.25782  2.97619667
## Roseburia_hominis                    0.04383 0.04183 0.04107 0.02304  0.03853  0.03464  0.03532  0.03597  0.01307  0.03414444
## Bacteroides_vulgatus                 3.06713 3.20369 3.14979 3.15352  3.24822  3.09280  3.13038  3.06113  2.14684  3.02816667
## Prevotella_copri                     2.03128 1.99619 1.92504 2.15422  1.57638  1.60584  1.57913  1.59224 60.84109  8.36682333
## Bifidobacterium_pseudocatenulatum    6.89310 6.36177 7.39605 5.76804  6.19464  6.47094  7.41615  6.27837  0.15023  5.88103222
## Lachnospiraceae_bacterium_5_1_63FAA  0.22002 0.26062 0.28058 0.33329  0.07093  0.06778  0.05229  0.04781  0.06825  0.15573000
## Impurity_level                       7.33876 7.34799 7.14891 6.47250  8.88190  9.72198  9.72162 10.57600 32.69000 11.09996222
##                                                                                Evaluation
## Bifidobacterium_longum              refE didn't pass the threshold (2022-10-08 15:36:53).
## Bacteroides_uniformis               refE didn't pass the threshold (2022-10-08 15:36:53).
## Faecalibacterium_prausnitzii        refE didn't pass the threshold (2022-10-08 15:36:53).
## Bifidobacterium_adolescentis        refE didn't pass the threshold (2022-10-08 15:36:53).
## Bacteroides_thetaiotaomicron        refE didn't pass the threshold (2022-10-08 15:36:53).
## Collinsella_aerofaciens             refE didn't pass the threshold (2022-10-08 15:36:53).
## Coprococcus_comes                   refE didn't pass the threshold (2022-10-08 15:36:53).
## Dorea_formicigenerans               refE didn't pass the threshold (2022-10-08 15:36:53).
## Streptococcus_salivarius            refE didn't pass the threshold (2022-10-08 15:36:53).
## Bacteroides_xylanisolvens           refE didn't pass the threshold (2022-10-08 15:36:53).
## Bacteroides_ovatus                  refE didn't pass the threshold (2022-10-08 15:36:53).
## Roseburia_hominis                   refE didn't pass the threshold (2022-10-08 15:36:53).
## Bacteroides_vulgatus                refE didn't pass the threshold (2022-10-08 15:36:53).
## Prevotella_copri                    refE didn't pass the threshold (2022-10-08 15:36:53).
## Bifidobacterium_pseudocatenulatum   refE didn't pass the threshold (2022-10-08 15:36:53).
## Lachnospiraceae_bacterium_5_1_63FAA refE didn't pass the threshold (2022-10-08 15:36:53).
## Impurity_level                      refE didn't pass the threshold (2022-10-08 15:36:53).

The spike-in samples didn’t pass the cutoff and failed to add the the Reference Matrix.

14.4.2.2 Spike-in sample’s remove

metaphlan2_ps_remove_BRS <- get_GroupPhyloseq(
                               ps = metaphlan2_ps,
                               group = "Group",
                               group_names = "QC",
                               discard = TRUE)
metaphlan2_ps_remove_BRS
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 326 taxa and 22 samples ]
## sample_data() Sample Data:       [ 22 samples by 2 sample variables ]
## tax_table()   Taxonomy Table:    [ 326 taxa by 7 taxonomic ranks ]

14.4.3 Data processing

This part has too may procedures and we only choose some of them. Please go to Chapter 6 to see more approaches and details for being familiar with this part.

14.4.3.1 Extracting specific taxonomic level

metaphlan2_ps_species <- summarize_taxa(ps = metaphlan2_ps_remove_BRS, 
                                        taxa_level = "Species")
metaphlan2_ps_species
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 325 taxa and 22 samples ]
## sample_data() Sample Data:       [ 22 samples by 2 sample variables ]
## tax_table()   Taxonomy Table:    [ 325 taxa by 7 taxonomic ranks ]

14.4.3.2 Filtering the low relative abundance or unclassified taxa by the threshold (total counts < 1e-4)

The condition to filter low relative abundance is according to this article (Thingholm et al. 2019).

Species from taxonomic profiles were retained for further analysis if their mean relative abundance exceeded 0.005 (0.5%) across the dataset with a minimum abundance of 0.05 (5%) in at least one sample and non-zero abundance in at least 60% of samples.

There are three conditions

  1. Mean relative abundance: 0.005;

  2. Minimum relative abundance: 0.05;

  3. Occurrence: 60%.

Here, we use 0.01 (the 1e-4 regarded as 0.01 compared to the Referece because Metaphlan2 data had been divided 100)

metaphlan2_ps_species_filter <- run_filter(ps = metaphlan2_ps_species, 
                                           cutoff = 1e-4, 
                                           unclass = TRUE)
metaphlan2_ps_species_filter 
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 233 taxa and 22 samples ]
## sample_data() Sample Data:       [ 22 samples by 2 sample variables ]
## tax_table()   Taxonomy Table:    [ 233 taxa by 7 taxonomic ranks ]

14.4.3.3 Trimming the taxa with low occurrence less than threshold

metaphlan2_ps_species_filter_trim <- run_trim(object = metaphlan2_ps_species_filter, 
                                              cutoff = 0.1, 
                                              trim = "feature")
metaphlan2_ps_species_filter_trim
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 179 taxa and 22 samples ]
## sample_data() Sample Data:       [ 22 samples by 2 sample variables ]
## tax_table()   Taxonomy Table:    [ 179 taxa by 7 taxonomic ranks ]

Finally, we obtained the final phyloseq-class object metaphlan2_ps_species_filter_trim and changed its name.

14.4.4 Diversity analysis

14.4.4.1 Alpha diveristy

  • Calculate the alpha diversity

Notes: choosing the measures (Shannon, Simpson and InvSimpson) only for relative abundance.

metaphlan2_ps_genus_alpha <- run_alpha_diversity(ps = metaphlan2_ps_remove_BRS, 
                                                 measures = c("Shannon", "Simpson", "InvSimpson"))
head(metaphlan2_ps_genus_alpha)
##   TempRowNames Group phynotype  Shannon   Simpson InvSimpson
## 1           s1    BB      0.00 2.876002 0.9108857  11.221549
## 2           s2    AA      2.50 2.045392 0.8105742   5.279111
## 3           s3    BB      0.00 3.441176 0.9439371  17.837114
## 4           s4    AA      1.25 2.746917 0.8290768   5.850579
## 5           s5    AA     30.00 1.450722 0.6412178   2.787207
## 6           s6    AA     15.00 2.619951 0.8950369   9.527154
  • visualization
plot_boxplot(data = metaphlan2_ps_genus_alpha,
             y_index = c("Shannon", "Simpson", "InvSimpson"),
             group = "Group",
             group_names = c("AA", "BB"),
             group_color = c("red", "blue"),
             method = "wilcox.test")
Alpha diversity (MGS example)

Figure 14.13: Alpha diversity (MGS example)

14.4.4.2 Beta diversity

  • beta dipersion
metaphlan2_ps_beta <- run_beta_diversity(ps = metaphlan2_ps_species_filter_trim, 
                                         method = "bray", 
                                         group = "Group")
## 
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
## 
## Response: Distances
##           Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
## Groups     1 0.005966 0.0059663 0.7914    999  0.372
## Residuals 20 0.150783 0.0075392                     
## 
## Pairwise comparisons:
## (Observed p-value below diagonal, permuted p-value above diagonal)
##         AA    BB
## AA         0.365
## BB 0.38426
metaphlan2_ps_beta$BetaDispersion
Beta diversity (MGS example)

Figure 14.14: Beta diversity (MGS example)

14.4.5 PERMANOVA + Ordination

14.4.5.1 PERMANOVA

metaphlan2_ps_per <- run_permanova(ps = metaphlan2_ps_species_filter_trim, 
                                   method = "bray", 
                                   columns = "Group")
head(metaphlan2_ps_per)
##       SumsOfSample Df SumsOfSqs   MeanSqs  F.Model        R2 Pr(>F) AdjustedPvalue
## Group           22  1 0.8452127 0.8452127 2.728018 0.1200289  0.002          0.002

The PERMANOVA result of the Group (AdjustedPvalue < 0.05) revealed that the two groups had the distinct patterns of microbial community.

14.4.5.2 Ordination

We performed ordination by using Uniform Manifold Approximation and Projection for Dimension Reduction (UMAP). If you want to try other methods please go to see Chapter 8 for more details.

metaphlan2_ps_ordination <- run_ordination(
                       ps = metaphlan2_ps_species_filter_trim,
                       group = "Group",
                       method = "UMAP")

plot_Ordination(ResultList = metaphlan2_ps_ordination, 
                group = "Group", 
                group_names = c("AA", "BB"),
                group_color = c("blue", "red"),
                sample = TRUE,
                sidelinechart = FALSE,
                circle_type = "ellipse_line",
                sideboxplot = TRUE)
PCoA (MGS example)

Figure 14.15: PCoA (MGS example)

14.4.6 Microbial composition

14.4.6.1 Stacked barplot

  • XVIZ package
plot_stacked_bar_XIVZ(
        phyloseq = metaphlan2_ps_species,
        level = "Phylum",
        feature = "Group")
Microbial composition (MGS) XVIZ

Figure 14.16: Microbial composition (MGS) XVIZ

  • XMAS package
plot_StackBarPlot(
        ps = metaphlan2_ps_species,
        taxa_level = "Phylum",
        group = "Group",
        cluster = TRUE)
## [1] "This palatte have 20 colors!"
Microbial composition (MGS example)

Figure 14.17: Microbial composition (MGS example)

14.4.6.2 Core microbiota

  • visualization
library(RColorBrewer)

prevalences <- seq(0.05, 1, 0.05)
detections <- 10^seq(log10(1e-3), log10(.2), length = 10)

plot_core_taxa(metaphlan2_ps_species_filter_trim, 
               plot.type = "heatmap", 
               colours = rev(brewer.pal(5, "Spectral")),
               prevalences = prevalences, 
               detections = detections, 
               min.prevalence = 0.5)+
    xlab("Detection Threshold (Relative Abundance (%))") +
    theme(axis.text.y = element_text(face="italic"))
Core taxa (MGS example)

Figure 14.18: Core taxa (MGS example)

The degree of color indicates the size of abundance and prevalence.

  • Use core_members to obtain the core taxa. detection for abundance and prevalence for occurrence.
core_taxa_name <- core_members(metaphlan2_ps_species_filter_trim, 
                               detection = 0.001, 
                               prevalence = 0.5)
print(core_taxa_name)
##  [1] "s__Bacteroides_ovatus"                "s__Bacteroides_thetaiotaomicron"      "s__Bacteroides_uniformis"            
##  [4] "s__Bacteroides_vulgatus"              "s__Bifidobacterium_longum"            "s__Bifidobacterium_pseudocatenulatum"
##  [7] "s__Collinsella_aerofaciens"           "s__Escherichia_coli"                  "s__Eubacterium_eligens"              
## [10] "s__Eubacterium_hallii"                "s__Faecalibacterium_prausnitzii"      "s__Roseburia_inulinivorans"          
## [13] "s__Ruminococcus_gnavus"               "s__Ruminococcus_obeum"                "s__Ruminococcus_sp_5_1_39BFAA"       
## [16] "s__Ruminococcus_torques"              "s__Streptococcus_salivarius"

Result:

17 species passed the threshold of detection and prevalence which we choose.

14.4.7 Differential Analysis

There are more than 10 approaches to perform differential analysis. Here, we choose two of them and recommend users going to Chapter 10 to see more detials.

14.4.7.1 Liner discriminant analysis (LDA) effect size (LEfSe)

  • Calculation
metaphlan2_ps_lefse <- run_lefse(
                          ps = metaphlan2_ps_species_filter_trim,
                          group = "Group",
                          group_names = c("AA", "BB"),
                          norm = "CPM",
                          Lda = 2)
head(metaphlan2_ps_lefse)
##                            TaxaID         Block Enrichment LDA_Score EffectSize Log2FoldChange (Median)\nAA_vs_BB
## 1    s__Actinomyces_odontolyticus 7_AA vs 15_BB         BB  2.354203 0.09521043                                NA
## 2 s__Bacteroides_thetaiotaomicron 7_AA vs 15_BB         AA -4.429029 4.25590505                          5.422189
## 3 s__Bifidobacterium_adolescentis 7_AA vs 15_BB         BB  4.443556 4.18357937                                NA
## 4       s__Bifidobacterium_longum 7_AA vs 15_BB         BB  4.446101 2.06541430                         -4.651371
## 5       s__Clostridium_bartlettii 7_AA vs 15_BB         BB  2.927763 1.73866565                                NA
## 6      s__Collinsella_aerofaciens 7_AA vs 15_BB         BB  3.839013 3.22865691                                NA
##   Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1                  0.0000                0.000             5.595339                              NA              16.83488
## 2               2878.5476            46631.686          1087.526189                        3.198416           24113.15854
## 3                  0.0000                0.000          3843.156200                              NA           36041.41284
## 4              11049.6932             1203.167         30236.246304                       -3.717363           41205.25130
## 5                287.0742                0.000           459.271930                       -3.712705            1293.74022
## 6               7678.6818                0.000         13208.610915                       -3.239758           12628.30960
##   Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1             0.0000           24.69116                    40.91                  0.00                 60.00
## 2         61441.3764         6693.32352                    90.91                100.00                 86.67
## 3             0.0000        52860.73882                    40.91                  0.00                 60.00
## 4          4437.1597        58363.69406                    86.36                 85.71                 86.67
## 5           139.7505         1832.26875                    68.18                 42.86                 80.00
## 6          1868.4023        17649.59965                    63.64                 42.86                 73.33
##   Odds Ratio (95% CI)
## 1                <NA>
## 2    0.064 (-5.3;5.5)
## 3                <NA>
## 4       270 (280;260)
## 5       450 (470;440)
## 6          28 (35;22)
  • Visualization
# # don't run this code when you do lefse in reality
# metaphlan2_ps_lefse$LDA_Score <- metaphlan2_ps_lefse$LDA_Score * 100

plot_lefse(
  da_res = metaphlan2_ps_lefse,
  x_index = "LDA_Score",
  x_index_cutoff = 2,
  group_color = c("green", "red"))
Lefse analysis (MGS example)

Figure 14.19: Lefse analysis (MGS example)

14.4.7.2 Wilcoxon Rank-Sum test

  • Calculation
metaphlan2_ps_wilcox <- run_wilcox(
                          ps = metaphlan2_ps_species_filter_trim,
                          group = "Group",
                          group_names = c("AA", "BB"))

head(metaphlan2_ps_wilcox)
##                           TaxaID         Block Enrichment EffectSize Statistic     Pvalue AdjustedPvalue
## 1  s__Acidaminococcus_fermentans 7_AA vs 15_BB  Nonsignif 0.09951572      49.0 0.75333110      0.9364324
## 2   s__Acidaminococcus_intestini 7_AA vs 15_BB  Nonsignif 0.75222128      51.0 0.91657416      1.0000000
## 3   s__Actinomyces_odontolyticus 7_AA vs 15_BB  Nonsignif 4.45220244      21.0 0.01422085      0.2314119
## 4 s__Adlercreutzia_equolifaciens 7_AA vs 15_BB  Nonsignif 0.60623026      28.0 0.05769072      0.4414970
## 5     s__Akkermansia_muciniphila 7_AA vs 15_BB  Nonsignif 1.45465974      63.0 0.23762783      0.5826765
## 6        s__Alistipes_finegoldii 7_AA vs 15_BB  Nonsignif 0.66890792      73.5 0.11454477      0.5119018
##   Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB
## 1                                NA                       0            0.0000000             0.00e+00
## 2                                NA                       0            0.0000000             0.00e+00
## 3                                NA                       0            0.0000000             3.90e-06
## 4                                NA                       0            0.0000000             5.58e-05
## 5                                NA                       0            0.0000000             0.00e+00
## 6                                NA                       0            0.0027362             0.00e+00
##   Log2FoldChange (Rank)\nAA_vs_BB Mean Rank Abundance\nAA Mean Rank Abundance\nBB Occurrence (100%)\n(All)
## 1                     -0.09269949                   11.00                   11.73                    18.18
## 2                     -0.03907932                   11.29                   11.60                    18.18
## 3                     -0.95817982                    7.00                   13.60                    40.91
## 4                     -0.71479501                    8.00                   13.13                    40.91
## 5                      0.26748031                   13.00                   10.80                    13.64
## 6                      0.52169761                   14.50                   10.10                    45.45
##   Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1                 14.29                 20.00      1.5 (2.4;0.69)
## 2                 14.29                 20.00     0.77 (0.26;1.3)
## 3                  0.00                 60.00                <NA>
## 4                 14.29                 53.33       Inf (Inf;Inf)
## 5                 28.57                  6.67     0.83 (0.48;1.2)
## 6                 57.14                 40.00     0.0028 (-12;12)
  • Volcano
plot_volcano(
    da_res = metaphlan2_ps_wilcox,
    group_names = c("AA", "BB"),
    x_index = "Log2FoldChange (Rank)\nAA_vs_BB",
    x_index_cutoff = 0.5,
    y_index = "Pvalue",
    y_index_cutoff = 0.05,
    group_color = c("red", "grey", "blue"),
    topN = 5)
Wilcoxon Rank-Sum test (MGS example)

Figure 14.20: Wilcoxon Rank-Sum test (MGS example)

14.5 Systematic Information

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       macOS Monterey 12.2.1
##  system   x86_64, darwin17.0
##  ui       RStudio
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Asia/Shanghai
##  date     2022-10-08
##  rstudio  2022.07.1+554 Spotted Wakerobin (desktop)
##  pandoc   2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package              * version  date (UTC) lib source
##  abind                  1.4-5    2016-07-21 [1] CRAN (R 4.1.0)
##  ade4                   1.7-18   2021-09-16 [1] CRAN (R 4.1.0)
##  ALDEx2                 1.26.0   2021-10-26 [1] Bioconductor
##  annotate               1.72.0   2021-10-26 [1] Bioconductor
##  AnnotationDbi          1.56.2   2021-11-09 [1] Bioconductor
##  ape                  * 5.6-2    2022-03-02 [1] CRAN (R 4.1.2)
##  askpass                1.1      2019-01-13 [1] CRAN (R 4.1.0)
##  assertthat             0.2.1    2019-03-21 [1] CRAN (R 4.1.0)
##  backports              1.4.1    2021-12-13 [1] CRAN (R 4.1.0)
##  base64enc              0.1-3    2015-07-28 [1] CRAN (R 4.1.0)
##  bayesm                 3.1-4    2019-10-15 [1] CRAN (R 4.1.0)
##  Biobase              * 2.54.0   2021-10-26 [1] Bioconductor
##  BiocGenerics         * 0.40.0   2021-10-26 [1] Bioconductor
##  BiocParallel           1.28.3   2021-12-09 [1] Bioconductor
##  biomformat             1.22.0   2021-10-26 [1] Bioconductor
##  Biostrings             2.62.0   2021-10-26 [1] Bioconductor
##  bit                    4.0.4    2020-08-04 [1] CRAN (R 4.1.0)
##  bit64                  4.0.5    2020-08-30 [1] CRAN (R 4.1.0)
##  bitops                 1.0-7    2021-04-24 [1] CRAN (R 4.1.0)
##  blob                   1.2.2    2021-07-23 [1] CRAN (R 4.1.0)
##  bookdown               0.27     2022-06-14 [1] CRAN (R 4.1.2)
##  brio                   1.1.3    2021-11-30 [1] CRAN (R 4.1.0)
##  broom                  1.0.0    2022-07-01 [1] CRAN (R 4.1.2)
##  bslib                  0.3.1    2021-10-06 [1] CRAN (R 4.1.0)
##  cachem                 1.0.6    2021-08-19 [1] CRAN (R 4.1.0)
##  callr                  3.7.0    2021-04-20 [1] CRAN (R 4.1.0)
##  car                    3.0-12   2021-11-06 [1] CRAN (R 4.1.0)
##  carData                3.0-5    2022-01-06 [1] CRAN (R 4.1.2)
##  caTools                1.18.2   2021-03-28 [1] CRAN (R 4.1.0)
##  checkmate              2.0.0    2020-02-06 [1] CRAN (R 4.1.0)
##  class                  7.3-20   2022-01-13 [1] CRAN (R 4.1.2)
##  classInt               0.4-3    2020-04-07 [1] CRAN (R 4.1.0)
##  cli                    3.3.0    2022-04-25 [1] CRAN (R 4.1.2)
##  cluster                2.1.2    2021-04-17 [1] CRAN (R 4.1.2)
##  codetools              0.2-18   2020-11-04 [1] CRAN (R 4.1.2)
##  coin                   1.4-2    2021-10-08 [1] CRAN (R 4.1.0)
##  colorspace             2.0-3    2022-02-21 [1] CRAN (R 4.1.2)
##  compositions           2.0-4    2022-01-05 [1] CRAN (R 4.1.2)
##  conflicted           * 1.1.0    2021-11-26 [1] CRAN (R 4.1.0)
##  corrplot               0.92     2021-11-18 [1] CRAN (R 4.1.0)
##  cowplot                1.1.1    2020-12-30 [1] CRAN (R 4.1.0)
##  crayon                 1.5.0    2022-02-14 [1] CRAN (R 4.1.2)
##  crosstalk              1.2.0    2021-11-04 [1] CRAN (R 4.1.0)
##  data.table             1.14.2   2021-09-27 [1] CRAN (R 4.1.0)
##  DBI                    1.1.2    2021-12-20 [1] CRAN (R 4.1.0)
##  DelayedArray           0.20.0   2021-10-26 [1] Bioconductor
##  DEoptimR               1.0-10   2022-01-03 [1] CRAN (R 4.1.2)
##  desc                   1.4.1    2022-03-06 [1] CRAN (R 4.1.2)
##  DESeq2                 1.34.0   2021-10-26 [1] Bioconductor
##  devtools             * 2.4.3    2021-11-30 [1] CRAN (R 4.1.0)
##  digest                 0.6.29   2021-12-01 [1] CRAN (R 4.1.0)
##  dplyr                * 1.0.8    2022-02-08 [1] CRAN (R 4.1.2)
##  DT                     0.21     2022-02-26 [1] CRAN (R 4.1.2)
##  e1071                  1.7-9    2021-09-16 [1] CRAN (R 4.1.0)
##  edgeR                  3.36.0   2021-10-26 [1] Bioconductor
##  ellipsis               0.3.2    2021-04-29 [1] CRAN (R 4.1.0)
##  evaluate               0.15     2022-02-18 [1] CRAN (R 4.1.2)
##  FactoMineR             2.4      2020-12-11 [1] CRAN (R 4.1.0)
##  fansi                  1.0.2    2022-01-14 [1] CRAN (R 4.1.2)
##  farver                 2.1.0    2021-02-28 [1] CRAN (R 4.1.0)
##  fastmap                1.1.0    2021-01-25 [1] CRAN (R 4.1.0)
##  flashClust             1.01-2   2012-08-21 [1] CRAN (R 4.1.0)
##  foreach                1.5.2    2022-02-02 [1] CRAN (R 4.1.2)
##  foreign                0.8-82   2022-01-13 [1] CRAN (R 4.1.2)
##  forestplot             2.0.1    2021-09-03 [1] CRAN (R 4.1.0)
##  formatR                1.11     2021-06-01 [1] CRAN (R 4.1.0)
##  Formula                1.2-4    2020-10-16 [1] CRAN (R 4.1.0)
##  fs                     1.5.2    2021-12-08 [1] CRAN (R 4.1.0)
##  futile.logger        * 1.4.3    2016-07-10 [1] CRAN (R 4.1.0)
##  futile.options         1.0.1    2018-04-20 [1] CRAN (R 4.1.0)
##  genefilter             1.76.0   2021-10-26 [1] Bioconductor
##  geneplotter            1.72.0   2021-10-26 [1] Bioconductor
##  generics               0.1.2    2022-01-31 [1] CRAN (R 4.1.2)
##  GenomeInfoDb         * 1.30.1   2022-01-30 [1] Bioconductor
##  GenomeInfoDbData       1.2.7    2022-03-09 [1] Bioconductor
##  GenomicRanges        * 1.46.1   2021-11-18 [1] Bioconductor
##  ggiraph                0.8.2    2022-02-22 [1] CRAN (R 4.1.2)
##  ggiraphExtra           0.3.0    2020-10-06 [1] CRAN (R 4.1.2)
##  ggplot2              * 3.3.5    2021-06-25 [1] CRAN (R 4.1.0)
##  ggpubr               * 0.4.0    2020-06-27 [1] CRAN (R 4.1.0)
##  ggrepel                0.9.1    2021-01-15 [1] CRAN (R 4.1.0)
##  ggsci                  2.9      2018-05-14 [1] CRAN (R 4.1.0)
##  ggsignif               0.6.3    2021-09-09 [1] CRAN (R 4.1.0)
##  ggVennDiagram          1.2.1    2022-04-13 [1] Github (gaospecial/ggVennDiagram@db6742d)
##  glmnet                 4.1-3    2021-11-02 [1] CRAN (R 4.1.0)
##  glue                 * 1.6.2    2022-02-24 [1] CRAN (R 4.1.2)
##  Gmisc                * 3.0.0    2022-01-03 [1] CRAN (R 4.1.2)
##  gplots                 3.1.1    2020-11-28 [1] CRAN (R 4.1.0)
##  gridExtra              2.3      2017-09-09 [1] CRAN (R 4.1.0)
##  gtable                 0.3.0    2019-03-25 [1] CRAN (R 4.1.0)
##  gtools                 3.9.2    2021-06-06 [1] CRAN (R 4.1.0)
##  highr                  0.9      2021-04-16 [1] CRAN (R 4.1.0)
##  Hmisc                  4.6-0    2021-10-07 [1] CRAN (R 4.1.0)
##  hms                    1.1.1    2021-09-26 [1] CRAN (R 4.1.0)
##  htmlTable            * 2.4.0    2022-01-04 [1] CRAN (R 4.1.2)
##  htmltools              0.5.2    2021-08-25 [1] CRAN (R 4.1.0)
##  htmlwidgets            1.5.4    2021-09-08 [1] CRAN (R 4.1.0)
##  httr                   1.4.2    2020-07-20 [1] CRAN (R 4.1.0)
##  igraph                 1.2.11   2022-01-04 [1] CRAN (R 4.1.2)
##  insight                0.17.0   2022-03-29 [1] CRAN (R 4.1.2)
##  IRanges              * 2.28.0   2021-10-26 [1] Bioconductor
##  iterators              1.0.14   2022-02-05 [1] CRAN (R 4.1.2)
##  jpeg                   0.1-9    2021-07-24 [1] CRAN (R 4.1.0)
##  jquerylib              0.1.4    2021-04-26 [1] CRAN (R 4.1.0)
##  jsonlite               1.8.0    2022-02-22 [1] CRAN (R 4.1.2)
##  kableExtra             1.3.4    2021-02-20 [1] CRAN (R 4.1.2)
##  KEGGREST               1.34.0   2021-10-26 [1] Bioconductor
##  KernSmooth             2.23-20  2021-05-03 [1] CRAN (R 4.1.2)
##  knitr                  1.39     2022-04-26 [1] CRAN (R 4.1.2)
##  labeling               0.4.2    2020-10-20 [1] CRAN (R 4.1.0)
##  lambda.r               1.2.4    2019-09-18 [1] CRAN (R 4.1.0)
##  lattice              * 0.20-45  2021-09-22 [1] CRAN (R 4.1.2)
##  latticeExtra           0.6-29   2019-12-19 [1] CRAN (R 4.1.0)
##  leaps                  3.1      2020-01-16 [1] CRAN (R 4.1.0)
##  libcoin                1.0-9    2021-09-27 [1] CRAN (R 4.1.0)
##  lifecycle              1.0.1    2021-09-24 [1] CRAN (R 4.1.0)
##  limma                  3.50.1   2022-02-17 [1] Bioconductor
##  locfit                 1.5-9.5  2022-03-03 [1] CRAN (R 4.1.2)
##  LOCOM                  1.1      2022-08-05 [1] Github (yijuanhu/LOCOM@c181e0f)
##  lubridate              1.8.0    2021-10-07 [1] CRAN (R 4.1.0)
##  magrittr             * 2.0.2    2022-01-26 [1] CRAN (R 4.1.2)
##  MASS                   7.3-55   2022-01-13 [1] CRAN (R 4.1.2)
##  Matrix                 1.4-0    2021-12-08 [1] CRAN (R 4.1.0)
##  MatrixGenerics       * 1.6.0    2021-10-26 [1] Bioconductor
##  matrixStats          * 0.61.0   2021-09-17 [1] CRAN (R 4.1.0)
##  mbzinb                 0.2      2022-03-16 [1] local
##  memoise                2.0.1    2021-11-26 [1] CRAN (R 4.1.0)
##  metagenomeSeq          1.36.0   2021-10-26 [1] Bioconductor
##  mgcv                   1.8-39   2022-02-24 [1] CRAN (R 4.1.2)
##  microbiome             1.16.0   2021-10-26 [1] Bioconductor
##  modeltools             0.2-23   2020-03-05 [1] CRAN (R 4.1.0)
##  multcomp               1.4-18   2022-01-04 [1] CRAN (R 4.1.2)
##  multtest               2.50.0   2021-10-26 [1] Bioconductor
##  munsell                0.5.0    2018-06-12 [1] CRAN (R 4.1.0)
##  mvtnorm                1.1-3    2021-10-08 [1] CRAN (R 4.1.0)
##  mycor                  0.1.1    2018-04-10 [1] CRAN (R 4.1.0)
##  NADA                   1.6-1.1  2020-03-22 [1] CRAN (R 4.1.0)
##  nlme                 * 3.1-155  2022-01-13 [1] CRAN (R 4.1.2)
##  nnet                   7.3-17   2022-01-13 [1] CRAN (R 4.1.2)
##  openssl                2.0.0    2022-03-02 [1] CRAN (R 4.1.2)
##  permute              * 0.9-7    2022-01-27 [1] CRAN (R 4.1.2)
##  pheatmap             * 1.0.12   2019-01-04 [1] CRAN (R 4.1.0)
##  phyloseq             * 1.38.0   2021-10-26 [1] Bioconductor
##  picante              * 1.8.2    2020-06-10 [1] CRAN (R 4.1.0)
##  pillar                 1.7.0    2022-02-01 [1] CRAN (R 4.1.2)
##  pkgbuild               1.3.1    2021-12-20 [1] CRAN (R 4.1.0)
##  pkgconfig              2.0.3    2019-09-22 [1] CRAN (R 4.1.0)
##  pkgload                1.2.4    2021-11-30 [1] CRAN (R 4.1.0)
##  plyr                   1.8.6    2020-03-03 [1] CRAN (R 4.1.0)
##  png                    0.1-7    2013-12-03 [1] CRAN (R 4.1.0)
##  ppcor                  1.1      2015-12-03 [1] CRAN (R 4.1.0)
##  prettyunits            1.1.1    2020-01-24 [1] CRAN (R 4.1.0)
##  processx               3.5.2    2021-04-30 [1] CRAN (R 4.1.0)
##  protoclust             1.6.3    2019-01-31 [1] CRAN (R 4.1.0)
##  proxy                  0.4-26   2021-06-07 [1] CRAN (R 4.1.0)
##  ps                     1.6.0    2021-02-28 [1] CRAN (R 4.1.0)
##  pscl                   1.5.5    2020-03-07 [1] CRAN (R 4.1.0)
##  purrr                * 0.3.4    2020-04-17 [1] CRAN (R 4.1.0)
##  qvalue                 2.26.0   2021-10-26 [1] Bioconductor
##  R6                     2.5.1    2021-08-19 [1] CRAN (R 4.1.0)
##  RAIDA                  1.0      2022-03-14 [1] local
##  RColorBrewer         * 1.1-2    2014-12-07 [1] CRAN (R 4.1.0)
##  Rcpp                 * 1.0.8.2  2022-03-11 [1] CRAN (R 4.1.2)
##  RcppZiggurat           0.1.6    2020-10-20 [1] CRAN (R 4.1.0)
##  RCurl                  1.98-1.6 2022-02-08 [1] CRAN (R 4.1.2)
##  readr                * 2.1.2    2022-01-30 [1] CRAN (R 4.1.2)
##  remotes                2.4.2    2021-11-30 [1] CRAN (R 4.1.0)
##  reshape2               1.4.4    2020-04-09 [1] CRAN (R 4.1.0)
##  reticulate             1.24     2022-01-26 [1] CRAN (R 4.1.2)
##  Rfast                  2.0.6    2022-02-16 [1] CRAN (R 4.1.2)
##  rhdf5                  2.38.1   2022-03-10 [1] Bioconductor
##  rhdf5filters           1.6.0    2021-10-26 [1] Bioconductor
##  Rhdf5lib               1.16.0   2021-10-26 [1] Bioconductor
##  rlang                  1.0.2    2022-03-04 [1] CRAN (R 4.1.2)
##  rmarkdown              2.14     2022-04-25 [1] CRAN (R 4.1.2)
##  robustbase             0.93-9   2021-09-27 [1] CRAN (R 4.1.0)
##  rpart                  4.1.16   2022-01-24 [1] CRAN (R 4.1.2)
##  rprojroot              2.0.2    2020-11-15 [1] CRAN (R 4.1.0)
##  RSpectra               0.16-0   2019-12-01 [1] CRAN (R 4.1.0)
##  RSQLite                2.2.10   2022-02-17 [1] CRAN (R 4.1.2)
##  rstatix                0.7.0    2021-02-13 [1] CRAN (R 4.1.0)
##  rstudioapi             0.13     2020-11-12 [1] CRAN (R 4.1.0)
##  Rtsne                  0.15     2018-11-10 [1] CRAN (R 4.1.0)
##  RVenn                  1.1.0    2019-07-18 [1] CRAN (R 4.1.0)
##  rvest                  1.0.2    2021-10-16 [1] CRAN (R 4.1.0)
##  S4Vectors            * 0.32.3   2021-11-21 [1] Bioconductor
##  sandwich               3.0-1    2021-05-18 [1] CRAN (R 4.1.0)
##  sass                   0.4.0    2021-05-12 [1] CRAN (R 4.1.0)
##  scales                 1.1.1    2020-05-11 [1] CRAN (R 4.1.0)
##  scatterplot3d          0.3-41   2018-03-14 [1] CRAN (R 4.1.0)
##  sessioninfo            1.2.2    2021-12-06 [1] CRAN (R 4.1.0)
##  sf                     1.0-7    2022-03-07 [1] CRAN (R 4.1.2)
##  shape                  1.4.6    2021-05-19 [1] CRAN (R 4.1.0)
##  sjlabelled             1.2.0    2022-04-10 [1] CRAN (R 4.1.2)
##  sjmisc                 2.8.9    2021-12-03 [1] CRAN (R 4.1.0)
##  stringi                1.7.6    2021-11-29 [1] CRAN (R 4.1.0)
##  stringr                1.4.0    2019-02-10 [1] CRAN (R 4.1.0)
##  SummarizedExperiment * 1.24.0   2021-10-26 [1] Bioconductor
##  survival               3.3-1    2022-03-03 [1] CRAN (R 4.1.2)
##  svglite                2.1.0    2022-02-03 [1] CRAN (R 4.1.2)
##  systemfonts            1.0.4    2022-02-11 [1] CRAN (R 4.1.2)
##  tensorA                0.36.2   2020-11-19 [1] CRAN (R 4.1.0)
##  testthat               3.1.2    2022-01-20 [1] CRAN (R 4.1.2)
##  TH.data                1.1-0    2021-09-27 [1] CRAN (R 4.1.0)
##  tibble               * 3.1.6    2021-11-07 [1] CRAN (R 4.1.0)
##  tidyr                * 1.2.0    2022-02-01 [1] CRAN (R 4.1.2)
##  tidyselect             1.1.2    2022-02-21 [1] CRAN (R 4.1.2)
##  truncnorm              1.0-8    2018-02-27 [1] CRAN (R 4.1.0)
##  tzdb                   0.2.0    2021-10-27 [1] CRAN (R 4.1.0)
##  umap                   0.2.8.0  2022-03-23 [1] CRAN (R 4.1.2)
##  units                  0.8-0    2022-02-05 [1] CRAN (R 4.1.2)
##  usethis              * 2.1.5    2021-12-09 [1] CRAN (R 4.1.0)
##  utf8                   1.2.2    2021-07-24 [1] CRAN (R 4.1.0)
##  uuid                   1.0-3    2021-11-01 [1] CRAN (R 4.1.0)
##  vctrs                  0.3.8    2021-04-29 [1] CRAN (R 4.1.0)
##  vegan                * 2.5-7    2020-11-28 [1] CRAN (R 4.1.0)
##  VennDiagram          * 1.7.3    2022-04-12 [1] CRAN (R 4.1.2)
##  viridis              * 0.6.2    2021-10-13 [1] CRAN (R 4.1.0)
##  viridisLite          * 0.4.0    2021-04-13 [1] CRAN (R 4.1.0)
##  vroom                  1.5.7    2021-11-30 [1] CRAN (R 4.1.0)
##  webshot                0.5.3    2022-04-14 [1] CRAN (R 4.1.2)
##  withr                  2.5.0    2022-03-03 [1] CRAN (R 4.1.2)
##  Wrench                 1.12.0   2021-10-26 [1] Bioconductor
##  xfun                   0.30     2022-03-02 [1] CRAN (R 4.1.2)
##  XMAS2                * 2.1.8    2022-10-08 [1] local
##  XML                    3.99-0.9 2022-02-24 [1] CRAN (R 4.1.2)
##  xml2                   1.3.3    2021-11-30 [1] CRAN (R 4.1.0)
##  xtable                 1.8-4    2019-04-21 [1] CRAN (R 4.1.0)
##  XVector                0.34.0   2021-10-26 [1] Bioconductor
##  yaml                   2.3.5    2022-02-21 [1] CRAN (R 4.1.2)
##  zCompositions          1.4.0    2022-01-13 [1] CRAN (R 4.1.2)
##  zlibbioc               1.40.0   2021-10-26 [1] Bioconductor
##  zoo                    1.8-9    2021-03-09 [1] CRAN (R 4.1.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

References

Thingholm, Louise B, Malte C Rühlemann, Manja Koch, Brie Fuqua, Guido Laucke, Ruwen Boehm, Corinna Bang, et al. 2019. “Obese Individuals with and Without Type 2 Diabetes Show Different Gut Microbial Functional Capacity and Composition.” Cell Host & Microbe 26 (2): 252–64.