Chapter 10 Differential Analysis
Identifying the significant taxa between case and control group is necessary. There are too many differential analysis (DA) approaches to choose (Nearing et al. 2022), we provide some of them which focusing on microbial data, including:
Tool.version. | Input | Normalization | Transformation | Distribution | MicrobialData |
---|---|---|---|---|---|
ALDEx2 (1.26.0) | Counts | None | CLR | Dirichlet-multinormial | 16s |
limma voom (3.50.1) | Counts/Relative | None/TMM | Log; Precision weighting | Normal | 16s/MGS |
mbzinb (0.2) | Counts | RLE | None | Zero-inflated negative binomial | 16s |
omnibus (0.2) | Counts | GMPR(Geometric Mean of Pairwise Ratios) | None | Zero-inflated negative binomial | 16s |
RAIDA (1.0) | Counts | None | zero-inflated Log | Modified t-test | 16s |
Wilcox(rare/CLR) | Counts/Relative | None | None/CLR | Non-parametric | 16s/MGS |
LEfSe | Rarefied Counts/Relative | TSS | None | Non-parametric | 16s/MGS |
t-test (rare) | Counts/Relative | None | None | Normal | 16s/MGS |
metagenomeSeq (1.36.0) | Counts | CSS | Log | Zero-inflated (log) Normal | 16s |
DESeq2 (1.34.0) | Counts | RLE | None | Negative binomial | 16s |
edgeR (3.36.0) | Counts | RLE/TMM | None | Negative binomial | 16s |
ANCOM-II (2.1) | Counts/Relative | None | ALR | Non-parametric | 16s/MGS |
Corncob (0.2.0) | Counts | None | None | Beta-binomial | 16s |
MaAslin2 (1.8.0) | Counts/Relative | None/TSS | AST | Normal | 16s/MGS |
Outline of this Chapter:
10.1 Loading Packages
library(XMAS2)
library(dplyr)
library(tibble)
library(phyloseq)
library(ggplot2)
library(ggpubr)
10.2 Importing Data
10.2.1 16s data
data("dada2_ps")
# step1: Removing samples of specific group in phyloseq-class object
<- get_GroupPhyloseq(
dada2_ps_remove_BRS ps = dada2_ps,
group = "Group",
group_names = "QC",
discard = TRUE)
# step2: Rarefying counts in phyloseq-class object
<- norm_rarefy(object = dada2_ps_remove_BRS,
dada2_ps_rarefy size = 51181)
# step3: Extracting specific taxa phyloseq-class object
<- summarize_taxa(ps = dada2_ps_rarefy,
dada2_ps_rare_genus taxa_level = "Genus",
absolute = TRUE)
# step4: Aggregating low relative abundance or unclassified taxa into others
# dada2_ps_genus_LRA <- summarize_LowAbundance_taxa(ps = dada2_ps_rare_genus,
# cutoff = 10,
# unclass = TRUE)
# step4: Filtering the low relative abundance or unclassified taxa by the threshold
<- run_filter(ps = dada2_ps_rare_genus,
dada2_ps_genus_filter cutoff = 10,
unclass = TRUE)
# step5: Trimming the taxa with low occurrence less than threshold
<- run_trim(object = dada2_ps_genus_filter,
dada2_ps_genus_filter_trim cutoff = 0.2,
trim = "feature")
dada2_ps_genus_filter_trim
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 100 taxa and 23 samples ]
## sample_data() Sample Data: [ 23 samples by 1 sample variables ]
## tax_table() Taxonomy Table: [ 100 taxa by 6 taxonomic ranks ]
10.2.2 Metagenomic data
data("metaphlan2_ps")
# step1: Removing samples of specific group in phyloseq-class object
<- get_GroupPhyloseq(
metaphlan2_ps_remove_BRS ps = metaphlan2_ps,
group = "Group",
group_names = "QC",
discard = TRUE)
# step2: Extracting specific taxa phyloseq-class object
<- summarize_taxa(ps = metaphlan2_ps_remove_BRS,
metaphlan2_ps_genus taxa_level = "Genus")
# step3: Aggregating low relative abundance or unclassified taxa into others
# dada2_ps_genus_LRA <- summarize_LowAbundance_taxa(ps = dada2_ps_genus,
# cutoff = 10,
# unclass = TRUE)
# step4: Filtering the low relative abundance or unclassified taxa by the threshold
<- run_filter(ps = metaphlan2_ps_genus,
metaphlan2_ps_genus_filter cutoff = 1e-4,
unclass = TRUE)
# step5: Trimming the taxa with low occurrence less than threshold
<- run_trim(object = metaphlan2_ps_genus_filter,
metaphlan2_ps_genus_filter_trim cutoff = 0.2,
trim = "feature")
metaphlan2_ps_genus_filter_trim
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 54 taxa and 22 samples ]
## sample_data() Sample Data: [ 22 samples by 2 sample variables ]
## tax_table() Taxonomy Table: [ 54 taxa by 6 taxonomic ranks ]
10.3 Amplicon sequencing microbial data (16s)
10.3.1 ALDEx2
ALDEx2 package is from Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis (Fernandes et al. 2014), and its principle is using log-ratio transformation and statistical testing to find the significant Taxa. (Caution: the otu_table must be integers).
run_aldex
provides 11 parameters. For instance, norm and transform are used to normalization and transformation input data. More details to see help(run_aldex)
.
<- run_aldex(
DA_ALDEx2 ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
method = "t.test")
## |------------(25%)----------(50%)----------(75%)----------|
colnames(DA_ALDEx2)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Pvalue" "AdjustedPvalue"
## [7] "Median CLR \n(All)" "Median CLR\nAA" "Median CLR\nBB"
## [10] "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)" "Median Abundance\nAA"
## [13] "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)"
## [16] "Mean Abundance\nAA" "Mean Abundance\nBB" "Occurrence (100%)\n(All)"
## [19] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_ALDEx2)
## TaxaID Block Enrichment EffectSize Pvalue AdjustedPvalue Median CLR \n(All) Median CLR\nAA
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 0.03216861 0.6548749 0.9018534 -3.4244631 -3.555905
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 0.09330854 0.6658567 0.9138848 2.7645196 2.167638
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 0.30832148 0.3184500 0.7716171 -0.7734251 -2.223805
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif -0.08089118 0.6408318 0.9042170 5.4463483 6.081374
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif -0.15579201 0.5972170 0.8723074 -2.9580004 -2.541838
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif -0.09579649 0.6356202 0.9052481 3.7226931 4.768802
## Median CLR\nBB Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB
## 1 -3.3102684 NA 0 0 0.0
## 2 3.1218379 -0.9577718 30 26 50.5
## 3 0.3041529 NA 0 0 9.0
## 4 5.3302183 0.1524904 316 329 296.0
## 5 -3.2004909 NA 0 0 0.0
## 6 3.4586946 -0.1085245 64 64 69.0
## Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All)
## 1 1.7051442 58.82609 101.777778 31.21429 21.74
## 2 -1.3107676 50.91304 26.777778 66.42857 82.61
## 3 -2.4683647 19.21739 5.111111 28.28571 43.48
## 4 1.7488297 996.26087 1740.444444 517.85714 65.22
## 5 -2.1676332 93.78261 30.000000 134.78571 21.74
## 6 0.2000385 163.86957 177.888889 154.85714 73.91
## Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 22.22 21.43 0.67 (-0.11;1.5)
## 2 88.89 78.57 4.1 (6.8;1.3)
## 3 33.33 50.00 8.6 (13;4.3)
## 4 55.56 71.43 0.39 (-1.4;2.2)
## 5 22.22 21.43 1.5 (2.3;0.71)
## 6 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by ALDEx2;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue by ALDEx2;
Median CLR (All)/(group AA)/(group BB): Median CLR (normalization by ALDEx2) in all, group AA and group BB, respectively;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
Please open the below buttons, if you want to see other options for differential analysis in ALDEx2.
run_da() pattern
We also provide another function run_da
to run ALDEx2 differential analysis.
<- run_da(
DA_ALDEx2 ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "aldex",
method = "t.test")
## |------------(25%)----------(50%)----------(75%)----------|
colnames(DA_ALDEx2)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Pvalue" "AdjustedPvalue"
## [7] "Median CLR \n(All)" "Median CLR\nAA" "Median CLR\nBB"
## [10] "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)" "Median Abundance\nAA"
## [13] "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)"
## [16] "Mean Abundance\nAA" "Mean Abundance\nBB" "Occurrence (100%)\n(All)"
## [19] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
otu_table or sample_table as inputdata
We also provide data_otu and data_sam as input data to run run_aldex
.
<- run_aldex(
DA_ALDEx2 data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"),
method = "t.test")
## |------------(25%)----------(50%)----------(75%)----------|
colnames(DA_ALDEx2)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Pvalue" "AdjustedPvalue"
## [7] "Median CLR \n(All)" "Median CLR\nAA" "Median CLR\nBB"
## [10] "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)" "Median Abundance\nAA"
## [13] "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)"
## [16] "Mean Abundance\nAA" "Mean Abundance\nBB" "Occurrence (100%)\n(All)"
## [19] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
taxa_level option
We also provide taxa_level for choosing the specific taxonomic level to run run_aldex
.
<- run_aldex(
DA_ALDEx2 ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"),
method = "t.test")
## |------------(25%)----------(50%)----------(75%)----------|
colnames(DA_ALDEx2)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Pvalue" "AdjustedPvalue"
## [7] "Median CLR \n(All)" "Median CLR\nAA" "Median CLR\nBB"
## [10] "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)" "Median Abundance\nAA"
## [13] "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)"
## [16] "Mean Abundance\nAA" "Mean Abundance\nBB" "Occurrence (100%)\n(All)"
## [19] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
10.3.2 limma_voom
limma package is from voom: Precision weights unlock linear model analysis tools for RNA-seq read counts (Law et al. 2014). Firstly, transforming count data to log2-counts per million (logCPM), estimate the mean-variance relationship and use this to compute appropriate observation-level weights. Secondly, fitting multiple linear models by weighted or generalized least squares. Finally, performing empirical bayes statistics for differential expression.
run_limma_voom
provides 11 parameters. For instance, norm and transform are used to normalization and transformation input data. More details to see help(run_limma_voom)
.
<- run_limma_voom(
DA_limma_voom ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_limma_voom)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "logFC" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
head(DA_limma_voom)
## TaxaID Block Enrichment EffectSize logFC Pvalue AdjustedPvalue
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 0.09424356 0.09424356 0.9451632 0.9895209
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 0.42751966 0.42751966 0.7781674 0.9489846
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 1.76792390 1.76792390 0.1726319 0.7804151
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 0.27153770 0.27153770 0.8944583 0.9722372
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif -0.24224151 -0.24224151 0.8457202 0.9610457
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif -0.41145677 -0.41145677 0.8141009 0.9491533
## Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB
## 1 NA 0 0 0.0
## 2 -0.9577718 30 26 50.5
## 3 NA 0 0 9.0
## 4 0.1524904 316 329 296.0
## 5 NA 0 0 0.0
## 6 -0.1085245 64 64 69.0
## Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All)
## 1 1.7051442 58.82609 101.777778 31.21429 21.74
## 2 -1.3107676 50.91304 26.777778 66.42857 82.61
## 3 -2.4683647 19.21739 5.111111 28.28571 43.48
## 4 1.7488297 996.26087 1740.444444 517.85714 65.22
## 5 -2.1676332 93.78261 30.000000 134.78571 21.74
## 6 0.2000385 163.86957 177.888889 154.85714 73.91
## Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 22.22 21.43 0.67 (-0.11;1.5)
## 2 88.89 78.57 4.1 (6.8;1.3)
## 3 33.33 50.00 8.6 (13;4.3)
## 4 55.56 71.43 0.39 (-1.4;2.2)
## 5 22.22 21.43 1.5 (2.3;0.71)
## 6 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by limma;
logFC: LogFC from groups’ coefficient by limma;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue by limma;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
Please open the below buttons, if you want to see other options for differential analysis in limma.
other options for limma-voom
if(0) {
<- run_limma_voom(
DA_limma_voom data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_limma_voom(
DA_limma_voom ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_limma_voom ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
head(DA_limma_voom)
}
10.3.3 mbzinb
mbzinb package is from An omnibus test for differential distribution analysis of microbiome sequencing data (J. Chen et al. 2018). It uses zeroinflated negative binomial model to investigate the significant taxa. (Caution: the otu_table must be integers).
<- run_mbzinb(
DA_mbzinb ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_mbzinb)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "Pvalue" "AdjustedPvalue" "EffectSize"
## [7] "base.mean" "mean.LFC" "base.abund"
## [10] "abund.LFC" "base.prev" "prev.change"
## [13] "base.disp" "disp.LFC" "statistic"
## [16] "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)" "Median Abundance\nAA"
## [19] "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)"
## [22] "Mean Abundance\nAA" "Mean Abundance\nBB" "Occurrence (100%)\n(All)"
## [25] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_mbzinb)
## TaxaID Block Enrichment Pvalue AdjustedPvalue EffectSize base.mean mean.LFC
## 1 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 0.032781937 0.2341567 23.1746032 4.161223 2.6824340
## 2 g__Butyricicoccus 9_AA vs 14_BB Nonsignif 0.019009756 0.1564333 61.0000000 231.439059 -0.1466880
## 3 g__Clostridium_sensu_stricto_1 9_AA vs 14_BB Nonsignif 0.007406534 0.1564333 399.5714286 10.618571 4.5552518
## 4 g__Enterococcus 9_AA vs 14_BB Nonsignif 0.020336331 0.1564333 60.7460317 32.658840 1.5654364
## 5 g__Faecalibacterium 9_AA vs 14_BB Nonsignif 0.068916032 0.3828668 2742.6031746 3229.238369 -1.1278995
## 6 g__Gordonibacter 9_AA vs 14_BB Nonsignif 0.017121597 0.1564333 0.3888889 7.074759 0.1774837
## base.abund abund.LFC base.prev prev.change base.disp disp.LFC statistic Log2FoldChange (Median)\nAA_vs_BB
## 1 12.483819 2.0941415 0.3333293 0.16782006 0.05915642 3.206884 8.751636 NA
## 2 297.532163 0.1283709 0.7778623 -0.13502400 0.04079382 3.619904 9.948415 2.174498
## 3 46.181531 2.4345380 0.2299311 0.77006016 1.20536377 1.475574 11.993187 NA
## 4 97.952015 0.4498616 0.3334167 0.38903504 0.04624961 4.624979 9.800932 NA
## 5 4844.268667 -1.2274983 0.6666101 0.04764622 0.22640364 1.505613 7.095478 2.400169
## 6 9.837094 0.9244114 0.7191920 -0.29064622 0.96279691 -5.167700 10.176792 NA
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1 0 0 9.0 -2.46836474 19.217391
## 2 282 316 70.0 0.36332246 236.869565
## 3 11 0 39.5 -4.88463779 257.217391
## 4 36 0 51.5 -1.42341563 73.086957
## 5 1373 4537 859.5 1.43691832 2679.478261
## 6 2 2 0.0 -0.06030051 9.347826
## Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1 5.111111 28.28571 43.48 33.33 50.00
## 2 274.000000 213.00000 69.57 77.78 64.29
## 3 14.000000 413.57143 60.87 22.22 85.71
## 4 36.111111 96.85714 56.52 33.33 71.43
## 5 4348.888889 1606.28571 69.57 66.67 71.43
## 6 9.111111 9.50000 52.17 66.67 42.86
## Odds Ratio (95% CI)
## 1 8.6 (13;4.3)
## 2 0.76 (0.2;1.3)
## 3 11000 (11000;11000)
## 4 1.9 (3.2;0.64)
## 5 0.37 (-1.6;2.3)
## 6 1.1 (1.2;0.93)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by mbzinb;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue by mbzinb;
base.mean: fitted mean abundance parameter times the fitted prevalence in baseline group;
mean.LFC: log2-fold change in fitted mean between other group and baseline;
base.abund: fitted mean abundance parameter in baseline group;
abund.LFC: log2-fold change in fitted mean abundance parameter between other group and baseline;
base.prev: fitted prevalence in baseline group;
prev.change: (linear) difference in prevalence between baseline group and other group (other-baseline);
base.disp: fitted dispersion parameter in baseline group;
disp.LFC: log2-fold change in fitted dispersion parameter between other group and baseline;
statistic: value of likelihood ratio test statistic;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
Please open the below buttons, if you want to see other options for differential analysis in mbzinb.
other options for mbzinb
if(0) {
<- run_mbzinb(
DA_mbzinb data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_mbzinb(
DA_mbzinb ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_mbzinb ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
head(DA_mbzinb)
}
10.3.4 omnibus
This approach is also from mbzinb (J. Chen et al. 2018) package. it uses GMPR (Geometric Mean of Pairwise Ratios) (L. Chen et al. 2018) to get the size factors.
where we specify models for count (abundance), zero (prevalence) and dispersion part. We also provide likelihood ratio test (zinb.lrt) for different models.
(Caution: the otu_table must be integers).
<- run_omnibus(
DA_omnibus ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
method = "omnibus")
## Start GMPR normalization ...
## Start Winsorization ...
## Perform filtering ...
## --A total of 92 taxa will be tested with a sample size of 23 !
## --Omnibus test is selected!
## --Dispersion is treated as a parameter of interest!
## Start testing ...
## 10 %
## 20 %
## 30 %
## 40 %
## 50 %
## 60 %
## 70 %
## 80 %
## 90 %
## 100%!
## Handle failed taxa using permutation test!
## Permutation test ....
## Completed!
colnames(DA_omnibus)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "Pvalue" "AdjustedPvalue" "EffectSize"
## [7] "chi.stat" "df" "abund.baseline"
## [10] "prev.baseline" "dispersion.baseline" "abund.LFC.CompvarBB.est"
## [13] "abund.LFC.CompvarBB.se" "prev.LOD.CompvarBB.est" "prev.LOD.CompvarBB.se"
## [16] "dispersion.LFC.CompvarBB.est" "dispersion.LFC.CompvarBB.se" "method"
## [19] "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)" "Median Abundance\nAA"
## [22] "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)"
## [25] "Mean Abundance\nAA" "Mean Abundance\nBB" "Occurrence (100%)\n(All)"
## [28] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_omnibus)
## TaxaID Block Enrichment Pvalue AdjustedPvalue EffectSize chi.stat df abund.baseline prev.baseline
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 0.26476211 0.6245670 70.56349 3.9696315 3 0.0050842054 0.2910300
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 0.13905545 0.4437791 39.65079 5.4930407 3 0.0006912502 0.8894593
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 0.06293962 0.3795000 23.17460 7.2995223 3 0.0003647798 0.3333734
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 0.12024562 0.4437791 1222.58730 5.8287526 3 0.0553800205 0.5555645
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif 0.86344681 0.9142217 104.78571 0.7413110 3 0.0031212219 0.2357130
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif 0.97967175 0.9796718 23.03175 0.1869261 3 0.0049314397 0.7815505
## dispersion.baseline abund.LFC.CompvarBB.est abund.LFC.CompvarBB.se prev.LOD.CompvarBB.est prev.LOD.CompvarBB.se
## 1 0.2041100 -0.36978708 1.4040484 -0.4088983 0.9811747
## 2 3.1647486 0.73754098 0.3062639 -0.7727912 1.2480678
## 3 11.0696374 1.29262130 0.3614490 0.6959681 0.8863794
## 4 1.6324698 -1.15572656 0.4756819 0.6968830 0.8947568
## 5 0.5149302 -0.83042868 1.7432216 2.3976744 1.0114415
## 6 1.0012623 0.03215791 0.4770876 -0.3541131 1.0007538
## dispersion.LFC.CompvarBB.est dispersion.LFC.CompvarBB.se method Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All)
## 1 2.8343543 1.1361534 omnibus NA 0
## 2 -0.5318657 0.6766040 omnibus -0.9577718 30
## 3 -1.7738398 1.4812754 omnibus NA 0
## 4 -0.5257607 0.7011677 omnibus 0.1524904 316
## 5 -2.4675480 1.0580187 omnibus NA 0
## 6 0.1662740 0.6265328 omnibus -0.1085245 64
## Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA
## 1 0 0.0 1.7051442 58.82609 101.777778
## 2 26 50.5 -1.3107676 50.91304 26.777778
## 3 0 9.0 -2.4683647 19.21739 5.111111
## 4 329 296.0 1.7488297 996.26087 1740.444444
## 5 0 0.0 -2.1676332 93.78261 30.000000
## 6 64 69.0 0.2000385 163.86957 177.888889
## Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 31.21429 21.74 22.22 21.43 0.67 (-0.11;1.5)
## 2 66.42857 82.61 88.89 78.57 4.1 (6.8;1.3)
## 3 28.28571 43.48 33.33 50.00 8.6 (13;4.3)
## 4 517.85714 65.22 55.56 71.43 0.39 (-1.4;2.2)
## 5 134.78571 21.74 22.22 21.43 1.5 (2.3;0.71)
## 6 154.85714 73.91 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by glm function to assess the pvalue effect;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue by omnibus;
chi.stat: chisquare statistics;
df: degree freedom;
abund.baseline: mean abundance in baseline group;
prev.baseline: prevalence in baseline group;
dispersion.baseline: dispersion in baseline group;
abund.LFC.CompvarBB.est: log2-fold change in abundance between group BB and other group;
abund.LFC.CompvarBB.se: log2-fold change of standard errors in abundance between group BB and other group;
prev.LOD.CompvarBB.est: prevalence of low of detect value in group BB;
prev.LOD.CompvarBB.se: prevalence’s standard errors of low of detect value in group BB;
dispersion.LFC.CompvarBB.est: log2-fold change in dispersion in group BB;
dispersion.LFC.CompvarBB.se: log2-fold change in dispersion’s standard errors in group BB;
method: methods of test;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
Please open the below buttons, if you want to see other options for differential analysis in omnibus.
other options for omnibus
if(0) {
<- run_omnibus(
DA_omnibus data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"),
method = "omnibus")
<- run_omnibus(
DA_omnibus ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"),
method = "omnibus")
<- run_da(
DA_omnibus ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "omnibus",
method = "omnibus")
head(DA_omnibus)
}
10.3.5 RAIDA
RAIDA package is from A robust approach for identifying differentially abundant features in metagenomic samples (Sohn, Du, and An 2015). It uses Ratio Approach for Identifying Differential Abundance (RAIDA). (Caution: the otu_table must be integers).
<- run_raida(
DA_RAIDA ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_RAIDA)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Pvalue" "AdjustedPvalue"
## [7] "eta_AA" "mean_AA" "sd_AA"
## [10] "eta_BB" "mean_BB" "sd_BB"
## [13] "mod.pool.var" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [16] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [19] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [22] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [25] "Odds Ratio (95% CI)"
head(DA_RAIDA)
## TaxaID Block Enrichment EffectSize Pvalue AdjustedPvalue eta_AA mean_AA sd_AA eta_BB
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 70.563492 0.3816518 0.7379802 0.7777778 -1.1787353 3.4581703 0.7857143
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 39.650794 0.3655566 0.7379802 0.1111111 -3.5858713 1.3594755 0.2142857
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 23.174603 0.1074479 0.7353741 0.6666667 -4.4427957 0.8570999 0.5000000
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 1222.587302 0.1143915 0.7353741 0.4444444 0.1696923 1.1949839 0.2857143
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif 104.785714 0.6418257 0.8494752 0.7777778 -3.7046260 2.3018074 0.7857143
## 6 g__Allisonella 9_AA vs 14_BB Nonsignif 3.603175 0.8515585 0.9461761 0.7777778 -3.0106395 0.5628666 0.7142857
## mean_BB sd_BB mod.pool.var Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA
## 1 -2.5738907 1.3069089 2.756248 NA 0 0
## 2 -2.9198320 1.8200871 2.410794 -0.9577718 30 26
## 3 -3.0988711 1.0974531 1.280317 NA 0 0
## 4 -0.8127871 0.9057351 1.174178 0.1524904 316 329
## 5 -4.4749687 2.5368159 3.071523 NA 0 0
## 6 -2.8074387 1.3346198 1.491825 NA 0 0
## Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA Mean Abundance\nBB
## 1 0.0 1.7051442 58.82609 101.777778 31.21429
## 2 50.5 -1.3107676 50.91304 26.777778 66.42857
## 3 9.0 -2.4683647 19.21739 5.111111 28.28571
## 4 296.0 1.7488297 996.26087 1740.444444 517.85714
## 5 0.0 -2.1676332 93.78261 30.000000 134.78571
## 6 0.0 -0.4052144 13.30435 11.111111 14.71429
## Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 21.74 22.22 21.43 0.67 (-0.11;1.5)
## 2 82.61 88.89 78.57 4.1 (6.8;1.3)
## 3 43.48 33.33 50.00 8.6 (13;4.3)
## 4 65.22 55.56 71.43 0.39 (-1.4;2.2)
## 5 21.74 22.22 21.43 1.5 (2.3;0.71)
## 6 26.09 22.22 28.57 1.1 (1.4;0.89)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by glm function to assess the pvalue effect;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue by RAIDA;
eta_AA: vector containing estimated probabilities of the false zero state for group AA;
mean_AA: vector containing estimated means of log ratios for group AA;
sd_AA: vector containing estimated standard deviations of log ratios for group AA;
eta_BB: vector containing estimated probabilities of the false zero state for group BB;
mean_BB: vector containing estimated means of log ratios for group BB;
sd_BB: vector containing estimated standard deviations of log ratios for group BB;
mod.pool.var: vector containing estimated posterior variances of log ratios;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
Please open the below buttons, if you want to see other options for differential analysis in RAIDA.
other options for RAIDA
if (0) {
<- run_raida(
DA_RAIDA data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_raida(
DA_RAIDA ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_RAIDA ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "raida")
head(DA_RAIDA)
}
10.3.6 Wilcoxon Rank Sum and Signed Rank Tests
Wilcoxon Rank Sum and Signed Rank Tests, which are nonparametric test methods, use the rank of taxa abundance to find the significant taxa.
- Ordinary pattern
<- run_wilcox(
DA_wilcox ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_wilcox)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Rank)\nAA_vs_BB"
## [13] "Mean Rank Abundance\nAA" "Mean Rank Abundance\nBB" "Occurrence (100%)\n(All)"
## [16] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_wilcox)
## TaxaID Block Enrichment EffectSize Statistic Pvalue AdjustedPvalue Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 70.56349 63.5 1.0000000 1.0000000 NA
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 39.65079 44.5 0.2556616 0.7014746 -0.9577718
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 23.17460 44.0 0.1980169 0.7014746 NA
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 1222.58730 71.0 0.6293971 0.8964389 0.1524904
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif 104.78571 64.5 0.9304707 0.9692403 NA
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif 23.03175 67.0 0.8239942 0.9363570 -0.1085245
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Rank)\nAA_vs_BB Mean Rank Abundance\nAA
## 1 0 0 0.0 0.01201252 12.06
## 2 30 26 50.5 -0.42227633 9.94
## 3 0 0 9.0 -0.43387758 9.89
## 4 316 329 296.0 0.17342686 12.89
## 5 0 0 0.0 0.03358045 12.17
## 6 64 64 69.0 0.08724541 12.44
## Mean Rank Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 11.96 21.74 22.22 21.43 0.67 (-0.11;1.5)
## 2 13.32 82.61 88.89 78.57 4.1 (6.8;1.3)
## 3 13.36 43.48 33.33 50.00 8.6 (13;4.3)
## 4 11.43 65.22 55.56 71.43 0.39 (-1.4;2.2)
## 5 11.89 21.74 22.22 21.43 1.5 (2.3;0.71)
## 6 11.71 73.91 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by glm function to assess the pvalue effect;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean Rank)_vs_BB**: Log2FoldChange (Mean Rank Abundance) between group AA and group BB;
Mean Rank Abundance (All)/(group AA)/(group BB): Mean Rank Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for wilcox
if (0) {
<- run_wilcox(
DA_wilcox data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_wilcox(
DA_wilcox ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_wilcox ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "wilcox")
head(DA_wilcox)
}
- wilcox_rarefy: random subsampling counts to the smallest library size in the data set.
# summary
summarize_phyloseq(ps = dada2_ps_genus_filter_trim)
## [[1]]
## [1] "1] Min. number of reads = 32359"
##
## [[2]]
## [1] "2] Max. number of reads = 49328"
##
## [[3]]
## [1] "3] Total number of reads = 1008016"
##
## [[4]]
## [1] "4] Average number of reads = 43826.7826086957"
##
## [[5]]
## [1] "5] Median number of reads = 45596"
##
## [[6]]
## [1] "7] Sparsity = 0.501304347826087"
##
## [[7]]
## [1] "6] Any OTU sum to 1 or less? NO"
##
## [[8]]
## [1] "8] Number of singletons = 0"
##
## [[9]]
## [1] "9] Percent of OTUs that are singletons\n (i.e. exactly one read detected across all samples)0"
##
## [[10]]
## [1] "10] Number of sample variables are: 1"
##
## [[11]]
## [1] "Group"
# run
<- run_wilcox(
DA_wilcox_rarefy ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "rarefy")
colnames(DA_wilcox_rarefy)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Rank)\nAA_vs_BB"
## [13] "Mean Rank Abundance\nAA" "Mean Rank Abundance\nBB" "Occurrence (100%)\n(All)"
## [16] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
The output is the same as the previous results of ordinary pattern.
other options for wilcox_rarefy
if (0) {
<- run_da(
DA_wilcox_rarefy ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "wilcox",
norm = "rarefy")
head(DA_wilcox_rarefy)
}
- wilcox_CLR: centered log-ratio normalization.
<- run_wilcox(
DA_wilcox_CLR ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "CLR")
colnames(DA_wilcox_CLR)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Rank)\nAA_vs_BB"
## [13] "Mean Rank Abundance\nAA" "Mean Rank Abundance\nBB" "Occurrence (100%)\n(All)"
## [16] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
The output is the same as the previous results of ordinary pattern.
other options for wilcox_CLR
if(0) {
<- run_da(
DA_wilcox_CLR ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "wilcox",
norm = "CLR")
head(DA_wilcox_CLR)
}
10.3.7 Liner discriminant analysis (LDA) effect size (LEfSe)
LEfSe method is from Metagenomic biomarker discovery and explanation (Segata et al. 2011). It uses Liner discriminant analysis model to identify Differential Taxa.
<- run_lefse(
DA_lefse ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "CPM",
Lda = 2)
colnames(DA_lefse)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "LDA_Score" "EffectSize" "Log2FoldChange (Median)\nAA_vs_BB"
## [7] "Median Abundance\n(All)" "Median Abundance\nAA" "Median Abundance\nBB"
## [10] "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)" "Mean Abundance\nAA"
## [13] "Mean Abundance\nBB" "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA"
## [16] "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_lefse)
## TaxaID Block Enrichment LDA_Score EffectSize Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Clostridium_sensu_stricto_1 9_AA vs 14_BB BB 3.671138 2.679298 NA
## 2 g__Intestinibacter 9_AA vs 14_BB BB 3.146337 2.384308 NA
## 3 g__Lactobacillus 9_AA vs 14_BB BB 4.256710 2.450656 -3.336128
## 4 g__Odoribacter 9_AA vs 14_BB BB 2.340309 1.876320 NA
## 5 g__Parasutterella 9_AA vs 14_BB AA -3.604579 2.234915 4.402050
## 6 g__Romboutsia 9_AA vs 14_BB BB 3.705317 2.929140 -6.856424
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1 225.11921 0.00000 1173.88222 -4.746617 5861.9051
## 2 383.50911 0.00000 1398.79740 -1.964554 1883.3386
## 3 1283.85401 520.46061 5256.08763 -5.173067 19938.7655
## 4 21.93175 604.99989 0.00000 2.099583 578.0758
## 5 172.77125 1080.35695 51.09966 4.481148 3901.0977
## 6 2189.79596 50.35627 5835.02623 -3.334490 7094.1306
## Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1 350.3379 9405.0554 60.87 22.22 85.71
## 2 680.6440 2656.4994 60.87 22.22 85.71
## 3 892.0297 32183.0957 86.96 77.78 92.86
## 4 1083.9017 252.9020 52.17 77.78 35.71
## 5 9320.3061 417.3209 65.22 88.89 50.00
## 6 1086.1419 10956.4090 78.26 55.56 92.86
## Odds Ratio (95% CI)
## 1 4900 (5000;4900)
## 2 3.4 (5.8;1)
## 3 1.1e+08 (1.1e+08;1.1e+08)
## 4 0.31 (-2;2.6)
## 5 0.0025 (-12;12)
## 6 76 (85;68)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by lefse;
LDA_Score: significant level of Pvalue and Adjusted-pvalue;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for lefse
if (0) {
<- run_lefse(
DA_lefse data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"),
norm = "CPM",
Lda = 0)
<- run_lefse(
DA_lefse ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"),
norm = "CPM",
Lda = 0)
<- run_da(
DA_lefse ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "lefse",
norm = "CPM",
Lda = 0)
head(DA_lefse)
}
10.3.8 t-test
T test, a parametric test method, identifies the significant taxa.
- Ordinary pattern
<- run_ttest(
DA_ttest ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_ttest)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
head(DA_ttest)
## TaxaID Block Enrichment EffectSize Statistic Pvalue AdjustedPvalue Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 70.56349 0.6845254 0.51170983 0.7107081 NA
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 39.65079 -1.9079866 0.07478702 0.6397026 -0.9577718
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 23.17460 -2.1601919 0.04743155 0.6397026 NA
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 1222.58730 1.4034718 0.19417159 0.6397026 0.1524904
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif 104.78571 -0.7632086 0.45788140 0.6983360 NA
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif 23.03175 0.2691871 0.79125424 0.9059862 -0.1085245
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1 0 0 0.0 1.7051442 58.82609
## 2 30 26 50.5 -1.3107676 50.91304
## 3 0 0 9.0 -2.4683647 19.21739
## 4 316 329 296.0 1.7488297 996.26087
## 5 0 0 0.0 -2.1676332 93.78261
## 6 64 64 69.0 0.2000385 163.86957
## Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1 101.777778 31.21429 21.74 22.22 21.43
## 2 26.777778 66.42857 82.61 88.89 78.57
## 3 5.111111 28.28571 43.48 33.33 50.00
## 4 1740.444444 517.85714 65.22 55.56 71.43
## 5 30.000000 134.78571 21.74 22.22 21.43
## 6 177.888889 154.85714 73.91 77.78 71.43
## Odds Ratio (95% CI)
## 1 0.67 (-0.11;1.5)
## 2 4.1 (6.8;1.3)
## 3 8.6 (13;4.3)
## 4 0.39 (-1.4;2.2)
## 5 1.5 (2.3;0.71)
## 6 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size of taxa by glm function to assess the pvalue effect;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (GeometricMean)_vs_BB**: Log2FoldChange (GeometricMean Abundance) between group AA and group BB;
GeometricMean Abundance (All)/(group AA)/(group BB): GeometricMean Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for t-test
if (0) {
<- run_ttest(
DA_ttest data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_ttest(
DA_ttest ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_ttest ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "ttest")
head(DA_ttest)
}
- t-test_rarefy: random subsampling counts to the smallest library size in the data set.
<- run_ttest(
DA_ttest_rarefy ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "rarefy")
colnames(DA_ttest_rarefy)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
The output is the same as the previous results of ordinary pattern.
other options for ttest_rarefy
if (0) {
<- run_da(
DA_ttest_rarefy ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "ttest",
norm = "rarefy")
head(DA_ttest_rarefy)
}
10.3.9 MetagenomeSeq
MetagenomeSeq package is from Differential abundance analysis for microbial marker-gene surveys (Paulson et al. 2013). It uses zero-inflated Log-Normal mixture model or Zero-inflated Gaussian mixture model to identify the significant taxa between groups. (Caution: the otu_table should be integers).
<- run_metagenomeseq(
DA_metagenomeseq ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "CSS",
method = "ZILN")
colnames(DA_metagenomeseq)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "logFC" "Pvalue" "AdjustedPvalue"
## [7] "se" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
head(DA_metagenomeseq)
## TaxaID Block Enrichment logFC Pvalue AdjustedPvalue se Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif -1.1453226 NA NA NA NA
## 2 g__Actinomyces 9_AA vs 14_BB <NA> 0.6216876 NA NA NA -0.9965674
## 3 g__Adlercreutzia 9_AA vs 14_BB <NA> 0.4652896 NA NA NA NA
## 4 g__Agathobacter 9_AA vs 14_BB <NA> -0.2406551 NA NA NA -0.1149914
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif -0.8318088 NA NA NA NA
## 6 g__Alistipes 9_AA vs 14_BB <NA> 0.2229196 NA NA NA 1.2561674
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1 0.00000 0.00000 0.000000 3.2876853 148.45156
## 2 40.00000 27.28823 54.446764 -0.5016143 82.10636
## 3 0.00000 0.00000 7.846556 -2.7675469 26.03497
## 4 301.55820 301.55820 326.578060 1.1416934 867.24692
## 5 0.00000 0.00000 0.000000 -2.6101945 72.25398
## 6 83.50731 175.82418 73.609609 0.2975842 189.53970
## Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1 327.248231 33.51085 21.74 22.22 21.43
## 2 65.522916 92.76714 82.61 88.89 78.57
## 3 5.739364 39.08214 43.48 33.33 50.00
## 4 1299.865944 589.13469 65.22 55.56 71.43
## 5 17.588969 107.39578 21.74 22.22 21.43
## 6 213.795329 173.94680 73.91 77.78 71.43
## Odds Ratio (95% CI)
## 1 0.49 (-0.92;1.9)
## 2 1.4 (1.9;0.76)
## 3 8.7 (13;4.5)
## 4 0.54 (-0.68;1.8)
## 5 1.6 (2.6;0.68)
## 6 0.83 (0.47;1.2)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
logFC: fitted coefficient represents the fold-change for group AA and group BB;
se: standard error;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for metagenomeseq
if (0) {
<- run_metagenomeseq(
DA_metagenomeseq data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"),
norm = "CSS",
method = "ZILN")
<- run_metagenomeseq(
DA_metagenomeseq ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"),
norm = "CSS",
method = "ZILN")
<- run_da(
DA_metagenomeseqps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "metagenomeseq",
norm = "CSS",
method = "ZILN")
head(DA_metagenomeseq)
}
10.3.10 DESeq2
DESeq2 package is from Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 (Love, Huber, and Anders 2014). Differential expression analysis based on the Negative Binomial (a.k.a. Gamma-Poisson) distribution.(Caution: the otu_table must be integers).
<- run_deseq2(
DA_deseq2 ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_deseq2)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "Pvalue" "AdjustedPvalue" "logFC"
## [7] "Statistic" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
head(DA_deseq2)
## TaxaID Block Enrichment Pvalue AdjustedPvalue logFC Statistic
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 0.003314859 0.02209906 -3.1129168 -2.9369235
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 0.122064731 0.31069563 -1.3609640 -1.5461650
## 3 g__Adlercreutzia 9_AA vs 14_BB BB 0.008736333 0.04408614 -2.4112622 -2.6222032
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 0.301287313 0.51946088 1.1696067 1.0336767
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif 0.395896745 0.60907192 0.5625950 0.8489722
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif 0.535255838 0.68622543 0.5817917 0.6200030
## Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB
## 1 NA 0 0 0.0
## 2 -0.9577718 30 26 50.5
## 3 NA 0 0 9.0
## 4 0.1524904 316 329 296.0
## 5 NA 0 0 0.0
## 6 -0.1085245 64 64 69.0
## Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All)
## 1 1.7051442 58.82609 101.777778 31.21429 21.74
## 2 -1.3107676 50.91304 26.777778 66.42857 82.61
## 3 -2.4683647 19.21739 5.111111 28.28571 43.48
## 4 1.7488297 996.26087 1740.444444 517.85714 65.22
## 5 -2.1676332 93.78261 30.000000 134.78571 21.74
## 6 0.2000385 163.86957 177.888889 154.85714 73.91
## Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 22.22 21.43 0.67 (-0.11;1.5)
## 2 88.89 78.57 4.1 (6.8;1.3)
## 3 33.33 50.00 8.6 (13;4.3)
## 4 55.56 71.43 0.39 (-1.4;2.2)
## 5 22.22 21.43 1.5 (2.3;0.71)
## 6 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
logFC: fitted coefficient represents the fold-change for group AA and group BB;
Statistic: test statistic (negative binomial model);
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for deseq2
if (0) {
<- run_deseq2(
DA_deseq2 data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_deseq2(
DA_deseq2 ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_deseq2 ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "deseq2")
head(DA_deseq2)
}
10.3.11 EdgeR
EdgeR package is from A scaling normalization method for differential expression analysis of RNA-seq data (Robinson and Oshlack 2010). Differential expression analysis based on the Negative Binomial (a.k.a. Gamma-Poisson) distribution. (Caution: the otu_table must be integers).
<- run_edger(
DA_edger ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))

Figure 10.1: EdgeR (BVC distance)
colnames(DA_edger)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "Pvalue" "AdjustedPvalue" "logFC"
## [7] "logCPM" "LR" "Log2FoldChange (Median)\nAA_vs_BB"
## [10] "Median Abundance\n(All)" "Median Abundance\nAA" "Median Abundance\nBB"
## [13] "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)" "Mean Abundance\nAA"
## [16] "Mean Abundance\nBB" "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA"
## [19] "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_edger)
## TaxaID Block Enrichment Pvalue AdjustedPvalue logFC logCPM LR
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 0.01399827 0.0874892 -3.2265874 11.40297 6.03836132
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 0.06553105 0.2113905 1.7971377 10.73043 3.39155685
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 0.02277119 0.1265066 2.1869009 8.75611 5.18587640
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif 0.21979989 0.4070368 -1.3582478 13.73655 1.50567724
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif 0.09219654 0.2622316 2.3965438 10.73275 2.83559687
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif 0.81597305 0.9320193 -0.2248336 11.56947 0.05416207
## Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB
## 1 NA 0 0 0.0
## 2 -0.9577718 30 26 50.5
## 3 NA 0 0 9.0
## 4 0.1524904 316 329 296.0
## 5 NA 0 0 0.0
## 6 -0.1085245 64 64 69.0
## Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All)
## 1 1.7051442 58.82609 101.777778 31.21429 21.74
## 2 -1.3107676 50.91304 26.777778 66.42857 82.61
## 3 -2.4683647 19.21739 5.111111 28.28571 43.48
## 4 1.7488297 996.26087 1740.444444 517.85714 65.22
## 5 -2.1676332 93.78261 30.000000 134.78571 21.74
## 6 0.2000385 163.86957 177.888889 154.85714 73.91
## Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 22.22 21.43 0.67 (-0.11;1.5)
## 2 88.89 78.57 4.1 (6.8;1.3)
## 3 33.33 50.00 8.6 (13;4.3)
## 4 55.56 71.43 0.39 (-1.4;2.2)
## 5 22.22 21.43 1.5 (2.3;0.71)
## 6 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
logFC: fitted coefficient represents the fold-change for group AA and group BB;
logCPM: is the average expression of all samples for that particular gene across all samples on the log-scale expressed in counts per million (cpm, as calculated by edgeR after normalization);
LR: the signed likelihood ratio test statistic;
Pvalue and AdjustedPvalue: significant level of Pvalue and Adjusted-pvalue;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for EdgeR
if (0) {
<- run_edger(
DA_edger data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_edger(
DA_edger ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_edger ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "edger")
head(DA_edger)
}
10.3.12 ANCOM
ANCOM (Analysis of composition of microbiomes) is from Analysis of composition of microbiomes: a novel method for studying microbial composition”, Microbial Ecology in Health (Mandal et al. 2015). ANCOM makes no distributional assumptions and can be implemented in a linear model framework. (Caution: the otu_table must be integers).
<- run_ancom(
DA_ancom ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))

Figure 10.2: ANCOM (Structure Zero)
colnames(DA_ancom)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "(W)q-values < alpha" "W_ratio"
## [7] "detected_0.7" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
head(DA_ancom)
## TaxaID Block Enrichment EffectSize (W)q-values < alpha W_ratio detected_0.7
## 1 g__Acidaminococcus 9_AA vs 14_BB Nonsignif 0.02320439 0 0 FALSE
## 2 g__Actinomyces 9_AA vs 14_BB Nonsignif 0.73520463 0 0 FALSE
## 3 g__Adlercreutzia 9_AA vs 14_BB Nonsignif 0.30465472 0 0 FALSE
## 4 g__Agathobacter 9_AA vs 14_BB Nonsignif -0.48457752 0 0 FALSE
## 5 g__Akkermansia 9_AA vs 14_BB Nonsignif -0.07760459 0 0 FALSE
## 6 g__Alistipes 9_AA vs 14_BB Nonsignif -0.01131933 0 0 FALSE
## Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB
## 1 NA 0 0 0.0
## 2 -0.9577718 30 26 50.5
## 3 NA 0 0 9.0
## 4 0.1524904 316 329 296.0
## 5 NA 0 0 0.0
## 6 -0.1085245 64 64 69.0
## Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All)
## 1 1.7051442 58.82609 101.777778 31.21429 21.74
## 2 -1.3107676 50.91304 26.777778 66.42857 82.61
## 3 -2.4683647 19.21739 5.111111 28.28571 43.48
## 4 1.7488297 996.26087 1740.444444 517.85714 65.22
## 5 -2.1676332 93.78261 30.000000 134.78571 21.74
## 6 0.2000385 163.86957 177.888889 154.85714 73.91
## Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 22.22 21.43 0.67 (-0.11;1.5)
## 2 88.89 78.57 4.1 (6.8;1.3)
## 3 33.33 50.00 8.6 (13;4.3)
## 4 55.56 71.43 0.39 (-1.4;2.2)
## 5 22.22 21.43 1.5 (2.3;0.71)
## 6 77.78 71.43 0.88 (0.64;1.1)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
Enrichment: enriched direction based on Median Abundance and AdjustedPvalue;
EffectSize: effect size by ANCOM;
(W)q-values < alpha: q-values less than alpha;
W_ratio: the ratio of W values;
detected_0.7: W_ratio more than 0.7;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
other options for ANCOM
if (0) {
<- run_ancom(
DA_ancom data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"))
<- run_ancom(
DA_ancom ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"))
<- run_da(
DA_ancom ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "ancom")
head(DA_ancom)
}
10.3.13 Corncob
Corncob package is from Modeling microbial abundances and dysbiosis with beta-binomial regression”, Microbial Ecology in Health (Martin, Witten, and Willis 2020). Corncob is based on beta-binomial regression. (Caution: the otu_table must be integers).
if (0) {
<- run_corncob(
DA_corncob ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
method = "Wald")
colnames(DA_ancom)
head(DA_ancom)
}
other options for Corncob
if (0) {
<- run_corncob(
DA_corncob data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"),
method = "Wald")
<- run_corncob(
DA_corncob ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"),
method = "Wald")
<- run_da(
DA_corncob ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "corncob",
method = "Wald")
head(DA_corncob)
}
10.3.14 Maaslin2 (Microbiome Multivariable Association with Linear Models)
Maaslin2 package is from Multivariable association discovery in population-scale meta-omics studies (Mallick et al. 2021). Maaslin2 relies on general linear models to accommodate most modern epidemiological study designs, including cross-sectional and longitudinal, along with a variety of filtering, normalization, and transform methods.
if (0) {
<- run_maaslin2(
DA_maaslin2 ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
transform = "LOG",
norm = "TMM",
method = "LM",
outdir = "./demo_output")
<- run_maaslin2(
DA_maaslin2 ps = dada2_ps,
taxa_level = "Genus",
group = "Group",
group_names = c("AA", "BB"),
transform = "LOG",
norm = "TMM",
method = "LM",
outdir = "./demo_output")
<- run_maaslin2(
DA_maaslin2 data_otu = phyloseq::otu_table(dada2_ps_genus_filter_trim),
data_sam = phyloseq::sample_data(dada2_ps_genus_filter_trim),
group = "Group",
group_names = c("AA", "BB"),
transform = "LOG",
norm = "TMM",
method = "LM",
outdir = "./demo_output")
<- run_da(
DA_maaslin2 ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "maaslin2",
transform = "LOG",
norm = "TMM",
method = "LM",
outdir = "./demo_output")
head(DA_maaslin2)
}
10.3.15 LOCOM
LOCOM: A logistic regression model for testing differential abundance in compositional microbiome data with false discovery rate control (Hu, Satten, and Hu 2022).
Cautions: It reports an error information which the matrix is singularError when user uses low prev.cut.
library(permute)
<- run_locom(ps = dada2_ps_genus_filter_trim,
DA_locom group = "Group",
group_names = c("AA", "BB"),
fdr.nominal = 0.2,
prev.cut = 0.8)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## 88 OTU(s) with fewer than 18.4 in all samples are removed
## permutations: 1
DA_locom
## TaxaID Block EffectSize Pvalue AdjustedPvalue Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All)
## 1 g__Bacteroides 9_AA vs 14_BB 1.030 0.024 0.114 1.5367266 3807
## 2 g__Blautia 9_AA vs 14_BB 0.570 0.038 0.114 0.2641697 8199
## 3 g__Lachnoclostridium 9_AA vs 14_BB 0.672 0.032 0.114 0.7874146 422
## 4 g__Lactobacillus 9_AA vs 14_BB -3.180 0.022 0.114 -3.0334230 56
## Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA
## 1 9103 3137.5 0.9841180 5745.5652 8219.44444
## 2 8645 7198.5 0.3178904 9145.9565 10397.55556
## 3 485 281.0 0.4658394 417.9565 502.33333
## 4 24 196.5 -5.0905713 791.2609 37.44444
## Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 4155.2143 100.00 100.00 100.00 0.45 (-1.1;2)
## 2 8341.3571 100.00 100.00 100.00 0.62 (-0.3;1.5)
## 3 363.7143 95.65 100.00 92.86 0.63 (-0.27;1.5)
## 4 1275.8571 86.96 77.78 92.86 1.8e+08 (1.8e+08;1.8e+08)
Results:
The results comprises more than 10 columns, and the details as follow:
TaxaID: taxa name;
Block: groups’ names and numbers;
EffectSize: effect size by ANCOM;
Pvalue: p-values for OTU-specific tests;
AdjustedPvalue : q-values (adjusted p-values by the BH procedure) for OTU-specific tests;
**Log2FoldChange (Median)_vs_BB**: Log2FoldChange (Median Abundance) between group AA and group BB;
Median Abundance (All)/(group AA)/(group BB): Median Abundance in all, group AA and group BB, respectively;
**Log2FoldChange (Mean)_vs_BB**: Log2FoldChange (Mean Abundance) between group AA and group BB;
Mean Abundance (All)/(group AA)/(group BB): Mean Abundance in all, group AA and group BB, respectively;
Occurrence (All)/(group AA)/(group BB): Occurrence in all, group AA and group BB, respectively;
Odds Ratio (95% CI): 95% confidence interval odds ratio between group AA and group BB.
10.4 Metagenomic sequencing microbial data (metaphlan2/3)
10.4.1 Wilcoxon Rank Sum and Signed Rank Tests
Wilcoxon Rank Sum and Signed Rank Tests, which are nonparameter test methods, use the rank of taxa abundance to find the significant taxa.
<- run_wilcox(
DA_wilcox_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_wilcox_mgs)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Rank)\nAA_vs_BB"
## [13] "Mean Rank Abundance\nAA" "Mean Rank Abundance\nBB" "Occurrence (100%)\n(All)"
## [16] "Occurrence (100%)\nAA" "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_wilcox_mgs)
## TaxaID Block Enrichment EffectSize Statistic Pvalue AdjustedPvalue Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Acidaminococcus 7_AA vs 15_BB Nonsignif 0.9253506 36.0 0.19074499 0.4830467 NA
## 2 g__Actinomyces 7_AA vs 15_BB Nonsignif 4.3383273 25.0 0.04564844 0.1896166 NA
## 3 g__Adlercreutzia 7_AA vs 15_BB Nonsignif 0.4045684 28.0 0.05769072 0.2076866 NA
## 4 g__Alistipes 7_AA vs 15_BB Nonsignif 1.3860231 71.0 0.20193465 0.4830467 2.061604
## 5 g__Anaerostipes 7_AA vs 15_BB Nonsignif 0.5796404 31.0 0.13834560 0.3931928 -4.964322
## 6 g__Anaerotruncus 7_AA vs 15_BB Nonsignif 0.8015454 47.5 0.71288708 0.8019980 NA
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Rank)\nAA_vs_BB Mean Rank Abundance\nAA
## 1 0.00000000 0.0000000 0.0000000 -0.4631577 9.14
## 2 0.00000280 0.0000000 0.0000061 -0.8163116 7.57
## 3 0.00000000 0.0000000 0.0000558 -0.7147950 8.00
## 4 0.00711780 0.0236665 0.0056693 0.4613459 14.14
## 5 0.00031435 0.0000229 0.0007149 -0.6171177 8.43
## 6 0.00000000 0.0000000 0.0000000 -0.1327552 10.79
## Mean Rank Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 12.60 36.36 14.29 46.67 8700 (8700;8700)
## 2 13.33 54.55 14.29 73.33 1.7 (2.7;0.66)
## 3 13.13 40.91 14.29 53.33 Inf (Inf;Inf)
## 4 10.27 77.27 71.43 80.00 0.22 (-2.7;3.2)
## 5 12.93 86.36 71.43 93.33 6 (9.5;2.5)
## 6 11.83 36.36 28.57 40.00 1.3 (1.8;0.78)
wilcox Metagenomic sequencing in run_da
if (0) {
<- run_da(
DA_wilcox_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "wilcox")
}
10.4.2 Liner discriminant analysis (LDA) effect size (LEfSe)
LEfSe method is from Metagenomic biomarker discovery and explanation (Segata et al. 2011). It uses Liner discriminant analysis model to identify Differential Taxa.
<- run_lefse(
DA_lefse_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "CPM",
Lda = 2)
colnames(DA_lefse_mgs)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "LDA_Score" "EffectSize" "Log2FoldChange (Median)\nAA_vs_BB"
## [7] "Median Abundance\n(All)" "Median Abundance\nAA" "Median Abundance\nBB"
## [10] "Log2FoldChange (Mean)\nAA_vs_BB" "Mean Abundance\n(All)" "Mean Abundance\nAA"
## [13] "Mean Abundance\nBB" "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA"
## [16] "Occurrence (100%)\nBB" "Odds Ratio (95% CI)"
head(DA_lefse_mgs)
## TaxaID Block Enrichment LDA_Score EffectSize Log2FoldChange (Median)\nAA_vs_BB Median Abundance\n(All)
## 1 g__Actinomyces 7_AA vs 15_BB BB 2.233078 1.226382 NA 2.866622e+00
## 2 g__Bacteroides 7_AA vs 15_BB AA -5.081345 2.546304 1.190750 2.541675e+05
## 3 g__Bifidobacterium 7_AA vs 15_BB BB 4.938251 2.809341 -6.573979 3.906942e+04
## 4 g__Blautia 7_AA vs 15_BB BB 4.137194 0.559013 -3.203536 2.181765e+04
## 5 g__Collinsella 7_AA vs 15_BB BB 3.837502 2.585562 NA 7.735890e+03
## 6 g__Dorea 7_AA vs 15_BB BB 3.830972 2.149879 -5.725323 8.554293e+03
## Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All) Mean Abundance\nAA
## 1 0.0000 7.245027e+00 -1.246758 29.78669 15.38419
## 2 464614.8499 2.035362e+05 1.141368 291673.92734 464989.41139
## 3 917.8126 8.744173e+04 -3.459698 114201.49672 14604.58489
## 4 3704.8960 3.413002e+04 -2.334726 25756.09859 6854.31058
## 5 0.0000 1.164172e+04 -2.928116 12495.99111 2268.85954
## 6 270.5150 1.431151e+04 -3.454344 11339.49845 1455.31524
## Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB Odds Ratio (95% CI)
## 1 36.50785 54.55 14.29 73.33 1.7 (2.8;0.66)
## 2 210793.36812 100.00 100.00 100.00 0.32 (-1.9;2.5)
## 3 160680.05558 100.00 100.00 100.00 150 (160;140)
## 4 34576.93300 100.00 100.00 100.00 91 (100;82)
## 5 17268.65252 68.18 42.86 80.00 20 (25;14)
## 6 15952.11728 90.91 71.43 100.00 28 (34;21)
lefse Metagenomic sequencing in run_da
if(0) {
<- run_da(
DA_lefse_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "lefse",
norm = "CPM",
Lda = 0)
}
10.4.3 t-test
T test, a parametric test method, identifies the significant taxa.
<- run_ttest(
DA_ttest_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_ttest_mgs)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "Statistic" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
head(DA_ttest_mgs)
## TaxaID Block Enrichment EffectSize Statistic Pvalue AdjustedPvalue Log2FoldChange (Median)\nAA_vs_BB
## 1 g__Acidaminococcus 7_AA vs 15_BB Nonsignif 0.9253506 -1.0515590 0.31074638 0.5630606 NA
## 2 g__Actinomyces 7_AA vs 15_BB Nonsignif 4.3383273 -1.0131776 0.32823254 0.5717599 NA
## 3 g__Adlercreutzia 7_AA vs 15_BB Nonsignif 0.4045684 -2.1929042 0.04570343 0.2243623 NA
## 4 g__Alistipes 7_AA vs 15_BB Nonsignif 1.3860231 1.5963681 0.15951558 0.5060914 2.061604
## 5 g__Anaerostipes 7_AA vs 15_BB Nonsignif 0.5796404 -1.6970563 0.10937430 0.4218723 -4.964322
## 6 g__Anaerotruncus 7_AA vs 15_BB Nonsignif 0.8015454 -0.5871152 0.56369794 0.6617324 NA
## Median Abundance\n(All) Median Abundance\nAA Median Abundance\nBB Log2FoldChange (Mean)\nAA_vs_BB Mean Abundance\n(All)
## 1 0.00000000 0.0000000 0.0000000 -4.6280567 9.548995e-03
## 2 0.00000280 0.0000000 0.0000061 -1.2053189 2.846364e-05
## 3 0.00000000 0.0000000 0.0000558 -9.2633214 8.266409e-04
## 4 0.00711780 0.0236665 0.0056693 2.1761109 1.995760e-02
## 5 0.00031435 0.0000229 0.0007149 -2.2827694 1.665177e-03
## 6 0.00000000 0.0000000 0.0000000 -0.9025366 2.137000e-04
## Mean Abundance\nAA Mean Abundance\nBB Occurrence (100%)\n(All) Occurrence (100%)\nAA Occurrence (100%)\nBB
## 1 5.558857e-04 0.0137457800 36.36 14.29 46.67
## 2 1.505714e-05 0.0000347200 54.55 14.29 73.33
## 3 1.971429e-06 0.0012114867 40.91 14.29 53.33
## 4 4.254910e-02 0.0094149000 77.27 71.43 80.00
## 5 4.579714e-04 0.0022285400 86.36 71.43 93.33
## 6 1.341714e-04 0.0002508133 36.36 28.57 40.00
## Odds Ratio (95% CI)
## 1 8700 (8700;8700)
## 2 1.7 (2.7;0.66)
## 3 Inf (Inf;Inf)
## 4 0.22 (-2.7;3.2)
## 5 6 (9.5;2.5)
## 6 1.3 (1.8;0.78)
ttest Metagenomic sequencing in run_da
if(0) {
<- run_da(
DA_ttest_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "ttest")
}
10.4.4 Maaslin2 (Microbiome Multivariable Association with Linear Models)
Maaslin2 package is from Multivariable association discovery in population-scale meta-omics studies (Mallick et al. 2021). Maaslin2 relies on general linear models to accommodate most modern epidemiological study designs, including cross-sectional and longitudinal, along with a variety of filtering, normalization, and transform methods.
if (0) {
<- run_maaslin2(
DA_maaslin2_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
transform = "NONE",
norm = "NONE",
method = "LM",
outdir = "./demo_output")
head(DA_maaslin2_mgs)
}
maaslin2 Metagenomic sequencing in run_da
if (0) {
<- run_da(
DA_maaslin2_mgs ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = "maaslin2",
transform = "NONE",
norm = "NONE",
method = "LM",
outdir = "./demo_output")
}
10.5 Visualization
The Volcano plot is used to display differential analysis. plot_volcano
provides multiple parameters for plotting volcano. More details to see help(plot_volcano)
.
The barplot is used to display the lefse results.
10.5.1 Volcano plot
The X and Y coordinate axis are flexible to choose for Volcano. Here, we choose logFC and AdjustedPvalue to visualize the results
<- run_limma_voom(
DA_limma_voom ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
colnames(DA_limma_voom)
## [1] "TaxaID" "Block" "Enrichment"
## [4] "EffectSize" "logFC" "Pvalue"
## [7] "AdjustedPvalue" "Log2FoldChange (Median)\nAA_vs_BB" "Median Abundance\n(All)"
## [10] "Median Abundance\nAA" "Median Abundance\nBB" "Log2FoldChange (Mean)\nAA_vs_BB"
## [13] "Mean Abundance\n(All)" "Mean Abundance\nAA" "Mean Abundance\nBB"
## [16] "Occurrence (100%)\n(All)" "Occurrence (100%)\nAA" "Occurrence (100%)\nBB"
## [19] "Odds Ratio (95% CI)"
<- plot_volcano(
DA_limma_voom_volcano da_res = DA_limma_voom,
group_names = c("AA", "BB"),
x_index = "logFC",
x_index_cutoff = 0.5,
y_index = "Pvalue",
y_index_cutoff = 0.05,
group_color = c("red", "grey", "blue"),
topN = 5)
DA_limma_voom_volcano

Figure 10.3: Volcano (limma-voom: logFC and AdjustedPvalue)
DA_limma_voom_volcano effectsize
plot_volcano(
da_res = DA_limma_voom,
group_names = c("AA", "BB"),
x_index = "EffectSize",
x_index_cutoff = 1,
y_index = "Pvalue",
y_index_cutoff = 0.05,
group_color = c("red", "grey", "blue"),
topN = 8)

Figure 10.4: Volcano (limma-voom: EffectSize and Pvalue)
The logFC and EffectSize are the same values in limma-voom.
10.5.2 barplot in lefse
<- run_lefse(
DA_lefse ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
norm = "CPM",
Lda = 2)
# # don't run this code when you do lefse in reality
# DA_lefse$LDA_Score <- DA_lefse$LDA_Score * 1000
plot_lefse(
da_res = DA_lefse,
x_index = "LDA_Score",
x_index_cutoff = 1,
group_color = c("green", "red"))

Figure 10.5: Barplot (Lefse)
10.6 Dominant taxa
Display the significant taxa with selection using boxplot.
<- run_wilcox(
DA_wilcox ps = metaphlan2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"))
plot_topN_boxplot(
ps = metaphlan2_ps_genus_filter_trim,
da_res = DA_wilcox,
x_index = "Log2FoldChange (Median)\nAA_vs_BB",
x_index_cutoff = 0.2,
y_index = "Pvalue",
y_index_cutoff = 0.3,
topN = 3,
taxa_name = "s__Ruminococcus_torques",
group = "Group")

Figure 10.6: Dominant Taxa
10.7 Multiple differential analysis by one function
here, we provide the run_multiple_da
for obtaining the results list from multiple differential analysis methods.
<- run_multiple_da(
multiple_res ps = dada2_ps_genus_filter_trim,
group = "Group",
group_names = c("AA", "BB"),
da_method = c("aldex", "limma_voom", "mbzinb", "omnibus"),
p_adjust = "none")
## |------------(25%)----------(50%)----------(75%)----------|
## Start GMPR normalization ...
## Start Winsorization ...
## Perform filtering ...
## --A total of 92 taxa will be tested with a sample size of 23 !
## --Omnibus test is selected!
## --Dispersion is treated as a parameter of interest!
## Start testing ...
## 10 %
## 20 %
## 30 %
## 40 %
## 50 %
## 60 %
## 70 %
## 80 %
## 90 %
## 100%!
## Handle failed taxa using permutation test!
## Permutation test ....
## Completed!
names(multiple_res)
## [1] "aldex" "limma_voom" "mbzinb" "omnibus"
- plot results
plot_multiple_DA(
Multip_DA_res = multiple_res,
x_index_list = c("EffectSize", "logFC", "mean.LFC", "abund.LFC.CompvarBB.est"),
x_index_cutoff = 0,
y_index = "AdjustedPvalue",
y_index_cutoff = 0.5,
cellwidth = 50,
cellheight = 10)

Figure 10.7: Multiple DA results
10.8 Comparing outputs from XMAS2, lefse-conda, and lefse-galaxy using the same in-house datasets (amplicon_ps|Zeybel_Gut)
10.8.1 Introduction
In this document, Comparing the output from lefse through different applications:
- XMAS2 (R package)
- lefse-conda (command line)
- lefse-galaxy (from the galaxy platfrom)
In all cases, using the same dataset, amplicon_ps and Zeybel_Gut, which are included in the XMAS package.
library(XMAS2)
library(dplyr)
library(ggplot2)
library(devtools)
library(tibble)
library(tidyr)
library(magrittr)
library(readr)
library(VennDiagram)
library(purrr)
# rm(list = ls())
options(stringsAsFactors = F)
options(future.globals.maxSize = 1000 * 1024^2)
10.8.2 Dataset
10.8.2.1 16s genus
data("amplicon_ps")
<- summarize_taxa(amplicon_ps, taxa_level = "Genus")
amplicon_ps_genus amplicon_ps_genus
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 406 taxa and 34 samples ]
## sample_data() Sample Data: [ 34 samples by 8 sample variables ]
## tax_table() Taxonomy Table: [ 406 taxa by 6 taxonomic ranks ]
10.8.2.2 metagenomics species
data("Zeybel_Gut")
<- summarize_taxa(Zeybel_Gut, taxa_level = "Species")
Zeybel_ps_species Zeybel_ps_species
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 372 taxa and 42 samples ]
## sample_data() Sample Data: [ 42 samples by 51 sample variables ]
## tax_table() Taxonomy Table: [ 372 taxa by 7 taxonomic ranks ]
10.8.2.3 Preparing for lefse galaxy and conda
1st row: class (required)
2nd row: subclass (optional)
3rd row: sampleID (required)
rownames: taxon
data format: splitted by “
<- function(ps,
prepare_lefse
Class,
Class_names,Subclass = NULL,
cutoff = 10) {
# ps = amplicon_ps_genus
# Class = "SampleType"
# Class_names = c("gut", "tongue")
# Subclass = NULL
# cutoff = 10
<- phyloseq::sample_data(ps) %>%
sam_tab data.frame()
colnames(sam_tab)[which(colnames(sam_tab) == Class)] <- "CompClass"
if (is.null(Subclass)) {
<- sam_tab %>%
sam_tab_final ::select(CompClass) %>%
dplyr::rownames_to_column("TempRowNames") %>%
tibble::filter(CompClass %in% Class_names) %>%
dplyr::select(all_of(c("TempRowNames", "CompClass"))) %>%
dplyr::column_to_rownames("TempRowNames")
tibbleelse {
} <- sam_tab %>%
sam_tab_final ::select(all_of(c("CompClass", Subclass))) %>%
dplyr::rownames_to_column("TempRowNames") %>%
tibble::filter(CompClass %in% Class_names) %>%
dplyr::select(all_of(c("TempRowNames", "CompClass", Subclass))) %>%
dplyr::column_to_rownames("TempRowNames")
tibble
}
colnames(sam_tab_final)[which(colnames(sam_tab_final) == "CompClass")] <- Class
::sample_data(ps) <- phyloseq::sample_data(sam_tab_final)
phyloseq<- phyloseq::otu_table(ps) %>%
otu_tab data.frame()
<- otu_tab[rowSums(otu_tab) > cutoff, colSums(otu_tab) > cutoff, F]
otu_tab_final ::otu_table(ps) <- phyloseq::otu_table(as.matrix(otu_tab_final), taxa_are_rows = TRUE)
phyloseq
<- sam_tab_final %>%
lefse_data ::rownames_to_column("Sample") %>%
tibble::inner_join(otu_tab_final %>%
dplyrt() %>% data.frame() %>%
::rownames_to_column("Sample"),
tibbleby = "Sample") %>%
::select(all_of(Class), Sample, all_of(Subclass), everything()) %>%
dplyr#stats::setNames(c(Class, "Sample", Subclass, rownames(otu_tab_final))) %>%
t() %>% data.frame()
<- sam_tab_final %>%
lefse_data_nosub ::rownames_to_column("Sample") %>%
tibble::inner_join(otu_tab_final %>%
dplyrt() %>% data.frame() %>%
::rownames_to_column("Sample"),
tibbleby = "Sample") %>%
::select(-Sample) %>%
dplyr::select(all_of(Class), all_of(Subclass), everything()) %>%
dplyrt() %>% data.frame()
<- list(ps=ps,
res lefse=lefse_data,
lefse_nosub=lefse_data_nosub)
return(res)
}
<- prepare_lefse(
amplicon_ps_genus_lefse ps = amplicon_ps_genus,
Class = "SampleType",
Class_names = c("gut", "tongue"),
cutoff = 10)
write.table(amplicon_ps_genus_lefse$lefse, "amplicon_ps_genus_lefse.tsv", quote = F, sep = "\t", col.names = F)
write.table(amplicon_ps_genus_lefse$lefse_nosub, "amplicon_ps_genus_lefse_nosub.tsv", quote = F, sep = "\t", col.names = F)
<- prepare_lefse(
Zeybel_ps_species_lefse ps = Zeybel_ps_species,
Class = "LiverFatClass",
Class_names = c("Mild", "Moderate"),
cutoff = 1e-4)
write.table(Zeybel_ps_species_lefse$lefse, "Zeybel_ps_species_lefse.tsv", quote = F, sep = "\t", col.names = F)
write.table(Zeybel_ps_species_lefse$lefse_nosub, "Zeybel_ps_species_lefse_nosub.tsv", quote = F, sep = "\t", col.names = F)
10.8.3 Run lefse independently with the three applications (R, conda, galaxy)
10.8.3.1 Running lefse in R (XMAS2)
Perform the analysis with the run_lefse2
function:
- amplicon_ps_genus
# run_lefse
<- run_lefse(
amplicon_xmas2_output ps = amplicon_ps_genus_lefse$ps,
group = "SampleType",
group_names = c("gut", "tongue"),
norm = "CPM") %>%
::mutate(app_name = "xmas_lefse") %>%
dplyr::arrange(LDA_Score)
dplyrhead(amplicon_xmas2_output)
## TaxaID Block Enrichment LDA_Score EffectSize Log2FoldChange (Median)\ngut_vs_tongue
## 1 g__Bacteroides 8_gut vs 9_tongue gut -5.475334 5.734982 8.706188
## 2 g__Lachnospiraceae_unclassified 8_gut vs 9_tongue gut -4.445346 2.296951 3.048798
## 3 g__Ruminococcaceae_unclassified 8_gut vs 9_tongue gut -4.430034 5.962548 8.833464
## 4 g__Lachnospira 8_gut vs 9_tongue gut -4.327495 6.503718 NA
## 5 g__Phascolarctobacterium 8_gut vs 9_tongue gut -4.214235 6.206589 NA
## 6 g__Faecalibacterium 8_gut vs 9_tongue gut -4.136991 5.492180 NA
## Median Abundance\n(All) Median Abundance\ngut Median Abundance\ntongue Log2FoldChange (Mean)\ngut_vs_tongue
## 1 4552.3520 604210.78 1446.655 7.995845
## 2 11431.9387 64739.42 7823.286 3.186478
## 3 945.6265 53763.30 117.855 7.701566
## 4 0.0000 50368.43 0.000 NA
## 5 0.0000 18238.84 0.000 NA
## 6 0.0000 19194.92 0.000 NA
## Mean Abundance\n(All) Mean Abundance\ngut Mean Abundance\ntongue Occurrence (100%)\n(All) Occurrence (100%)\ngut
## 1 277146.67 586352.50 2297.0454 100.00 100
## 2 34604.22 65446.48 7188.8734 100.00 100
## 3 26586.43 56192.47 269.9452 76.47 100
## 4 22767.22 48380.35 0.0000 47.06 100
## 5 13444.82 28570.25 0.0000 47.06 100
## 6 13991.79 29732.55 0.0000 47.06 100
## Occurrence (100%)\ntongue Odds Ratio (95% CI) app_name
## 1 100.00 4.7e-14 (-60;60) xmas_lefse
## 2 100.00 4.4e-24 (-110;110) xmas_lefse
## 3 55.56 7.2e-41 (-180;180) xmas_lefse
## 4 0.00 <NA> xmas_lefse
## 5 0.00 <NA> xmas_lefse
## 6 0.00 <NA> xmas_lefse
# run_lefse2
<- run_lefse2(
amplicon_xmas2_output2 ps = amplicon_ps_genus_lefse$ps,
group = "SampleType",
group_names = c("gut", "tongue"),
norm = "CPM") %>%
::mutate(app_name = "xmas_lefse2") %>%
dplyr::arrange(LDA_Score)
dplyrhead(amplicon_xmas2_output2)
## TaxaID Block LDA_Score Enrichment EffectSize Pvalue
## 1 g__Bacteroides 8_gut vs 9_tongue -5.763040 gut 5.734982 0.0005320055
## 2 g__Lachnospiraceae_unclassified 8_gut vs 9_tongue -4.774847 gut 2.296951 0.0005320055
## 3 g__Ruminococcaceae_unclassified 8_gut vs 9_tongue -4.756686 gut 5.962548 0.0004911726
## 4 g__Lachnospira 8_gut vs 9_tongue -4.684106 gut 6.503718 0.0001762277
## 5 g__Faecalibacterium 8_gut vs 9_tongue -4.447011 gut 5.492180 0.0001762277
## 6 g__Phascolarctobacterium 8_gut vs 9_tongue -4.441230 gut 6.206589 0.0001762277
## Log2FoldChange (Median)\ngut_vs_tongue Median Abundance\n(All) Median Abundance\ngut Median Abundance\ntongue
## 1 8.706188 4552.3520 604210.78 1446.655
## 2 3.048798 11431.9387 64739.42 7823.286
## 3 8.833464 945.6265 53763.30 117.855
## 4 NA 0.0000 50368.43 0.000
## 5 NA 0.0000 19194.92 0.000
## 6 NA 0.0000 18238.84 0.000
## Log2FoldChange (Mean)\ngut_vs_tongue Mean Abundance\n(All) Mean Abundance\ngut Mean Abundance\ntongue
## 1 7.995845 277146.67 586352.50 2297.0454
## 2 3.186478 34604.22 65446.48 7188.8734
## 3 7.701566 26586.43 56192.47 269.9452
## 4 NA 22767.22 48380.35 0.0000
## 5 NA 13991.79 29732.55 0.0000
## 6 NA 13444.82 28570.25 0.0000
## Occurrence (100%)\n(All) Occurrence (100%)\ngut Occurrence (100%)\ntongue Odds Ratio (95% CI) app_name
## 1 100.00 100 100.00 4.7e-14 (-60;60) xmas_lefse2
## 2 100.00 100 100.00 4.4e-24 (-110;110) xmas_lefse2
## 3 76.47 100 55.56 7.2e-41 (-180;180) xmas_lefse2
## 4 47.06 100 0.00 <NA> xmas_lefse2
## 5 47.06 100 0.00 <NA> xmas_lefse2
## 6 47.06 100 0.00 <NA> xmas_lefse2
- Zeybel_ps_species
# run_lefse
<- run_lefse(
MGS_xmas2_output ps = Zeybel_ps_species_lefse$ps,
group = "LiverFatClass",
group_names = c("Mild", "Moderate"),
norm = "CPM") %>%
::mutate(app_name = "xmas_lefse") %>%
dplyr::arrange(LDA_Score)
dplyrhead(MGS_xmas2_output)
## TaxaID Block Enrichment LDA_Score EffectSize
## 1 s__Butyricimonas_virosa 12_Mild vs 12_Moderate Moderate 2.445966 1.668501
## 2 s__Bacteroides_salyersiae 12_Mild vs 12_Moderate Moderate 2.659689 1.573816
## 3 s__Bacteroides_clarus 12_Mild vs 12_Moderate Moderate 3.191513 2.331714
## 4 s__Bacteroides_thetaiotaomicron 12_Mild vs 12_Moderate Moderate 3.210748 2.775490
## 5 s__Bacteroides_coprocola 12_Mild vs 12_Moderate Moderate 3.314238 1.367262
## 6 s__Bacteroides_cellulosilyticus 12_Mild vs 12_Moderate Moderate 3.408064 3.145457
## Log2FoldChange (Median)\nMild_vs_Moderate Median Abundance\n(All) Median Abundance\nMild Median Abundance\nModerate
## 1 NA 0.8999998 0 262.9238
## 2 NA 0.0000000 0 0.0000
## 3 NA 0.0000000 0 0.0000
## 4 NA 223.8648101 0 1123.0082
## 5 NA 0.0000000 0 0.0000
## 6 NA 82.9748375 0 2707.7169
## Log2FoldChange (Mean)\nMild_vs_Moderate Mean Abundance\n(All) Mean Abundance\nMild Mean Abundance\nModerate
## 1 -2.329681 353.2093 117.2101 589.2084
## 2 NA 262.0187 0.0000 524.0375
## 3 NA 1283.6658 0.0000 2567.3315
## 4 -2.241224 2413.3440 842.6510 3984.0371
## 5 NA 1704.3673 0.0000 3408.7346
## 6 -3.325019 3170.2327 575.2844 5765.1810
## Occurrence (100%)\n(All) Occurrence (100%)\nMild Occurrence (100%)\nModerate Odds Ratio (95% CI) app_name
## 1 50.00 25.00 75.00 3.9 (6.6;1.2) xmas_lefse
## 2 16.67 0.00 33.33 <NA> xmas_lefse
## 3 20.83 0.00 41.67 <NA> xmas_lefse
## 4 66.67 41.67 91.67 3.1 (5.3;0.87) xmas_lefse
## 5 16.67 0.00 33.33 <NA> xmas_lefse
## 6 62.50 41.67 83.33 17 (23;11) xmas_lefse
# run_lefse2
<- run_lefse2(
MGS_xmas2_output2 ps = Zeybel_ps_species_lefse$ps,
group = "LiverFatClass",
group_names = c("Mild", "Moderate"),
norm = "CPM") %>%
::mutate(app_name = "xmas_lefse2") %>%
dplyr::arrange(LDA_Score)
dplyrhead(MGS_xmas2_output2)
## TaxaID Block LDA_Score Enrichment EffectSize Pvalue
## 1 s__Butyricimonas_virosa 12_Mild vs 12_Moderate 2.709350 Moderate 1.668501 0.02432292
## 2 s__Bacteroides_salyersiae 12_Mild vs 12_Moderate 2.763901 Moderate 1.573816 0.03286923
## 3 s__Bacteroides_clarus 12_Mild vs 12_Moderate 3.354092 Moderate 2.331714 0.01473169
## 4 s__Bacteroides_thetaiotaomicron 12_Mild vs 12_Moderate 3.502262 Moderate 2.775490 0.01050852
## 5 s__Bacteroides_intestinalis 12_Mild vs 12_Moderate 3.524676 Moderate 1.184353 0.03286923
## 6 s__Bacteroides_coprocola 12_Mild vs 12_Moderate 3.538227 Moderate 1.367262 0.03286923
## Log2FoldChange (Median)\nMild_vs_Moderate Median Abundance\n(All) Median Abundance\nMild Median Abundance\nModerate
## 1 NA 0.8999998 0 262.9238
## 2 NA 0.0000000 0 0.0000
## 3 NA 0.0000000 0 0.0000
## 4 NA 223.8648101 0 1123.0082
## 5 NA 0.0000000 0 0.0000
## 6 NA 0.0000000 0 0.0000
## Log2FoldChange (Mean)\nMild_vs_Moderate Mean Abundance\n(All) Mean Abundance\nMild Mean Abundance\nModerate
## 1 -2.329681 353.2093 117.2101 589.2084
## 2 NA 262.0187 0.0000 524.0375
## 3 NA 1283.6658 0.0000 2567.3315
## 4 -2.241224 2413.3440 842.6510 3984.0371
## 5 NA 1745.9150 0.0000 3491.8300
## 6 NA 1704.3673 0.0000 3408.7346
## Occurrence (100%)\n(All) Occurrence (100%)\nMild Occurrence (100%)\nModerate Odds Ratio (95% CI) app_name
## 1 50.00 25.00 75.00 3.9 (6.6;1.2) xmas_lefse2
## 2 16.67 0.00 33.33 <NA> xmas_lefse2
## 3 20.83 0.00 41.67 <NA> xmas_lefse2
## 4 66.67 41.67 91.67 3.1 (5.3;0.87) xmas_lefse2
## 5 16.67 0.00 33.33 <NA> xmas_lefse2
## 6 16.67 0.00 33.33 <NA> xmas_lefse2
10.8.3.2 Running lefse-conda (command line)
10.8.3.2.0.1 lefse-conda installation and version
Note: I installed lefse following the instructions from this site after installing conda.
## Add channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --add channels biobakery
## Install lefse
conda create -n lefse -c biobakery lefse -y
Conda and lefse versions:
conda --version
#> conda 4.12.0
conda list | grep -e "lefse"
# packages in environment at /home/samuel/miniconda3/envs/lefse:
#> lefse 1.1.2 pyhdfd78af_0 bioconda
10.8.3.2.1 Run lefse-conda
Generate a tabular dataset (amplicon_ps_genus_lefse or Zeybel_ps_species_lefse) compatible with lefse-conda and lefse-galaxy using the
get_dataset.R
script.Run the script
run_lefse.sh
(linux) with the following parameters:
# In general
# ./run_lefse.sh <path/to/conda/activate> <env_name> <path/to/Rscript> <filename prefix>
# in my case (Hua Zou)
## amplicon_ps_genus_lefse
./run_lefse.sh /Users/zouhua/opt/anaconda3/bin/activate lefse /usr/local/bin/R amplicon_ps_genus_lefse
## Zeybel_ps_species_lefse
./run_lefse.sh /Users/zouhua/opt/anaconda3/bin/activate lefse /usr/local/bin/R Zeybel_ps_species_lefse
Note: All script files, get_dataset.R
and run_lefse.sh
, and this rmarkdown
document must be in the same directory.
10.8.3.2.2 Import output from lefse-conda into R
<- function(datres,
get_lefse_python
Class_names,name = "lefse_conda",
LDA_names = "lefse_conda_LDA",
LDA_cutoff = 2) {
# datres = "amplicon_ps_genus_lefse.res"
# Class_names = c("gut", "tongue")
# LDA_cutoff = 2
<- c(
col_names "TaxaID", "log_hi_class_avg", "Enrichment", "lefse_conda_LDA", "pval")
<- readr::read_tsv(datres, show_col_types = FALSE, col_names = FALSE ) %>%
lefse_conda ::set_colnames(col_names) %>%
magrittr::filter(!is.na(lefse_conda_LDA)) %>%
dplyr::mutate(
dplyrlefse_conda_LDA = ifelse(
== Class_names[1], -lefse_conda_LDA, lefse_conda_LDA),
Enrichment app_name = name) %>%
::filter(abs(lefse_conda_LDA) >= LDA_cutoff) %>%
dplyr::arrange(lefse_conda_LDA)
dplyr
colnames(lefse_conda)[which(colnames(lefse_conda) == "lefse_conda_LDA")] <- LDA_names
return(lefse_conda)
}
<- get_lefse_python(
amplicon_ps_genus_lefse_conda datres = "amplicon_ps_genus_lefse.res",
Class_names = c("gut", "tongue"),
LDA_names = "lefse_conda_LDA")
head(amplicon_ps_genus_lefse_conda)
## # A tibble: 6 × 6
## TaxaID log_hi_class_avg Enrichment lefse_conda_LDA pval app_name
## <chr> <dbl> <chr> <dbl> <chr> <chr>
## 1 g__Bacteroides 5.77 gut -5.47 0.0005320055051392497 lefse_conda
## 2 g__Lachnospiraceae_unclassified 4.82 gut -4.47 0.0005320055051392497 lefse_conda
## 3 g__Ruminococcaceae_unclassified 4.75 gut -4.46 0.0004911726274481772 lefse_conda
## 4 g__Lachnospira 4.68 gut -4.36 0.0001762276980982556 lefse_conda
## 5 g__Faecalibacterium 4.47 gut -4.20 0.0001762276980982556 lefse_conda
## 6 g__Phascolarctobacterium 4.46 gut -4.15 0.0001762276980982556 lefse_conda
<- get_lefse_python(
Zeybel_ps_species_lefse_conda datres = "Zeybel_ps_species_lefse.res",
Class_names = c("Mild", "Moderate"),
LDA_names = "lefse_conda_LDA")
head(Zeybel_ps_species_lefse_conda)
## # A tibble: 6 × 6
## TaxaID log_hi_class_avg Enrichment lefse_conda_LDA pval app_name
## <chr> <dbl> <chr> <dbl> <chr> <chr>
## 1 s__Butyricimonas_virosa 2.77 Moderate 2.46 0.02432291566171718 lefse_conda
## 2 s__Bacteroides_salyersiae 2.72 Moderate 2.47 0.032869234300658745 lefse_conda
## 3 s__Bacteroides_clarus 3.41 Moderate 3.07 0.014731687897477879 lefse_conda
## 4 s__Bacteroides_thetaiotaomicron 3.60 Moderate 3.26 0.01050851971947486 lefse_conda
## 5 s__Bacteroides_coprocola 3.53 Moderate 3.28 0.032869234300658745 lefse_conda
## 6 s__Bacteroides_intestinalis 3.54 Moderate 3.32 0.032869234300658745 lefse_conda
10.8.3.3 Running lefse from galaxy
Using the amplicon_ps_genus_lefse_nosub.txt
or Zeybel_ps_species_lefse_nosub.txt
file (no subjects included) as input for lefse from the galaxy platform of the Huttenhower lab at galaxy.
The conditions as follow:
alpha were 0.05 for both KW and Wilcox,
2.0 for LDA.
TSS normalization was applied as well.
converting the output into compared format:
amplicon_ps_genus_lefse_nosub.res
Zeybel_ps_species_lefse_nosub.res
<- get_lefse_python(
amplicon_ps_genus_lefse_galaxy datres = "amplicon_ps_genus_lefse_nosub.res",
name = "lefse_galaxy",
Class_names = c("gut", "tongue"),
LDA_names = "lefse_galaxy_LDA")
head(amplicon_ps_genus_lefse_galaxy)
## # A tibble: 6 × 6
## TaxaID log_hi_class_avg Enrichment lefse_galaxy_LDA pval app_name
## <chr> <dbl> <chr> <dbl> <chr> <chr>
## 1 g__Bacteroides 5.77 gut -5.46 0.000532005505139 lefse_galaxy
## 2 g__Lachnospiraceae_unclassified 4.82 gut -4.45 0.000532005505139 lefse_galaxy
## 3 g__Ruminococcaceae_unclassified 4.75 gut -4.41 0.000491172627448 lefse_galaxy
## 4 g__Lachnospira 4.68 gut -4.38 0.000176227698098 lefse_galaxy
## 5 g__Phascolarctobacterium 4.46 gut -4.18 0.000176227698098 lefse_galaxy
## 6 g__Faecalibacterium 4.47 gut -4.18 0.000176227698098 lefse_galaxy
<- get_lefse_python(
Zeybel_ps_species_lefse_galaxy datres = "Zeybel_ps_species_lefse_nosub.res",
name = "lefse_galaxy",
Class_names = c("Mild", "Moderate"),
LDA_names = "lefse_galaxy_LDA")
head(amplicon_ps_genus_lefse_galaxy)
## # A tibble: 6 × 6
## TaxaID log_hi_class_avg Enrichment lefse_galaxy_LDA pval app_name
## <chr> <dbl> <chr> <dbl> <chr> <chr>
## 1 g__Bacteroides 5.77 gut -5.46 0.000532005505139 lefse_galaxy
## 2 g__Lachnospiraceae_unclassified 4.82 gut -4.45 0.000532005505139 lefse_galaxy
## 3 g__Ruminococcaceae_unclassified 4.75 gut -4.41 0.000491172627448 lefse_galaxy
## 4 g__Lachnospira 4.68 gut -4.38 0.000176227698098 lefse_galaxy
## 5 g__Phascolarctobacterium 4.46 gut -4.18 0.000176227698098 lefse_galaxy
## 6 g__Faecalibacterium 4.47 gut -4.18 0.000176227698098 lefse_galaxy
10.8.3.4 Extracting results from XMAS2 results
run_lefse (lefser R package)
run_lefse2 (microbiomeMarker R package)
<- function(datres,
get_lefse_R name = "Rrun_lefse",
LDA_names = "lefse_R_LDA",
LDA_cutoff = 2) {
# datres = amplicon_xmas2_output
# name = "Rrun_lefse"
# LDA_cutoff = 2
<- c(
col_names "TaxaID", "Block", "Enrichment", "LDA_Score", "EffectSize")
<- datres %>%
lefse_R ::select(all_of(col_names)) %>%
dplyr::mutate(app_name = name) %>%
dplyr::filter(abs(LDA_Score) >= LDA_cutoff) %>%
dplyr::arrange(LDA_Score)
dplyr
colnames(lefse_R)[which(colnames(lefse_R) == "LDA_Score")] <- LDA_names
return(lefse_R)
}
<- get_lefse_R(
amplicon_ps_genus_lefse_R datres = amplicon_xmas2_output,
name = "Rrun_lefse",
LDA_names = "lefse_R_LDA")
head(amplicon_ps_genus_lefse_R)
## TaxaID Block Enrichment lefse_R_LDA EffectSize app_name
## 1 g__Bacteroides 8_gut vs 9_tongue gut -5.475334 5.734982 Rrun_lefse
## 2 g__Lachnospiraceae_unclassified 8_gut vs 9_tongue gut -4.445346 2.296951 Rrun_lefse
## 3 g__Ruminococcaceae_unclassified 8_gut vs 9_tongue gut -4.430034 5.962548 Rrun_lefse
## 4 g__Lachnospira 8_gut vs 9_tongue gut -4.327495 6.503718 Rrun_lefse
## 5 g__Phascolarctobacterium 8_gut vs 9_tongue gut -4.214235 6.206589 Rrun_lefse
## 6 g__Faecalibacterium 8_gut vs 9_tongue gut -4.136991 5.492180 Rrun_lefse
<- get_lefse_R(
amplicon_ps_genus_lefse_R2 datres = amplicon_xmas2_output2,
name = "Rrun_lefse2",
LDA_names = "lefse_R2_LDA")
head(amplicon_ps_genus_lefse_R2)
## TaxaID Block Enrichment lefse_R2_LDA EffectSize app_name
## 1 g__Bacteroides 8_gut vs 9_tongue gut -5.763040 5.734982 Rrun_lefse2
## 2 g__Lachnospiraceae_unclassified 8_gut vs 9_tongue gut -4.774847 2.296951 Rrun_lefse2
## 3 g__Ruminococcaceae_unclassified 8_gut vs 9_tongue gut -4.756686 5.962548 Rrun_lefse2
## 4 g__Lachnospira 8_gut vs 9_tongue gut -4.684106 6.503718 Rrun_lefse2
## 5 g__Faecalibacterium 8_gut vs 9_tongue gut -4.447011 5.492180 Rrun_lefse2
## 6 g__Phascolarctobacterium 8_gut vs 9_tongue gut -4.441230 6.206589 Rrun_lefse2
<- get_lefse_R(
Zeybel_ps_species_lefse_R datres = MGS_xmas2_output,
name = "Rrun_lefse",
LDA_names = "lefse_R_LDA")
head(Zeybel_ps_species_lefse_R)
## TaxaID Block Enrichment lefse_R_LDA EffectSize app_name
## 1 s__Butyricimonas_virosa 12_Mild vs 12_Moderate Moderate 2.445966 1.668501 Rrun_lefse
## 2 s__Bacteroides_salyersiae 12_Mild vs 12_Moderate Moderate 2.659689 1.573816 Rrun_lefse
## 3 s__Bacteroides_clarus 12_Mild vs 12_Moderate Moderate 3.191513 2.331714 Rrun_lefse
## 4 s__Bacteroides_thetaiotaomicron 12_Mild vs 12_Moderate Moderate 3.210748 2.775490 Rrun_lefse
## 5 s__Bacteroides_coprocola 12_Mild vs 12_Moderate Moderate 3.314238 1.367262 Rrun_lefse
## 6 s__Bacteroides_cellulosilyticus 12_Mild vs 12_Moderate Moderate 3.408064 3.145457 Rrun_lefse
<- get_lefse_R(
Zeybel_ps_species_lefse_R2 datres = MGS_xmas2_output2,
name = "Rrun_lefse2",
LDA_names = "lefse_R2_LDA")
head(Zeybel_ps_species_lefse_R2)
## TaxaID Block Enrichment lefse_R2_LDA EffectSize app_name
## 1 s__Butyricimonas_virosa 12_Mild vs 12_Moderate Moderate 2.709350 1.668501 Rrun_lefse2
## 2 s__Bacteroides_salyersiae 12_Mild vs 12_Moderate Moderate 2.763901 1.573816 Rrun_lefse2
## 3 s__Bacteroides_clarus 12_Mild vs 12_Moderate Moderate 3.354092 2.331714 Rrun_lefse2
## 4 s__Bacteroides_thetaiotaomicron 12_Mild vs 12_Moderate Moderate 3.502262 2.775490 Rrun_lefse2
## 5 s__Bacteroides_intestinalis 12_Mild vs 12_Moderate Moderate 3.524676 1.184353 Rrun_lefse2
## 6 s__Bacteroides_coprocola 12_Mild vs 12_Moderate Moderate 3.538227 1.367262 Rrun_lefse2
10.8.4 Comparison of lefse-conda with XMAS2
10.8.4.1 Number of features reported as significant
- amplicon_ps_genus_lefse
<- function(dat1, dat2, dat3, dat4) {
plot_signif_taxa_num
# dat1 = amplicon_ps_genus_lefse_conda
# dat2 = amplicon_ps_genus_lefse_galaxy
# dat3 = amplicon_ps_genus_lefse_R
# dat4 = amplicon_ps_genus_lefse_R2
<- dplyr::bind_rows(dat1, dat2, dat3, dat4) %>%
combined_outputs ::mutate(LDA = coalesce(lefse_conda_LDA,
dplyr
lefse_galaxy_LDA,
lefse_R_LDA,
lefse_R2_LDA))
<- combined_outputs %>%
pl ::count(app_name) %>%
dplyrggplot(aes(app_name, n)) +
geom_col() +
geom_label(aes(label = n)) +
ggtitle('Number of significiant features identified by the different applications using lefse')
return(pl)
}
plot_signif_taxa_num(dat1 = amplicon_ps_genus_lefse_conda,
dat2 = amplicon_ps_genus_lefse_galaxy,
dat3 = amplicon_ps_genus_lefse_R,
dat4 = amplicon_ps_genus_lefse_R2)

Figure 10.8: Number of significiant features identified by the different applications using lefse (16s)
- Zeybel_ps_species_lefse
plot_signif_taxa_num(dat1 = Zeybel_ps_species_lefse_conda,
dat2 = Zeybel_ps_species_lefse_galaxy,
dat3 = Zeybel_ps_species_lefse_R,
dat4 = Zeybel_ps_species_lefse_R2)

Figure 10.9: Number of significiant features identified by the different applications using lefse (MGS)
10.8.4.2 Overlap of features reported as significant
- amplicon_ps_genus_lefse
<- function(dat1, dat2, dat3, dat4) {
plot_signif_taxa_venn
# dat1 = amplicon_ps_genus_lefse_conda
# dat2 = amplicon_ps_genus_lefse_galaxy
# dat3 = amplicon_ps_genus_lefse_R
# dat4 = amplicon_ps_genus_lefse_R2
= dat1$TaxaID
set1 = dat2$TaxaID
set2 = dat3$TaxaID
set3 = dat4$TaxaID
set4
grid.newpage()
<- venn.diagram(
venn_object x = list(set1, set2, set3, set4),
category.names = c("lefse-conda", "lefse-galaxy",
"run_lefse(lefser)", "run_lefse2(microbiomeMarker)"),
filename = NULL
)grid.draw(venn_object)
}
plot_signif_taxa_venn(dat1 = amplicon_ps_genus_lefse_conda,
dat2 = amplicon_ps_genus_lefse_galaxy,
dat3 = amplicon_ps_genus_lefse_R,
dat4 = amplicon_ps_genus_lefse_R2)

Figure 10.10: Venn (16s)
- Zeybel_ps_species_lefse
plot_signif_taxa_venn(dat1 = Zeybel_ps_species_lefse_conda,
dat2 = Zeybel_ps_species_lefse_galaxy,
dat3 = Zeybel_ps_species_lefse_R,
dat4 = Zeybel_ps_species_lefse_R2)

Figure 10.11: Venn (MGS)
10.8.4.3 LDA scores’ comparison
LDA scores of the 14 overlapping features are similar.
- amplicon_ps_genus_lefse
<- purrr::reduce(
amplicon_joint_output .x = list(amplicon_ps_genus_lefse_conda, amplicon_ps_genus_lefse_galaxy,
amplicon_ps_genus_lefse_R, amplicon_ps_genus_lefse_R2),.f = ~ inner_join(.x, .y, by = "TaxaID")) %>%
::select(TaxaID, lefse_conda_LDA, lefse_galaxy_LDA,
dplyr
lefse_R_LDA, lefse_R2_LDA) amplicon_joint_output
## # A tibble: 57 × 5
## TaxaID lefse_conda_LDA lefse_galaxy_LDA lefse_R_LDA lefse_R2_LDA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 g__Bacteroides -5.47 -5.46 -5.48 -5.76
## 2 g__Lachnospiraceae_unclassified -4.47 -4.45 -4.45 -4.77
## 3 g__Ruminococcaceae_unclassified -4.46 -4.41 -4.43 -4.76
## 4 g__Lachnospira -4.36 -4.38 -4.33 -4.68
## 5 g__Faecalibacterium -4.20 -4.18 -4.14 -4.45
## 6 g__Phascolarctobacterium -4.15 -4.18 -4.21 -4.44
## 7 g__Clostridiales_unclassified -4.03 -4.06 -4.02 -4.37
## 8 g__Akkermansia -3.91 -4.04 -4.08 -4.22
## 9 g__Oscillospira -3.87 -3.89 -3.87 -4.20
## 10 g__Rikenellaceae_unclassified -3.87 -3.91 -3.85 -4.27
## # … with 47 more rows
- Zeybel_ps_species_lefse
<- purrr::reduce(
MGS_joint_output .x = list(Zeybel_ps_species_lefse_conda, Zeybel_ps_species_lefse_galaxy,
Zeybel_ps_species_lefse_R, Zeybel_ps_species_lefse_R2),.f = ~ inner_join(.x, .y, by = "TaxaID")) %>%
::select(TaxaID, lefse_conda_LDA, lefse_galaxy_LDA,
dplyr
lefse_R_LDA, lefse_R2_LDA) MGS_joint_output
## # A tibble: 11 × 5
## TaxaID lefse_conda_LDA lefse_galaxy_LDA lefse_R_LDA lefse_R2_LDA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 s__Butyricimonas_virosa 2.46 2.82 2.45 2.71
## 2 s__Bacteroides_salyersiae 2.47 3.16 2.66 2.76
## 3 s__Bacteroides_clarus 3.07 3.29 3.19 3.35
## 4 s__Bacteroides_thetaiotaomicron 3.26 3.28 3.21 3.50
## 5 s__Bacteroides_coprocola 3.28 3.24 3.31 3.54
## 6 s__Bacteroides_intestinalis 3.32 3.31 3.41 3.52
## 7 s__Bacteroides_cellulosilyticus 3.41 3.36 3.41 3.71
## 8 s__Bacteroides_eggerthii 3.52 3.59 3.47 3.80
## 9 s__Parabacteroides_merdae 3.59 3.68 3.54 3.93
## 10 s__Barnesiella_intestinihominis 3.68 3.72 3.67 3.95
## 11 s__Prevotella_sp_CAG_520 4.04 4.26 4.38 4.63
10.8.4.3.1 XMAS2 LDA scores vs lefse-conda LDA scores
10.8.4.3.1.1 amplicon_ps_genus_lefse
- run_lefse (lefser R package) vs lefse-conda
%>%
amplicon_joint_output ggplot(aes(lefse_conda_LDA, lefse_R_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-conda and run_lefse")

Figure 10.12: Comparison of LDA scores of features reported as significant by both lefse-conda and run_lefse (16s)
- run_lefse2 (microbiomeMarker R package) vs lefse-conda
%>%
amplicon_joint_output ggplot(aes(lefse_conda_LDA, lefse_R2_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-conda and run_lefse2")

Figure 10.13: Comparison of LDA scores of features reported as significant by both lefse-conda and run_lefse2 (16s)
10.8.4.3.1.2 Zeybel_ps_species_lefse
- run_lefse (lefser R package) vs lefse-conda
%>%
MGS_joint_output ggplot(aes(lefse_conda_LDA, lefse_R_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-conda and run_lefse")

Figure 10.14: Comparison of LDA scores of features reported as significant by both lefse-conda and run_lefse (MGS)
- run_lefse2 (microbiomeMarker R package) vs lefse-conda
%>%
MGS_joint_output ggplot(aes(lefse_conda_LDA, lefse_R2_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-conda and run_lefse2")

Figure 10.15: Comparison of LDA scores of features reported as significant by both lefse-conda and run_lefse2 (MGS)
10.8.4.3.2 XMAS2 LDA scores vs lefse-galaxy LDA scores
10.8.4.3.2.1 amplicon_ps_genus_lefse
- run_lefse (lefser R package) vs lefse-galaxy
%>%
amplicon_joint_output ggplot(aes(lefse_galaxy_LDA, lefse_R_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-galaxy and run_lefse")

Figure 10.16: Comparison of LDA scores of features reported as significant by both lefse-galaxy and run_lefse (16s)
- run_lefse2 (microbiomeMarker R package) vs lefse-galaxy
%>%
amplicon_joint_output ggplot(aes(lefse_galaxy_LDA, lefse_R2_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-galaxy and run_lefse2")

Figure 10.17: Comparison of LDA scores of features reported as significant by both lefse-galaxy and run_lefse2 (16s)
10.8.4.3.2.2 Zeybel_ps_species_lefse
- run_lefse (lefser R package) vs lefse-galaxy
%>%
MGS_joint_output ggplot(aes(lefse_galaxy_LDA, lefse_R_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-galaxy and run_lefse")

Figure 10.18: Comparison of LDA scores of features reported as significant by both lefse-galaxy and run_lefse (MGS)
- run_lefse2 (microbiomeMarker R package) vs lefse-galaxy
%>%
MGS_joint_output ggplot(aes(lefse_galaxy_LDA, lefse_R2_LDA)) +
geom_point(size = 3, shape = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept = 0, linetype = "dashed") +
ggtitle("Comparison of LDA scores of features reported as significant
by both lefse-galaxy and run_lefse2")

Figure 10.19: Comparison of LDA scores of features reported as significant by both lefse-galaxy and run_lefse2 (MGS)
Results:
- The overlap between
run_lefse2
(microbiomeMarker R package) and lefse-conda or lefse-galaxy have the similar LDA scores. However, the overlap betweenrun_lefse
(lefser R package) and lefse-conda or lefse-galaxy seem have slightly different LDA scores.
10.9 Systematic Information
::session_info() devtools
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.1.2 (2021-11-01)
## os macOS Monterey 12.2.1
## system x86_64, darwin17.0
## ui RStudio
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Asia/Shanghai
## date 2022-10-08
## rstudio 2022.07.1+554 Spotted Wakerobin (desktop)
## pandoc 2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [1] CRAN (R 4.1.0)
## ade4 1.7-18 2021-09-16 [1] CRAN (R 4.1.0)
## ALDEx2 1.26.0 2021-10-26 [1] Bioconductor
## annotate 1.72.0 2021-10-26 [1] Bioconductor
## AnnotationDbi 1.56.2 2021-11-09 [1] Bioconductor
## ape * 5.6-2 2022-03-02 [1] CRAN (R 4.1.2)
## askpass 1.1 2019-01-13 [1] CRAN (R 4.1.0)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0)
## backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.0)
## base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.1.0)
## bayesm 3.1-4 2019-10-15 [1] CRAN (R 4.1.0)
## Biobase * 2.54.0 2021-10-26 [1] Bioconductor
## BiocGenerics * 0.40.0 2021-10-26 [1] Bioconductor
## BiocParallel 1.28.3 2021-12-09 [1] Bioconductor
## biomformat 1.22.0 2021-10-26 [1] Bioconductor
## Biostrings 2.62.0 2021-10-26 [1] Bioconductor
## bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0)
## bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0)
## bitops 1.0-7 2021-04-24 [1] CRAN (R 4.1.0)
## blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0)
## bookdown 0.27 2022-06-14 [1] CRAN (R 4.1.2)
## brio 1.1.3 2021-11-30 [1] CRAN (R 4.1.0)
## broom 1.0.0 2022-07-01 [1] CRAN (R 4.1.2)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.0)
## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0)
## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0)
## car 3.0-12 2021-11-06 [1] CRAN (R 4.1.0)
## carData 3.0-5 2022-01-06 [1] CRAN (R 4.1.2)
## caTools 1.18.2 2021-03-28 [1] CRAN (R 4.1.0)
## checkmate 2.0.0 2020-02-06 [1] CRAN (R 4.1.0)
## class 7.3-20 2022-01-13 [1] CRAN (R 4.1.2)
## classInt 0.4-3 2020-04-07 [1] CRAN (R 4.1.0)
## cli 3.3.0 2022-04-25 [1] CRAN (R 4.1.2)
## cluster 2.1.2 2021-04-17 [1] CRAN (R 4.1.2)
## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.2)
## coin 1.4-2 2021-10-08 [1] CRAN (R 4.1.0)
## colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.1.2)
## compositions 2.0-4 2022-01-05 [1] CRAN (R 4.1.2)
## conflicted * 1.1.0 2021-11-26 [1] CRAN (R 4.1.0)
## corrplot 0.92 2021-11-18 [1] CRAN (R 4.1.0)
## cowplot 1.1.1 2020-12-30 [1] CRAN (R 4.1.0)
## crayon 1.5.0 2022-02-14 [1] CRAN (R 4.1.2)
## crosstalk 1.2.0 2021-11-04 [1] CRAN (R 4.1.0)
## data.table 1.14.2 2021-09-27 [1] CRAN (R 4.1.0)
## DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.0)
## DelayedArray 0.20.0 2021-10-26 [1] Bioconductor
## DEoptimR 1.0-10 2022-01-03 [1] CRAN (R 4.1.2)
## desc 1.4.1 2022-03-06 [1] CRAN (R 4.1.2)
## DESeq2 1.34.0 2021-10-26 [1] Bioconductor
## devtools * 2.4.3 2021-11-30 [1] CRAN (R 4.1.0)
## digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.0)
## dplyr * 1.0.8 2022-02-08 [1] CRAN (R 4.1.2)
## DT 0.21 2022-02-26 [1] CRAN (R 4.1.2)
## e1071 1.7-9 2021-09-16 [1] CRAN (R 4.1.0)
## edgeR 3.36.0 2021-10-26 [1] Bioconductor
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
## evaluate 0.15 2022-02-18 [1] CRAN (R 4.1.2)
## FactoMineR 2.4 2020-12-11 [1] CRAN (R 4.1.0)
## fansi 1.0.2 2022-01-14 [1] CRAN (R 4.1.2)
## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0)
## flashClust 1.01-2 2012-08-21 [1] CRAN (R 4.1.0)
## foreach 1.5.2 2022-02-02 [1] CRAN (R 4.1.2)
## foreign 0.8-82 2022-01-13 [1] CRAN (R 4.1.2)
## forestplot 2.0.1 2021-09-03 [1] CRAN (R 4.1.0)
## formatR 1.11 2021-06-01 [1] CRAN (R 4.1.0)
## Formula 1.2-4 2020-10-16 [1] CRAN (R 4.1.0)
## fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.0)
## futile.logger * 1.4.3 2016-07-10 [1] CRAN (R 4.1.0)
## futile.options 1.0.1 2018-04-20 [1] CRAN (R 4.1.0)
## genefilter 1.76.0 2021-10-26 [1] Bioconductor
## geneplotter 1.72.0 2021-10-26 [1] Bioconductor
## generics 0.1.2 2022-01-31 [1] CRAN (R 4.1.2)
## GenomeInfoDb * 1.30.1 2022-01-30 [1] Bioconductor
## GenomeInfoDbData 1.2.7 2022-03-09 [1] Bioconductor
## GenomicRanges * 1.46.1 2021-11-18 [1] Bioconductor
## ggiraph 0.8.2 2022-02-22 [1] CRAN (R 4.1.2)
## ggiraphExtra 0.3.0 2020-10-06 [1] CRAN (R 4.1.2)
## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0)
## ggpubr * 0.4.0 2020-06-27 [1] CRAN (R 4.1.0)
## ggrepel 0.9.1 2021-01-15 [1] CRAN (R 4.1.0)
## ggsci 2.9 2018-05-14 [1] CRAN (R 4.1.0)
## ggsignif 0.6.3 2021-09-09 [1] CRAN (R 4.1.0)
## ggVennDiagram 1.2.1 2022-04-13 [1] Github (gaospecial/ggVennDiagram@db6742d)
## glmnet 4.1-3 2021-11-02 [1] CRAN (R 4.1.0)
## glue * 1.6.2 2022-02-24 [1] CRAN (R 4.1.2)
## Gmisc * 3.0.0 2022-01-03 [1] CRAN (R 4.1.2)
## gplots 3.1.1 2020-11-28 [1] CRAN (R 4.1.0)
## gridExtra 2.3 2017-09-09 [1] CRAN (R 4.1.0)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0)
## gtools 3.9.2 2021-06-06 [1] CRAN (R 4.1.0)
## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0)
## Hmisc 4.6-0 2021-10-07 [1] CRAN (R 4.1.0)
## hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0)
## htmlTable * 2.4.0 2022-01-04 [1] CRAN (R 4.1.2)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0)
## htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.1.0)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0)
## igraph 1.2.11 2022-01-04 [1] CRAN (R 4.1.2)
## insight 0.17.0 2022-03-29 [1] CRAN (R 4.1.2)
## IRanges * 2.28.0 2021-10-26 [1] Bioconductor
## iterators 1.0.14 2022-02-05 [1] CRAN (R 4.1.2)
## jpeg 0.1-9 2021-07-24 [1] CRAN (R 4.1.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0)
## jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.1.2)
## kableExtra 1.3.4 2021-02-20 [1] CRAN (R 4.1.2)
## KEGGREST 1.34.0 2021-10-26 [1] Bioconductor
## KernSmooth 2.23-20 2021-05-03 [1] CRAN (R 4.1.2)
## knitr 1.39 2022-04-26 [1] CRAN (R 4.1.2)
## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0)
## lambda.r 1.2.4 2019-09-18 [1] CRAN (R 4.1.0)
## lattice * 0.20-45 2021-09-22 [1] CRAN (R 4.1.2)
## latticeExtra 0.6-29 2019-12-19 [1] CRAN (R 4.1.0)
## leaps 3.1 2020-01-16 [1] CRAN (R 4.1.0)
## libcoin 1.0-9 2021-09-27 [1] CRAN (R 4.1.0)
## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0)
## limma 3.50.1 2022-02-17 [1] Bioconductor
## locfit 1.5-9.5 2022-03-03 [1] CRAN (R 4.1.2)
## LOCOM 1.1 2022-08-05 [1] Github (yijuanhu/LOCOM@c181e0f)
## lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.1.0)
## magrittr * 2.0.2 2022-01-26 [1] CRAN (R 4.1.2)
## MASS 7.3-55 2022-01-13 [1] CRAN (R 4.1.2)
## Matrix 1.4-0 2021-12-08 [1] CRAN (R 4.1.0)
## MatrixGenerics * 1.6.0 2021-10-26 [1] Bioconductor
## matrixStats * 0.61.0 2021-09-17 [1] CRAN (R 4.1.0)
## mbzinb 0.2 2022-03-16 [1] local
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.1.0)
## metagenomeSeq 1.36.0 2021-10-26 [1] Bioconductor
## mgcv 1.8-39 2022-02-24 [1] CRAN (R 4.1.2)
## microbiome 1.16.0 2021-10-26 [1] Bioconductor
## modeltools 0.2-23 2020-03-05 [1] CRAN (R 4.1.0)
## multcomp 1.4-18 2022-01-04 [1] CRAN (R 4.1.2)
## multtest 2.50.0 2021-10-26 [1] Bioconductor
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0)
## mvtnorm 1.1-3 2021-10-08 [1] CRAN (R 4.1.0)
## mycor 0.1.1 2018-04-10 [1] CRAN (R 4.1.0)
## NADA 1.6-1.1 2020-03-22 [1] CRAN (R 4.1.0)
## nlme * 3.1-155 2022-01-13 [1] CRAN (R 4.1.2)
## nnet 7.3-17 2022-01-13 [1] CRAN (R 4.1.2)
## openssl 2.0.0 2022-03-02 [1] CRAN (R 4.1.2)
## permute * 0.9-7 2022-01-27 [1] CRAN (R 4.1.2)
## pheatmap * 1.0.12 2019-01-04 [1] CRAN (R 4.1.0)
## phyloseq * 1.38.0 2021-10-26 [1] Bioconductor
## picante * 1.8.2 2020-06-10 [1] CRAN (R 4.1.0)
## pillar 1.7.0 2022-02-01 [1] CRAN (R 4.1.2)
## pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.1.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0)
## pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.1.0)
## plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0)
## png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0)
## ppcor 1.1 2015-12-03 [1] CRAN (R 4.1.0)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0)
## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0)
## protoclust 1.6.3 2019-01-31 [1] CRAN (R 4.1.0)
## proxy 0.4-26 2021-06-07 [1] CRAN (R 4.1.0)
## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0)
## pscl 1.5.5 2020-03-07 [1] CRAN (R 4.1.0)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0)
## qvalue 2.26.0 2021-10-26 [1] Bioconductor
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0)
## RAIDA 1.0 2022-03-14 [1] local
## RColorBrewer * 1.1-2 2014-12-07 [1] CRAN (R 4.1.0)
## Rcpp * 1.0.8.2 2022-03-11 [1] CRAN (R 4.1.2)
## RcppZiggurat 0.1.6 2020-10-20 [1] CRAN (R 4.1.0)
## RCurl 1.98-1.6 2022-02-08 [1] CRAN (R 4.1.2)
## readr * 2.1.2 2022-01-30 [1] CRAN (R 4.1.2)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.0)
## reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.1.0)
## reticulate 1.24 2022-01-26 [1] CRAN (R 4.1.2)
## Rfast 2.0.6 2022-02-16 [1] CRAN (R 4.1.2)
## rhdf5 2.38.1 2022-03-10 [1] Bioconductor
## rhdf5filters 1.6.0 2021-10-26 [1] Bioconductor
## Rhdf5lib 1.16.0 2021-10-26 [1] Bioconductor
## rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.2)
## rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.1.2)
## robustbase 0.93-9 2021-09-27 [1] CRAN (R 4.1.0)
## rpart 4.1.16 2022-01-24 [1] CRAN (R 4.1.2)
## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0)
## RSpectra 0.16-0 2019-12-01 [1] CRAN (R 4.1.0)
## RSQLite 2.2.10 2022-02-17 [1] CRAN (R 4.1.2)
## rstatix 0.7.0 2021-02-13 [1] CRAN (R 4.1.0)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
## Rtsne 0.15 2018-11-10 [1] CRAN (R 4.1.0)
## RVenn 1.1.0 2019-07-18 [1] CRAN (R 4.1.0)
## rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.0)
## S4Vectors * 0.32.3 2021-11-21 [1] Bioconductor
## sandwich 3.0-1 2021-05-18 [1] CRAN (R 4.1.0)
## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0)
## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0)
## scatterplot3d 0.3-41 2018-03-14 [1] CRAN (R 4.1.0)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.0)
## sf 1.0-7 2022-03-07 [1] CRAN (R 4.1.2)
## shape 1.4.6 2021-05-19 [1] CRAN (R 4.1.0)
## sjlabelled 1.2.0 2022-04-10 [1] CRAN (R 4.1.2)
## sjmisc 2.8.9 2021-12-03 [1] CRAN (R 4.1.0)
## stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.0)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0)
## SummarizedExperiment * 1.24.0 2021-10-26 [1] Bioconductor
## survival 3.3-1 2022-03-03 [1] CRAN (R 4.1.2)
## svglite 2.1.0 2022-02-03 [1] CRAN (R 4.1.2)
## systemfonts 1.0.4 2022-02-11 [1] CRAN (R 4.1.2)
## tensorA 0.36.2 2020-11-19 [1] CRAN (R 4.1.0)
## testthat 3.1.2 2022-01-20 [1] CRAN (R 4.1.2)
## TH.data 1.1-0 2021-09-27 [1] CRAN (R 4.1.0)
## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.0)
## tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.1.2)
## tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.1.2)
## truncnorm 1.0-8 2018-02-27 [1] CRAN (R 4.1.0)
## tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.0)
## umap 0.2.8.0 2022-03-23 [1] CRAN (R 4.1.2)
## units 0.8-0 2022-02-05 [1] CRAN (R 4.1.2)
## usethis * 2.1.5 2021-12-09 [1] CRAN (R 4.1.0)
## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0)
## uuid 1.0-3 2021-11-01 [1] CRAN (R 4.1.0)
## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0)
## vegan * 2.5-7 2020-11-28 [1] CRAN (R 4.1.0)
## VennDiagram * 1.7.3 2022-04-12 [1] CRAN (R 4.1.2)
## viridis * 0.6.2 2021-10-13 [1] CRAN (R 4.1.0)
## viridisLite * 0.4.0 2021-04-13 [1] CRAN (R 4.1.0)
## vroom 1.5.7 2021-11-30 [1] CRAN (R 4.1.0)
## webshot 0.5.3 2022-04-14 [1] CRAN (R 4.1.2)
## withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.2)
## Wrench 1.12.0 2021-10-26 [1] Bioconductor
## xfun 0.30 2022-03-02 [1] CRAN (R 4.1.2)
## XMAS2 * 2.1.8 2022-10-08 [1] local
## XML 3.99-0.9 2022-02-24 [1] CRAN (R 4.1.2)
## xml2 1.3.3 2021-11-30 [1] CRAN (R 4.1.0)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.1.0)
## XVector 0.34.0 2021-10-26 [1] Bioconductor
## yaml 2.3.5 2022-02-21 [1] CRAN (R 4.1.2)
## zCompositions 1.4.0 2022-01-13 [1] CRAN (R 4.1.2)
## zlibbioc 1.40.0 2021-10-26 [1] Bioconductor
## zoo 1.8-9 2021-03-09 [1] CRAN (R 4.1.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────