This is a sub-class that is compatible to data obtained from either 16S rRNA marker-gene sequencing or shot-gun metagenomics sequencing.
It inherits all methods from the abstract class omics and only adapts the initialize function.
It supports BIOM format data (v2.1.0 from http://biom-format.org/) in both HDF5 and JSON format, also pre-existing data structures can be used or text files.
When omics data is very large, data loading becomes very expensive. It is therefore recommended to use the reset() method to reset your changes.
Every omics class creates an internal memory efficient back-up of the data, the resetting of changes is an instant process.
Super class
omics -> metagenomics
Active bindings
treeDataA "phylo" class, see as.phylo.
Methods
Inherited methods
metagenomics$new()
Initializes the metagenomics class object with metagenomics$new()
Usage
metagenomics$new(
countData = NULL,
metaData = NULL,
featureData = NULL,
treeData = NULL,
biomData = NULL,
feature_names = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species")
)Arguments
countDataA path to an existing file or a dense/sparse Matrix format.
metaDataA path to an existing file, data.table or data.frame.
featureDataA path to an existing file, data.table or data.frame.
treeDataA path to an existing newick file or class "phylo", see read.tree.
biomDataA path to an existing biom file, version 2.1.0 (http://biom-format.org/), see h5read.
feature_namesA character vector to name the feature names that fit the supplied
featureData.
metagenomics$write_biom()
Creates a BIOM file in HDF5 format of the loaded items via 'new()', which is compatible to the python biom-format version 2.1, see http://biom-format.org.
Arguments
filenameA character variable of either the full path of filename of the biom file (e.g.
output.biom)
Examples
library("OmicFlow")
metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
tree_file <- system.file("extdata", "tree.newick", package = "OmicFlow")
taxa <- metagenomics$new(
metaData = metadata_file,
countData = counts_file,
featureData = features_file,
treeData = tree_file
)
taxa$write_biom(filename = "output.biom")
file.remove("output.biom")metagenomics$foldchange()
Differential feature expression (DFE) Total Sample Sum (TSS) transformed values for both paired and non-paired test.
The function performs feature agglomeration, subsetting to remove NAs in condition.group, finding samplepairs and finally feature scaling prior to fold-change computation.
Based on the transform method, fold-changes will be computed either via subtraction or division.
Usage
metagenomics$foldchange(
condition.group,
condition_A,
condition_B,
group_by = NULL,
feature_rank = "FEATURE_ID",
feature_filter = NULL,
paired = FALSE,
normalize = FALSE,
pvalue.threshold = 0.05,
logfold.threshold = 0.06,
abundance.threshold = 0
)Arguments
condition.groupA character variable of an existing column name in
metaData, wherein the conditions A and B are located.condition_AA character value or vector of characters.
condition_BA character value or vector of characters.
group_byA character variable of an existing column in
metaDatato split the table in chunks prior to fold-change computation (default: NULL).feature_rankA character or vector of characters in the
featureDatato aggregate viafeature_merge()(default:"FEATURE_ID").feature_filterA character or vector of characters to remove features via regex pattern (default:
NULL).pairedA boolean value, the paired is only applicable when a
SAMPLEPAIR_IDcolumn exists within themetaData. See wilcox.test andsamplepair_subset().normalizeA boolean value wether to normalize via Total Sample Sums (TSS) or not (default:
FALSE).pvalue.thresholdA numeric value used as a p-value threshold to label and color significant features (default: 0.05).
logfold.thresholdA numeric value used as a fold-change threshold to label and color significantly expressed features (default: 0.06).
abundance.thresholdA numeric value used as an abundance threshold to size the scatter dots based on their mean abundance (default: 0.01).
Returns
dataA long data.table table.volcano_plotA ggplot object.AA data.table table for (each) condition A.BA data.table table for (each) condition B.
Examples
library("ggplot2")
library("OmicFlow")
metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
obj <- metagenomics$new(
metaData = metadata_file,
countData = counts_file,
featureData = features_file
)
dfe <- obj$foldchange(feature_rank = "Genus",
paired = FALSE,
condition.group = "treatment",
condition_A = c("healthy"),
condition_B = c("tumor"))Examples
## ------------------------------------------------
## Method `metagenomics$write_biom()`
## ------------------------------------------------
library("OmicFlow")
metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
tree_file <- system.file("extdata", "tree.newick", package = "OmicFlow")
taxa <- metagenomics$new(
metaData = metadata_file,
countData = counts_file,
featureData = features_file,
treeData = tree_file
)
#> ✔ metaData template passed the JSON validation.
#> ℹ Checking for duplicated identifiers ..
#> ✔ featureData is loaded.
#> ✔ countData is loaded.
#> ✔ treeData is loaded.
#> ℹ Final steps .. cleaning & creating back-up
#>
#> ── <metagenomics> object
#> metaData: 9 variables × 4 samples
#> countData: 4 samples × 242 features
#> featureData: 7 attributes × 242 features
#> treeData: 242 tips × 241 nodes
taxa$write_biom(filename = "output.biom")
file.remove("output.biom")
#> [1] TRUE
## ------------------------------------------------
## Method `metagenomics$foldchange()`
## ------------------------------------------------
library("ggplot2")
library("OmicFlow")
metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
obj <- metagenomics$new(
metaData = metadata_file,
countData = counts_file,
featureData = features_file
)
#> ✔ metaData template passed the JSON validation.
#> ℹ Checking for duplicated identifiers ..
#> ✔ featureData is loaded.
#> ✔ countData is loaded.
#> ℹ Final steps .. cleaning & creating back-up
#>
#> ── <metagenomics> object
#> metaData: 9 variables × 4 samples
#> countData: 4 samples × 242 features
#> featureData: 7 attributes × 242 features
dfe <- obj$foldchange(feature_rank = "Genus",
paired = FALSE,
condition.group = "treatment",
condition_A = c("healthy"),
condition_B = c("tumor"))
#>
#> ── <metagenomics> object
#> metaData: 9 variables × 4 samples
#> countData: 4 samples × 64 features
#> featureData: 7 attributes × 64 features
