Title: | Conduct Simulation Studies with a Minimal Amount of Source Code |
---|---|
Description: | Tool for statistical simulations that have two components. One component generates the data and the other one analyzes the data. The main aims of the package are the reduction of the administrative source code (mainly loops and management code for the results) and a simple applicability of the package that allows the user to quickly learn how to work with it. Parallel computing is also supported. Finally, convenient functions are provided to summarize the simulation results. |
Authors: | Marsel Scheer [aut, cre] |
Maintainer: | Marsel Scheer <[email protected]> |
License: | GPL-3 |
Version: | 1.1.7.9000 |
Built: | 2025-03-10 05:05:55 UTC |
Source: | https://github.com/marselscheer/simtool |
Generates data according to all provided
constellations in data_tibble
and applies
all provided constellations in proc_tibble
to them.
eval_tibbles( data_grid, proc_grid = expand_tibble(proc = "length"), replications = 1, discard_generated_data = FALSE, post_analyze = identity, summary_fun = NULL, group_for_summary = NULL, ncpus = 1L, cluster = NULL, cluster_seed = rep(12345, 6), cluster_libraries = NULL, cluster_global_objects = NULL, envir = globalenv(), simplify = TRUE )
eval_tibbles( data_grid, proc_grid = expand_tibble(proc = "length"), replications = 1, discard_generated_data = FALSE, post_analyze = identity, summary_fun = NULL, group_for_summary = NULL, ncpus = 1L, cluster = NULL, cluster_seed = rep(12345, 6), cluster_libraries = NULL, cluster_global_objects = NULL, envir = globalenv(), simplify = TRUE )
data_grid |
a |
proc_grid |
similar as |
replications |
number of replications for the simulation |
discard_generated_data |
if |
post_analyze |
this is a convenience function, that is applied
directly after the data analyzing function. If this function has an
argument |
summary_fun |
named list of univariate function to summarize the results (numeric or logical) over the replications, e.g. list(mean = mean, sd = sd). |
group_for_summary |
if the result returned by the data analyzing
function or |
ncpus |
a cluster of |
cluster |
a cluster generated by the |
cluster_seed |
if the simulation is done in parallel
manner, then the combined multiple-recursive generator from L'Ecuyer (1999)
is used to generate random numbers. Thus |
cluster_libraries |
a character vector specifying the packages that should be loaded by the workers. |
cluster_global_objects |
a character vector specifying the names of R objects in the global environment that should be exported to the global environment of every worker. |
envir |
must be provided if the functions specified
in |
simplify |
usually the result column is nested, by default it is tried to unnest it. |
The returned object list of the class
eval_tibbles
, where the element simulations
contain
the results of the simulation.
If cluster
is provided by the user the
function eval_tibbles
will NOT stop the cluster.
This has to be done by the user. Conducting parallel
simulations by specifying ncpus
will internally
create a cluster and stop it after the simulation
is done.
Marsel Scheer
rng <- function(data, ...) { ret <- range(data) names(ret) <- c("min", "max") ret } ### The following line is only necessary ### if the examples are not executed in the global ### environment, which for instance is the case when ### the oneline-documentation ### http://marselscheer.github.io/simTool/reference/eval_tibbles.html ### is build. In such case eval_tibble() would search the ### above defined function rng() in the global environment where ### it does not exist! eval_tibbles <- purrr::partial(eval_tibbles, envir = environment()) dg <- expand_tibble(fun = "rnorm", n = c(5L, 10L)) pg <- expand_tibble(proc = c("rng", "median", "length")) eval_tibbles(dg, pg, rep = 2, simplify = FALSE) eval_tibbles(dg, pg, rep = 2) eval_tibbles(dg, pg, rep = 2, post_analyze = purrr::compose(as.data.frame, t) ) eval_tibbles(dg, pg, rep = 2, summary_fun = list(mean = mean, sd = sd)) regData <- function(n, SD) { data.frame( x = seq(0, 1, length = n), y = rnorm(n, sd = SD) ) } eg <- eval_tibbles( expand_tibble(fun = "regData", n = 5L, SD = 1:2), expand_tibble(proc = "lm", formula = c("y~x", "y~I(x^2)")), replications = 3 ) eg presever_rownames <- function(mat) { rn <- rownames(mat) ret <- tibble::as_tibble(mat) ret$term <- rn ret } eg <- eval_tibbles( expand_tibble(fun = "regData", n = 5L, SD = 1:2), expand_tibble(proc = "lm", formula = c("y~x", "y~I(x^2)")), post_analyze = purrr::compose(presever_rownames, coef, summary), # post_analyze = broom::tidy, # is a nice out of the box alternative summary_fun = list(mean = mean, sd = sd), group_for_summary = "term", replications = 3 ) eg$simulation dg <- expand_tibble(fun = "rexp", rate = c(10, 100), n = c(50L, 100L)) pg <- expand_tibble(proc = c("t.test"), conf.level = c(0.8, 0.9, 0.95)) et <- eval_tibbles(dg, pg, ncpus = 1, replications = 10^1, post_analyze = function(ttest, .truth) { mu <- 1 / .truth$rate ttest$conf.int[1] <= mu && mu <= ttest$conf.int[2] }, summary_fun = list(mean = mean, sd = sd) ) et dg <- dplyr::bind_rows( expand_tibble(fun = "rexp", rate = 10, .truth = 1 / 10, n = c(50L, 100L)), expand_tibble(fun = "rnorm", .truth = 0, n = c(50L, 100L)) ) pg <- expand_tibble(proc = c("t.test"), conf.level = c(0.8, 0.9, 0.95)) et <- eval_tibbles(dg, pg, ncpus = 1, replications = 10^1, post_analyze = function(ttest, .truth) { ttest$conf.int[1] <= .truth && .truth <= ttest$conf.int[2] }, summary_fun = list(mean = mean, sd = sd) ) et ### need to remove the locally adapted eval_tibbles() ### otherwise executing the examples would mask ### eval_tibbles from simTool-namespace. rm(eval_tibbles)
rng <- function(data, ...) { ret <- range(data) names(ret) <- c("min", "max") ret } ### The following line is only necessary ### if the examples are not executed in the global ### environment, which for instance is the case when ### the oneline-documentation ### http://marselscheer.github.io/simTool/reference/eval_tibbles.html ### is build. In such case eval_tibble() would search the ### above defined function rng() in the global environment where ### it does not exist! eval_tibbles <- purrr::partial(eval_tibbles, envir = environment()) dg <- expand_tibble(fun = "rnorm", n = c(5L, 10L)) pg <- expand_tibble(proc = c("rng", "median", "length")) eval_tibbles(dg, pg, rep = 2, simplify = FALSE) eval_tibbles(dg, pg, rep = 2) eval_tibbles(dg, pg, rep = 2, post_analyze = purrr::compose(as.data.frame, t) ) eval_tibbles(dg, pg, rep = 2, summary_fun = list(mean = mean, sd = sd)) regData <- function(n, SD) { data.frame( x = seq(0, 1, length = n), y = rnorm(n, sd = SD) ) } eg <- eval_tibbles( expand_tibble(fun = "regData", n = 5L, SD = 1:2), expand_tibble(proc = "lm", formula = c("y~x", "y~I(x^2)")), replications = 3 ) eg presever_rownames <- function(mat) { rn <- rownames(mat) ret <- tibble::as_tibble(mat) ret$term <- rn ret } eg <- eval_tibbles( expand_tibble(fun = "regData", n = 5L, SD = 1:2), expand_tibble(proc = "lm", formula = c("y~x", "y~I(x^2)")), post_analyze = purrr::compose(presever_rownames, coef, summary), # post_analyze = broom::tidy, # is a nice out of the box alternative summary_fun = list(mean = mean, sd = sd), group_for_summary = "term", replications = 3 ) eg$simulation dg <- expand_tibble(fun = "rexp", rate = c(10, 100), n = c(50L, 100L)) pg <- expand_tibble(proc = c("t.test"), conf.level = c(0.8, 0.9, 0.95)) et <- eval_tibbles(dg, pg, ncpus = 1, replications = 10^1, post_analyze = function(ttest, .truth) { mu <- 1 / .truth$rate ttest$conf.int[1] <= mu && mu <= ttest$conf.int[2] }, summary_fun = list(mean = mean, sd = sd) ) et dg <- dplyr::bind_rows( expand_tibble(fun = "rexp", rate = 10, .truth = 1 / 10, n = c(50L, 100L)), expand_tibble(fun = "rnorm", .truth = 0, n = c(50L, 100L)) ) pg <- expand_tibble(proc = c("t.test"), conf.level = c(0.8, 0.9, 0.95)) et <- eval_tibbles(dg, pg, ncpus = 1, replications = 10^1, post_analyze = function(ttest, .truth) { ttest$conf.int[1] <= .truth && .truth <= ttest$conf.int[2] }, summary_fun = list(mean = mean, sd = sd) ) et ### need to remove the locally adapted eval_tibbles() ### otherwise executing the examples would mask ### eval_tibbles from simTool-namespace. rm(eval_tibbles)
tibble
from All CombinationsActually a wrapper for expand.grid
, but
character vectors will stay as characters.
expand_tibble(...)
expand_tibble(...)
... |
vectors, factors or a list containing these. |
See expand.grid
but instead of a
data.frame
a tibble
is returned.
Marsel Scheer
expand_tibble(fun = "rnorm", mean = 1:4, sd = 2:5)
expand_tibble(fun = "rnorm", mean = 1:4, sd = 2:5)
Prints objects created by eval_tibbles()
## S3 method for class 'eval_tibbles' print(x, ...)
## S3 method for class 'eval_tibbles' print(x, ...)
x |
object of class |
... |
not used. only necessary to define the function consistently
with respect to |
Marsel Scheer