Title: | Data Package Containing Only Data and Data Information |
---|---|
Description: | Setup package for the LEEF pipeline which loads / installs all necessary packages and functions to run the pipeline. |
Authors: | Rainer M. Krug [aut, cre], SNF Project 310030_188431 [fnd] |
Maintainer: | Rainer M. Krug <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.9.1 |
Built: | 2024-11-05 02:57:48 UTC |
Source: | https://github.com/LEEF-UZH/LEEF |
Functions starting with add_...
do add a function to a
queue which is processed by the coresponding run_...
command.
The functions do reqquire exactly two arguments with the first named
input
and the second one named output
. They should return
either TRUE
when run successful, or FALSE
when failed.
Although, the checking is not yet implemented.
add(fun, funname, queue) add_additor(fun) add_archiver(fun) add_extractor(fun) add_pre_processor(fun)
add(fun, funname, queue) add_additor(fun) add_archiver(fun) add_extractor(fun) add_pre_processor(fun)
fun |
function which is run when calling |
funname |
name of the function |
queue |
name of queue in |
add: The function which is doing the adding - normally the
specific add_*
functions are used
add_additor: Adding a named function to the queue of additors. If the named function already exists will it be replaced.
add_archiver: Adding a named function to the queue of archivers. If the named function already exists will it be replaced.
These functions do archive the results of the current processing step.
add_extractor: Adding a named function to the queue of extractors. If the named function already exists will it be replaced.
These functions should extract data from the pre-processed data. The extracted data should be usable for the actual analysis to address the actual research question.
add_pre_processor: Adding a named function to the queue of pre-processors. If the named function already exists will it be replaced.
This function should pre-process the raw data. The pre-processed data should be archive ready, i.e. contain the same information as the raw data, be in an open format, and be compressed if possible.
invisibly the function queue.
invisibly the function queue. A list
which is processed
invisibly the function queue. A list
which is processed
## Not run: ## To add the function `cat` to the `additor` queue add (fun = cat, .queue = "additor") ## To add the function `paste` to the `extractor` queue add (cat, "cat", "extractor") ## End(Not run) add_additor( fun = cat ) add_additor( fun = cat ) add_extractor( fun = paste ) add_pre_processor( fun = paste )
## Not run: ## To add the function `cat` to the `additor` queue add (fun = cat, .queue = "additor") ## To add the function `paste` to the `extractor` queue add (cat, "cat", "extractor") ## End(Not run) add_additor( fun = cat ) add_additor( fun = cat ) add_extractor( fun = paste ) add_pre_processor( fun = paste )
The Control centre app allows the
sanity checks of the raw data
running of the pipeline
control_center(rootdir = ".")
control_center(rootdir = ".")
rootdir |
Directory in which all the data directories can be found. |
return value from runApp()
## Not run: control_center() ## End(Not run)
## Not run: control_center() ## End(Not run)
The following steps are done in this function
init_LEEF( config_file = system.file("default_config.yml", package = "LEEF"), id = NULL )
init_LEEF( config_file = system.file("default_config.yml", package = "LEEF"), id = NULL )
config_file |
config file to use. If none is specified, |
id |
id which will be appended to the name in the config file, using a '.' |
the config file as specified in the argument config_file
is read
the folders as specified in the config file are, if they do not exist yet, created. If they are not specified, the following default values are used:
general.parameter: 00.general.parameter
- the directory containing general configuratuion files which are used for multiple measurements
raw: 0.raw.data
- the raw data
pre_processed: 1.pre_processed.data
- the pre-processed Archive Ready Data
extracted: 2.extracted.data
= the extracted Research Ready Data
archive: 3.archived.data
- the archived data from any of the previous steps or raw data
backend: 9.backend
- the backend which contains the Research Ready Data from all pipeline runs before
tools: tools
- tools needed for running the different processes in the pipeline
verifies if a file named sample_metadata.yml
exists which contains the metadata of the raw data
registers all measurement
, archive
and backend
packages
verifies,if all tools are installed and installs them when needed. THis step is specific to the bemovi measurement!!!
invisible TRUE
## Not run: init_LEEF(system.file("default_config.yml", package = "LEEF")) ## End(Not run)
## Not run: init_LEEF(system.file("default_config.yml", package = "LEEF")) ## End(Not run)
This function is a wrapper around tools::package_dependencies("LEEF",
which = "all", recursive = TRUE)
which returns only the packages which
contain .LEEF or LEEF. and the package LEEF itself.
list_LEEF_packages(recursive = TRUE, versions = FALSE)
list_LEEF_packages(recursive = TRUE, versions = FALSE)
recursive |
logical: should (reverse) dependencies of (reverse)
dependencies (and so on) be included? defaults to |
versions |
logical: should versions be returned as well. |
This function is a convenience function and only returns useful results when
all packages which are dependencies of the LEEF
package are prefixed
with LEEF.
or postficxed with .LEEF
.
list of all packages which are installed which contain .LEEF or LEEF. and the package LEEF itself
## Not run: list_LEEF_packages() ## End(Not run)
## Not run: list_LEEF_packages() ## End(Not run)
Read or write the directories to be used in the processing. Directories do not have to exist and will be created. Content will be overwritten without confirmation! If no parameter is given, the directories will be returned a a list.
opt_directories( general.parameter, raw, pre_processed, extracted, archive, tools )
opt_directories( general.parameter, raw, pre_processed, extracted, archive, tools )
general.parameter |
|
raw |
|
pre_processed |
|
extracted |
|
archive |
|
tools |
directory in which the tools are located |
list of directories. If values have set, the value before the change.
opt_directories() opt_directories(raw = "./temp")
opt_directories() opt_directories(raw = "./temp")
This function is an example and can be used as a template for processing the queues in a script,. Raw data is always archived using the "none" compression.
process(submitter, timestamp, process = TRUE, ...)
process(submitter, timestamp, process = TRUE, ...)
submitter |
name of submitter. When provided, will override the one in the 'sample_metadata.yml' file. |
timestamp |
timestamp for the data. When provided, will override the one in the 'sample_metadata.yml' file. |
process |
if |
... |
additional arguments for the different queues |
invisibly TRUE
## Not run: process() ## End(Not run)
## Not run: process() ## End(Not run)
This function is an example and can be used as a template for processing the queues in a script. It uses the archiver "none" for the raw and pre-processed data, useful for already compressed and large data, e.g. bemovi.
process_raw_comp_none(submitter, timestamp, ...)
process_raw_comp_none(submitter, timestamp, ...)
submitter |
name of the submitter of the data to the pipeline. Will be added to the metadata. |
timestamp |
timestamp of the submission of the data to the pipeline.
This should be in the format |
... |
additional arguments for the different queues |
invisibly TRUE
## Not run: process() ## End(Not run)
## Not run: process() ## End(Not run)
Register the functions to be usedfrom packages in the config file
register_packages(packages)
register_packages(packages)
packages |
list of packages. Each element must contain the elements
|
invisibly a list containing the results of the register commands
## Not run: register_packages(getOption("LEEF")$measurement_packages) ## End(Not run)
## Not run: register_packages(getOption("LEEF")$measurement_packages) ## End(Not run)
Run all the functions in the process queue named queue
run(input, output, queue)
run(input, output, queue)
input |
directory containing the input data in folders with the name
of the methodology (e.g. |
output |
directory in which the results will be written in a folder with
the name of the methodology (e.g. |
queue |
name of queue in |
returns the results of the queue as a vector of length of the queue.
If an element is TRUE
, the function was run successfully (i.e.
returned TRUE
)
## Not run: run( input = "./input", output = "./output", queue = "extractor" ) ## End(Not run)
## Not run: run( input = "./input", output = "./output", queue = "extractor" ) ## End(Not run)
Run all the additors registered with add_additor()
.
run_additors()
run_additors()
returns the results of the queue as a vector of length of the queue.
If an element is TRUE
, the function was run successfully (i.e.
returned TRUE
)
## Not run: run_additors() ## End(Not run)
## Not run: run_additors() ## End(Not run)
Run all the archivers registered with add_archiver()
.
run_archivers(input, output)
run_archivers(input, output)
input |
directory to be archive, including subdirectories |
output |
director in which the archive will be created |
returns the results of the queue as a vector of length of the queue.
If an element is TRUE
, the function was run successfully (i.e.
returned TRUE
)
## Not run: run_archivers( input = "./input", output = "./output" ) ## End(Not run)
## Not run: run_archivers( input = "./input", output = "./output" ) ## End(Not run)
Run all the extractors registered with add_extractor()
.
run_extractors()
run_extractors()
returns the results of the queue as a vector of length of the queue.
If an element is TRUE
, the function was run successfully (i.e.
returned TRUE
)
## Not run: run_extractors() ## End(Not run)
## Not run: run_extractors() ## End(Not run)
Run all the additors registered with add_pre_processor()
.
run_pre_processors()
run_pre_processors()
returns the results of the queue as a vector of length of the queue.
If an element is TRUE
, the function was run successfully (i.e.
returned TRUE
)
## Not run: run_pre_processors() ## End(Not run)
## Not run: run_pre_processors() ## End(Not run)
Split bemovi filder into a number of bemovi. folders with a maximum of per_batch video files
split_bemovi( per_batch = 30, bemovi_dir = file.path(".", "0.raw.data"), overwrite = TRUE )
split_bemovi( per_batch = 30, bemovi_dir = file.path(".", "0.raw.data"), overwrite = TRUE )
per_batch |
maximum number of movies per batch |
bemovi_dir |
bas directory in which the |
overwrite |
if |
the maximum id used
## Not run: split_bemovi(per_batch = 5) ## End(Not run)
## Not run: split_bemovi(per_batch = 5) ## End(Not run)