In DSF Project structure, project is largest element.
A project has information about I/O strategy, config, owner, etc.
And a project can have many workflow. You can write data analysis logic pipeline in a workflow.
And In a project, workflow can be connected to pipeline by referring other workflow.
In a workflow, many tasks are connected to task pipeline.
A workflow can create 3 types of tasks.
| task type | input | output | use |
|---|---|---|---|
| loader | external | dsf data store | load external data at start of pipeline |
| (calculation) task | dsf data store | dsf data store | calculation |
| terminal | dsf data store | external | write calculation result to external storage at end of pipeline |
DSF automatically create dsf data store to save all calculation results.
Loader task is to load data from external to dsf data store, and Terminal task is to write data from dsf data store to external.
All calculation tasks have calculation component that determine calculation method.
DSF has many Calculation components in dsf_opkg repositories, And of course user can define them.