Anchor Families
The Family class provides the fundamental visual block of pyflow. Families provide two distinct roles within suites:
Visually grouping related families/tasks
Logically grouping related families/tasks from an execution perspective
Due to constraints imposed by the order in which ecFlow searches for scripts within the configured files location, by default all tasks with the same name must share the same script located in the files directory (if scripts are deployed by pyflow, they will be deployed to this directory). This means that tasks with the same name must either be avoided, or written to have identical scripts, and is a significant constraint on encapsulation in object-oriented suite design.
For simple agregation of tasks, it is encouraged to use pf.Family or derive from it. This provides minimal encapsulation of tasks, but not of scripts. All tasks with the same name will share the same script. We build such library of classes and objects so we can re-use these components (Tasks, Families, Suites) in different contexts. A given task class could be used in a research workflow and then reused in another operational workflow.
However different contexts may require some differences in the suite execution. To ensure that we still have a concise, maintainable and easily checkable suite, we need to cater for those differences preferably in a single entity (as opposed to spreadout through the suite).
To that aim, we introduce the use of a configuration object that will handle the differences, and therefore interact and configure our objects under each different context.
This results in suites that are configurable for different use-cases and different contexts and build fundamentally different generated suites from the same components
A configuration object can be constructed manually for different use cases or as a result of parsing configuration files. It can be used to:
Provide constants and data for specific cases, that will be needed in the suites
Switch functionality on/off or modify it
Configuration for hosts where to run the tasks
Locations of and details of data to process
But most importantly, as objects, these configuration objects can be programmable in themselves (can include code). The suite components can delegate part of the suite definition to these configurators and as such the structure of the suite can be determined by logic in the configuration object if necessary.
[2]:
with CourseSuite('family_example') as s:
with pf.Family('simple', labels={'example': ''}) as f:
LabelSetter((f.example, 'example text'))
MyFamily('derived_family', 5)
s
[2]:
suite family_example defstatus suspended edit ECF_FILES '/path/to/scratch/files/family_example' edit ECF_HOME '/path/to/scratch/out' edit ECF_JOB_CMD 'bash -c 'export ECF_PORT=%ECF_PORT%; export ECF_HOST=%ECF_HOST%; export ECF_NAME=%ECF_NAME%; export ECF_PASS=%ECF_PASS%; export ECF_TRYNO=%ECF_TRYNO%; export PATH=/usr/local/apps/ecflow/%ECF_VERSION%/bin:$PATH; ecflow_client --init="$$" && %ECF_JOB% && ecflow_client --complete || ecflow_client --abort ' 1> %ECF_JOBOUT% 2>&1 &' edit ECF_KILL_CMD 'pkill -15 -P %ECF_RID%' edit ECF_STATUS_CMD 'true' edit ECF_OUT '%ECF_HOME%' label exec_host "localhost" family simple label example "" task set_labels endfamily family derived_family label total_counters "5" task derived_family_0 edit HALF '0' edit LIMIT '0' label counter_label "count to 0" task derived_family_1 trigger derived_family_0 eq complete edit HALF '1' edit LIMIT '2' label counter_label "count to 2" task derived_family_2 trigger derived_family_1 eq complete edit HALF '2' edit LIMIT '4' label counter_label "count to 4" task derived_family_3 trigger derived_family_2 eq complete edit HALF '3' edit LIMIT '6' label counter_label "count to 6" task derived_family_4 trigger derived_family_3 eq complete edit HALF '4' edit LIMIT '8' label counter_label "count to 8" endfamily endsuite
For more complex functionality containing groups of tasks that require encapsulation we encourage the use of AnchorFamily.
The AnchorFamily class updates the files location according to the relative path of the family from the suite (or previous AnchorFamily). Within an AnchorFamily, all script lookups are relative to this new location, providing isolation and encapsulation.
All tasks with the same name within an ``AnchorFamily`` must share the same script located in the files location for that ``AnchorFamily``.
As such it is encouraged to:
Use
AnchorFamilyto encapsulate independent units within a suite. Typically these are the subtrees that make sense to deploy as a whole.Use
Familyto aggregate tasks that could share scripts with each other. This can be within anAnchorFamily.
The following example shows a suite with identical task names using different scripts, by scoping them with the AnchorFamily.
[3]:
with CourseSuite('anchor_families', files=filesdir) as s:
with pf.Family('f1'):
pf.Task('test1') # Script <files>/test1.ecf
with pf.Family('f2'):
pf.Task('test1') # Script <files>/test1.ecf
with pf.AnchorFamily('f'):
with pf.Family('f1'):
pf.Task('test1') # Script <files>/f/test1.ecf
pf.Task('test2') # Script <files>/f/test2.ecf
with pf.Family('f2'):
pf.Task('test2') # Script <files>/f/test2.ecf
s
[3]:
suite anchor_families defstatus suspended edit ECF_FILES '/path/to/scratch/files' edit ECF_HOME '/path/to/scratch/out' edit ECF_JOB_CMD 'bash -c 'export ECF_PORT=%ECF_PORT%; export ECF_HOST=%ECF_HOST%; export ECF_NAME=%ECF_NAME%; export ECF_PASS=%ECF_PASS%; export ECF_TRYNO=%ECF_TRYNO%; export PATH=/usr/local/apps/ecflow/%ECF_VERSION%/bin:$PATH; ecflow_client --init="$$" && %ECF_JOB% && ecflow_client --complete || ecflow_client --abort ' 1> %ECF_JOBOUT% 2>&1 &' edit ECF_KILL_CMD 'pkill -15 -P %ECF_RID%' edit ECF_STATUS_CMD 'true' edit ECF_OUT '%ECF_HOME%' label exec_host "localhost" family f1 task test1 endfamily family f2 task test1 endfamily family f edit ECF_FILES '/path/to/scratch/files/f' edit ECF_INCLUDE '/path/to/scratch/files/f' family f1 task test1 task test2 endfamily family f2 task test2 endfamily endfamily endsuite
This supports 2 ways of attaching scripts to identical Tasks with different parameters:
Generate one script per task containing the parameters
Use one script that is parameterised by the
Variableson theFamiliesandTasks