Object Oriented Suites

Suite Structural Layout

As a pyflow user, you are encouraged to use Python with statement to build the structure of the suites following the graphical ecFlow tree. Dependencies are then added to form the Directed Graph for execution.

The example below creates an initial simple suite with interdependent tasks. In software terms it is essentially an example of procedural programming.

[2]:
with pf.Suite('first_suite') as s:

    with pf.Family('family1') as f1:
        t1 = pf.Task('t1')
        with pf.Task('t2') as t2:
            pf.Variable('FOO', 'bar')

        t1 >> t2

    with pf.Family('family2') as f2:
        t1 = pf.Task('t1')
        t2 = pf.Task('t2')
        t1 >> t2

    f1 >> f2

s
[2]:
suite first_suite
  edit ECF_JOB_CMD 'bash -c 'export ECF_PORT=%ECF_PORT%; export ECF_HOST=%ECF_HOST%; export ECF_NAME=%ECF_NAME%; export ECF_PASS=%ECF_PASS%; export ECF_TRYNO=%ECF_TRYNO%; export PATH=/usr/local/apps/ecflow/%ECF_VERSION%/bin:$PATH; ecflow_client --init="$$" && %ECF_JOB% && ecflow_client --complete || ecflow_client --abort ' 1> %ECF_JOBOUT% 2>&1 &'
  edit ECF_KILL_CMD 'pkill -15 -P %ECF_RID%'
  edit ECF_STATUS_CMD 'true'
  edit ECF_OUT '%ECF_HOME%'
  label exec_host "default"
  family family1
    task t1
    task t2
      trigger t1 eq complete
      edit FOO 'bar'
  endfamily
  family family2
    trigger family1 eq complete
    task t1
    task t2
      trigger t1 eq complete
  endfamily
endsuite

Whilst procedural programming can be used to build simple suites, to manage long-term lifecycles of complex suites we encourage drawing inspiration from object-oriented software development.

Suites can be split into objects that are derived from pyflow components. Suites can then be assembled from those configurable and reusable objects.

Deriving From Task

Probably the most important pyflow class to subclass is pf.Task. This object describes what should be carried out as one executable unit.

Consider the following non-object-oriented task definition built within a Family.

[3]:
with pf.Family('f') as f:

    variables = {
        'HALF': 7,
        'LIMIT': 2*7
    }

    labels = {
        'a_label': 'with a value'
    }

    t = pf.Task('my_task', labels=labels, defstatus=pf.state.suspended, variables=variables)

    # Note that t is incomplete at this point...
    t.script = [
        'echo "This is a counting task ..."',
        'for i in $(seq 1 $HALF); do echo "count $i/$LIMIT"; done',
        'i=$[$HALF+1]; while [ $i -lt $LIMIT ]; do echo "count $i/$LIMIT" ; i=$[$i+1]; done'
    ]

f
[3]:
  family f
    task my_task
      defstatus suspended
      edit HALF '7'
      edit LIMIT '14'
      label a_label "with a value"
  endfamily

As a suite grows, and the number of tasks increases, the complexity of managing all of these components becomes prohibitive.

We wish to encapsulate all of the functionality related to this task into a single object. As we want to reuse functionality we organise objects into classes. These classes should be appropriately configurable.

As the number of tasks increases, we can re-use the class to create objects with similar behaviour. This in turn will dramatically reduce the complexity of the families and then of the suites.

The above task should now be defined as a reusable class.

[4]:
class MyTask(pf.Task):

    """Counts to the double of a number, first half using a for loop then a while loop"""

    def __init__(self, name, default_value=0, **kwargs):

        variables = {
            'HALF': default_value,
            'LIMIT': 2*default_value,
        }
        variables.update(kwargs.pop('variables', {}))

        labels = {
            'counter_label': 'count to {}'.format(2*default_value)
        }

        script = [
            'echo "This is a counting task named {}"'.format(name),
            'for i in $(seq 1 $HALF); do echo "count $i/$LIMIT"; done',
            'i=$[$HALF+1]; while [ $i -lt $LIMIT ]; do echo "count $i/$LIMIT" ; i=$[$i+1]; done'
        ]

        super().__init__(name,
                         script=script,
                         labels=labels,
                         variables=variables,
                         **kwargs)


with pf.Suite('CountingSuite', files=os.path.join(filesdir, 'CountingSuite')) as s:
    with pf.Family('F') as f:
        MyTask('Seven', 7, defstatus=pf.state.suspended)
        MyTask('Five', 5)

s
[4]:
suite CountingSuite
  edit ECF_FILES '/path/to/scratch/files/CountingSuite'
  edit ECF_JOB_CMD 'bash -c 'export ECF_PORT=%ECF_PORT%; export ECF_HOST=%ECF_HOST%; export ECF_NAME=%ECF_NAME%; export ECF_PASS=%ECF_PASS%; export ECF_TRYNO=%ECF_TRYNO%; export PATH=/usr/local/apps/ecflow/%ECF_VERSION%/bin:$PATH; ecflow_client --init="$$" && %ECF_JOB% && ecflow_client --complete || ecflow_client --abort ' 1> %ECF_JOBOUT% 2>&1 &'
  edit ECF_KILL_CMD 'pkill -15 -P %ECF_RID%'
  edit ECF_STATUS_CMD 'true'
  edit ECF_OUT '%ECF_HOME%'
  label exec_host "default"
  family F
    task Seven
      defstatus suspended
      edit HALF '7'
      edit LIMIT '14'
      label counter_label "count to 14"
    task Five
      edit HALF '5'
      edit LIMIT '10'
      label counter_label "count to 10"
  endfamily
endsuite

Deriving from Family and other pyflow objects

The same process can be used for deriving from families or other pyflow related classes. In this manner we can build up configurable functionality piece by piece.

Note how the family takes an input parameter counters, to control how many tasks it generates internally.

[5]:
class MyFamily(pf.Family):

    def __init__(self, name, counters, **kwargs):

        labels = {
            'total_counters': counters
        }

        super().__init__(name, labels=labels, **kwargs)

        with self:
            pf.sequence(MyTask('{}_{}'.format(name,i), i) for i in range(counters))


with pf.Suite('CountingSuite', files=os.path.join(filesdir, 'CountingSuite')) as s:
    f = MyFamily('TaskCounter', 7)

f
[5]:
  family TaskCounter
    label total_counters "7"
    task TaskCounter_0
      edit HALF '0'
      edit LIMIT '0'
      label counter_label "count to 0"
    task TaskCounter_1
      trigger TaskCounter_0 eq complete
      edit HALF '1'
      edit LIMIT '2'
      label counter_label "count to 2"
    task TaskCounter_2
      trigger TaskCounter_1 eq complete
      edit HALF '2'
      edit LIMIT '4'
      label counter_label "count to 4"
    task TaskCounter_3
      trigger TaskCounter_2 eq complete
      edit HALF '3'
      edit LIMIT '6'
      label counter_label "count to 6"
    task TaskCounter_4
      trigger TaskCounter_3 eq complete
      edit HALF '4'
      edit LIMIT '8'
      label counter_label "count to 8"
    task TaskCounter_5
      trigger TaskCounter_4 eq complete
      edit HALF '5'
      edit LIMIT '10'
      label counter_label "count to 10"
    task TaskCounter_6
      trigger TaskCounter_5 eq complete
      edit HALF '6'
      edit LIMIT '12'
      label counter_label "count to 12"
  endfamily

Composing Suites from Reusable Components

All objects in the suite can be constructed and configured. It is worth noting that the derived class can be used within Python with statements in the same way as the base classes. This allows us to set some values or defaults without forcing us to build the entire suite inside the constructor of a derived type.

[6]:
class CourseSuite(pf.Suite):
    """
    This CourseSuite object will be used throughout the course to provide sensible
    defaults without verbosity
    """
    def __init__(self, name, **kwargs):

        config = {
            'host': pf.LocalHost(),
            'files': os.path.join(filesdir, name),
            'home': outdir,
            'defstatus': pf.state.suspended
        }
        config.update(kwargs)

        super().__init__(name, **config)


with CourseSuite('configurable_suite') as s:
    MyFamily('fam1', 3)
    MyFamily('fam2', 5)

s
[6]:
suite configurable_suite
  defstatus suspended
  edit ECF_FILES '/path/to/scratch/files/configurable_suite'
  edit ECF_HOME '/path/to/scratch/out'
  edit ECF_JOB_CMD 'bash -c 'export ECF_PORT=%ECF_PORT%; export ECF_HOST=%ECF_HOST%; export ECF_NAME=%ECF_NAME%; export ECF_PASS=%ECF_PASS%; export ECF_TRYNO=%ECF_TRYNO%; export PATH=/usr/local/apps/ecflow/%ECF_VERSION%/bin:$PATH; ecflow_client --init="$$" && %ECF_JOB% && ecflow_client --complete || ecflow_client --abort ' 1> %ECF_JOBOUT% 2>&1 &'
  edit ECF_KILL_CMD 'pkill -15 -P %ECF_RID%'
  edit ECF_STATUS_CMD 'true'
  edit ECF_OUT '%ECF_HOME%'
  label exec_host "localhost"
  family fam1
    label total_counters "3"
    task fam1_0
      edit HALF '0'
      edit LIMIT '0'
      label counter_label "count to 0"
    task fam1_1
      trigger fam1_0 eq complete
      edit HALF '1'
      edit LIMIT '2'
      label counter_label "count to 2"
    task fam1_2
      trigger fam1_1 eq complete
      edit HALF '2'
      edit LIMIT '4'
      label counter_label "count to 4"
  endfamily
  family fam2
    label total_counters "5"
    task fam2_0
      edit HALF '0'
      edit LIMIT '0'
      label counter_label "count to 0"
    task fam2_1
      trigger fam2_0 eq complete
      edit HALF '1'
      edit LIMIT '2'
      label counter_label "count to 2"
    task fam2_2
      trigger fam2_1 eq complete
      edit HALF '2'
      edit LIMIT '4'
      label counter_label "count to 4"
    task fam2_3
      trigger fam2_2 eq complete
      edit HALF '3'
      edit LIMIT '6'
      label counter_label "count to 6"
    task fam2_4
      trigger fam2_3 eq complete
      edit HALF '4'
      edit LIMIT '8'
      label counter_label "count to 8"
  endfamily
endsuite

pyflow aims to provide a library of commonly used abstract functionality, but suites should aim to build and collect classes of internally useful functionality which can be used to build a suite out of relevant objects.