PYSGE

Import the pysge package (and anything else you need in your submission function)

In [1]:
import pysge
import time

Define the submission function. This can use imported modules or functions defined elsewhere (dill will hopefully find these)

In [2]:
def my_sum(a, b):
    return a + b

def sub_func(a, b, sleep=1):
    time.sleep(sleep)
    return my_sum(a, b)

List of tasks to submit

In [3]:
tasks = [
    {"task": sub_func, "args": (1, 2), "kwargs": {"sleep": 10}},
    {"task": sub_func, "args": (3, 4), "kwargs": {"sleep": 10}},
    {"task": sub_func, "args": (5, 6), "kwargs": {"sleep": 10}},
]
tasks
Out[3]:
[{'task': <function __main__.sub_func(a, b, sleep=1)>,
  'args': (1, 2),
  'kwargs': {'sleep': 10}},
 {'task': <function __main__.sub_func(a, b, sleep=1)>,
  'args': (3, 4),
  'kwargs': {'sleep': 10}},
 {'task': <function __main__.sub_func(a, b, sleep=1)>,
  'args': (5, 6),
  'kwargs': {'sleep': 10}}]
In [4]:
help(pysge.sge_submit)
Help on function sge_submit in module pysge.interface:

sge_submit(tasks, label, tmpdir, options='-q hep.q', dryrun=False, quiet=False, sleep=5, request_resubmission_options=True, return_files=False)
    Submit jobs to an SGE batch system. Return a list of the results of each
    job (i.e. the return values of the function calls)
    
    Parameters
    ----------
    tasks : list
        A list of dictrionaries with the keys: task, args and kwargs. Each
        element is run on a node as task(*args, **kwargs).
    
    label : str
        Label given to the qsub submission script through -N.
    
    tmpdir : str
        Path to temporary directory (doesn't have to exist) where pysge stores
        job infomation. Each call will have a unique identifier in the form
        tpd_YYYYMMDD_hhmmss_xxxxxxxx. Within this directory exists all tasks in
        separate directories with a dilled file, stdout and stderr for that
        particular job.
    
    options : str (default = "-q hep.q")
        Additional options to pass to the qsub command. Take care since the
        following options are already in use: -wd, -V, -e, -o and -t.
    
    dryrun : bool (default = False)
        Create directories and files but don't submit the jobs.
    
    quiet : bool (default = False)
        Don't print tqdm progress bars. Other prints are controlled by logging.
    
    sleep : float (default = 5)
        Minimum time between queries to the batch system.
    
    request_resubmission_options : bool (default = True)
        When a job fails the master process will expect a stdin from the user
        to alter the submission options (e.g. to increase walltime or memory
        requested). If False it will use the original options.
    
    return_files : bool (default = False)
        Instead of opening the output files and loading them into python, just
        send the paths to the output files and let the user deal with them.

In [5]:
results = pysge.sge_submit(tasks, "myjobs", "tmp_path", options="-q hep.q -l h_rt=3:0:0")
2019-08-22 14:32:43,897 - pysge.area - INFO - Creating paths in /vols/build/cms/sdb15/ZinvWidth/zinv-notebooks/tmp_path/tpd_20190822_143243_c5ie1j6r
2019-08-22 14:32:44,053 - pysge.submitter - INFO - Submitted 9414897.1-3:1


In [6]:
results
Out[6]:
[3, 7, 11]

The equivalent running locally is

In [7]:
[task["task"](*task["args"], **task["kwargs"]) for task in tasks]
Out[7]:
[3, 7, 11]
In [ ]: