Core API

Module contents

Longbow package. Import all of the usable functions to the top level.

longbow.applications module

A module containing methods for processing application command-lines.

The applications module contains methods for processing the aspect of jobs which relate to external applications (such as an MD package). The following methods can be found within this module:

testapp(jobs)

This method will make an attempt to check that the application executables required to run a job/s is present on the specified host/s. This method is capable of using the module system.

processjobs(jobs)

This method will process information that is given as an intended target to be passed on to the executable at run time. It will check that required parameters (provided the respective plug-in is configured correctly) have been supplied, and that all files and their dependencies (again provided that the respective plug-in is configured for this) exist on disk.

longbow.applications.checkapp(jobs)

Test that executables and their modules are launchable.

This method will make an attempt to check that the application executable required to run a job or many jobs is present on the specified host. This method is capable of using the module system using some pre-configured either using user specified modules supplied in configuration files, or by using internal defaults. Users of codes that we are not supporting out of the box, will either have to specify the modules explicitly within configuration files.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.applications.processjobs(jobs)

Process the application portion of the command-line.

This method will process information that is given as an intended target to be passed on to the executable at run time. It will check that required parameters (provided the respective plug-in is configured correctly) have been supplied, and that all files and their dependencies (again provided that the respective plug-in is configured for this) exist on disk.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.configuration module

A module containing methods for dealing with configuration files.

This module contains methods for loading and saving to Longbow (ini) configuration files in addition to methods for extracting the resultant information into Longbow data structures. The templates for these data structures and their forms are also declared within this module.

The following data structures can be found:

JOBTEMPLATE

The template of the job data structure. The Longbow API will assume that variables listed here are to be found in this structure.

The following methods can be found:

processjobs(parameters)

Method for processing the raw configuration structures loaded from the configuration files into Longbow friendly configuration structures. This is where the parameter hierarchy is applied.

loadconfigs(configfile)

Method for loading and extracting data from the Longbow configuration files.

saveconfigs(configfile, params)

Method for saving data to Longbow configuration files, this method will honour comments and simply amend the file structure with new or changed data.

saveini(inifile, params)

A method to save an ini file formatted file (inifile) from a dictionary structure (params). This method is much simpler than the saveconfigs method which has been tuned to simply update configuration files.

longbow.configuration.loadconfigs(configfile)

Load a Longbow configuration file.

Files of this format contain the following mark-up structure.

Sections of a file are marked using square brackets Section then contain option statements of the form “param = value” Comments are marked using hashes.

An example of such a file would be:

[section1] # this is the first option option1 = value1 # this is the second option option2 = value2

[section2] # this is the first option option1 = value1 # this is the second option option2 = value2

[section3] # this is the first option option1 = value1 # this is the second option option2 = value2

This method performs basic error handling to do with the structure of the ini file only. All error handling specific to Longbow should be performed elsewhere.

Required arguments are:

configfile (string): This should be an absolute path to a configuration

file.

Return parameters are:

contents (list): This is the raw file structure where each line is an item

in the list.

sections (list): This is a list of section headers in the data (preserves

order).

data (dict of dicts): This is a structure containing the data loaded from

the file, a dictionary is created for each heading in the ini file. Then the parameters and values under each heading will form a dictionary within the corresponding heading section (dictionary of dictionaries).

longbow.configuration.processconfigs(parameters)

Process the raw configuration sources.

This method is used to create and populate the main “jobs” dictionary with data from various configuration sources. This is where the configuration hierarchy described here is enforced. Developers can use this method for their configuration to create the required “jobs” dictionary for use with the rest of the Longbow library. Developers can also simply create the “jobs” dictionary themselves as it is likely that all of the data that should go into this data structure already exists within other structures of their application, in this case developers should use the JOBTEMPLATE as the template to create this to minimise problems.

Required arguments are:

parameters (dictionary): This parameter is required. It is used to provide

overrides from the application command-line.

Return parameters are:

jobs (dictionary) A fully processed Longbow jobs data structure.

longbow.configuration.saveconfigs(configfile, params)

Save to a Longbow configuration file.

Files of this format contain the following mark-up structure.

Sections of a file are marked using square brackets Section then contain option statements of the form “param = value” Comments are marked using hashes.

An example of such a file would be:

[section1] # this is the first option option1 = value1 # this is the second option option2 = value2

[section2] # this is the first option option1 = value1 # this is the second option option2 = value2

[section3] # this is the first option option1 = value1 # this is the second option option2 = value2

This method is comment safe, this was a major downfall of the standard python parser as it would wipe out comments that a user would include.

Required arguments are:

configfile (string): This should be an absolute path to a configuration

file.

params (dictionary): This should contain the data structure that should

be saved (typically hosts or job configs structure).

longbow.configuration.saveini(inifile, params)

Save to a Longbow recovery file.

This method will write a simple ini formatted file.

Required arguments are:

configfile (string): This should be an absolute path to a revovery file.

params (dictionary): This should contain the data structure that should

be saved (typically hosts or jobs structure).

longbow.entrypoints module

This module contains the Longbow entry points and supporting methods.

The following gives a summary of the methods available:

main()

This method is the main entry point for Longbow launched as an application. Library users should not use this method when linking Longbow at a high level. Developers should be calling Longbowmain() directly with the parameters dictionary already setup.

longbowmain(parameters)

This method is the upper level method of the Longbow library. Users interested in integrating Longbow into their applications without fine grain control may invoke this method, along with creating the data structures that the main entry point of the application would normally create.

recover(recoveryfile)

This method is for attempting to recover a Longbow session. This should be used in cases where jobs have been submitted to the host and somehow Longbow failed to stay connected. This will try to take the recovery file, written shortly after submission to recover the whole session. Jobs that are no longer in the queue will be marked as finished and will be staged as normal.

longbow.entrypoints.launcher()

Entry point for Longbow when used as an application.

This method is the main entry point for Longbow launched as an application. Library users should not use this method when linking Longbow at a high level. Developers doing high level linking should be calling Longbow() directly with the parameters dictionary already setup.

This method takes the information from sys.argv and processes this into a dictionary format ready to fire longbow().

longbow.entrypoints.longbow(jobs, parameters)

Entry point at the top level of the Longbow library.

Being the top level method that makes calls on the Longbow library. This is a good place to link against Longbow if a developer does not want to link against the executable, or if low level linking is not needed or is over-kill.

Required inputs are: parameters (dictionary): A dictionary containing the parameters and

overrides from the command-line.

longbow.entrypoints.recovery(jobs, recoveryfile)

Recover a Longbow session.

This method is for attempting to recover a failed Longbow session or to reconnect to an intentionally disconnected session. It will try to take the recovery file, written shortly after submission to recover the whole session. Once the data has been loaded from the recovery file and a new job data structure populated, this method will then re-enter the monitoring function to continue where it left off. Any jobs that finished in the meantime will be marked accordingly and then file staging will continue.

Required inputs are: recoveryfile (string): A path to the recovery file.

longbow.entrypoints.update(jobs, updatefile)

Trigger update of a disconnected Longbow session.

This method will start the update process on an existing but disconnected Longbow session. All job statuses will be checked and updated in the recovery file and all output files will be synced before disconnecting.

longbow.exceptions module

This module contains exception class definitions for Longbow.

This module contains the ‘custom’ exception classes for the Longbow core library. These exceptions are best used in methods that replace/override those of the standard library, such that a error messages have more specific functionality to Longbow.

exception longbow.exceptions.AbsolutepathError(message, path)

Bases: Exception

Exception class for absolute path errors.

Usage: AbsolutepathError(message, path)

exception longbow.exceptions.CommandlineargsError

Bases: Exception

Command-line arguments exception.

exception longbow.exceptions.ConfigurationError

Bases: Exception

Configuration error.

exception longbow.exceptions.DirectorynotfoundError

Bases: Exception

Directory not found exception.

exception longbow.exceptions.DisconnectException

Bases: Exception

Disconnect exception, for disconnect mode related errors.

exception longbow.exceptions.ExecutableError

Bases: Exception

Executable not found exception.

exception longbow.exceptions.HandlercheckError

Bases: Exception

Job handler checking exception.

exception longbow.exceptions.JobdeleteError

Bases: Exception

Job delete exception.

exception longbow.exceptions.JobsubmitError

Bases: Exception

Job submit exception.

exception longbow.exceptions.LocalcopyError(message, path)

Bases: Exception

Copy on local machine exception.

Usage: LocalcopyError(message, path)

exception longbow.exceptions.LocaldeleteError(message, path)

Bases: Exception

Delete on local machine exception.

Usage: LocalcopyError(message, path)

exception longbow.exceptions.LocallistError(message, path)

Bases: Exception

List on local machine exception.

Usage: LocallistError(message, path)

exception longbow.exceptions.PluginattributeError

Bases: Exception

Missing plugin method exception.

exception longbow.exceptions.QueuemaxError

Bases: Exception

Job submit exception.

exception longbow.exceptions.RemotecopyError(message, src, dst)

Bases: Exception

Copy on remote machine exception.

Usage: LocallistError(message, sourcepath, destinationpath)

exception longbow.exceptions.RemotedeleteError(message, path)

Bases: Exception

Delete on remote machine exception.

Usage: RemotedeleteError(message, path)

exception longbow.exceptions.RemotelistError(message, path)

Bases: Exception

List on remote machine exception.

Usage: RemotelistError(message, path)

exception longbow.exceptions.RemoteworkdirError

Bases: Exception

Remote working directory related generic exception.

exception longbow.exceptions.RequiredinputError

Bases: Exception

Required input error exception.

exception longbow.exceptions.RsyncError(message, shellout)

Bases: Exception

Rsync exception.

Usage: RemotelistError(message, (stdout, stderr, errcode))

exception longbow.exceptions.SSHError(message, shellout)

Bases: Exception

SSH exception.

Usage: RemotelistError(message, (stdout, stderr, errcode))

exception longbow.exceptions.SchedulercheckError

Bases: Exception

Scheduler checking exception.

exception longbow.exceptions.StagingError

Bases: Exception

Generic staging error exception.

exception longbow.exceptions.UpdateExit

Bases: Exception

Exception, to exit gracefully after update of job progress.

longbow.scheduling module

A module containing generic scheduling methods.

This module contains generic methods for preparing, submitting, deleting and monitoring jobs. The methods contained within this module are all based on generic job concepts. The specific functionality that comes from each scheduler is accessed through the plug-in framework. To make use of these methods, the plug-in framework must be present alongside the core library.

testenv(jobs, hostconf)

This method makes an attempt to test the environment and determine from a pre-configured list what scheduler and job submission handler is present on the machine.

delete(job)

A method containing the generic and boiler plate Longbow code for deleting a job.

monitor(jobs)

A method containing the generic and boiler plate Longbow code for monitoring a job, this method contains the entire structure of the loop that deals with monitoring jobs.

prepare(jobs)

A method containing the generic and boiler plate Longbow code for constructing the submit file.

submit(jobs)

A method containing the generic and boiler plate Longbow code for submitting a job.

longbow.scheduling.checkenv(jobs, hostconf)

Determine the scheduler and job handler on a machine.

This method makes an attempt to test the environment and determine from a pre-configured list what scheduler and job submission handler is present on the machine. These are then cached in the users host configuration file so it does not have to repeat this step.

Required arguments are:

hostconf (string) - The path to the host configuration file, this should be

provided so that if any changes are made that they can be saved.

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.scheduling.delete(job)

Delete a job.

This method is for deleting a job, it will only delete a single job at a time. This method is the generic function calling point for the scheduler specific delete method (provided by a plugin) which contains the actual code specific to deleting a job for a given scheduler.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

longbow.scheduling.monitor(jobs)

Monitor the status of jobs (loop).

A method containing the generic and boiler plate Longbow code for monitoring a job, this method contains the entire structure of the loop that deals with monitoring jobs.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.scheduling.prepare(jobs)

Create job submission scripts.

This method will loop through all jobs in the “jobs” data structure and use the parameters for each job to create the submission file. This method acts as a generic interface to scheduler specific plugins which contain the specific code to create the submit file.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.scheduling.submit(jobs)

Submit all jobs.

A method containing the generic and boiler plate Longbow code for submitting a job.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.shellwrappers module

A module containing methods for interacting with the Unix shell.

This module contains methods for interacting with the Unix shell, it includes methods for file manipulation and directory functions. Where possible paths are checked to make sure they are absolute paths.

The following methods can be found:

testconnections(jobs)

This method will test that connections to hosts specified in jobs can be established. Problems encountered at this stage could be due to either badly configured hosts, networking problems, or even system maintenance/ downtime on the HPC host.

sendtoshell(cmd)

This method is responsible for handing off commands to the Unix shell, it makes use of the subprocess library from the Python standard library.

sendtossh(job, args)

This method constructs a string containing commands to be executed via SSH. This string is then handed off to the sendtoshell() method for execution.

sendtorsync(job, src, dst, includemask, excludemask)

This method constructs a string that forms an rsync command, this string is then handed off to the sendtoshell() method for execution.

localcopy(src, dst)

This method is for copying a file/directory between two local paths, this method relies on the Python standard library to perform operations.

localdelete(src)

This method is for deleting a file/directory from the local machine, this method relies on the Python standard library to perform operations.

locallist(src)

This method is for constructing a list of items present within a given directory. This method relies on the Python standard library to perform operations.

remotecopy(job, src, dst)

This method is for copying a file/directory between two paths on a remote host, this is done via passing a copy command to the sendtossh() method.

remotedelete(job)

This method is for deleting a file/directory from a path on a remote host, this is done via passing a delete command to the sendtossh() method.

remotelist(job)

This method is for listing the contents of a directory on a remote host, this is done via passing a list command to the sendtoshell() method.

upload(job)

This method is for uploading files to a remote host, this method is responsible for specifying the direction that the transfer takes place.

download(job)

This method is for downloading files from a remote host, this method is responsible for specifying the direction that the transfer takes place.

longbow.shellwrappers.checkconnections(jobs)

Test that connections to HPC machines can be established.

This method will test that connections to hosts specified in jobs can be established. It will at the same time check if the basic Linux environment looks like it is configured for a non-login shell, if not then the environment fix mode is turned on by setting

jobs[“somejob”][“env-fix] = true

this will make sure that /etc/profile is sourced on certain calls to the remote machine. Problems encountered at this stage could be due to either badly configured hosts, networking problems, or even system maintenance/downtime on the HPC host.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.shellwrappers.download(job)

Download file/s from a remote machine.

This method is for downloading files from a remote host, this method is responsible for specifying the direction that the transfer takes place. This method will make the appropriate call to the rsync method based on data for a given job, the rsync method should not be called directly and this method should be used instead.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

longbow.shellwrappers.localcopy(src, dst)

Copy files from one local path to another.

This method is for copying a file/directory between two local paths, this method relies on the Python standard library to perform operations. This method will test the path and use the correct python methods for transferring an object whether it be a file or a directory. Note that this function requires that you pass absolute paths as both the source and destination paths.

Required arguments are:

src (string) - A string containing the absolute path of the file/directory

to be copied.

dst (string) - A string containing the destination absolute path to be

copied to.

longbow.shellwrappers.localdelete(src)

Delete local file/directory.

This method is for deleting a file/directory from the local machine, this method relies on the Python standard library to perform operations. This method will test the path and use the correct variant of the python methods for deleting a file or a directory. Note that this function requires that you pass absolute paths as the source path.

Required arguments are:

src (string) - A string containing the absolute path of the file/directory

to be deleted.

longbow.shellwrappers.locallist(src)

List the contents of a local directory.

This method is for constructing a list of items present within a given directory. This method relies on the Python standard library to perform operations. Note that this method is not recursive, nor will it give information on whether an object is a file or directory, however it is trivial to run these tests using standard Python. Note that this function requires that you pass absolute paths as the source path.

Required arguments are:

src (string) - A string containing the absolute path to a directory to

be listed.

Return parameters are:

filelist (list) - A list of files within the specified directory.

longbow.shellwrappers.remotecopy(job, src, dst)

Copy files between paths on a remote HPC machine.

This method is for copying a file/directory between two paths on a remote host, this is done via passing a copy command to the sendtossh() method.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

src (string) - A string containing the absolute path of the file/directory

to be copied (on the host).

dst (string) - A string containing the destination absolute path to be

copied to (on the host).

longbow.shellwrappers.remotedelete(job)

Delete a file/directory on a remote HPC machine.

This method is for deleting a file/directory from a path on a remote host, this is done via passing a delete command to the sendtossh() method.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

longbow.shellwrappers.remotelist(job)

List the contents of a directory on a remote HPC machine.

This method is for listing the contents of a directory on a remote host, this is done via passing a list command to the sendtoshell() method.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

Returned parameters are:

filelist (list) - A list of files within the specified directory.

longbow.shellwrappers.sendtorsync(job, src, dst, includemask, excludemask)

Construct Rsync commands and hand them off to the shell.

This method constructs a string that forms an rsync command, this string is then handed off to the sendtoshell() method for execution.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

src (string) - A string containing the source directory for transfer, if

this is a download then this should include the host information. See the download and upload methods for how this should be done (or just make use of those two methods).

dst (string) - A string containing the destination directory for transfer,

if this is an upload then this should include the host information. See the download and upload methods for how this should be done (or just make use of those two methods).

includemask (string) - This is a string that should contain a comma

separated list of files for transfer.

excludemask (string) - This is a string that should specify which files

should be excluded from rsync transfer, this is useful for not transfering large unwanted files.

longbow.shellwrappers.sendtoshell(cmd)

Send assembled commands to the Unix shell.

This method is responsible for handing off commands to the Unix shell, it makes use of the subprocess library from the Python standard library.

Required arguments are:

cmd (string) - A fully qualified Unix command.

Return parameters are:

stdout (string) - Contains the output from the standard output of the Unix

shell.

stderr (string) - Contains the output from the standard error of the Unix

shell.

errorstate (string) - Contains the exit code that the Unix shell exits

with.

longbow.shellwrappers.sendtossh(job, args)

Construct SSH commands and hand them off to the shell.

This method constructs a string containing commands to be executed via SSH. This string is then handed off to the sendtoshell() method for execution.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

args (list) - A list containing commands to be sent to SSH, multiple

commands should each be an entry in the list.

Return parameters are:

shellout (tuple of strings) - Contains the three strings returned from the

sendtoshell() method. These are standard output, standard error and the exit code.

longbow.shellwrappers.upload(job)

Upload a file/s to a remote machine.

This method is for uploading files to a remote host, this method is responsible for specifying the direction that the transfer takes place.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

longbow.staging module

A module containing methods for staging files to and from remote machines.

The staging module provides methods for processing the transfer of files between the local host and the remote host job directories.

The following methods are contained within this module:

stage_upstream(jobs)

A method for staging files for each job to the target HPC host. The underlying utility behind this transfer is rsync, thus it is possible to supply rsync file masks to blacklist unwanted large files. By default rsync is configured to transfer blockwise and only transfer the newest/changed blocks, this saves a lot of time during persistant staging.

stage_downstream(job)

A method for staging files for each job to from target HPC host. The underlying utility behind this transfer is rsync, thus it is possible to supply rsync file masks to blacklist unwanted large files. By default rsync is configured to transfer blockwise and only transfer the newest/changed blocks, this saves a lot of time during persistant staging.

cleanup(jobs)

A method for cleaning up the working directory on the HPC host, this method will only delete job directories that are valid for the given Longbow instance, thus avoid data loss.

longbow.staging.cleanup(jobs)

Clean up the working directory on the HPC machine.

This method will only delete job directories that are valid for jobs within a given Longbow instance, thus avoiding catastrophic data loss. It will also fail gracefully with debug level log messages should the cleanup function be triggered at a stage prior to remote job directory creation. This method also contains the code for cleaning up the recovery file used in the session.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.

longbow.staging.stage_downstream(job)

Transfer all files for a job, back from the HPC machine.

A method for staging files for each job to from target HPC host. The underlying utility behind this transfer is rsync, thus it is possible to supply rsync file masks to blacklist unwanted large files. By default rsync is configured to transfer blockwise and only transfer the newest/changed blocks, this saves a lot of time during persistant staging.

Required arguments are:

job (dictionary) - A single job dictionary, this is often simply passed in

as a subset of the main jobs dictionary.

longbow.staging.stage_upstream(jobs)

Transfer files for all jobs, to a remote HPC machine.

A method for staging files for each job to the target HPC host. The underlying utility behind this transfer is rsync, thus it is possible to supply rsync file masks to blacklist unwanted large files. By default rsync is configured to transfer blockwise and only transfer the newest/changed blocks, this saves a lot of time during persistant staging.

Required arguments are:

jobs (dictionary) - The Longbow jobs data structure, see configuration.py

for more information about the format of this structure.