SJM+ and my wrapper

SJM is the best cluster manager I have work with

Simple Job Manager (SJM) is a wrapper of Sun Grid Engine (SGE), developed by Stanford SCGPM Bioinformatics.

SJM is a super powerful software that enables us to split a run into several sections. For each section, the standard outputs and standard errors are recorded into separated files with timestamps, so that the users can easily manage their jobs. This is very important for the users to efficiently pick up a small piece of logs from the numerous outputs, and esepcaially essential for the users to find bugs in their test pipeline. SJM also has friendly coding style that close to the native shell. I guess the only one drawback is from the deployment of SGE, which never succeeded in my hands…

SJM was introduced into Zhanglab as early as its establishment. For me, I heavily relay on SJM, but it’s tired to write them one-by-one for a big project. Literately speaking, we have some batching codes in Perl(written in the last centery), but they are too bad in maintenance. So that I finally designed the sjm-tools, a Python based wrapper of SJM.

(To install sjm-tools, please type pip install sjm-tools in your command lines)

With this package, we can simply handle our jobs like this:

Load the package:

from sjm_tools import job,check_env

Check the input variables:

ref_genome = check_env("/path/to/Homo_sapiens.GRCh37.75.dna_sm.primary_assembly.fa")
python = check_env('/share/public/apps/bin/python')

Declear a job:

SJM = name + ".sjm"
workpath = PATH + "/" + name + "/"
if os.path.isdir(workpath) == False:
JOB = job(workpath,SJM)

Add a step:

JOB.add_process("{cutadapt} -a NNNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -A NNNNNNAGATCGGAAGAGCGTCGTGTAGGGAAAGAG -e 0.25 -q 25 --trim-n -o read1.cutadapt.fastq -p read2.cutadapt.fastq ../{read1} ../{read2}".format(cutadapt=cutadapt,name=name,read1=read1,read2=read2))
JOB.add_process("{java} -jar {trimmomatic} PE -threads 2 -phred33 read1.cutadapt.fastq read2.cutadapt.fastq rev.fastq rev.UP.fastq fwd.fastq fwd.UP.fastq HEADCROP:10 SLIDINGWINDOW:4:22 AVGQUAL:25 MINLEN:40".format(java=java,trimmomatic=trimmomatic))
JOB.add_process("rm read1.cutadapt.fastq read2.cutadapt.fastq")

I also have an evil modification of SJM to submit jobs to a specific node, personally I term it SJM+.

(This package is not released on Pypi currently.)
