Simple LaTeX Table Generator

Anyone who's ever had to type up a large table in LaTeX knows that it can be a bit of work. When faced with a particulalry large table myself, I of course thought "why not python?".

It turns out there are already a few ways to generate latex tables, but here's my take:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
""" This short script converts a CSV table into latex table.
 
Command Line Arguments:
 
required positional arguments:
infile input file name
 
optional arguments:
-h, --help show this help message and exit
-ncols N, --numbercolumns N
number of columns in file
-vd, --verticaldivider
adds vertical dividers to table
-hd, --horizontaldivider
adds horizontal dividers to table
"""
 
import csv
import sys
import argparse
 
# define and parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument("infile", help="input file name")
parser.add_argument("-ncols", "--numbercolumns", type=int, help="number of columns in file", default=2)
parser.add_argument("-vd", "--verticaldivider", action="store_true", help="adds vertical dividers to table")
parser.add_argument("-hd", "--horizontaldivider", action="store_true", help="adds horizontal dividers to table")
args = parser.parse_args()
 
# csv input and latex table output files
infile = args.infile
outfile = infile +".table"
 
with open(infile, 'r') as inf:
    with open(outfile, 'w') as out:
        reader = csv.reader(inf)
 
        # build the table beginning code based on number of columns and args
        # columns all left justified
        code_header = "\\begin{tabular}{"
        for i in range(args.numbercolumns):
            code_header += " l "
            if i < args.numbercolumns - 1 and args.verticaldivider:
                code_header += "|"
        code_header += "}\n\\hline\n"
        out.write(code_header)
 
        # begin writing data
        for row in reader:
            # replace "," with "&"
            if args.horizontaldivider:
                out.write(" & ".join(row) + " \\\\ \\hline\n")
            else:
                out.write(" & ".join(row) + " \\\\ \n")
 
        if not args.horizontaldivider:
            out.write("\\hline\n")
 
        out.write("\\end{tabular}")

Example input file:

1,2,3
4,5,6

Running with the -vd and -hd flags to specify vertical and horizontal dividers produces:

\begin{tabular}{ l | l | l }
\hline
1 & 2 & 3 \\ \hline
4 & 5 & 6 \\ \hline
\end{tabular}

It's very minimal, and the main idea is that it does 95% of the work for you, leaving only very minor cosmetic tweaks.

Custom PBS qstat output

I recently became slightly annoyed with the information being displayed by PBS's qstat command.  My main issue was that a simple qstat tends to cut off job names, which are very important if you're running multiple jobs with long, similar names that can't be distinguished when trimmed.  The other extreme, qstat -f, prints way too much information that's difficult to efficiently navigate through.

There's probably an option flag that's midway between the two, but it seemed like a fun idea to write a simple intercepting script that only printed a couple things I found useful.

First, here's the first few lines of one job from the output of qstat -f to give you an idea of what the script is working with:

Job Id: 54314.master.localdomain
    Job_Name = df-AC6hex-N2-h2-HSE1PBE-opt-gdv
    Job_Owner = bw@master.localdomain
    resources_used.cput = 113:03:48
    resources_used.mem = 3177372kb
    resources_used.vmem = 4856612kb
    resources_used.walltime = 118:20:42
    job_state = R
    queue = verylong
    ...

In the output, each job is separated by a blank line.  So, here's a python script that strips away some of the unneeded info, while printing the full job name:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#! /usr/bin/python
 
import subprocess
 
# get user name
user = subprocess.check_output(['whoami']).strip()
# get all jobs data
out = subprocess.check_output(['qstat','-f'])
lines = out.split('\n')
 
# build list of jobs, each job is a dictionary
jobs = []
for line in lines:
    if "Job Id:" in line:  # new job
        job = {}
        s = line.split(":")
        job_id = s[1].split('.')[0].strip()
        job[s[0].strip()] = job_id
    if '=' in line:
        s = line.split("=")
        job[s[0].strip()] = s[1].strip()
    elif line == '':
        jobs.append(job)
 
# print out useful information about user's jobs
print "\n   " + user + "'s jobs:\n"
for job in jobs:
    if job['Job_Owner'].split('@')[0] == user:        
        print "   " +  job['Job_Name']
        print "   Id: " + job['Job Id']
        print "   Wall time: " + job['resources_used.walltime']
        print "   State: " + job['job_state']
        print

Snippet of example output:

   bw's jobs:

   df-AC6hex-N2-h2-HSE1PBE-opt-gdv
   Id: 54314
   Wall time: 118:20:42
   State: R

   df-AC6hex-N2-h1b-HSE1PBE-opt-gdv
   Id: 54317
   Wall time: 118:13:38
   State: R

   df-AC6hex-N2-h2b-HSE1PBE-opt-gdv
   Id: 54321
   Wall time: 118:13:39
   State: R

   ...

The output of the command qstat -f is captured by python via the subprocess.check_output() function and organized into a dictionary, which allows for easy customization of what's printed out.  After that, it's just some basic string processing and printing.  Note also that it only prints information about the jobs of the user who is running the script.