LOCI Software Articles

LOCI software articles

View My GitHub Profile

It is purported that computer programming pioneer Maurice Wilkes described his own experience of the 1940s:

As soon as we started programming, we found to our surprise that it wasn’t as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.

Reducing the problem

To diagnose a problem, it is quite common that the program in question has to be run well beyond a dozen times, making changes in between each execution. This is known as the edit-compile-debug cycle. It is essential to optimize this cycle, by reducing the time it takes to reproduce the problem. Common techniques to accomplish that entail:

Using a debugger

A debugger (such as the one built into Eclipse) can do wonders to diagnose problems. Single-stepping, and inspecting the local variables, just before the problem occurs, often provides enough insight to identify the problem and even to come up with a tentative fix. The trick is typically to get to the point just before the problem occurs.

Some techniques can help getting there quicker:

Using a debugger to attach to a running program

Sometimes it seems impossible to trigger bugs in the debugger, but easy in an external program, say, Fiji or in unit tests run by Maven.

Eclipse’s debugger can actually attach to external Java processes by creating a new debug configuration via Run>Debug Configurations…, selecting the Remote Java Application line on the left side and creating a new launch configuration with the “new document” icon on the upper left side, provided the Java Runtime Environment with the process to be debugged has been started with the option

-agentlib:jdwp=transport=dt\_socket,server=y,suspend=n,address=localhost:8000

where 8000 is the TCP port on which to listen (if that port is used by another program on your computer, you will have to change that both in the command line starting the JVM and the launch configuration). If the problem occurs too quickly for you to attach, you can change the n in suspend=n to y to ask the Java Runtime Environment to wait for a debugger before starting the Java main class.

If the problem occurs in a unit test by Maven, you will want to select only the failing unit test via -Dtest=<class-name>, and if your project is based on pom-scijava, you can configure Java to be started up appropriately by specifying

-DargLine=-agentlib:jdwp=transport=dt\_socket,server=y,suspend=n,address=localhost:8000

(if your project is not based on pom-scijava, you will have to configure the argLine in the maven-surefire-plugin section in your pom.xml).

The GNU debugger gdb can attach to running C/C++ programs using the attach <pid> command. Microsoft Visual Studio can attach to running programs using Debug>Attach to Process..., too.

Debugging Jenkins jobs with a debugger

When a Jenkins job is failing, but you cannot reproduce the problem on your local machine (e.g. when it is a platform-specific bug), all is not lost. You can still use the attach-to-running-program strategy described above, but it is slightly more involved.

First of all, disable the Jenkins job. Really, you do not want Jenkins to interfere with your tests, or worse, your tests to interfere with Jenkins.

Open an SSH tunnel (because you are unlikely to have direct access to the TCP ports of the Jenkins node): log into the Jenkins node via:

ssh -L 8000:127.0.0.1:8000 &lt;jenkins-node&gt;

This will ask ssh to listen on port 8000, and once the debugger connects to it, forward the connection to the remote side’s 127.0.0.1 (localhost) port 8000. Now you can use the strategy described above for attaching to Maven while it runs unit tests.

If you cannot access the Jenkins node via ssh, all is not lost, as long as you can connect somehow (e.g. VNC, remote desktop) and call

ssh -R 8000:127.0.0.1:8000 &lt;your-development-machine&gt;

on the Jenkins node. If your development machine cannot be accessed from the Jenkins node via ssh, you can still tunnel by using a publicly accessible SSH server in the middle.

Preventing future problems

It is human nature to assume that others’ shortcomings do not apply to oneself. However, experience shows that there are no exceptions when it comes to reliving Maurice Wilkes’ story of realizing that more time is spent on figuring out causes and fixes to problems than is spent on introducing said problems.

To that end, it is a remarkably clever idea to introduce regression tests, i.e. small pieces of code that verify that certain functionality of the software works as expected (see also Unit Testing (JUnit) in Developing Software @ LOCI).

It is also a remarkably clever idea to introduce regression tests just after fixing bugs. After all, whenever a bug was found by a user, and had to be diagnosed and fixed by a developer, it demonstrates an obvious lack of regression tests.