Examples from this blog post currently run on a front-end automation suite running Ruby 2.2.0. Jenkins is used as a CI server to run tests from.
The automation test suite here at Venntro has been growing steadily over the past few years, and is now able to test most of the user-facing features on the platform. However, there are some features which have always been difficult for us to automate tests for.
Search and member feeds (which members we recommend to the user on different parts of the site), have been notoriously difficult because once you make changes to the data, they need to be reindexed by the Sphinx search server before changes are actually effected on the front-end. This runs every few minutes on the test server, but as any test automation specialist will tell you, a few seconds is too long for a test to be waiting, let alone a few minutes!
We needed to be able to force this index to be rebuilt, which required us to be able to access the “scary world” of the command line on the test environment host server.
First Iteration: Basic Commands
It turns out Ruby has many ways to access the command line built-in, and the most simple way to do this was to just encase the command we wanted to run in backticks:
One of the documented disadvantages of using this method is that it blocks processing until the command is complete. Thankfully, for a test suite, this actually becomes an advantage; we don’t actually want the test to continue running until the indexing is complete.
Second Iteration: Combatting Stalls
It didn’t take long using this method before tests started to get stuck at this point. Unfortunately, there are many things that can happen when using the command line to cause your test to stall. For example, our script uses a sudo command, and if the user running the test does not have sudo access for the action specified, the command line will prompt for a password. As this task is being run in the background by Ruby, you will never see this prompt.
To combat stalling we reused a tactic from our front-end tests, wrapping the call in a simple timeout:
So, if the call takes more than 30 seconds, the test will abort as a failure, and the cause for it taking so long can be investigated.
Third Iteration: Output
Debugging was the next issue to address. It was great that we now had the ability to run command line operations in our test suite, and a way to deal with the most crippling issue, stalling, but there were other issues that needed to be debugged. If a test failed for a missed expectation after appearing to have executed the command without fail, we had no way to see what had actually happened during the call.
The command to rotate all the Sphinx indices is an interesting one to inspect. It runs through each Sphinx index, performs the rotation, and moves on to the next regardless of whether it successfully performed the rotation or not. Even if a number of the rotations fail, and these failures are shown in the output to the user, once the command is completed, Ruby does not understand these fails. As far as it can tell, the command was completed successfully.
It turns out getting hold of this output is as easy as assigning it to a
variable, so that’s what I did, and immediately output it using the
Not long after implementing this functionality, our test environment was rebuilt on a new server. By seeing this output in our test automation console output, I could alert our Operations team to the fact that the Sphinx user being used did not have access to a specific database, which was causing the index rotation to quietly fail unbeknownst to anyone not closely monitoring the logs.
This worked for some specific scenarios, but one of the review comments I got from my colleague when I proposed it was:
I tested this by reducing the timeout down to a few seconds, and sure enough
found that when using this backticks methods, the output will only be written to
output on completion of the call. Even if the timeout does its
puts output is moved to a rescue block, the variable
output will be
It was time to dig deeper into Ruby’s array of command line tools.
Fourth Iteration: IO.pipes
After a bit of research into the challenge, I stumbled across an awesome StackOverflow answer, which gave a fantastic basis for what would eventually be the method we use to execute command line operations:
There’s quite a lot going on, so I’ll explain piece by piece.
The Setup and Begin block
We start by creating two IO pipes. One will record STDOUT (standard
output stream from the command), and one will record STDERR (standard error
stream). We then set variables
stderr to nil.
After printing out the command being executed, we spawn a new Ruby Process
to run the command given. We make a new process group to run it using
and assign the write ends of our two new IO.pipes to record the output and error
streams. We store the process number as variable
Next, we start our timeout block and use
Process.waitpid(pid) to wait for the
process we spawned to stop. If the time taken to do this does not exceed the
number of seconds specified (30 by default), we close the two write ends of the
IO.pipes (we cannot read from the read ends if not), and then read the read ends
of each pipe, assigning them to the specified variables to print out later.
The Rescue block
If the Begin block does time out, the rescue block will clean up. Firstly, it
will terminate and detach the process we started so it does not either become a
zombie process or affect any other tests unexpectedly. Then a similar process of
closing and reading the IO pipes is performed. We also set a local variable,
true for use in error reporting later.
The Ensure block
The ensure block is run no matter the outcome of the Begin or Rescue blocks, and simply closes any ends of IO pipes that may still be open, and then prints out all the output and all the errors to the console if they exist.
raise is performed if the rescue block was run. This will
immediately fail the test being run, with a very clear message; this particular
command being run took an unexpectedly long time to complete. This, along with
all the output and errors that were written right up to the time of failure or
timeout should give all the information needed to analyse the output.
Hints & Tips
When running commands on a remote host, whether you have access to run those commands is going to be an issue. You should investigate ssh access, tty, known hosts, and use of the sudo command; you will probably need to have a discussion with your Ops team to discuss which commands need running, by which users, and how to do that in a secure way. I found a useful way to ensure if your CI user has access to run a command was to ssh to the CI server as the user which runs automation (in our case, Jenkins uses the default user ‘jenkins’) and trying to run the command.
Any test which requires waiting for a cron job to do something is a prime
candidate for accessing the command line and kicking off the process yourself,
as cron jobs do not by default run at intervals smaller than 1 minute. I’ve also
found that if you need to hit a url in order to trigger an action, but don’t
want your test to navigate away from the page you’re currently on, using this
method to run a
curl command to hit the url works perfectly.
We use Cucumber to organise and execute our test scenarios, which supports
tagging. We tag scenarios depending on which command line operations are run
during the test, for example,
@ssh, so they can be run as a batch if required,
or excluded from test runs if we are experiencing issues with ssh.
Conditionally printed output
If you don’t want all the output printed on every test run, with a bit of tweaking you could:
- Only print the output if the call fails or the timeout is exceeded
- Add an additional argument to the
exec_with_timeoutmethod to control whether output is printed or not
- Create an environment variable to control whether output is printed or not