Why write a failing 🔴 test?

Sagar Maurya
4 min readMar 31, 2018

We all know red, green and refactor technique. But I was never convinced that I should write a failing test first. I thought — “Why would I fail my test?”. So here is my story on how I learnt it the hard way. That was a time when I lost all my confidence. I wish — nobody has to go through the same in their life.

My Role in the team

If someone has a role of DevOps, you can’t actually say what he/she does. If I have to boast — “I work towards overall delivery of the software and make sure the services are delivered on time as per quality standards defined.” But, that’s not the truth. I was a regular system administrator when I joined the team. I didn’t work alone. There have been at least two people on the team dedicated to manage build infrastructure. So just like pair-programming, I did Pair Administration. However, because our build agents (MacBooks and Linux VMs) had become so stable, we could pick up other IT operation tasks as well, which included introducing fastlane , team password manager, managing pipelines in Gocd, etc. It might appear from a distance that we decided our own priorities and did the stuff according to our will. But, there is a catch. If any misconfiguration were found on the agents or a new tool had to be configured, I and

(my pair) need to tackle it asap. The whole team gets blocked on a slightest issue with the build agents. Hence, it boils down to any machine that needs to be added to our build CI pool should be tested thoroughly beforehand.

The Task

So we had only one version of Xcode on our agents installed, we kept away from having multiple version of Xcode on the same agent. Requirement came to move from Xcode 9.0 to Xcode 9.1. This looked pretty simple to me. I had to setup a machine as we did earlier, the only difference would be the Xcode version. I provisioned a machine and run some jobs on it. All went well. Except, one of the visual comparison test was failing. I saw the logs of it. The logs said that the visual comparison test broke due to the difference between the expected baseline image and the actual screen. It was way more than acceptable difference (3 %). So I thought it may happen that due to some older artefact being fetched, the test failed. Hence, I inspected all of its dependencies — code, artefact, provisioning profiles, configuration, environment variables, network configuration, motherboard, RAM, CPU cycle(pun intended). But, nothing helped. I spent over two weeks understanding why the job failed on the new machine. But, I didn’t get any clue. I lost all my confidence. I was hesitant to update the team that — “I don’t get head or tail of why does the visual tests fails on the machine setup by me.”

After investing two weeks and being in dilemma — to ask for help or not, I thought to consult one of the QAs. He told that it was a valid failure. The test broke because really the baseline is very old and the application’s UI has changed a lot.

The New Task

Now, the problem had reversed. From — “Why is it failing on the new machine?”, it changed to — “Why it was not failing on the existing machines in the CI pool.

I was happy and sad, both at the same time. Happy — because I did setup the machine correctly. Sad — because I lost confidence in the tests that we had. How could we have not figured out that we had a test, which tests nothing at all. Every time we checked in code, we ran the tests and it went green. Nobody thought that while having so many changes in the application’s UI , some or at least one visual comparison test should definitely fail. For months, no one had any suspicion.


The solution was to update the ImageMagick version as it was too old. The earlier version was not compatible with one of our ruby gem. Hence, the visual tests did not ran at all. So, the tests always ran successfully.

While setting up a machine, we always installed the latest version of the ImageMagick. The existed CI machines had older version of ImageMagick which was latest at that point of time when the machines were setup. The one I recently setup had the newer version of ImageMagick. Hence the tests were actually testing and then failing only on the machine under inspection (i.e. the machine which I was testing).


Now either I would write a failing test first or purposely check if the test really do fail when — “expected !== actual”. For example: I need to run visual test. Even if it pass. I will compare the application’s UI with my photograph. And if that thing doesn’t fail — Bye, Bye tests. Because tests which doesn’t test anything at all, doesn’t need to exist at all.

Image Source