Why write a failing 🔴 test?

We all know red, green and refactor technique. But I was never convinced that I should write a failing test first. I thought — “Why would I fail my test?”. So here is my story on how I learnt it the hard way. That was a time when I lost all my confidence. I wish — nobody has to go through the same in their life.

My Role in the team

If someone has a role of DevOps, you can’t actually say what he/she does. If I have to boast — “I work towards overall delivery of the software and make sure the services are delivered on time as per quality standards defined.” But, that’s not the truth. I was a regular system administrator when I joined the team. I didn’t work alone. There have been at least two people on the team dedicated to manage build infrastructure. So just like pair-programming, I did Pair Administration. However, because our build agents (MacBooks and Linux VMs) had become so stable, we could pick up other IT operation tasks as well, which included introducing fastlane , team password manager, managing pipelines in Gocd, etc. It might appear from a distance that we decided our own priorities and did the stuff according to our will. But, there is a catch. If any misconfiguration were found on the agents or a new tool had to be configured, I and Gopal Singhal(my pair) need to tackle it asap. The whole team gets blocked on a slightest issue with the build agents. Hence, it boils down to any machine that needs to be added to our build CI pool should be tested thoroughly beforehand.

The Task

So we had only one version of Xcode on our agents installed, we kept away from having multiple version of Xcode on the same agent. Requirement came to move from Xcode 9.0 to Xcode 9.1. This looked pretty simple to me. I had to setup a machine as we did earlier, the only difference would be the Xcode version. I provisioned a machine and run some jobs on it. All went well. Except, one of the visual comparison test was failing. I saw the logs of it. The logs said that the visual comparison test broke due to the difference between the expected baseline image and the actual screen. It was way more than acceptable difference (3 %). So I thought it may happen that due to some older artefact being fetched, the test failed. Hence, I inspected all of its dependencies — code, artefact, provisioning profiles, configuration, environment variables, network configuration, motherboard, RAM, CPU cycle(pun intended). But, nothing helped. I spent over two weeks understanding why the job failed on the new machine. But, I didn’t get any clue. I lost all my confidence. I was hesitant to update the team that — “I don’t get head or tail of why does the visual tests fails on the machine setup by me.”

The New Task

Now, the problem had reversed. From — “Why is it failing on the new machine?”, it changed to — “Why it was not failing on the existing machines in the CI pool.

Solution

The solution was to update the ImageMagick version as it was too old. The earlier version was not compatible with one of our ruby gem. Hence, the visual tests did not ran at all. So, the tests always ran successfully.

Conclusion

Now either I would write a failing test first or purposely check if the test really do fail when — “expected !== actual”. For example: I need to run visual test. Even if it pass. I will compare the application’s UI with my photograph. And if that thing doesn’t fail — Bye, Bye tests. Because tests which doesn’t test anything at all, doesn’t need to exist at all.

Image Source