HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How can successfully pre-verified changes cause regressions that should have been caught?

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
canverifiedbeenprechangesthatregressionshowcaughtcause

Problem

In a CI context one of the commonly-used measures of increasing the quality levels of the integration branch is a mandatory set of pre-commit quality verifications (typically including building some artifacts, performing unit tests and even some feature/integration tests).

Yet some regressions (build breakages, various test failures) are detected by the CI system verifications in exactly the areas which were supposed to be covered by these mandatory pre-commit verifications.

During analysis of these regressions an argument often heard is that the developer who committed the change identified as root-cause of the regression has successfully passed all such verifications. And often the claim is supported by hard evidence indicating that:

  • after the final version of the change was reached it was ported to a fresh workspace based on the tip of the branch



  • the required artifacts were built from scratch (so the build was totally fine, no cache-related issues, etc.)



  • all mandatory tests passed, including those covering the area in question and should have detected the regression



  • no intermittent false-positives affected the respective verifications



  • no file merges were detected when committing the change to the branch



  • none of the files being modified was touched by any other change committed in the branch since the fresh workspace was pulled



Is it really possible for a software change to cause such a regression despite correctly following all the prescribed processes and practices? How?

Solution

There's one possibility I can think of, if when the dev are working on their own workstation, with sometimes images baked for virtual box to run on their workstation where your infrastructure doesn't use the exact same image.

The dev will need, while developing a feature, need to add a JVM parameter or whatever change to the middleware early in its work and forget it.

Before commiting, all unit/integration tests run on its workstation works great, as the baked image is shared, it works on every develloper system.

But when going through CI, it fails because the change to the middle-ware wasn't implemented, either because the dev forgot to ask for it, or just because the team in charge of updating the base images/provisioning system didn't had the time or did forget to update the system.

That's a good thing it break in CI, because it tells early before going into production that the system won't work as expected, but sometimes it becomes a hell to find the missing parameter.

This last point advocate to avoid rejecting commits, and just break on CI on a feature branch, thus it won't block anyone else, and let the dev fix the problem early, when the change is needed and prevent this change to be forget in the flow.

FWIW, we did exactly this here, developers had whole access to development machines and releases in Q/A were failing because a parameter change has been forget, we did move to chef to handle the configuration of the middleware (tomcat now) so each needed change to the infrastructure has to be coded somewhere and will be reproduced in all environment.

Context

StackExchange DevOps Q#398, answer score: 5

Revisions (0)

No revisions yet.