Posts

Showing posts from December, 2023

Terminology of CICD

To be clear, there are two types of software: libraries and applications. Libraries are used by applications to provide some key piece of functionality. Applications are always deployed to some Environment and run to perform computation. It should always be clear which one of the types is being implemented in a given piece of code. Building and publishing a new version of source code which is ready for use is called Releasing the library/application. The release is always associated with a new version number. Act of change to use a new version of an application in a given environment is called a Deployment. Deployments may also be other system or data changes that affect state produced by the application or general behavior of the application. All such changes should be associated with some code changes making them easy to track.

On Testing

Test a piece runs A basic building block of a codebase is a function. So for each building block let's just at the very least test it can be called with some reasonable input and produce the right result. If you feel that’s not needed perhaps you don’t need that function at all. That is a base case for testing that prevents breaking what works. It should be enabled to run on all PRs. Every test case should have a human understandable description of the test (LLMs will be able to generate those tests soon, so better be ready). Test it runs end to end Once a bunch of functions are composed things get tricky to check if it all works. So let's test if a thing can be run end to end and produce something sensible at the end. It can be enabled on all PRs and definitely done before any release. Test pieces produce similar results Once a thing runs end to end it will produce a result. But it's hard to say that the result is right without anything to compare. So instead, we will test...

Perils of classes and pointers

When trying to understand or debug a piece of code the amount of state that one needs to keep in mind is inversely proportional with success. This leads us to an obvious advice to try to limit scope of each function to the essential components. However, there are also certain programming concepts that contribute to the amount of state one needs to track. The first one is the use of classes because class attributes are accessible in each method and may change between calls. The second one is a pointer (or reference). Pointers are problematic because, unless explicitly managed like in rust , they make it difficult to determine the owner of data at a given point of computation. This leads to uncertainty about what state a given variable may hold. Be very careful when adding classes and references to your code. Sometimes a pure function is just better.

On engineering realism

A lot of engineering advice is normative in form: “you should do X and avoid Y, because Y is bad engineering”. The nagging character of such statements can create a feeling that Y shouldn’t exist at all. After all it’s bad engineering so why should it be tolerated? There are of course explanations that point to user or business needs being more important than engineering aspects. To add to that, our feelings towards bad engineering should be of a different form. Rather than deny their existence we should acknowledge that bad engineering simply exists and our efforts should be focused on creating an environment where it can not flourish. So when we point out some bad piece of engineering we are simply stating the fact. We are not trying to accuse or attack anyone. Besides fixing the one piece which we believe is broken we should also ask: why did it happen? What can we change about our process to make it less possible to happen? In essence, we should be realists. We should feel that ...