With AI-supported observability from black box to white box testing

With AI-supported observability from black box to white box testing

A load test should be a matter of course, especially for business-critical or sales-relevant applications. However, the classic black box performance tests are now reaching their limits in current architectures. An AI-supported white box approach provides better inferences.

Launching a new website, webshop or digital customer service without first carrying out a load test is pure luck. Nevertheless, many companies are still getting involved. This is negligent, because the result of the worst-case scenario is known: the system responds slowly, incompletely or only with error messages. This can lead to an ad campaign coming to nothing or the investment in PR not paying off because the construct collapsed under the onslaught of users. This is expensive for companies and also leaves a negative impression.

The growing complexity of modern architecture

Load testing can prevent this problem from arising in the first place. They are among the classics in the quality assurance of IT systems. The classic black box test of a web server or shop gradually puts the target system under increasing stress using load generators. Its behavior is measured and the selected parameters are logged and evaluated. In this way, it can be determined how many requests can be processed within a unit of time without the system becoming too slow, i.e. the user experience suffers.

However, technology stacks are becoming increasingly complex and extensive. Even an average website today integrates several frameworks, technologies and languages. Numerous microservices can be involved in the process before the end customer puts a product in the shopping cart. If the bottleneck is deep in such a structure, the search for it becomes a complex matter. The classic black box approach to load testing can therefore determine that something is wrong, but can hardly prove the causes. Such a load test delegates error analysis to the development or operations team. If the performance is not right with a certain load, you have to find out whether the fault lies with a component used or an interface.

Almost finished!

Please confirm your email address!

Click on the link in the email we just sent you. Also check your spam folder and whitelist us.

More information about the newsletter.

Load testing needs to get smarter

The smarter solution is to combine load testing with monitoring tools. Black box testing becomes a white box load test. The monitoring then ensures a continuous overview of the resources and the status of applications in all layers of the infrastructure.

With this approach, the use cases and test steps are created and parameterized in the test tool. A combination of JMeter and Jenkins can form the central control unit and take over the task of the automation server. A major advantage over black box load tests is the integration of monitoring, which leads to more transparency. The monitoring monitors the load behavior continuously.

AI can also help monitor application performance and identify anomalies. There is nothing wrong with developing your own AI. However, this is associated with considerable additional effort.

The use of AI allows anomalies to be detected earlier. Using various algorithms, such a system makes a prediction from past data, such as the past seven days. If the response time of a service deviates from this forecast, the system automatically raises an alarm and shows an unexpected increase in load (e.g. CPU usage). This is a decisive advantage over classic monitoring, which is based on threshold values, i.e. static parameters. Even during the load test, the integrated AI monitoring reveals performance weaknesses and can thus identify an application that may be too small.

The intelligent monitoring, which learns thanks to AI, creates a holistic view that includes the entire IT landscape with all applications. Operating and development teams work together better and in a networked manner according to the DevOps principle, since a problem analysis can now be tackled by everyone involved from all areas. This significantly speeds up troubleshooting and ensures efficient collaboration.

It is best to start during development

Not only since the work of Barry Boehm has it been known that the costs of eliminating errors increase the later they are discovered. That is why it is too short-sighted to only establish white-box load tests, including monitoring and AI, shortly before the launch of a new service. The monitoring also provides important insights during ongoing operation. It therefore makes more sense to start with performance tests, as outlined here, during development. The combination of load testing and monitoring over the entire release cycle offers the security that reliable and high-performance releases are rolled out at every step.

The overall system is thus continuously checked for reliability. First of all, the users benefit from this, as they get an optimal user experience. On the other hand, the DevOps team can discover errors and problems faster. This is a win for the entire organization as it is no longer caught off guard by IT disruptions that are business-critical or impact revenue. Correspondingly professionally set up load tests thus contribute to securing sales.


Raphael Pionke has been a DevOps engineer since 2013 and an IT architect in the area of ​​application performance management since 2019. During this time he accompanied numerous projects for SMEs, large companies and the public sector. Since 2020, as a quality architect, he has been working intensively on the topics of performance, DevOps, IaaC, cloud and the implementation and application of intelligent APM solutions.

Daniel Bell has been working in the field of software quality assurance since 2007. During this time he accompanied numerous projects with a technical focus on performance testing. Since 2018 he has been working as a senior IT architect in the area of ​​application performance management and deals with the implementation and use of intelligent APM solutions. The focus here is on optimizing and continuously monitoring the performance of complex IT systems.


Ransomware: ICS and OT affected almost as often as IT systems Previous post Ransomware: ICS and OT affected almost as often as IT systems
More time for what matters with network automation Next post More time for what matters with network automation