Exploratory Performance Testing Using Reinforcement Learning

Ahmad, Tanwir; Ashraf, Adnan; Truşcan, Dragoş; Porres, Iván

doi:10.1109/seaa.2019.00032

Cited by 16 publications

(30 citation statements)

References 18 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In our previous work [9], the selection of actions was restricted to a finite discrete action space, which means that at every time step t, the agent could either increase or decrease the value of a single input in the current state by a fixed amount in order to get the next state. Therefore, the agent had to go through numerous unrewarding states in order to get to the rewarding regions of the input space, which resulted in decreasing the overall bottleneck detection rate of the approach.…”

Section: ) Action Spacementioning

confidence: 99%

“…Table 1 lists several examples of the possible actions that the agent can select and how they modify the current state in order to produce the next state. In our previous work [9], we addressed only integer input parameters, but iPerfXRL can be applied to float inputs without any modification. Furthermore, iPerfXRL can easily be extended to support other types of inputs (e.g., string, categorical) by modifying the action and the input space accordingly.…”

Section: ) Action Spacementioning

confidence: 99%

“…The purpose of RQ2 is to measure the effectiveness of iPerfXRL by comparing the number of relevant combinations identified by it against the following alternative approaches: PerfXRL [9], random testing, Deterministic Grid Search (DGS), and Combinatorial Interaction Testing (CIT).…”

Section: Rq2: Effectiveness Of Iperfxrlmentioning

confidence: 99%

“…The work presented in this article is an extension of our work published in [9], where we initially formulated the input space exploration problem for performance testing as a DRL problem. However, in the current article, we provide several improvements: 1) We reformulate the input space exploration problem in the context of DRL by redefining the action space and the reward function.…”

mentioning

confidence: 99%

“…2) We provide the tool support for our approach using Python's Stable Baselines [10] library to automate the exploration process. 3) We empirically evaluate the efficiency and effectiveness of our method against our previous approach (i.e., PerfXRL [9]), random testing, deterministic grid search, and combinatorial interaction testing. 4) We experimentally show that the improved PerfXRL (iPerfXRL) is able to detect more performance bottlenecks than the PerfXRL [9] and, at the same time, it identifies up to 9 times more bottlenecks than the other approaches.…”

mentioning

confidence: 99%

See 4 more Smart Citations

Using Deep Reinforcement Learning for Exploratory Performance Testing of Software Systems With Multi-Dimensional Input Spaces

et al. 2020

Self Cite

View full text Add to dashboard Cite

During exploratory performance testing, software testers evaluate the performance of a software system with different input combinations in order to identify combinations that cause performance problems in the system under test. Performance problems such as low throughput, high response times, hangs, or crashes in software applications have an adverse effect on the customer's satisfaction. Since many of today's largescale, complex software systems (e.g., eCommerce applications, databases, web servers) exhibit very large multi-dimensional input spaces with many input parameters and large ranges, it has become costly and inefficient to explore all possible combinations of inputs in order to detect performance problems. In order to address this issue, we introduce a method for identifying input combinations that trigger performance problems in the software system under test. Our method, under the name of iPerfXRL, employs deep reinforcement learning in order to explore a given large multi-dimensional input space efficiently. The main benefit of the approach is that, during the exploration process, it learns and recognizes the problematic regions of the input space that have a higher chance of triggering performance problems. It concentrates the search in those problematic regions to find as many input combinations as possible that can trigger performance problems while executing a limited number of input combinations against the system. In addition, our approach does not require prior domain knowledge or access to the source code of the system. Therefore, it can be applied to any software system where we can interactively execute different input combinations while monitoring their performance impact on the system. We implement iPerfXRL on top of the Soft Actor-Critic algorithm. We evaluate empirically the efficiency and effectiveness of our approach against alternative state-of-the-art approaches. Our results show that iPerfXRL accurately identifies the problematic regions of the input space and finds up to 9 times more input combinations that trigger performance problems on the system under test than the alternative approaches. INDEX TERMS Exploratory performance testing, Deep reinforcement learning, Test data generation I. INTRODUCTION One of the most critical and challenging tasks for developers is to identify and fix performance problems of software systems [1]. Performance problems such as low throughput, high response times, hangs, or crashes in software applications have an adverse effect on the customer's satisfaction. According to [2], there are higher chances of a software system crashing due to performance problems rather than functional failures. Recent reports [1] show that certain input combinations can trigger more than half of the performance bottlenecks identified in non-trivial software systems. The reason is that certain input combinations can invoke inefficient code sequences or resource-intensive operations, which result in overall system performance degradation commonly referred to as performan...

show abstract

Section: ) Action Spacementioning

confidence: 99%

Section: ) Action Spacementioning

confidence: 99%

Section: Rq2: Effectiveness Of Iperfxrlmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 3 more Smart Citations

Using Deep Reinforcement Learning for Exploratory Performance Testing of Software Systems With Multi-Dimensional Input Spaces

et al. 2020

Self Cite

View full text Add to dashboard Cite

show abstract

The integration of machine learning into automated test generation: A systematic mapping study

Fontes

Gay

2023

Software Testing Verif & Rel

View full text Add to dashboard Cite

Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.

show abstract