§1. Overview
Apollo is a toolchain for automatically detecting and diagnosing performance regressions in database systems (DBMSs). We hope that Apollo will assist database system developers with the tedious process of testing DBMSs so that they may focus on more important problems in developing these systems.
Apollo automates the generation of regression-triggering queries, simplifies the bug reporting process for users, and enables developers to quickly pinpoint the root cause of performance regressions. We have discovered ten previously unknown and unique performance regressions in two widely-used database systems using Apollo. In all of these cases, it correctly identified branches related to the root cause. Apollo automatically reduces the query size by 4.2 times on average.
§2. Challenges in developing DBMSs
Developing DBMSs that deliver predictable performance is non-trivial because of complex interactions between different components of the system. When a user upgrades a DBMS installation, such interactions can unexpectedly slow down certain queries. We refer to these bugs that slow down the newer version of the DBMS as performance regression bugs, or regressions for short. To resolve regressions in the upgraded system, users should file regression reports to inform developers about the problem. However, users from other domains, like data scientists, may be unfamiliar with the requirements and process for reporting a regression. This limits their productivity. A critical regression can reduce performance by orders of magnitude, in many cases converting an interactive query to an overnight execution
We are developing the Apollo toolchain to tackle these challenges.
To address these challenges, we designed the Apollo toolchain.
§3. Toolchain
The figure above illustrates the architecture of Apollo. It takes in two versions of one DBMS, and produces a set of performance regression reports.
SQLFuzz: Finding Regressions
Apollo stochastically generates SQL statements for uncovering performance regressions using fuzzing. This technique consists of bombarding a system with many randomly generated inputs. Researchers have successfully used it to find security vulnerabilities and correctness bugs. Unlike those bugs, validating performance regressions is challenging because the ground truth of the regression is unclear and may be heavily affected by the execution environment. We tackle these problems by applying a set of validation checks, incorporating the feedback from DBMS developers, to reduce false positives.
SQLMin: Reducing Queries
When a regression is discovered, the next challenge is for users to report it. As the queries are usually large and may span multiple files, users have to perform query reduction before reporting the bug. However, manual query reduction is time-consuming and challenging, especially for users who are non-experts in the domain of databases. Apollo solves this problem by iteratively distilling a regression-causing statement to its essence. This takes out as many elements of the statement as possible while ensuring that the reduced query still triggers the problem.
SQLDebug: Diagnosing Regressions
Once a regression report is filed, the final challenge is for developers to diagnose its root cause. To accomplish this, a developer either manually examines the program, or utilizes a performance profiler to determine how the CPU time is distributed on different functions. However, this process cannot highlight why the time is distributed in this manner. To simplify the diagnosis process, Apollo uses two techniques to automatically identify the root cause. First, it bisect historical commits to locate the first one that introduces the performance regression. Second, it leverages statistical debugging to co-relate the regression with suspicious source lines within the commit.
§4. Publications
- Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, and Woon–Hak Kang, "APOLLO: automatic detection and diagnosis of performance regressions in database systems," PVLDB, 13(1):57–70, 2019. [PDF][DOI][BIBTEX]
§5. People
§6. Acknowledgements
We thank the anonymous reviewers and development teams of PostgreSQL and SQLite for their helpful feedback. We thank the anonymous reviewers and development teams ofPostgreSQLandSQLitefor their helpful feedback. This researchwas supported, in part, by the NSF awards CNS-1563848, CNS-1704701, CRI-1629851, CNS-1749711, IIS-1850342 and IIS-1908984, ONR under grants N00014-18-1-2662, N00014-15-1-2162, N00014-17-1-2895, DARPA TC (No. DARPA FA8650-15-C-7556), and ETRI IITP/KEIT[B0101-17-0644], and gifts from Facebook, Mozilla, Intel, VMware, Alibaba, and Google.
§7. Future Plans
- We have open-sourced the Apollo toolchain.
- We are developing the next version of this toolchain.
- We are collaborating with Cockroach Labs.
- Drop us a note at: jinho.jung@gatech.edu