DoubleFaceAD:A New Datastore Drive Architecture to Optimize Fanout Query Performance

Abstract

The broad adoption of fanout queries on distributed datastores has made asynchronous event-driven datastore drivers a natural choice due to reduced multithreading overhead. However, through extensive experiments using the latest datastore drivers (e.g., MongoDB, HBase, DynamoDB) and YCSB benchmark, we show that an asynchronous datastore driver can cause unexpected performance degradation especially in fanout-query scenarios. For example, the default MongoDB asynchronous driver adopts the latest Java asynchronous I/O library, which uses a hidden on-demand JVM level thread pool to process fanout query responses, causing a surprising multithreading overhead when the query response size is large. A second instance is the traditional wisdom of modular design of an application server and the embedded asynchronous datastore driver can cause an imbalanced workload between the two components due to lack of coordination, incurring frequent unnecessary system calls. To address the revealed problems, we introduce DoubleFaceAD–a new asynchronous datastore driver architecture that integrates the management of both upstream and downstream workload traffic through a few shared reactor threads, with a fanout-query-aware priority-based scheduling to reduce the overall query waiting time. Our experimental results on three representative application scenarios (YCSB, DBLP, and microservices) show DoubleFaceAD outperforms all other types of datastore drivers up to 34% on throughput and 2.5× faster on 95th percentile response time.

Publication
In Proceedings of the 21st ACM/IFIP International Middleware Conference (Middleware’20)