The Zoomdata Query Engine with Pushdown Processing

Share on LinkedIn Tweet about this on Twitter Share on Facebook

In our last post, we explained how Logi Composer’s microservices architecture makes it easier to keep pace with advancements in technology and empower users to perform speed-of-thought analytics. Now, we’ll explore the Query Engine with pushdown processing.

The Logi Composer Query Engine is a microservice that sits between the web application (or a client application built with the Logi Composer JavaScript client) and the Logi Composer Smart Data Connectors. It is purpose-built for a general user population, which can be unpredictable and demanding.

The Query Engine has three primary roles:

  1. Deconstruct and convert user requests into distributed query execution plans
  2. Optimize queries and execution plans based on data platform capabilities, in-memory cached results, and its own Query Engine capabilities
  3. Execute data functions:
    1. Communicate with Logi Composer Smart Data Connectors to execute pushdown queries (see below)
    2. Retrieve data from in-memory cached results as appropriate
    3. Use in-memory processing to combine, append, and/or manipulate one or more data sets to produce only the values needed to fulfill the user’s request.

It can be deployed with all Logi Composer microservices on a single computer, or it can be deployed separately with a modern resource manager like YARN or Kubernetes (coming soon!) to take advantage of distributed processing.

Pushdown Processing

Unlike most traditional and even newer BI alternatives, Logi Composer is intelligently architected to allow users to interact directly with data down to row-level detail (within the context of one’s security privileges, of course!). Pushdown processing complements our implementation of websocket communication, and is necessary to support an ad-hoc, interactive user experience on fresh data. As users explore data at progressively lower levels of detail, Logi Composer pushes processing down as new queries. This is in contrast to other solutions that query and work off data extracts. Extracted datasets, whether in a cube, flat rows, or other format, restricts what and how the analysts engage with their data.

With Logi Composer, the database returns only the values that the Query Engine needs to populate the user’s visualizations. The push-down architecture also avoids scaling up the Query Engine unnecessarily when complex processing can be better executed on high-performance database engines or scalable data platforms.

Query Optimizations Reduce Database and Network Load

Logi Composer pushes down as much work to the underlying data sources as possible. The Query Engine evaluates and optimizes each end-user request, and determines whether to submit all or part of the request to the target data sources. The engine can, if appropriate, push down filtering criteria, aggregations, calculations, and offset, limit, sort, and time bucketing operations.

Pushing down filters means the data platform engine doesn’t need to scan large datasets unnecessarily. It also reduces the amount of data transferred over the network from the data source to Logi Composer. Logi Composer can push down all filters required by a user’s security profile or that a user requests in the web application.

Pushdown of aggregations and calculations optimizes performance for the most resource-intensive operations. Logi Composer always pushes down aggregates: min, max, sum, avg, count, distinct count, last value, and percentiles. Where advantageous, the Query Engine combines several simpler aggregates to compute more complex metrics.

Time for Something Different

Logi Composer offers automatic time bucketing, which allows users to group and filter data by time categories such as current or prior week, month and year, rolling time periods, and so on. There’s no need to pre-aggregate or model time buckets, which frees technical personnel to focus on higher value objectives. All that’s needed is a date-time field. The Query Engine does all the work of interpreting and converting user requests to one or more queries and pushing the whole operation down to the data source.

To re-cap, the benefits of Logi Composer direct-connect with pushdown processing are:

  • The user always has access to fresh data from the source
  • Compute resources are scaled and managed where they make the most sense
  • Network bandwidth is conserved
  • Works very well for hybrid-cloud deployments, since there’s no need for massive data movement between systems

Originally published February 12, 2019; updated on March 19th, 2021