Big Data Analytics for Apache Hive

Big Data Analytics & Visualization for Hive on Tez
Share on LinkedIn Tweet about this on Twitter Share on Facebook

What is Apache Tez?

Apache™ Tez is an extensible framework for building high-performance batch and interactive data processing applications, coordinated by YARN in Apache Hadoop. Tez improves the MapReduce paradigm by dramatically improving its speed, while maintaining the MapReduce ability to scale to petabytes of data. Important Hadoop ecosystem projects like Apache Hive and Apache Pig use Apache Tez, as do a growing number of third-party data access applications developed for the broader Hadoop ecosystem.

What is Hive on Tez?

Hive on Tez combines the two technologies, delivering orders of magnitude performance improvements over previous releases of Hive. Hive on Tez is the SQL-in-Hadoop technology recommended by Hortonworks for interactive big data queries.

Logi Composer & Hive on Tez

Logi Composer provides interactive visual analytics of data stored in Hadoop by leveraging Hive on Tez. Logi Composer dynamically generates HiveQL queries that  are pushed to Hive for execution. This approach takes advantage of the rapidly-evolving query performance innovations in Hive, from storage formats like ORC and vectorized operations to cost-based optimization.

Originally published January 15, 2020; updated on March 19th, 2021