Features and limitations of Apache hive
Features of Apache hive
Let us discuss features of Apache Hive one by one
- Apache Hive provides data summarization, query, and analysis in much easier manner.
- It stores schema in a database and processed data into HDFS.
- It support OLAP(Online Analytical Processing).
- Apache Hive can mange low-level interface requirement of Hadoop perfectly.
- Hive Partitions and bucketing of data in the tables improve the performance.
- It is a rule based optimizer (set of rules followed in query execution) to get expected result.
- It is scalable, familiar, and extensible. i.e working with huge volume and variety of data, without affecting performance of the system.
- Hive supports client application(application runs on the work station or personal computer) written in Java, PHP, Python, C++ and Ruby.
- It is as an efficient ETL (Extract, Transform, Load) tool.
- Working with HiveQL does not require any knowledge of programming language, Knowledge of basic SQL query is enough.
- We can easily process structured data in Hadoop using Hive.
- We can also run Ad-hoc queries(loosely typed command/query whose value depends upon some variable) for the data analysis using Hive.
- It can be used for Data Visualization and Apache Tez(integration with Hive) will provides real time processing capabilities.
- Supports to works on the server side of a cluster.
- Apache hive used in some of the area like,
- Log processing(system or network log)
- Text mining(deriving high-quality information from text)
- Business analytics( investigation of past business, performance to gain profit)
- Predictive modeling (uses statistics to predict outcomes)
- Document indexing(organizing and storing documents for later use)
- Data mining( analyzing a large amount of data in a database)
Limitations of Apache Hive
Some of the limitations of Apache Hive are as follows:
- Apache hive does not offer real-time queries and row level updates.
- Latency of Apache Hive queries is generally very high.
- Limited subquery support.
- No support for materialized view.
- update or delete operations are not supported in hive.
- Not designed for OLTP(online transitional process).