Apache Hive Data Types

  • Apache Hive Data Types are very important for query language and data modeling (representation of the data structures in a table for a company’s database).
  • It is necessary to know about the data types and its usage to defining the table column types.
  • There are mainly two types of Apache Hive Data Types. They are,
  1. Primitive Data types
  2. Complex Data types

Figure: Hive Data Types

1. Primitive Data type

Primitive data types and sizes are similar to Java or SQL. Again it classified into four types.

  • Numeric Data type
  • String Data type
  • Date/Time Data type
  • Miscellaneous Data type

1.1 Numeric Data type

The Numeric data types and memory allocation is explained below,

TypeMemory Allocation
TINY INT1-byte signed integer (-128 to 127)
SMALL INT2-byte signed integer (-32768 to 32767)
INT4 –byte signed integer ( -2,147,484,648 to 2,147,484,647)
BIG INT8 byte signed integer
FLOAT4 – byte single precision floating point number
DOUBLE8- byte double precision floating point number
DECIMALWe can define precision and scale in this Type

1.2 String Data type

The String data type and memory allocation is explained below,

TypeLength
CHAR255
VARCHAR1 to 65355
STRINGNo Limit

1.3 Date/Time Data type

The date or Time data type and usage is explained below,

Type      Usage
Timestamp· It supports traditional UNIX timestamp with optional                nanosecond precision.

· It supports java.sql

· Timestamp format is “YYYY-MM-DD HH:MM:SS.fffffffff”               and  “yyyy-mm-dd hh:mm:ss.ffffffffff”

Date· Its format is YYYY-MM-DD.

· The range of values is 0000-01-01 to 9999-12-31

1.4 Miscellaneous Data Type

Miscellaneous Data types further classified into two types. They are,

  • BOOLEAN (True/false value)
  • BINARY(byte array)

2. Complex Data type

       Complex Data types further classified into four types. They are explained below,

2.1 ARRAY

  • It is an ordered collection of fields.
  • The fields must all be of the same type

Syntax: ARRAY<data_type>

Example: array (1, 4)

2.2 MAP

  • It is an unordered collection of key-value pairs.
  • Keys must be primitives,values may be any type.

Syntax: MAP<primitive_type,data_type>

Example: map(‘a’,1,’c’,3)

2.3 STRUCT

  • It is a collection of elements of different types.

Syntax: STRUCT<col_name :data_type, ….. >

Example: struct(‘a’, 1 1.0)

2.4 UNION

  • It is a collection of Heterogeneous data types.

Syntax: UNIONTYPE<data_type, data_type, …>

Example: create_union(1, ‘a’, 63)

 

 

This is all about “Apache Hive Data Types”.