HBase Advanced operations – Filters Counters

Let’s study HBase Advanced operations – Filters Counters.

Filters

Hbase filters are used to filter the data based on condition. It is very useful in the following cases like,

* It is very useful to reduce the volume of data to be processed.

* Filters in hbase saves network bandwidth while processing with huge data set of client.

* Filter is a powerful tool to process rows, column, column family, value, qualifier and timestamp.

* It enhances the performance while processing bulk data.

* Filter also used to perform ad-hoc analysis.

Note: In order to see list of filters available in hbase we need to use the command called “show_filters”

Let’s see some of the frequently used filters in hbase.

1. FirstKeyOnlyFilter

* FirstKeyOnlyFilter return only primary key-value from each row, it does not take any argument.

Syntax

FirstKeyOnlyFilter ()

Example

2. KeyOnlyFilter

* KeyOnlyFilter returns only the key part of every key-value. It does not take any argument.

Syntax

KeyOnlyFilter ()

Example

3. Prefixfilter

* Prefixfilter returns only those key-values present in a row that starts with the specified row prefix. It will take one argument of prefix i.e row key.

Syntax

PrefixFilter (<row_prefix>)

Example

4. ColumnPrefixFilter

* ColumnPrefixFilter returns only key-values present in a column that starts with the specified column prefix. It takes one argument i.e column prefix.

Syntax

ColumnPrefixFilter (<column_prefix>)

Example

5. MultipleColumnPrefixFilter

* Using MultipleColumnPrefixFilter we can select only the keys with columns that matches a particular prefix. It will take a list of column prefixes as arguments.

Syntax

MultipleColumnPrefixFilter (<column_prefix>,<column_prefix>,….<column_prefix>)

Example

6. ColumnCountGetFilter

* ColumnCountGetFilter that returns first N columns on row only. It takes one argument as limit.

Syntax

ColumnCountGetFilter(<limit>)

Example

7. PageFilter

* PageFilter returns page size number of the rows from the table. It takes one argument  i.e page size.

Syntax

PageFilter (<page_size>)

Example

8. Qualifier Filter (Family Filter)

* Qualifier Filter used to filter based on the column qualifier.

Example

9. ValueFilter

* ValueFilter used to filter based on column value.

Example

Counters

Hbase provides us a mechanism to treat columns as counters. Counters allow us to increment a column value very easily.

When we try to increment a value stored in a table, we should lock the row for long period, to read the value, increment it, write it back to the table then only lock is removed from the row. It causes clash between the clients to access the same row.

This problem is solved is by using counters.

* It is used in analytical systems like digital marketing, click stream analysis and document index model in e-commerce Company.

* Hbase has two types of counter, they are.

  1. Single counter
  2. Multiple counters

Example

In this example will see how counters work on the column level, by creating a table, increments a counter thrice, and then queries the current value.

* We can access the counter with a get call shown below.

* Counter also used to specify a larger increment value.

* The counter is not only used to increment the value but also used to decrease the value too.

Reference

https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/package-summary.html

https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Counter.html

“that’s all about the Advanced operations in Hbase- Filters and Counters”