Integration of HBase and Hive – An intro to insert JSON data into HBase from Hive

By Anusha Jallipalli

Here is how JSON data  is inserted into HBase table using Hive.

Use the HBaseStorageHandler to register HBase tables with the Hive metastore. You can optionally specify the HBase table as EXTERNAL, in which case , Hive can not drop that table directly . You will have to use the HBase shell command  to drop  such a table.

Registering the table is the first step. As part of the registration, you also need to specify a column mapping. This is how you will have to link Hive column names to the HBase table’s rowkey and columns. Do so using the hbase.columns.mapping SerDe property.

Step 1: Create a new HBase table which is to be managed by Hive

Step 2: data Input file looks like:

Step 3: You can use the get_json_object function to parse the data as a JSON object. For instance, if you create a table – staging with your JSON data:

Step 4: Then use get_json_object to extract the attributes you want to load into the table:

Step 5: Let us scan HBase table to validate data is loaded or not: