Unit Testing MapReduce with MRUnit frameworks

By Anusha Jallipalli

‘Hadoop’ has become indispensable frame work for processing Big Data.  Our clients have recognized the need for processing their huge data to generate revenue. That makes us engaged in development of solutions in Hadoop frame work.

In this discussion, I present a simple and straightforward  way of unit-testing Hadoop MR programs from the eclipse IDE.

MapReduce jobs are relatively simple. In the map phase, each input record has a function applied to it, resulting in one or more key-value pairs. The reduce phase receives a group of the key-value pairs and performs the function over that group.

Testing mappers and reducers is typical of testing any other function- a given input will result in an expected output. …

The complexities arise due to the distributed nature of Hadoop. MRUnit removes much of the Hadoop framework for testing. The focus is narrowed down to the map and reduce code, their inputs, and expected outputs. Testing MapReduce code can be done with MRUnit  entirely in the IDE. Here is  an example to illustrate how MRUnit uses the IdentityMapper provided by the MapReduce framework in the lib folder. The IdentityMapper takes a key-value pair as input and emits the same key-value pair, unchanged.

package com.cloudera.MRUnit;

import junit.framework.TestCase;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.lib.IdentityMapper;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mrunit.MapDriver;

import org.junit.Before;

import org.junit.Test;

public class IdentityMapperTest extends TestCase {

 private Mapper mapper;

 private MapDriver driver;

 @Before

 public void setUp(){

  mapper = new IdentityMapper();

  driver = new MapDriver(mapper);

 }

 @Test

 public void testIdentityMapper1(){

  driver.withInput(new Text(“key”), new Text(“value”))

   .withOutput(new Text(“key”), new Text(“value”))

   .runTest();

 }

 @Test

 public void testIdentityMapper2(){

  driver.withInput(new Text(“key”), new Text(“value”))

   .withOutput(new Text(“key2″), new Text(“value2″))

   .runTest();

 }

Execute the IdentityMapperTest.java and it runs two different tests-the first test pass and second test should fail shown here.

MRUnitTest-fig1     MRUnitTest-fig2