Debug MapReduce Code in Eclipse

By Prasad Khode

I am involved in development of a critical solution for a financial organization. The solution involves analysis of multitude of transactions and data coming from multiple resources . Hadoop is the most flexible option to handle big data and it efficiently implements MapReduce.

I share here my experience to execute and debug Map-Reduce code in Eclipse just like any other Java program.

When we run Map-Reduce code in Eclipse, Hadoop runs in a special mode called LocalJobRunner, under which all the Hadoop daemons run in a single JVM (Java Virtual Machine) instead of several different JVMs.

The default file paths are set to local file paths and not of HDFS paths. ..

Step1: Get the code from Git repository by issuing clone command as follows:

$ git clone https://github.com/prasadkhode/wordcount-Debug.git

Step2: Import the project into your Eclipse workspace

step2.fig1

 

Sep3: Setting Breakpoints:

To set breakpoints in your source code right-click in the small left margin in your source code editor and select Toggle Breakpoint. Alternatively you can double-click on the line of the code to debug.

step3.fig1

step3.fig2

Step4: Starting the Debugger:

To debug our application, select a Java file which can be executed (WordCountDriver.java), right-click on it and select Debug As → Debug Configuration.

Go to “Arguments” tab and pass the input arguments, input file name and output folder name as follows:

step4.fig1

Click on Apply and then Debug

step4.fig2

Step5: Controlling the program execution:

Eclipse provides buttons in the toolbar for controlling the execution of the program that we are debugging. It is easier to use the corresponding keys to control this execution.

We can use the F5, F6, F7 and F8 keys to go through our code. Action of each of the four keys is as presented  below:

Key Description
F5 Executes the currently selected line and goes to the next line in our program. If the selected line is a method call the debugger steps into the associated code.
F6 F6 steps over the call, i.e. it executes a method without debugger.
F7 F7 steps out to the caller of the currently executed method. It executes of the current method and returns to the caller of this method.
F8 F8 tells the Eclipse debugger to resume the execution of the program code until it reaches the next break point.