org.apache.hadoop.mrunit
Class MapReduceDriverBase<K1,V1,K2,V2,K3,V3>

java.lang.Object
  extended by org.apache.hadoop.mrunit.TestDriver<K1,V1,K3,V3>
      extended by org.apache.hadoop.mrunit.MapReduceDriverBase<K1,V1,K2,V2,K3,V3>
Direct Known Subclasses:
MapReduceDriver, MapReduceDriver

public abstract class MapReduceDriverBase<K1,V1,K2,V2,K3,V3>
extends TestDriver<K1,V1,K3,V3>

Harness that allows you to test a Mapper and a Reducer instance together You provide the input key and value that should be sent to the Mapper, and outputs you expect to be sent by the Reducer to the collector for those inputs. By calling runTest(), the harness will deliver the input to the Mapper, feed the intermediate results to the Reducer (without checking them), and will check the Reducer's outputs against the expected results. This is designed to handle a single (k, v)* -> (k, v)* case from the Mapper/Reducer pair, representing a single unit test.


Field Summary
protected  List<Pair<K1,V1>> inputList
           
protected  Comparator<K2> keyGroupComparator
          Key group comparator
protected  Comparator<K2> keyValueOrderComparator
          Key value order comparator
static org.apache.commons.logging.Log LOG
           
 
Fields inherited from class org.apache.hadoop.mrunit.TestDriver
configuration, counterWrapper, expectedEnumCounters, expectedOutputs, expectedStringCounters
 
Constructor Summary
MapReduceDriverBase()
           
 
Method Summary
 void addInput(K1 key, V1 val)
          Adds an input to send to the mapper
 void addInput(Pair<K1,V1> input)
          Adds an input to send to the Mapper
 void addInputFromString(String input)
          Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables
 void addOutput(K3 key, V3 val)
          Adds a (k, v) pair we expect as output from the Reducer
 void addOutput(Pair<K3,V3> outputRecord)
          Adds an output (k, v) pair we expect from the Reducer
 void addOutputFromString(String output)
          Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables
abstract  List<Pair<K3,V3>> run()
          Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this)
 void runTest(boolean orderMatters)
          Runs the test and validates the results
 void setKeyGroupingComparator(org.apache.hadoop.io.RawComparator<K2> groupingComparator)
          Set the key grouping comparator, similar to calling the following API calls but passing a real instance rather than just the class: pre 0.20.1 API: JobConf.setOutputValueGroupingComparator(Class) 0.20.1+ API: Job.setGroupingComparatorClass(Class)
 void setKeyOrderComparator(org.apache.hadoop.io.RawComparator<K2> orderComparator)
          Set the key value order comparator, similar to calling the following API calls but passing a real instance rather than just the class: pre 0.20.1 API: JobConf.setOutputKeyComparatorClass(Class) 0.20.1+ API: Job.setSortComparatorClass(Class)
 List<Pair<K2,List<V2>>> shuffle(List<Pair<K2,V2>> mapOutputs)
          Take the outputs from the Mapper, combine all values for the same key, and sort them by key.
 
Methods inherited from class org.apache.hadoop.mrunit.TestDriver
formatValueList, getConfiguration, getExpectedEnumCounters, getExpectedOutputs, getExpectedStringCounters, parseCommaDelimitedList, parseTabbedPair, resetExpectedCounters, resetOutput, runTest, setConfiguration, validate, validate, withCounter, withCounter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

inputList

protected List<Pair<K1,V1>> inputList

keyGroupComparator

protected Comparator<K2> keyGroupComparator
Key group comparator


keyValueOrderComparator

protected Comparator<K2> keyValueOrderComparator
Key value order comparator

Constructor Detail

MapReduceDriverBase

public MapReduceDriverBase()
Method Detail

addInput

public void addInput(K1 key,
                     V1 val)
Adds an input to send to the mapper

Parameters:
key -
val -

addInput

public void addInput(Pair<K1,V1> input)
Adds an input to send to the Mapper

Parameters:
input - The (k, v) pair to add to the input list.

addOutput

public void addOutput(Pair<K3,V3> outputRecord)
Adds an output (k, v) pair we expect from the Reducer

Parameters:
outputRecord - The (k, v) pair to add

addOutput

public void addOutput(K3 key,
                      V3 val)
Adds a (k, v) pair we expect as output from the Reducer

Parameters:
key -
val -

addInputFromString

@Deprecated
public void addInputFromString(String input)
Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

Expects an input of the form "key \t val" Forces the Mapper input types to Text.

Parameters:
input - A string of the form "key \t val". Trims any whitespace.

addOutputFromString

@Deprecated
public void addOutputFromString(String output)
Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

Expects an input of the form "key \t val" Forces the Reducer output types to Text.

Parameters:
output - A string of the form "key \t val". Trims any whitespace.

run

public abstract List<Pair<K3,V3>> run()
                               throws IOException
Description copied from class: TestDriver
Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this)

Specified by:
run in class TestDriver<K1,V1,K3,V3>
Returns:
the list of (k, v) pairs returned as output from the test
Throws:
IOException

runTest

public void runTest(boolean orderMatters)
Description copied from class: TestDriver
Runs the test and validates the results

Specified by:
runTest in class TestDriver<K1,V1,K3,V3>
Parameters:
orderMatters - Whether or not output ordering is important

shuffle

public List<Pair<K2,List<V2>>> shuffle(List<Pair<K2,V2>> mapOutputs)
Take the outputs from the Mapper, combine all values for the same key, and sort them by key.

Parameters:
mapOutputs - An unordered list of (key, val) pairs from the mapper
Returns:
the sorted list of (key, list(val))'s to present to the reducer

setKeyGroupingComparator

public void setKeyGroupingComparator(org.apache.hadoop.io.RawComparator<K2> groupingComparator)
Set the key grouping comparator, similar to calling the following API calls but passing a real instance rather than just the class:

Parameters:
groupingComparator -

setKeyOrderComparator

public void setKeyOrderComparator(org.apache.hadoop.io.RawComparator<K2> orderComparator)
Set the key value order comparator, similar to calling the following API calls but passing a real instance rather than just the class:

Parameters:
orderComparator -


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.