MapReduceDriverBase (MRUnit 1.0.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mrunit
Class MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T extends MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T>>

java.lang.Object
  org.apache.hadoop.mrunit.TestDriver<K1,V1,K3,V3,T>
      org.apache.hadoop.mrunit.MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T>

Direct Known Subclasses:: MapReduceDriver, MapReduceDriver

public abstract class MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T extends MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T>>
extends TestDriver<K1,V1,K3,V3,T>
extends TestDriver<K1,V1,K3,V3,T>

Harness that allows you to test a Mapper and a Reducer instance together You provide the input key and value that should be sent to the Mapper, and outputs you expect to be sent by the Reducer to the collector for those inputs. By calling runTest(), the harness will deliver the input to the Mapper, feed the intermediate results to the Reducer (without checking them), and will check the Reducer's outputs against the expected results. This is designed to handle a single (k, v)* -> (k, v)* case from the Mapper/Reducer pair, representing a single unit test.

Field Summary
`protected List<Pair<K1,V1>>`	`inputList`
`protected Comparator<K2>`	`keyGroupComparator` Key group comparator
`protected Comparator<K2>`	`keyValueOrderComparator` Key value order comparator
`static org.apache.commons.logging.Log`	`LOG`
`protected org.apache.hadoop.fs.Path`	`mapInputPath`

Fields inherited from class org.apache.hadoop.mrunit.TestDriver
`counterWrapper, expectedEnumCounters, expectedOutputs, expectedStringCounters`

Constructor Summary
`MapReduceDriverBase()`

Method Summary
`void`	`addAll(List<Pair<K1,V1>> inputs)` Adds input to send to the mapper
`void`	`addInput(K1 key, V1 val)` Adds an input to send to the mapper
`void`	`addInput(Pair<K1,V1> input)` Adds an input to send to the Mapper
`void`	`addInputFromString(String input)` Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables
`org.apache.hadoop.fs.Path`	`getMapInputPath()`
`protected void`	`preRunChecks(Object mapper, Object reducer)`
`abstract List<Pair<K3,V3>>`	`run()` Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this)
`void`	`setKeyGroupingComparator(org.apache.hadoop.io.RawComparator<K2> groupingComparator)` Set the key grouping comparator, similar to calling the following API calls but passing a real instance rather than just the class: pre 0.20.1 API: `JobConf.setOutputValueGroupingComparator(Class)` 0.20.1+ API: `Job.setGroupingComparatorClass(Class)`
`void`	`setKeyOrderComparator(org.apache.hadoop.io.RawComparator<K2> orderComparator)` Set the key value order comparator, similar to calling the following API calls but passing a real instance rather than just the class: pre 0.20.1 API: `JobConf.setOutputKeyComparatorClass(Class)` 0.20.1+ API: `Job.setSortComparatorClass(Class)`
`void`	`setMapInputPath(org.apache.hadoop.fs.Path mapInputPath)`
`List<Pair<K2,List<V2>>>`	`shuffle(List<Pair<K2,V2>> mapOutputs)` Take the outputs from the Mapper, combine all values for the same key, and sort them by key.
`T`	`withAll(List<Pair<K1,V1>> inputs)` Identical to addAll() but returns self for fluent programming style
`T`	`withInput(K1 key, V1 val)` Identical to addInput() but returns self for fluent programming style
`T`	`withInput(Pair<K1,V1> input)` Identical to addInput() but returns self for fluent programming style
`T`	`withInputFromString(String input)` Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables
`T`	`withKeyGroupingComparator(org.apache.hadoop.io.RawComparator<K2> groupingComparator)` Identical to `setKeyGroupingComparator(RawComparator)`, but with a fluent programming style
`T`	`withKeyOrderComparator(org.apache.hadoop.io.RawComparator<K2> orderComparator)` Identical to `setKeyOrderComparator(RawComparator)`, but with a fluent programming style
`T`	`withMapInputPath(org.apache.hadoop.fs.Path mapInputPath)`

Methods inherited from class org.apache.hadoop.mrunit.TestDriver
addAllOutput, addCacheArchive, addCacheArchive, addCacheFile, addCacheFile, addOutput, addOutput, addOutputFromString, cleanupDistributedCache, copy, copyPair, formatValueList, getConfiguration, getExpectedEnumCounters, getExpectedOutputs, getExpectedStringCounters, getOutputSerializationConfiguration, initDistributedCache, parseCommaDelimitedList, parseTabbedPair, printPreTestDebugLog, resetExpectedCounters, resetOutput, run, runTest, runTest, setCacheArchives, setCacheFiles, setConfiguration, setOutputSerializationConfiguration, thisAsTestDriver, validate, validate, withAllOutput, withCacheArchive, withCacheArchive, withCacheFile, withCacheFile, withConfiguration, withCounter, withCounter, withOutput, withOutput, withOutputFromString, withOutputSerializationConfiguration, withStrictCounterChecking

Methods inherited from class org.apache.hadoop.mrunit.TestDriver

addAllOutput, addCacheArchive, addCacheArchive, addCacheFile, addCacheFile, addOutput, addOutput, addOutputFromString, cleanupDistributedCache, copy, copyPair, formatValueList, getConfiguration, getExpectedEnumCounters, getExpectedOutputs, getExpectedStringCounters, getOutputSerializationConfiguration, initDistributedCache, parseCommaDelimitedList, parseTabbedPair, printPreTestDebugLog, resetExpectedCounters, resetOutput, run, runTest, runTest, setCacheArchives, setCacheFiles, setConfiguration, setOutputSerializationConfiguration, thisAsTestDriver, validate, validate, withAllOutput, withCacheArchive, withCacheArchive, withCacheFile, withCacheFile, withConfiguration, withCounter, withCounter, withOutput, withOutput, withOutputFromString, withOutputSerializationConfiguration, withStrictCounterChecking

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

inputList

protected List<Pair<K1,V1>> inputList

mapInputPath

protected org.apache.hadoop.fs.Path mapInputPath

keyGroupComparator

protected Comparator<K2> keyGroupComparator

Key group comparator

keyValueOrderComparator

protected Comparator<K2> keyValueOrderComparator

Key value order comparator

Constructor Detail

MapReduceDriverBase

public MapReduceDriverBase()

Method Detail

addInput

public void addInput(K1 key,
                     V1 val)

Adds an input to send to the mapper

Parameters:: key -; val -

addInput

public void addInput(Pair<K1,V1> input)

Adds an input to send to the Mapper

Parameters:: input - The (k, v) pair to add to the input list.

addAll

public void addAll(List<Pair<K1,V1>> inputs)

Adds input to send to the mapper

Parameters:: inputs - List of (k, v) pairs to add to the input list

addInputFromString

@Deprecated
public void addInputFromString(String input)

Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

Expects an input of the form "key \t val" Forces the Mapper input types to Text.

Parameters:: input - A string of the form "key \t val". Trims any whitespace.

withInput

public T withInput(K1 key,
                   V1 val)

Identical to addInput() but returns self for fluent programming style

Parameters:: key -; val -
Returns:: this

withInput

public T withInput(Pair<K1,V1> input)

Identical to addInput() but returns self for fluent programming style

Parameters:: input - The (k, v) pair to add
Returns:: this

withInputFromString

@Deprecated
public T withInputFromString(String input)

Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

Identical to addInputFromString, but with a fluent programming style

Parameters:: input - A string of the form "key \t val". Trims any whitespace.
Returns:: this

withAll

public T withAll(List<Pair<K1,V1>> inputs)

Identical to addAll() but returns self for fluent programming style

Parameters:: inputs - List of (k, v) pairs to add
Returns:: this

getMapInputPath

public org.apache.hadoop.fs.Path getMapInputPath()

Returns:: the path passed to the mapper InputSplit

setMapInputPath

public void setMapInputPath(org.apache.hadoop.fs.Path mapInputPath)

Parameters:: mapInputPath - Path which is to be passed to the mappers InputSplit

withMapInputPath

public final T withMapInputPath(org.apache.hadoop.fs.Path mapInputPath)

Parameters:: mapInputPath - The Path object which will be given to the mapper
Returns:

preRunChecks

protected void preRunChecks(Object mapper,
                            Object reducer)

run

public abstract List<Pair<K3,V3>> run()
                               throws IOException

Description copied from class: TestDriver

Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this)

Specified by:: run in class TestDriver<K1,V1,K3,V3,T extends MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T>>

Returns:: the list of (k, v) pairs returned as output from the test
Throws:: IOException

shuffle

public List<Pair<K2,List<V2>>> shuffle(List<Pair<K2,V2>> mapOutputs)

Take the outputs from the Mapper, combine all values for the same key, and sort them by key.

Parameters:: mapOutputs - An unordered list of (key, val) pairs from the mapper
Returns:: the sorted list of (key, list(val))'s to present to the reducer

setKeyGroupingComparator

public void setKeyGroupingComparator(org.apache.hadoop.io.RawComparator<K2> groupingComparator)

Set the key grouping comparator, similar to calling the following API calls but passing a real instance rather than just the class:

pre 0.20.1 API: JobConf.setOutputValueGroupingComparator(Class)
0.20.1+ API: Job.setGroupingComparatorClass(Class)

Parameters:: groupingComparator -

setKeyOrderComparator

public void setKeyOrderComparator(org.apache.hadoop.io.RawComparator<K2> orderComparator)

Set the key value order comparator, similar to calling the following API calls but passing a real instance rather than just the class:

pre 0.20.1 API: JobConf.setOutputKeyComparatorClass(Class)
0.20.1+ API: Job.setSortComparatorClass(Class)

Parameters:: orderComparator -

withKeyGroupingComparator

public T withKeyGroupingComparator(org.apache.hadoop.io.RawComparator<K2> groupingComparator)

Identical to setKeyGroupingComparator(RawComparator), but with a fluent programming style

Parameters:: groupingComparator - Comparator to use in the shuffle stage for key grouping
Returns:: this

withKeyOrderComparator

public T withKeyOrderComparator(org.apache.hadoop.io.RawComparator<K2> orderComparator)

Identical to setKeyOrderComparator(RawComparator), but with a fluent programming style

Parameters:: orderComparator - Comparator to use in the shuffle stage for key value ordering
Returns:: this

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mrunit Class MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T extends MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T>>

LOG

inputList

mapInputPath

keyGroupComparator

keyValueOrderComparator

MapReduceDriverBase

addInput

addInput

addAll

addInputFromString

withInput

withInput

withInputFromString

withAll

getMapInputPath

setMapInputPath

withMapInputPath

preRunChecks

run

shuffle

setKeyGroupingComparator

setKeyOrderComparator

withKeyGroupingComparator

withKeyOrderComparator

org.apache.hadoop.mrunit
Class MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T extends MapReduceDriverBase<K1,V1,K2,V2,K3,V3,T>>