TestDriver (MRUnit 1.0.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mrunit
Class TestDriver<K1,V1,K2,V2,T extends TestDriver<K1,V1,K2,V2,T>>

java.lang.Object
  org.apache.hadoop.mrunit.TestDriver<K1,V1,K2,V2,T>

Direct Known Subclasses:: MapDriverBase, MapReduceDriverBase, PipelineMapReduceDriver, ReduceDriverBase

public abstract class TestDriver<K1,V1,K2,V2,T extends TestDriver<K1,V1,K2,V2,T>>
extends Object
extends Object

Field Summary
`protected org.apache.hadoop.mrunit.internal.counters.CounterWrapper`	`counterWrapper`
`protected List<Pair<Enum<?>,Long>>`	`expectedEnumCounters`
`protected List<Pair<K2,V2>>`	`expectedOutputs`
`protected List<Pair<Pair<String,String>,Long>>`	`expectedStringCounters`
`static org.apache.commons.logging.Log`	`LOG`

Constructor Summary
`TestDriver()`

Method Summary

void addAllOutput(List<Pair<K2,V2>> outputRecords)
Adds output (k, v)* pairs we expect

void addCacheArchive(String path)
Adds an archive to be put on the distributed cache.

void addCacheArchive(URI uri)
Adds an archive to be put on the distributed cache.

void addCacheFile(String path)
Adds a file to be put on the distributed cache.

void addCacheFile(URI uri)
Adds a file to be put on the distributed cache.

void addOutput(K2 key, V2 val)
Adds a (k, v) pair we expect as output

void addOutput(Pair<K2,V2> outputRecord)
Adds an output (k, v) pair we expect

void addOutputFromString(String output)
Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

protected void cleanupDistributedCache()
Cleans up the distributed cache test by deleting the temporary directory and any extracted cache archives contained within

protected 



<E> E

copy(E object)

protected 



<S,E> Pair<S,E>

copyPair(S first,
         E second)

protected static void formatValueList(List<?> values, StringBuilder sb)

org.apache.hadoop.conf.Configuration getConfiguration()

List<Pair<Enum<?>,Long>> getExpectedEnumCounters()

List<Pair<K2,V2>> getExpectedOutputs()

List<Pair<Pair<String,String>,Long>> getExpectedStringCounters()

org.apache.hadoop.conf.Configuration getOutputSerializationConfiguration()
Get the Configuration to use when copying output for use with run* methods or for the InputFormat when reading output back in when setting a real OutputFormat.

protected void initDistributedCache()
Initialises the test distributed cache if required.

protected static List<org.apache.hadoop.io.Text> parseCommaDelimitedList(String commaDelimList)
Split "val,val,val,val..." into a List of Text(val) objects.

static Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> parseTabbedPair(String tabSeparatedPair)
Split "key \t val" into Pair(Text(key), Text(val))

protected void printPreTestDebugLog()
Overridable hook for printing pre-test debug information

void resetExpectedCounters()
Clears the list of expected counters from this driver

void resetOutput()
Clears the list of outputs expected from this driver

abstract List<Pair<K2,V2>> run()
Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this)

List<Pair<K2,V2>> run(boolean validateCounters)
Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this).

void runTest()
Runs the test and validates the results

void runTest(boolean orderMatters)
Runs the test and validates the results

void setCacheArchives(URI[] archives)
Set the list of archives to put on the distributed cache

void setCacheFiles(URI[] files)
Set the list of files to put on the distributed cache

void setConfiguration(org.apache.hadoop.conf.Configuration configuration)
Deprecated. Use getConfiguration() to set configuration items as opposed to overriding the entire configuration object as it's used internally.

void setOutputSerializationConfiguration(org.apache.hadoop.conf.Configuration configuration)
Set the Configuration to use when copying output for use with run* methods or for the InputFormat when reading output back in when setting a real OutputFormat.

protected T thisAsTestDriver()

protected void validate(org.apache.hadoop.mrunit.internal.counters.CounterWrapper counterWrapper)
Check counters.

protected void validate(List<Pair<K2,V2>> outputs, boolean orderMatters)
check the outputs against the expected inputs in record

T withAllOutput(List<Pair<K2,V2>> outputRecords)
Functions like addAllOutput() but returns self for fluent programming style

T withCacheArchive(String archive)
Adds an archive to be put on the distributed cache.

T withCacheArchive(URI archive)
Adds an archive to be put on the distributed cache.

T withCacheFile(String file)
Adds a file to be put on the distributed cache.

T withCacheFile(URI file)
Adds a file to be put on the distributed cache.

T withConfiguration(org.apache.hadoop.conf.Configuration configuration)
Deprecated. Use getConfiguration() to set configuration items as opposed to overriding the entire configuration object as it's used internally.

T withCounter(Enum<?> e, long expectedValue)
Register expected enumeration based counter value

T withCounter(String group, String name, long expectedValue)
Register expected name based counter value

T withOutput(K2 key, V2 val)
Works like addOutput() but returns self for fluent programming style

T withOutput(Pair<K2,V2> outputRecord)
Works like addOutput(), but returns self for fluent style

T withOutputFromString(String output)
Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

T withOutputSerializationConfiguration(org.apache.hadoop.conf.Configuration configuration)
Set the Configuration to use when copying output for use with run* methods or for the InputFormat when reading output back in when setting a real OutputFormat.

T withStrictCounterChecking()
Change counter checking.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

expectedOutputs

protected List<Pair<K2,V2>> expectedOutputs

expectedEnumCounters

protected List<Pair<Enum<?>,Long>> expectedEnumCounters

expectedStringCounters

protected List<Pair<Pair<String,String>,Long>> expectedStringCounters

counterWrapper

protected org.apache.hadoop.mrunit.internal.counters.CounterWrapper counterWrapper

Constructor Detail

TestDriver

public TestDriver()

Method Detail

addAllOutput

public void addAllOutput(List<Pair<K2,V2>> outputRecords)

Adds output (k, v)* pairs we expect

Parameters:: outputRecords - The (k, v)* pairs to add

withAllOutput

public T withAllOutput(List<Pair<K2,V2>> outputRecords)

Functions like addAllOutput() but returns self for fluent programming style

Parameters:: outputRecords -
Returns:: this

addOutput

public void addOutput(Pair<K2,V2> outputRecord)

Adds an output (k, v) pair we expect

Parameters:: outputRecord - The (k, v) pair to add

addOutput

public void addOutput(K2 key,
                      V2 val)

Adds a (k, v) pair we expect as output

Parameters:: key - the key; val - the value

withOutput

public T withOutput(Pair<K2,V2> outputRecord)

Works like addOutput(), but returns self for fluent style

Parameters:: outputRecord -
Returns:: this

withOutput

public T withOutput(K2 key,
                    V2 val)

Works like addOutput() but returns self for fluent programming style

Returns:: this

addOutputFromString

@Deprecated
public void addOutputFromString(String output)

Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

Expects an input of the form "key \t val" Forces the output types to Text.

Parameters:: output - A string of the form "key \t val". Trims any whitespace.

withOutputFromString

@Deprecated
public T withOutputFromString(String output)

Deprecated. No replacement due to lack of type safety and incompatibility with non Text Writables

Identical to addOutputFromString, but with a fluent programming style

Parameters:: output - A string of the form "key \t val". Trims any whitespace.
Returns:: this

getExpectedOutputs

public List<Pair<K2,V2>> getExpectedOutputs()

Returns:: the list of (k, v) pairs expected as output from this driver

resetOutput

public void resetOutput()

Clears the list of outputs expected from this driver

getExpectedEnumCounters

public List<Pair<Enum<?>,Long>> getExpectedEnumCounters()

Returns:: expected counters from this driver

getExpectedStringCounters

public List<Pair<Pair<String,String>,Long>> getExpectedStringCounters()

Returns:: expected counters from this driver

resetExpectedCounters

public void resetExpectedCounters()

Clears the list of expected counters from this driver

thisAsTestDriver

protected T thisAsTestDriver()

withCounter

public T withCounter(Enum<?> e,
                     long expectedValue)

Parameters:: e - Enumeration based counter; expectedValue - Expected value
Returns:

withCounter

public T withCounter(String group,
                     String name,
                     long expectedValue)

Parameters:: group - Counter group; name - Counter name; expectedValue - Expected value
Returns:

withStrictCounterChecking

public T withStrictCounterChecking()

Change counter checking. After this method is called, the test will fail if an actual counter is not matched by an expected counter. By default, the test only check that every expected counter is there. This mode allows you to ensure that no unexpected counters has been declared.

getConfiguration

public org.apache.hadoop.conf.Configuration getConfiguration()

Returns:: The configuration object that will given to the mapper and/or reducer associated with the driver

setConfiguration

@Deprecated
public void setConfiguration(org.apache.hadoop.conf.Configuration configuration)

Deprecated. Use getConfiguration() to set configuration items as opposed to overriding the entire configuration object as it's used internally.

Parameters:: configuration - The configuration object that will given to the mapper and/or reducer associated with the driver. This method should only be called directly after the constructor as the internal state of the driver depends on the configuration object

withConfiguration

@Deprecated
public T withConfiguration(org.apache.hadoop.conf.Configuration configuration)

Deprecated. Use getConfiguration() to set configuration items as opposed to overriding the entire configuration object as it's used internally.

Parameters:: configuration - The configuration object that will given to the mapper associated with the driver. This method should only be called directly after the constructor as the internal state of the driver depends on the configuration object
Returns:: this object for fluent coding

getOutputSerializationConfiguration

public org.apache.hadoop.conf.Configuration getOutputSerializationConfiguration()

Get the Configuration to use when copying output for use with run* methods or for the InputFormat when reading output back in when setting a real OutputFormat.

Returns:: outputSerializationConfiguration, null when no outputSerializationConfiguration is set

setOutputSerializationConfiguration

public void setOutputSerializationConfiguration(org.apache.hadoop.conf.Configuration configuration)

Set the Configuration to use when copying output for use with run* methods or for the InputFormat when reading output back in when setting a real OutputFormat. When this configuration is not set, MRUnit will use the configuration set with withConfiguration(Configuration) or setConfiguration(Configuration)

Parameters:: configuration -

withOutputSerializationConfiguration

public T withOutputSerializationConfiguration(org.apache.hadoop.conf.Configuration configuration)

Parameters:: configuration -
Returns:: this for fluent style

addCacheFile

public void addCacheFile(String path)

Adds a file to be put on the distributed cache. The path may be relative and will try to be resolved from the classpath of the test.

Parameters:: path - path to the file

addCacheFile

public void addCacheFile(URI uri)

Adds a file to be put on the distributed cache.

Parameters:: uri - uri of the file

setCacheFiles

public void setCacheFiles(URI[] files)

Set the list of files to put on the distributed cache

Parameters:: files - list of URIs

addCacheArchive

public void addCacheArchive(String path)

Adds an archive to be put on the distributed cache. The path may be relative and will try to be resolved from the classpath of the test.

Parameters:: path - path to the archive

addCacheArchive

public void addCacheArchive(URI uri)

Adds an archive to be put on the distributed cache.

Parameters:: uri - uri of the archive

setCacheArchives

public void setCacheArchives(URI[] archives)

Set the list of archives to put on the distributed cache

Parameters:: archives - list of URIs

withCacheFile

public T withCacheFile(String file)

Adds a file to be put on the distributed cache. The path may be relative and will try to be resolved from the classpath of the test.

Parameters:: file - path to the file
Returns:: the driver

withCacheFile

public T withCacheFile(URI file)

Adds a file to be put on the distributed cache.

Parameters:: file - uri of the file
Returns:: the driver

withCacheArchive

public T withCacheArchive(String archive)

Adds an archive to be put on the distributed cache. The path may be relative and will try to be resolved from the classpath of the test.

Parameters:: archive - path to the archive
Returns:: the driver

withCacheArchive

public T withCacheArchive(URI archive)

Adds an archive to be put on the distributed cache.

Parameters:: file - uri of the archive
Returns:: the driver

run

public List<Pair<K2,V2>> run(boolean validateCounters)
                      throws IOException

Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this). Also optionally performs counter validation.

Parameters:: validateCounters - whether to run automatic counter validation
Returns:: the list of (k, v) pairs returned as output from the test
Throws:: IOException

initDistributedCache

protected void initDistributedCache()
                             throws IOException

Initialises the test distributed cache if required. This process is referred to as "localizing" by Hadoop, but since this is a unit test all files/archives are already local. Cached files are not moved but cached archives are extracted into a temporary directory.

Throws:: IOException

cleanupDistributedCache

protected void cleanupDistributedCache()
                                throws IOException

Cleans up the distributed cache test by deleting the temporary directory and any extracted cache archives contained within

Throws:: IOException - if the local fs handle cannot be retrieved

run

public abstract List<Pair<K2,V2>> run()
                               throws IOException

Runs the test but returns the result set instead of validating it (ignores any addOutput(), etc calls made before this)

Returns:: the list of (k, v) pairs returned as output from the test
Throws:: IOException

runTest

public void runTest()
             throws IOException

Runs the test and validates the results

Throws:: IOException

runTest

public void runTest(boolean orderMatters)
             throws IOException

Runs the test and validates the results

Parameters:: orderMatters - Whether or not output ordering is important
Throws:: IOException

printPreTestDebugLog

protected void printPreTestDebugLog()

Overridable hook for printing pre-test debug information

parseTabbedPair

public static Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> parseTabbedPair(String tabSeparatedPair)

Split "key \t val" into Pair(Text(key), Text(val))

Parameters:: tabSeparatedPair -
Returns:

parseCommaDelimitedList

protected static List<org.apache.hadoop.io.Text> parseCommaDelimitedList(String commaDelimList)

Split "val,val,val,val..." into a List of Text(val) objects.

Parameters:: commaDelimList - A list of values separated by commas

copy

protected <E> E copy(E object)

copyPair

protected <S,E> Pair<S,E> copyPair(S first,
                                   E second)

validate

protected void validate(List<Pair<K2,V2>> outputs,
                        boolean orderMatters)

check the outputs against the expected inputs in record

Parameters:: outputs - The actual output (k, v) pairs; orderMatters - Whether or not output ordering is important when validating test result

validate

protected void validate(org.apache.hadoop.mrunit.internal.counters.CounterWrapper counterWrapper)

Check counters.

formatValueList

protected static void formatValueList(List<?> values,
                                      StringBuilder sb)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mrunit Class TestDriver<K1,V1,K2,V2,T extends TestDriver<K1,V1,K2,V2,T>>

LOG

expectedOutputs

expectedEnumCounters

expectedStringCounters

counterWrapper

TestDriver

addAllOutput

withAllOutput

addOutput

addOutput

withOutput

withOutput

addOutputFromString

withOutputFromString

getExpectedOutputs

resetOutput

getExpectedEnumCounters

getExpectedStringCounters

resetExpectedCounters

thisAsTestDriver

withCounter

withCounter

withStrictCounterChecking

getConfiguration

setConfiguration

withConfiguration

getOutputSerializationConfiguration

setOutputSerializationConfiguration

withOutputSerializationConfiguration

addCacheFile

addCacheFile

setCacheFiles

addCacheArchive

addCacheArchive

setCacheArchives

withCacheFile

withCacheFile

withCacheArchive

withCacheArchive

run

initDistributedCache

cleanupDistributedCache

run

runTest

runTest

printPreTestDebugLog

parseTabbedPair

parseCommaDelimitedList

copy

copyPair

validate

validate

formatValueList

org.apache.hadoop.mrunit
Class TestDriver<K1,V1,K2,V2,T extends TestDriver<K1,V1,K2,V2,T>>