Pipeline to execute MapReduce jobs.
Args:
job_name: job name as string.
mapper_spec: specification of mapper to use.
reducer_spec: specification of reducer to use.
input_reader_spec: specification of input reader to read data from.
output_writer_spec: specification of output writer to save reduce output to.
mapper_params: parameters to use for mapper phase.
reducer_params: parameters to use for reduce phase.
shards: number of shards to use as int.
combiner_spec: Optional. Specification of a combine function. If not
supplied, no combine step will take place. The combine function takes a
key, list of values and list of previously combined results. It yields
combined values that might be processed by another combiner call, but will
eventually end up in reducer. The combiner output key is assumed to be the
same as the input key.
Returns:
result_status: one of model.MapreduceState._RESULTS. Check this to see
if the job is successful.
default: a list of filenames if the mapreduce was sucesssful and
was outputting files. An empty list otherwise.