Runs an MRJob on your Hadoop cluster. Invoked when you run your job with -r hadoop.
Input and support files can be either local or on HDFS; use hdfs://... URLs to refer to files on HDFS.
HadoopJobRunner takes the same arguments as MRJobRunner, plus some additional options which can be defaulted in mrjob.conf.
Return the path where Hadoop stores logs.
Parameters: | hadoop_home – putative value of HADOOP_HOME, or None to default to the actual value if used. This is only used if HADOOP_LOG_DIR is not defined. |
---|
Return the path of the hadoop streaming jar inside the given directory tree, or None if we can’t find it.
If path isn’t an hdfs:// URL, turn it into one.