p2pMapReduce.mapreduceModule.input
Class TextFileInputFormat

java.lang.Object
  extended by p2pMapReduce.mapreduceModule.input.InputFormat<java.lang.Long,java.lang.String>
      extended by p2pMapReduce.mapreduceModule.input.TextFileInputFormat

public class TextFileInputFormat
extends InputFormat<java.lang.Long,java.lang.String>

Read input text files and generate the InputSplit Each InputSplit is a TextFileInputSplit and always terminate with a whole text line. Provide also a LineRecordReader


Field Summary
static java.lang.String INPUT_PATHS_ATTR
          Deprecated. 
static java.lang.String INPUT_PATHS_SEPARATOR
           
static java.lang.String INPUTSPLIT_SUBDIR_ATTR
           
static java.lang.String LOCAL_INPUT_PATHS_ATTR
           
 
Constructor Summary
TextFileInputFormat()
           
 
Method Summary
static void addInputPath_old(Job job, java.lang.String path)
          Deprecated.  
static void addInputPath(Job job, java.lang.String path)
           
static void addLocalInputPath(Job job, java.lang.String path)
          Add a path to an input of the job.
 RecordReader<java.lang.Long,java.lang.String> createRecordReader(InputSplit split, TaskAttemptContext context)
          Create a record reader for a given split.
static java.lang.String getLocalInputPath(Job job)
           
 java.util.List<InputSplit> getSplits(JobContext jobContext)
          Logically split the set of input files for the job.
static void main(java.lang.String[] args)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOCAL_INPUT_PATHS_ATTR

public static final java.lang.String LOCAL_INPUT_PATHS_ATTR
See Also:
Constant Field Values

INPUT_PATHS_ATTR

@Deprecated
public static final java.lang.String INPUT_PATHS_ATTR
Deprecated. 
See Also:
Constant Field Values

INPUTSPLIT_SUBDIR_ATTR

public static final java.lang.String INPUTSPLIT_SUBDIR_ATTR
See Also:
Constant Field Values

INPUT_PATHS_SEPARATOR

public static final java.lang.String INPUT_PATHS_SEPARATOR
See Also:
Constant Field Values
Constructor Detail

TextFileInputFormat

public TextFileInputFormat()
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Throws:
java.io.IOException

addLocalInputPath

public static void addLocalInputPath(Job job,
                                     java.lang.String path)
Add a path to an input of the job. The path refers to local File System

Parameters:
job -
path -

getLocalInputPath

public static java.lang.String getLocalInputPath(Job job)

addInputPath_old

public static void addInputPath_old(Job job,
                                    java.lang.String path)
Deprecated. 

Add a Path to the list of inputs for the map-reduce job.

Parameters:
job - The Job to modify
path - Path to be added to the list of inputs for the map-reduce job.

addInputPath

public static void addInputPath(Job job,
                                java.lang.String path)

createRecordReader

public RecordReader<java.lang.Long,java.lang.String> createRecordReader(InputSplit split,
                                                                        TaskAttemptContext context)
                                                                 throws java.io.IOException,
                                                                        java.lang.InterruptedException
Description copied from class: InputFormat
Create a record reader for a given split. The framework will call RecordReader.initialize(InputSplit, TaskAttemptContext) before the split is used.

Specified by:
createRecordReader in class InputFormat<java.lang.Long,java.lang.String>
Parameters:
split - the split to be read
context - the information about the task
Returns:
a new record reader
Throws:
java.io.IOException
java.lang.InterruptedException

getSplits

public java.util.List<InputSplit> getSplits(JobContext jobContext)
                                     throws java.io.IOException,
                                            java.lang.InterruptedException
Description copied from class: InputFormat
Logically split the set of input files for the job.

Each InputSplit is then assigned to an individual Mapper for processing.

Note: The split is a logical split of the inputs and the input files are not physically split into chunks. For e.g. a split could be <input-file-path, start, offset> tuple. The InputFormat also creates the RecordReader to read the InputSplit.

Specified by:
getSplits in class InputFormat<java.lang.Long,java.lang.String>
Parameters:
jobContext - job configuration.
Returns:
an array of InputSplits for the job.
Throws:
java.io.IOException
java.lang.InterruptedException