Explain key-value pairing in MapReduce

8.B) Explain key-value pairing in MapReduce. – 10 Marks

Answer:-

Each phase (Map phase and Reduce phase) of MapReduce has key-value pairs as input and output. Data should be first converted into key-value pairs before it is passed to the Mapper, as the Mapper only understands key-value pairs of data.

Key-value pairs in Hadoop MapReduce are generated as follows:

  • InputSplit – Defines a logical representation of data and presents a Split data for processing at individual map().
  • RecordReader – Communicates with the InputSplit and converts the Split into records which are in the form of key-value pairs in a format suitable for reading by the Mapper.
  • RecordReader uses TextlnputFormat by default for converting data into key-value pairs.
  • RecordReader communicates with the InputSplit until the file is read.

Generation of a key-value pair in MapReduce depends on the dataset and the required output.

Also, the functions use the key-value pairs at four places: map() input, map() output, reduce() input and reduce() output.

Leave a Reply

Your email address will not be published. Required fields are marked *