8.B) Explain key-value pairing in MapReduce. – 10 Marks
Answer:-
Each phase (Map phase and Reduce phase) of MapReduce has key-value pairs as input and output. Data should be first converted into key-value pairs before it is passed to the Mapper, as the Mapper only understands key-value pairs of data.
Key-value pairs in Hadoop MapReduce are generated as follows:
- InputSplit – Defines a logical representation of data and presents a Split data for processing at individual map().
- RecordReader – Communicates with the InputSplit and converts the Split into records which are in the form of key-value pairs in a format suitable for reading by the Mapper.
- RecordReader uses TextlnputFormat by default for converting data into key-value pairs.
- RecordReader communicates with the InputSplit until the file is read.
Generation of a key-value pair in MapReduce depends on the dataset and the required output.
Also, the functions use the key-value pairs at four places: map() input, map() output, reduce() input and reduce() output.