Avro Secondary Sorting
In this post, We are gonna see secondary sorting with Avro file format, using AvroJob Api and its gonna be two part series, MapReduce V1 Api (org.apache.hadoop.mapred) MapReduce V2 Api (org.apache.hadoop.mapreduce) In this post we are going to see example in MapReduce V1 Api. So lets start, Here's the Avro Input Schema that we are going to use in this example. Sample Data which we are going to read as input file, for viewing purpose data is shown in JSON format. but it was read as avro format in mapreduce program. Secondary sorting means the values sent to the Reducer should be sorted based on some criteria. you can refer more about secondary sorting here . In this example, Reducer Iterator values should be sorted based the TimeStamp of above data, but the Reducer Input Key should be grouped based on Id of above sample data. We need use MapOutput Key Schema for Secondary Sorting(Natural Key + Composite Key).Here Natural Key is Id and Composite Ke...