Tuesday, 17 May 2016

SORT BY vs. ORDER BY


  • ORDER BY performs a total ordering of the query result set
    • All the data is passed through a single reducer
    • takes long time to execute for larger data

  • SORT BY sorts the data within each Reducer - Local ordering
    • Each each reducer’s output will be sorted
    • doesn't achieve a total ordering on the dataset


No comments:

Post a Comment

Note: only a member of this blog may post a comment.