We learn in this lesson that in producing the unique entries in a file,
uniq will only compare lines which are adjacent and so it is best used along with
sort. Why doesn’t
uniq actually return unique entries or simply call
sort on our behalf?
One possible response to this question has to do with the so called Unix Philosophy which, in brief, focuses on (1) designing small functions with one well-defined purpose and (2) chaining these functions together to accomplish more complex tasks. With this in mind, let’s think about
uniq. An entry in a file is unique if it is not equal to any other entry. This statement is easy to make but invites many different potential solutions. One of which is to first sort the entries and then compare adjacent lines. Remembering that each function should do one thing,
uniq shouldn’t sort your file since
sort can do this. It should only do the check for adjacent pairs.
This decision is a subjective, of course, but similar decisions are made with many command line utilities and so it is helpful to keep this perspective in mind.
Furthermore, note that this decision is not necessarily bad. maybe you know that all duplicates are adjacent and you don’t want the overhead of sorting your file. Then
uniq will still be appropriate for this case.