Question
We learn in this lesson that in producing the unique entries in a file, uniq
will only compare lines which are adjacent and so it is best used along with sort
. Why doesn’t uniq
actually return unique entries or simply call sort
on our behalf?
Answer
One possible response to this question has to do with the so called Unix Philosophy which, in brief, focuses on (1) designing small functions with one well-defined purpose and (2) chaining these functions together to accomplish more complex tasks. With this in mind, let’s think about uniq
. An entry in a file is unique if it is not equal to any other entry. This statement is easy to make but invites many different potential solutions. One of which is to first sort the entries and then compare adjacent lines. Remembering that each function should do one thing, uniq
shouldn’t sort your file since sort
can do this. It should only do the check for adjacent pairs.
This decision is a subjective, of course, but similar decisions are made with many command line utilities and so it is helpful to keep this perspective in mind.
Furthermore, note that this decision is not necessarily bad. maybe you know that all duplicates are adjacent and you don’t want the overhead of sorting your file. Then uniq
will still be appropriate for this case.