What are some important things to consider to minimize the number of collisions?


#1

Question

In the context of this exercise, what are some important things to consider to minimize the number of collisions in a hash map?

Answer

Having hash collisions is almost inevitable because of probability, so instead of avoiding them entirely, the best we can do is minimize them.

One important thing to consider is the hash function. Ideally, the hash function should distribute your data evenly across all the indexes. In addition, a good compression function can be very important, as it applies with the hash function to determine the indices.

Another important thing to consider is the size of your hash map. Usually, we use the concept of a load factor to determine the ratio of a hash map’s size with the number of entries. The ratio is:
n / k
where n = number of entries and k = number of buckets.
A good load factor is around 0.70 to 0.75.

Having more buckets will mean less collisions, because there are more indexes for each entry to be hashed to, but there is always a tradeoff of space required for additional buckets.