⟩ What is hashing in C?
Hashing is the process of mapping strings to integers, usually in a relatively small range. A ``hash function'' maps a string (or some other data structure) to a bounded number (the ``hash bucket'') which can more easily be used as an index in an array, or for performing repeated comparisons. (Obviously, a mapping from a potentially huge set of strings to a small set of integers will not be unique. Any algorithm using hashing therefore has to deal with the possibility of ``collisions.'')
Many hashing functions and related algorithms have been developed; a full treatment is beyond the scope of this list. An extremely simple hash function for strings is simply to add up the values of all the characters:
unsigned hash(char *str)
{
unsigned int h = 0;
while(*str != '')
h += *str++;
return h % NBUCKETS;
}
A somewhat better hash function is
unsigned hash(char *str)
{
unsigned int h = 0;
while(*str != '')
h = (256 * h + *str++) % NBUCKETS;
return h;
}