c语言 - 我不明白这个哈希函数代码 - c - I don't understand this hash-function code 小贝子编程网

有人能解释一下这个散列函数是如何工作的吗？我花了很多时间试图弄清楚它，但仍然不知道它是如何工作的。

完整代码来自https://gist.github.com/choaimeloo/ffb96f7e43d67e81f0d44c08837f5944#file-dictionary-c-L30

// Hashes the word (hash function posted on reddit by delipity)
// The word you want to hash is contained within new node, arrow, word.
// Hashing that will give you the index. Then you insert word into linked list.
int hash_index(char *hash_this)
{
unsigned int hash = 0;
for (int i = 0, n = strlen(hash_this); i < n; i++)
{
hash = (hash << 2) ^ hash_this[i];
}
return hash % HASHTABLE_SIZE;
}

我不明白他为什么用(<<和^(？

此外，他为什么使用strlen(hash_this(？

哈希函数的目的是为给定序列(字节、字符…(检索唯一值。

因此，您需要序列的长度，这里使用"strlen"。

如果没有位移位运算符(<<(，序列"abc"one_answers"cba"将得到相同的结果。

xor运算符(^(进一步"加扰"/"散列"当前值，因此类似的序列会产生等效值(想象具有特定模式的序列，如"abcabc…"(，这变得更不可取

他使用strlen是因为他正在遍历字符串并处理每个字符。他还可以测试hash_this[i]不是零：

for ( int i = 0; hash_this[i] != 0; i++ )
...

这会做同样的事情。

按位运算符防止散列函数为相同字母的不同组合计算相同索引。您希望hash_index( "bat" )返回与hash_index( "tab" )不同的值。

为不同的字符串返回相同的索引被称为冲突，这是你想要避免的，所以大多数好的哈希函数都会对每个字符进行某种算术或逐位运算，以最大限度地减少这种可能性。

c语言 - 我不明白这个哈希函数代码

相关内容

最新更新

热门标签：