C#中Dictionary的内部实现剖析(C# Dictionary内部机制深度解析)

原创

ithorizon 6个月前 (10-21) 阅读数 29 #后端开发

C#中Dictionary的内部实现剖析(C# Dictionary内部机制深度解析)

在C#中，Dictionary是一个非常常用且高效的数据结构，它提供了键值对的存储做法，允许我们通过键迅速检索到对应的值。本文将深入探讨C#中Dictionary的内部实现机制，了解其工作原理和性能特性。

1. Dictionary的基本结构

Dictionary内部使用散列表（Hash Table）来实现。散列表是一种基于键的哈希码来存储和检索数据的数据结构。在C#中，Dictionary使用两个核心数组：一个是存储键的数组，另一个是存储值的数组。


public class Dictionary
{
    private readonly IEqualityComparer _comparer; // 键的比较器
    private readonly int[] _buckets; // 存储键的哈希值的数组
    private readonly Slot[] _slots; // 存储键和值的数组
    private readonly TValue[] _values; // 存储值的数组
    // ... 其他成员
}

2. 哈希函数与冲突解决

Dictionary使用哈希函数来计算每个键的哈希码，并将键存储在相应的位置。理想情况下，哈希函数应该能够将键均匀分布到散列表中，减少冲突。然而，由于哈希码大概产生冲突，Dictionary需要一种机制来解决这些冲突。

Dictionary使用链表来解决哈希冲突。如果一个键的哈希码对应的数组位置已经被占用，它会在该位置形成一个链表，将新的键值对添加到链表的末尾。

3. 添加元素

当向Dictionary添加一个元素时，首先使用哈希函数计算键的哈希码，然后检查哈希码对应的数组位置是否为空。如果为空，直接存储键和值；如果不为空，则检查链表中的元素是否已经存在。如果存在，则抛出异常；如果不存在，则将新元素添加到链表的末尾。


public void Add(TKey key, TValue value)
{
    int hash = _comparer.GetHashCode(key) & (_bucketsSize - 1);
    Slot[] slots = _slots;
    int index = hash % _buckets.Length;
    if (slots[index].key == key)
    {
        throw new ArgumentException("An element with the same key already exists.");
    }
    slots[index].key = key;
    slots[index].value = value;
    _size++;
    // ... 其他操作
}

4. 查找元素

查找元素时，同样使用哈希函数计算键的哈希码，然后查找哈希码对应的数组位置。如果数组位置为空，则描述字典中不存在该键；如果数组位置不为空，则在链表中查找具有相同键的元素。


public TValue this[TKey key]
{
    get
    {
        int hash = _comparer.GetHashCode(key) & (_bucketsSize - 1);
        Slot[] slots = _slots;
        int index = hash % _buckets.Length;
        while (slots[index].key != null)
        {
            if (slots[index].key.Equals(key))
            {
                return slots[index].value;
            }
            index = (_buckets[index] + 1) % _buckets.Length;
        }
        throw new KeyNotFoundException("The given key was not present in the dictionary.");
    }
}

5. 扩容机制

随着元素数量的提高，散列表大概会变得明显拥挤，这会影响其性能。为了解决这个问题，Dictionary会在添加元素时检查当前大小是否已经约为容量束缚。如果约为，它会自动扩容，创建一个新的更大的数组，并将现有元素重新散列到新数组中。


private void Resize()
{
    int newSize = _size * 2;
    int newBucketsSize = GetPrime(newSize);
    Slot[] newSlots = new Slot[newSize];
    TValue[] newValues = new TValue[newSize];
    int[] newBuckets = new int[newBucketsSize];
    for (int i = 0; i < _size; i++)
    {
        int hash = _comparer.GetHashCode(_slots[i].key) & (newBucketsSize - 1);
        newSlots[hash] = _slots[i];
        newValues[hash] = _values[i];
    }
    _slots = newSlots;
    _values = newValues;
    _buckets = newBuckets;
    // ... 更新其他成员
}