A Dictionary Class Which Permits Duplicate Keys

Introduction

The .NET framework contains a number of 'Dictionary' classes including.

  • Dictionary<K, V>
  • SortedDictionary<K, V>
  • Hashtable
  • ListDictionary
  • OrderedDictionary

All of these associate a key with a value and are represented internally by a collection of key/value pairs. However, none of them permit duplicate keys so, if you try to add an item whose key is already present, an exception is thrown.

There is a Lookup<K, V> class which is a collection of keys mapped to one or more values. However, this is intended for use in LINQ queries and is not really suitable for general-purpose use.

A strategy for a workaround

The absence of a general-purpose dictionary class that permits duplicate keys is a serious shortcoming of the framework, in my opinion.

The most usual way to work around this shortcoming is to associate a list of values, rather than a single value, with each key in the dictionary.

In fact, the GroupBy extension method which is used in LINQ works by reading the input elements into a temporary dictionary of lists so that all elements with the same key end up in the list associated with that key. The lists are then emitted as a sequence of objects that implement the IGrouping interface.

Although this strategy works, it is tedious to employ in practice because you need to write code to manipulate the list associated with each key every time an item is added, removed, or checked for containership. Also, when enumerating the dictionary, you have to enumerate first the keys and then their associated lists.

The Lexicon class

It would be nice if we had a class that did all the 'housekeeping' of maintaining the lists for us and which could be used in a natural way. I have therefore written such a class.

As Dictionary is already a long name, I decided to simply use the synonym 'Lexicon' for the new class rather than append yet another prefix to 'Dictionary'.

The Lexicon<K, V> class contains all the same members as the Dictionary<K, V> class but has some new ones to deal with the possibility that a key could have multiple values. The usage of some existing members has had to be modified for the same reason and some additional overloads have been added.

Although the Lexicon<K, V> class is a wrapper for a Dictionary<K, List<V>>, I decided that it would not be appropriate to formally implement the generic and non-generic IDictionary interfaces because these interfaces implicitly assume that classes which implement them will not support duplicate keys. However, it does implement all the other interfaces that Dictionary implements.

It also implements a new interface I've created called ILexicon<K, V> which contains signatures for its essential members.

This table lists its main constructors with a brief description of what they do.

Constructor Signature Description
Lexicon() creates a new empty Lexicon
Lexicon(dictionary) creates a new Lexicon from a generic IDictionary
Lexicon(lexicon) creates a new Lexicon from an ILexicon
Lexicon(capacity) creates a new Lexicon with the specified initial capacity
Lexicon(comparer) creates a new Lexicon that uses a specific IEqualityComparer to test for equality of keys

This table lists its main properties.

Property Name Description
Count gets the total number of items in the Lexicon
KeyCount gets the total number of unique keys
this[key] gets or sets the list of values with this key
this[key, index] gets or sets the value of the item at this index within this key's list of values
Keys gets a collection of all unique keys in the Lexicon
ValueLists gets the lists of values of all items in the Lexicon in the same order as the key property
Values gets an enumeration of all values in the Lexicon in the same order as the Keys and Values properties

And, finally, this table lists its main methods:

Method Name and Signature Description
Add(key, value) adds an item to the Lexicon with this key and value
AddList(key, list) adds items to the Lexicon with this key from this list of values
AddRange(key-value pairs) adds an enumerable collection of key-value pairs to the Lexicon
ChangeValue(key, oldValue, newValue) changes the value of the first item with this key and this old value to this new value
ChangeValueAt(key, index, newValue) changes the value of the item with this index in this key's list of values to newValue
Clear() clears the Lexicon of all items
Contains(key, value) indicates if an item with this key/value pair is within the Lexicon
ContainsKey(key) indicates if an item with this key is within the Lexicon
ContainsValue(value) indicates if an item with this value is within the Lexicon
ContainsValue(value, out key) indicates if an item with this value is within the Lexicon and, if so, returns the first key found with that value as an output parameter
CopyTo(array, index) copies the key /value pairs of the Lexicon to an array starting at the specified index
FindKeyIndexPairs(value) finds all key/index pairs in the Lexicon which have this value
GetValueCount(key) gets the number of all values in this key's list
IndexOfValue(key, value) gets the first index of this value within this key's list
Remove(key, value) removes the first item in the Lexicon with this key and value
RemoveAt(key, index) removes the item at this index in this key's list of values
RemoveKey(key) removes all items in the Lexicon with this key
TryGetValueList(key, out value) tries to get the list of values with this key
TryGetValueAt(key, index, out value) tries to get the value of the item at this index within this key's list of values


Notes on members

Lexicon's constructors mirror those of the generic Dictionary class except that there is an additional constructor to create a new Lexicon from an existing ILexicon.

The Count property returns the total number of key/value pairs in the Lexicon.

The indexer which takes a sole key argument gets or sets the whole list of values for that key. If there's already a list for that key, it is replaced (not merged) by the 'setter'. Otherwise, a new entry for that key is created.

The indexer which takes an additional index parameter gets or sets the individual value at that index in the key's value list. If the key doesn't exist or there's no value at that index (and it's the next index to be assigned) then a new entry is created.

The Keys property returns just the unique keys (however many times they're duplicated) and the KeyCount property returns how many such keys there are.

The ValueLists property returns the lists of values corresponding to the unique keys and the Values property returns an enumeration of all values in all key/value pairs in the Lexicon.

The Add, Contains, and Remove methods also have overloads (not shown) that take a KeyValuePair structure as a parameter.

The AddList method adds a key with a list of values to the Lexicon. If the key already exists, its existing list of values is merged with the new one, not replaced by it. This differs from the indexer's behavior.

The FindAllKeyIndexPairs method does what it says on the tin for a given value. Notice, though, that this method will be slow for a large Lexicon as it needs to iterate through every key/value pair within it. It has therefore been implemented using deferred execution (i.e. an iterator).

The 'ChangeValue' and 'Remove' family of methods all return a bool value to indicate whether the operation was a success or not.

The other members are largely self-explanatory.

When Lexicons are created (or added to) from other objects, they are copied first so that subsequent changes don't affect the original objects. However, when lists of values are retrieved by external code, they are not copied and so may be manipulated directly by that code.

Enumerating the Lexicon

The best way to enumerate a Lexicon object is to use the enumerator returned by the GetEnumerator method which returns all key/value pairs in the same order as the keys are enumerated in the underlying Dictionary object

However, it's also possible to enumerate using the Keys property and then use the indexer to get the list of values for that key and iterate through those. The ValueLists property enables the lists of values for each key to be enumerated separately.

As mentioned in the previous section, the Values property and FindKeyIndexPairs method are implemented as generic IEnumerables and so can also be enumerated using the foreach statement.

Example of usage

The code for the Lexicon class and ILexicon interface can be downloaded from the link accompanying this article as it is too long to include in the body of the article itself. Both these types are included in a namespace called Lexicons and can be used in .NET 2.0 or later.

The following console application shows how to use some of its principal members.

using System;
using System.Collections.Generic;
using Lexicons;

class Program
{
    static void Main(string[] args)
    {
        // create and populate a Lexicon object
        Lexicon<string, int> lex = new Lexicon<string, int>();
        lex.Add("Dave", 1);
        lex.Add("John", 2);
        lex.Add("Dave", 3);
        lex.Add("Stan", 4);
        lex.Add("Dave", 5);
        lex.Add(new KeyValuePair<string, int>("Fred", 6));

        // iterate through key/value pairs
        Console.WriteLine("The lexicon initially contains the following key/value pairs\n");

        foreach (KeyValuePair<string, int> kvp in lex)
        {
            Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);
        }

        // add a new entry to the Lexicon
        lex["Alan"] = new List<int> { 7 };

        // add some more values for the new key
        lex.AddList("Alan", new List<int> { 8, 9 });

        // add another new entry
        lex.Add("Mary", 10);

        // iterate the Lexicon again, this time using the Keys collection
        Console.WriteLine("\nFollowing the addition of new entries, the lexicon now contains\n");

        foreach (string key in lex.Keys)
        {
            foreach (int value in lex[key])
            {
                Console.WriteLine("{0} : {1}", key, value);
            }
        }

        Console.WriteLine("\nDave has {0} values", lex.GetValueCount("Dave"));

        lex.RemoveKey("Dave"); // remove key and all its values
        lex.Remove("Alan", 8); // remove a single value
        lex.ChangeValue("Fred", 6, 5); // change a value

        // iterate the Lexicon again
        Console.WriteLine("\nFollowing some removals and a change, the lexicon now contains\n");
        foreach (KeyValuePair<string, int> kvp in lex)
        {
            Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);
        }

        if (lex.Contains("Stan", 4))
        {
            Console.WriteLine("\nStan has a value of 4");
        }

        // create an array of key/value pairs and copy the Lexicon's contents to it
        KeyValuePair<string, int>[] kvpa = new KeyValuePair<string, int>[lex.Count];
        lex.CopyTo(kvpa, 0);

        Console.WriteLine("There are currently {0} key value pairs in the Lexicon", kvpa.Length);

        // try and get the value at index 1 for Alan
        int val;
        bool b = lex.TryGetValueAt("Alan", 1, out val);
        if (b) Console.WriteLine("Alan has a value of {0} at index 1", val);

        // create a new dictionary
        Dictionary<string, int> dict = new Dictionary<string, int>();
        dict["Nora"] = 3;
        dict["John"] = 4; // uses a key already in the Lexicon

        // create a new Lexicon from the Dictionary
        Lexicon<string, int> lex2 = new Lexicon<string, int>(dict);

        // add some more members to it
        lex2["Zena"] = new List<int> { 11 };
        lex2["Myra", 0] = 12;

        // merge with existing Lexicon
        lex.AddRange(lex2);

        lex.Remove(new KeyValuePair<string, int>("Stan", 4)); // effectively remove Stan
        lex.RemoveAt("Mary", 0); // effectively remove Mary

        // iterate the Lexicon again
        Console.WriteLine("\nFollowing a number of changes, the lexicon now contains\n");
        foreach (KeyValuePair<string, int> kvp in lex)
        {
            Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);
        }

        Console.WriteLine("\nNora has a value of 3 at index {0}", lex.IndexOfValue("Nora", 3));

        lex["Zena", 1] = 1; // add a new value for Zena

        if (lex.ContainsValue(12))
        {
            Console.WriteLine("The lexicon contains a value of 12");
        }

        string k;
        if (lex.ContainsValue(5, out k)) Console.Write("{0} had a value of 5 ", k);
        lex.ChangeValue(k, 5, 2);
        if (lex[k, 0] == 2) Console.WriteLine("but now has a value of 2", k);

        Console.WriteLine("\nThe following key/index pairs have a value of 2\n");
        foreach (KeyValuePair<string, int> kip in lex.FindKeyIndexPairs(2))
        {
            Console.WriteLine("Key : {0}  Value : 2  Index : {1}", kip.Key, kip.Value);
        }
        Console.ReadKey();
    }
}

Example output A screenshot of the output is shown below.

Output

Conclusion

I hope you will find the Lexicon class to be a useful adjunct to the other 'Dictionary' classes in the .NET Framework.

It is not intended as a replacement for the generic Dictionary class and should only be used in situations where keys may be duplicated. As a List object needs to be assigned to each key, it will clearly use more memory and be slightly slower than an 'ordinary' Dictionary. This is not ideal in situations where the duplicated keys are relatively sparse but an inevitable consequence of the strategy used.

I am currently working on sorted versions of the Lexicon and a 'ToLexicon' extension method for use in LINQ queries. I am also considering the use of a different strategy to address the issue of Lexicons with sparse duplicate keys. The results will be the subject of a future article.


Similar Articles