Tuesday, January 6, 2009

Dictionary Replication

It is very common in programming to need to do a replication of data. In SQL Sentry when we pull objects from remote systems we need to figure out if they exist in our database. We index objects by specific keys, and then during the synchronization process use these keys and compare them to the remote objects keys to figure out what objects are new, what objects have been deleted, and what objects have changed. The details of everything we do are beyond the scope of this article, but I find myself doing enough of these dictionary replications that I decided to blog about it.

Take the following scenario:
You have two dictionaries,

Dictionary<K,VTarget> targetDictionary
and
Dictionary<K,VSource> sourceDictionary

You want to in one call update targetDictionary with all the contents of sourceDictionary, and be able to generate callbacks for new entries, deleted entries, and changed entries. The value types are different, as you plan to map VSource to VTarget (more on that later), but they share a common key data type (we could later expand that to be different too if it suited us). It seems a bit overwhelming but it’s a perfect example of where generics can make your life a lot easier. Traditionally I’d find myself writing three loops. In the first loop I update changed items and add new items that are in the source but not in the target. In the second l find the keys that are in the target but not in the source. These are deleted items. I cant remove them yet though because I’m inside an enumerator and that would generate an error, so I use a temp collection then do one more loop at the end to remove them from the target.

   1: // Initialize the collections
   2: Dictionary<int, string> peopleByIDTarget = new Dictionary<int, string>();
   3: Dictionary<int, string> peopleByIDSource = new Dictionary<int, string>();
   4:  
   5: peopleByIDTarget.Add(1, "Brooke");
   6: peopleByIDTarget.Add(2, "Tommy");
   7:  
   8: peopleByIDSource.Add(2, "Rick");
   9: peopleByIDSource.Add(3, "Dom");
  10:  
  11: // Loop through the source
  12: foreach (KeyValuePair<int, string> sourceKeyValuePair in peopleByIDSource)
  13: {
  14:     string existingName;
  15:     
  16:     if (peopleByIDTarget.TryGetValue(sourceKeyValuePair.Key, out existingName))
  17:     {
  18:         // ID Exists. Change Name!
  19:         peopleByIDTarget[sourceKeyValuePair.Key] = sourceKeyValuePair.Value;
  20:     }
  21:     else
  22:     {
  23:         // ID doesn't exist, so add with name.
  24:         peopleByIDTarget.Add(sourceKeyValuePair.Key, sourceKeyValuePair.Value);
  25:     }
  26: }
  27:  
  28: // Create a temp list to hold items we need to remove. You cant remove items
  29: // while enumerating or you get an error.
  30: List<int> keysToRemove = new List<int>();
  31:  
  32: // Loop through the target to see what items dont exist in the source
  33: foreach (KeyValuePair<int, string> targetKeyValuePair in peopleByIDTarget)
  34: {
  35:     // The target item doesnt exist in the source so we must remove it.
  36:     // Add it to the removal list.
  37:     if (!peopleByIDSource.ContainsKey(targetKeyValuePair.Key))
  38:     {
  39:         keysToRemove.Add(targetKeyValuePair.Key);
  40:     }
  41: }
  42:  
  43: // Remove the keys we marked for removal
  44: foreach (int key in keysToRemove)
  45: {
  46:     peopleByIDTarget.Remove(key);
  47: }

This works but it’s quite a bit of code, especially if this is happening with a lot of collections. I like to promote code reuse so the goal was to make this routine generic so that I could simply call:

peopleByIDTarget.Merge(peopleByIDSource);

Another thing the above example lacks is support for the callbacks I mentioned earlier. I’d like to know when an item is removed, added, or changed, and specify it in a easily defined way, like

peopleByIDTarget.Merge(peopleByIDSource, itemAddedCallback, itemChangedCallback, itemRemovedCallback)

and get the item that was removed, added, or changed.

It’s pretty straightforward to convert the above code into a generic method. The following is the most advanced version, supporting lots of options, callbacks, comparisons to see whether two values are the same (you may not always wish to fire the changed event if the instance of V didn’t have any properties that are different).

First we need to define some helper classes for the callbacks:

   1: /// <summary>
   2: /// Provides event arguments for items that are added to a dictionary.
   3: /// </summary>
   4: /// <typeparam name="K">The Key type</typeparam>
   5: /// <typeparam name="V">The Value type</typeparam>
   6: public class DictionaryItemAddedEventArgs<K, V> : EventArgs
   7: {
   8:     /// <summary>
   9:     /// The Key
  10:     /// </summary>
  11:     public K Key { get; private set; }
  12:  
  13:     /// <summary>
  14:     /// The new value
  15:     /// </summary>
  16:     public V NewValue { get; private set; }
  17:  
  18:     /// <summary>
  19:     /// Creates a new DictionaryItemAddedEventArgs
  20:     /// </summary>
  21:     /// <param name="key">The key</param>
  22:     /// <param name="newValue">The new value</param>
  23:     public DictionaryItemAddedEventArgs(K key, V newValue)
  24:     {
  25:         this.Key = key;
  26:         this.NewValue = newValue;
  27:     }
  28: }
  29:  
  30: /// <summary>
  31: /// Provides event arguments for items that are changed in a dictionary.
  32: /// </summary>
  33: /// <typeparam name="K">The Key type</typeparam>
  34: /// <typeparam name="V">The Value type</typeparam>
  35: public class DictionaryItemChangedEventArgs<K, V> : EventArgs
  36: {
  37:     /// <summary>
  38:     /// The Key
  39:     /// </summary>
  40:     public K Key{ get; private set; }
  41:  
  42:     /// <summary>
  43:     /// The previous value
  44:     /// </summary>
  45:     public V OldValue { get; private set; }
  46:  
  47:     /// <summary>
  48:     /// The new value
  49:     /// </summary>
  50:     public V NewValue { get; private set; }
  51:  
  52:     /// <summary>
  53:     /// Creates a new DictionaryItemChangedEventArgs
  54:     /// </summary>
  55:     /// <param name="key">The key</param>
  56:     /// <param name="oldValue">The previous value</param>
  57:     /// <param name="newValue">The new value</param>
  58:     public DictionaryItemChangedEventArgs(K key, V oldValue, V newValue)
  59:     {
  60:         this.Key = key;
  61:         this.OldValue = oldValue;
  62:         this.NewValue = newValue;
  63:     }
  64: }
  65:  
  66: /// <summary>
  67: /// Provides event arguments for items that are deleted from a dictionary.
  68: /// </summary>
  69: /// <typeparam name="K">The Key type</typeparam>
  70: /// <typeparam name="V">The Value type</typeparam>
  71: public class DictionaryItemDeletedEventArgs<K, V> : EventArgs
  72: {
  73:     /// <summary>
  74:     /// The Key
  75:     /// </summary>
  76:     public K Key { get; private set; }
  77:  
  78:     /// <summary>
  79:     /// The previous value
  80:     /// </summary>
  81:     public V OldValue { get; private set; }
  82:  
  83:     /// <summary>
  84:     /// Creates a new DictionaryItemDeletedEventArgs
  85:     /// </summary>
  86:     /// <param name="key">The key</param>
  87:     /// <param name="oldValue">The previous value</param>
  88:     public DictionaryItemDeletedEventArgs(K key, V oldValue)
  89:     {
  90:         this.Key = key;
  91:         this.OldValue = oldValue;
  92:     }
  93: }

Then we can get to the actual dictionary extensions class. The primary work starts on line 180 and I've included a couple other helper extensions I added for other uses:

   1: /// <summary>
   2: /// Provides extention methods to the dictionary class
   3: /// </summary>
   4: public static class DictionaryExtensions
   5: {
   6:     /// <summary>
   7:     /// Creates a new Dictionary with the key and value of the current dictionary reversed.
   8:     /// This method should be not used when duplicate values are expected because collisions will occur.
   9:     /// </summary>
  10:     /// <typeparam name="K">The Key type</typeparam>
  11:     /// <typeparam name="V">The Value type</typeparam>
  12:     /// <param name="dictionary">The dictionary to use</param>
  13:     /// <returns>A Dictionary with the keys and values of the current dictionary reversed</returns>
  14:     public static Dictionary<V, K> CreateDictionaryOfValueAndKey<K, V>(this Dictionary<K, V> dictionary)
  15:     {
  16:         Dictionary<V, K> result = new Dictionary<V,K>();
  17:         foreach (KeyValuePair<K, V> keyValuePair in dictionary)
  18:         {
  19:             result.Add(keyValuePair.Value, keyValuePair.Key);
  20:         }
  21:  
  22:         return result;
  23:     }
  24:  
  25:     /// <summary>
  26:     /// Creates a new MultiDictionary with the key and value of the current dictionary reversed.
  27:     /// This method should be used when duplicate values are expected.
  28:     /// </summary>
  29:     /// <typeparam name="K">The Key type</typeparam>
  30:     /// <typeparam name="V">The Value type</typeparam>
  31:     /// <param name="dictionary">The dictionary to use</param>
  32:     /// <returns>A MultiDictionary with the keys and values of the current dictionary reversed</returns>
  33:     public static MultiDictionary<V, K> CreateMultiDictionaryOfValueAndKey<K, V>(this Dictionary<K, V> dictionary)
  34:     {
  35:         MultiDictionary<V, K> result = new MultiDictionary<V, K>();
  36:         foreach (KeyValuePair<K, V> keyValuePair in dictionary)
  37:         {
  38:             result.Add(keyValuePair.Value, keyValuePair.Key);
  39:         }
  40:  
  41:         return result;
  42:     }
  43:  
  44:     /// <summary>
  45:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
  46:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
  47:     /// setting them to the value in sourceDictionary
  48:     /// </summary>
  49:     /// <typeparam name="K">The Key type</typeparam>
  50:     /// <typeparam name="V">The Value type of the source and target dictionaries</typeparam>        
  51:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
  52:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
  53:     public static void Merge<K, V>(
  54:         this Dictionary<K, V> targetDictionary,
  55:         Dictionary<K, V> sourceDictionary)
  56:     {
  57:         Merge(targetDictionary, sourceDictionary, null, null, null, null, null);
  58:     }
  59:  
  60:     /// <summary>
  61:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
  62:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
  63:     /// setting them to the value in sourceDictionary
  64:     /// </summary>
  65:     /// <typeparam name="K">The Key type</typeparam>
  66:     /// <typeparam name="VTarget">The Value type of the target dictionary</typeparam>
  67:     /// <typeparam name="VSource">The Value type of the source dictionary</typeparam>
  68:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
  69:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
  70:     public static void Merge<K, VTarget, VSource>(
  71:         this Dictionary<K, VTarget> targetDictionary,
  72:         Dictionary<K, VSource> sourceDictionary)
  73:     {
  74:         Merge(targetDictionary, sourceDictionary, null, null, null, null, null);
  75:     }
  76:  
  77:     /// <summary>
  78:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
  79:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
  80:     /// setting them to the value in sourceDictionary
  81:     /// </summary>
  82:     /// <typeparam name="K">The Key type</typeparam>
  83:     /// <typeparam name="V">The Value type of the source and target dictionaries</typeparam>
  84:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
  85:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
  86:     /// <param name="valueUpdater">The action to use to update values that share the same key</param>
  87:     public static void Merge<K, V>(
  88:         this Dictionary<K, V> targetDictionary,
  89:         Dictionary<K, V> sourceDictionary,
  90:         Func<V, V, bool> valueUpdater)
  91:     {
  92:         Func<V, V> valueMapper = x => x;
  93:         Merge(targetDictionary, sourceDictionary, valueMapper, valueUpdater, null, null, null);
  94:     }
  95:  
  96:     /// <summary>
  97:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
  98:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
  99:     /// setting them to the value in sourceDictionary
 100:     /// </summary>
 101:     /// <typeparam name="K">The Key type</typeparam>
 102:     /// <typeparam name="VTarget">The Value type of the target dictionary</typeparam>
 103:     /// <typeparam name="VSource">The Value type of the source dictionary</typeparam>
 104:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
 105:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
 106:     /// <param name="valueMapper">The transform to convert VSource to VTarget</param>
 107:     /// <param name="valueUpdater">The action to use to update values that share the same key</param>
 108:     public static void Merge<K, VTarget, VSource>(
 109:         this Dictionary<K, VTarget> targetDictionary,
 110:         Dictionary<K, VSource> sourceDictionary,
 111:         Func<VSource, VTarget> valueMapper,
 112:         Func<VTarget, VTarget, bool> valueUpdater)
 113:     {
 114:         Merge(targetDictionary, sourceDictionary, valueMapper, valueUpdater, null, null, null);
 115:     }
 116:  
 117:     /// <summary>
 118:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
 119:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
 120:     /// setting them to the value in sourceDictionary
 121:     /// </summary>
 122:     /// <typeparam name="K">The Key type</typeparam>
 123:     /// <typeparam name="V">The Value type of the source and target dictionaries</typeparam>
 124:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
 125:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
 126:     /// <param name="itemAddedEventHandler">The event to fire for items that were added</param>
 127:     /// <param name="itemChangedEventHandler">The event to fire for items that were changed</param>
 128:     /// <param name="itemDeletedEventHandler">The event to fire for items that were deleted</param>
 129:     public static void Merge<K, V>(
 130:         this Dictionary<K, V> targetDictionary,
 131:         Dictionary<K, V> sourceDictionary,
 132:         EventHandler<DictionaryItemAddedEventArgs<K, V>> itemAddedEventHandler,
 133:         EventHandler<DictionaryItemChangedEventArgs<K, V>> itemChangedEventHandler,
 134:         EventHandler<DictionaryItemDeletedEventArgs<K, V>> itemDeletedEventHandler)
 135:     {
 136:         Func<V, V> valueMapper = x => x;
 137:         Merge(targetDictionary, sourceDictionary, valueMapper, null, itemAddedEventHandler, itemChangedEventHandler, itemDeletedEventHandler);
 138:     }
 139:  
 140:     /// <summary>
 141:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
 142:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
 143:     /// setting them to the value in sourceDictionary
 144:     /// </summary>
 145:     /// <typeparam name="K">The Key type</typeparam>
 146:     /// <typeparam name="VTarget">The Value type of the target dictionary</typeparam>
 147:     /// <typeparam name="VSource">The Value type of the source dictionary</typeparam>
 148:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
 149:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
 150:     /// <param name="valueMapper">The transform to convert VSource to VTarget</param>
 151:     /// <param name="itemAddedEventHandler">The event to fire for items that were added</param>
 152:     /// <param name="itemChangedEventHandler">The event to fire for items that were changed</param>
 153:     /// <param name="itemDeletedEventHandler">The event to fire for items that were deleted</param>
 154:     public static void Merge<K, VTarget, VSource>(
 155:         this Dictionary<K, VTarget> targetDictionary,
 156:         Dictionary<K, VSource> sourceDictionary,
 157:         Func<VSource, VTarget> valueMapper,
 158:         EventHandler<DictionaryItemAddedEventArgs<K, VTarget>> itemAddedEventHandler,
 159:         EventHandler<DictionaryItemChangedEventArgs<K, VTarget>> itemChangedEventHandler,
 160:         EventHandler<DictionaryItemDeletedEventArgs<K, VTarget>> itemDeletedEventHandler)
 161:     {
 162:         Merge(targetDictionary, sourceDictionary, valueMapper, null, itemAddedEventHandler, itemChangedEventHandler, itemDeletedEventHandler);
 163:     }        
 164:  
 165:     /// <summary>
 166:     /// Merges the targetDictionary with the sourceDictionary, deleting items that arent in sourceDictionary, 
 167:     /// adding items that are in sourceDictionary but not in the targetDictionary, and updating items that are in both, 
 168:     /// setting them to the value in sourceDictionary
 169:     /// </summary>
 170:     /// <typeparam name="K">The Key type</typeparam>
 171:     /// <typeparam name="VTarget">The Value type of the target dictionary</typeparam>
 172:     /// <typeparam name="VSource">The Value type of the source dictionary</typeparam>
 173:     /// <param name="targetDictionary">The current dictionary to merge entries into</param>
 174:     /// <param name="sourceDictionary">The new dictionary with the most recent data</param>
 175:     /// <param name="valueMapper">The transform to convert VSource to VTarget</param>
 176:     /// <param name="valueUpdater">The action to use to update values that share the same key</param>
 177:     /// <param name="itemAddedEventHandler">The event to fire for items that were added</param>
 178:     /// <param name="itemChangedEventHandler">The event to fire for items that were changed</param>
 179:     /// <param name="itemDeletedEventHandler">The event to fire for items that were deleted</param>
 180:     public static void Merge<K, VTarget, VSource>(
 181:         this Dictionary<K, VTarget> targetDictionary,
 182:         Dictionary<K, VSource> sourceDictionary,
 183:         Func<VSource, VTarget> valueMapper,
 184:         Func<VTarget, VTarget, bool> valueUpdater,
 185:         EventHandler<DictionaryItemAddedEventArgs<K, VTarget>> itemAddedEventHandler,
 186:         EventHandler<DictionaryItemChangedEventArgs<K, VTarget>> itemChangedEventHandler,
 187:         EventHandler<DictionaryItemDeletedEventArgs<K, VTarget>> itemDeletedEventHandler)
 188:     {
 189:         foreach (var keyValuePair in sourceDictionary)
 190:         {
 191:             VTarget newValue = valueMapper(keyValuePair.Value);
 192:  
 193:             VTarget oldValue;
 194:             if (targetDictionary.TryGetValue(keyValuePair.Key, out oldValue))
 195:             {
 196:                 bool changed = true;
 197:                 if (valueUpdater == null)
 198:                 {
 199:                     targetDictionary[keyValuePair.Key] = newValue;
 200:                 }
 201:                 else
 202:                 {
 203:                     changed = valueUpdater(oldValue, newValue);
 204:                 }
 205:  
 206:                 if (itemChangedEventHandler != null && changed)
 207:                 {
 208:                     itemChangedEventHandler(targetDictionary, new DictionaryItemChangedEventArgs<K, VTarget>(keyValuePair.Key, oldValue, newValue));
 209:                 }
 210:             }
 211:             else
 212:             {
 213:                 targetDictionary.Add(keyValuePair.Key, newValue);
 214:                 if (itemAddedEventHandler != null)
 215:                 {
 216:                     itemAddedEventHandler(targetDictionary, new DictionaryItemAddedEventArgs<K, VTarget>(keyValuePair.Key, newValue));
 217:                 }
 218:             }
 219:         }
 220:  
 221:         List<KeyValuePair<K, VTarget>> itemsToDelete = new List<KeyValuePair<K, VTarget>>();
 222:         foreach (var keyValuePair in targetDictionary)
 223:         {
 224:             if (!sourceDictionary.ContainsKey(keyValuePair.Key))
 225:             {
 226:                 itemsToDelete.Add(keyValuePair);
 227:             }
 228:         }
 229:  
 230:         foreach (var keyValuePair in itemsToDelete)
 231:         {
 232:             targetDictionary.Remove(keyValuePair.Key);
 233:             if (itemDeletedEventHandler != null)
 234:             {
 235:                 itemDeletedEventHandler(targetDictionary, new DictionaryItemDeletedEventArgs<K, VTarget>(keyValuePair.Key, keyValuePair.Value));
 236:             }
 237:         }
 238:     }
 239: }

It’s essentially the original code, just made generic with the callbacks included. There are some additional Func<> delegates. valueMapper is used to transform the source value type to the target value type if they are different. Note there are overloads that do not require this and they just use a simple x=>x mapping. Also, there is a valueUpdater delegate as well. It is used to compare two values to see if they are really the same, which is useful if the value type is a class with properties and you want to see if those have changed prior to calling the changed callback.

Taking the initial example we can now do this:

   1: int countRemoved = 0;
   2: int countAdded = 0;
   3: int countChanged = 0;
   4:  
   5: peopleByIDTarget.Merge(
   6: peopleByIDSource,
   7: (dictionary, itemAddedArgs) =>
   8: {
   9:     countAdded++;
  10:     System.Diagnostics.Debug.WriteLine("Item " + itemAddedArgs.NewValue + " Added");
  11: },
  12: (dictionary, itemChangedArgs) =>
  13: {
  14:     countChanged++;
  15:     System.Diagnostics.Debug.WriteLine("Item " + itemChangedArgs.OldValue + " changed to " + itemChangedArgs.NewValue);
  16: },
  17: (dictionary, itemRemovedArgs) =>
  18: {
  19:     countRemoved++;
  20:     System.Diagnostics.Debug.WriteLine("Item " + itemRemovedArgs.OldValue + " Removed");
  21: });
  22:  
  23: System.Diagnostics.Debug.WriteLine(countAdded.ToString() + " Items Added");
  24: System.Diagnostics.Debug.WriteLine(countChanged.ToString() + " Items Changed");
  25: System.Diagnostics.Debug.WriteLine(countRemoved.ToString() + " Items Removed");

Generics combined with closures allows for some very rapid development.

No comments: