Friday, February 19, 2010

Caching and Memoization

A while back Wes Dyer wrote a great article on function memoization, a technique used to speed up subsequent function evaluations by caching the results of previous executions based on input value.

In a nutshell, given a function f(x),
m(f(x)) =
if (precomputed x is stored)
return precomputed x
else
precomputed x = f(x)
store precomputed(x)
return f(x)

Memoization is frequently used when the following conditions are true:
1. Input values of x are frequently recalculated
2. f(x) takes enough time to execute that the speedup for computing f(x) outweighs by a large margin the lookup times for finding the precomputed value of f(x).
3. There are not so many distinct values of x as to cause memory to overflow for all the precomputed values that will be cached. Memoization would not be helpful to calculate across a set of 10^10 distinct input values, for example.

The examples Wes gave are great and I strongly encourage you to read his article. This article was written to expand on that article by applying some additional concepts. The primary issue I had with straightforward memoization was item #3. Let us assume we have a long running function f(x). We then want to apply the following logic:

Define a cache timeout time t.
If we have computed f(x) within t, then return the cached value of f(x).
If we have not, then recalculate f(x) and update the cache.

This solves item #3 fairly gracefully. As long as the number of input values of x do not vary by more than n new items in time t, you can have support for an infinite values of x, as long as you understand how x is changing in time t. It also solves the issue of eviction of stale values from the cache. So knowing what I wanted to do let’s look at the implementation.

The class I created was called CachedFunction<T, TKey, TResult> where T is the input type that can be a class, TKey is a struct that can be reliably used as an key for the caching dictionary, and TResult is the output type of the function we are caching. I also created a simpler version when you have a function that takes a struct as an input value vs a class. In that case you can just use CachedFunction<T, TResult>. Internally that class just maps T->TKey via a 1-1 mapping function.

Let’s look at an example:

Func<int, int> addOne = x => { System.Threading.Thread.Sleep(1000); return x + 1; }; // Wait second and add a value
Action<int, TimeSpan> printTime =
(x, time) =>
{
string message = string.Format("result={0}, computationtime={1}", x, time);
System.Diagnostics.Debug.WriteLine(message);
}; // Helper for printing output

var addOneCached = addOne.CreateCachedFunction(new TimeSpan(0, 1, 0)); // Create a caching version of the function with a one minute timeout
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
var result = addOneCached(1); // Compute the value. Should take 1 second because of sleep.
printTime(result, sw.Elapsed);
sw.Reset();
sw.Start();
var result2 = addOneCached(1); // Compute the value. Should be instantaneous because it's cached.
printTime(result2, sw.Elapsed);
sw.Stop();


And the results:
result=2, computationtime=00:00:01.0039687
result=2, computationtime=00:00:00.0005103

Let’s look at the code, starting with the caching function itself:

using System;
using System.Collections.Generic;

namespace Intercerve
{
/// <summary>
/// Provides functionality for wrapping functions and caching computed values
/// </summary>
/// <typeparam name="T"></typeparam>
/// <typeparam name="TKey"></typeparam>
/// <typeparam name="TResult"></typeparam>
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Design", "CA1005:AvoidExcessiveParametersOnGenericTypes")]
public class CachedFunction<T, TKey, TResult> where TKey : struct
{
/// <summary>
/// An internal class used to hold the time an individual result was cached and the corresponding value.
/// </summary>
private class ResultAndCacheTime
{
/// <summary>
/// The value of the cached result.
/// </summary>
public TResult Result { get; set; }
/// <summary>
/// The time the value was cached.
/// </summary>
public DateTime CacheTime { get; set; }
}

private readonly Dictionary<TKey, ResultAndCacheTime> _Cache = new Dictionary<TKey, ResultAndCacheTime>();
private readonly Func<T, TResult> _Function;
private readonly Func<T, TKey> _KeyMap;
private readonly TimeSpan _CacheTimeout;
private readonly object _SyncLock = new object();

/// <summary>
/// Creates a new CachedFunction to provide automatic caching and eviction for computed values.
/// </summary>
/// <param name="function">The function to wrap.</param>
/// <param name="keyMap">A mapping function that returns a key of type TKey for an input value of type T.</param>
/// <param name="cacheTimeout">The cache timeout threshold for flushing the cache.</param>
public CachedFunction(Func<T, TResult> function, Func<T, TKey> keyMap, TimeSpan cacheTimeout)
{
_Function = function;
_KeyMap = keyMap;
_CacheTimeout = cacheTimeout;
}

/// <summary>
/// Computes the value of f(value) or returns the cached value if within the cache timeout threshold.
/// </summary>
/// <param name="value">The value to retrieve the result for.</param>
/// <returns>The value of f(value) or the last cached value.</returns>
public TResult Compute(T value)
{
TKey key = _KeyMap(value);
ResultAndCacheTime resultAndCacheTime;

lock (_SyncLock)
{
// Aquire the lock and see if we have the value already cached.
if (_Cache.TryGetValue(key, out resultAndCacheTime))
{
// We already have the value. How old is it?
TimeSpan elapsedTime = DateTime.UtcNow.Subtract(resultAndCacheTime.CacheTime);
if (elapsedTime >= _CacheTimeout)
{
// The value is too old so remove it.
_Cache.Remove(key);
}
else
{
// The value is within the cache threshold so return it.
return resultAndCacheTime.Result;
}
}
}

// We don't have the value cached. Compute the value. Note we don't hold the lock here.
// This can result in the operating executing twice vs only computing the value once when we don't have it cached
// but if we don't do this and hold _SyncLock while computing, it would make Compute(T value) a
// blocking operation for the duration of _Function(value).
TResult computedResult = _Function(value);
resultAndCacheTime = new ResultAndCacheTime { Result = computedResult, CacheTime = DateTime.UtcNow };

lock (_SyncLock)
{
ResultAndCacheTime resultAndCacheTimeExisting;
if (_Cache.TryGetValue(key, out resultAndCacheTimeExisting))
{
// This is for thread synchronization. _Function(value) could potentially take a long time, so
// we cant hold the lock during it. Because of that we use a last win algorithm to see if two threads updated at the same time
// If they did the last computed time wins
if (resultAndCacheTime.CacheTime > resultAndCacheTimeExisting.CacheTime)
{
_Cache.Remove(key);
_Cache.Add(key, resultAndCacheTime);
}
}
else
{
_Cache.Add(key, resultAndCacheTime);
}
}

return computedResult;
}

/// <summary>
/// Clears the entire cache all at once for all values.
/// </summary>
public void ClearCache()
{
lock (_SyncLock)
{
_Cache.Clear();
}
}

/// <summary>
/// Clears a specific value from the cache.
/// </summary>
public void ClearCacheForValue(T value)
{
TKey key = _KeyMap(value);
lock (_SyncLock)
{
if (_Cache.ContainsKey(key))
{
_Cache.Remove(key);
}
}
}
}

/// <summary>
/// Provides functionality for wrapping functions and caching computed values
/// </summary>
public class CachedFunction<T, TResult> where T : struct
{
private CachedFunction<T, T, TResult> _CachedFunction;

/// <summary>
/// Creates a new CachedFunction to provide automatic caching and eviction for computed values.
/// </summary>
/// <param name="function">The function to wrap.</param>
/// <param name="cacheTimeout">The cache timeout threshold for flushing the cache.</param>
public CachedFunction(Func<T, TResult> function, TimeSpan cacheTimeout)
{
_CachedFunction = new CachedFunction<T, T, TResult>(function, GetKey, cacheTimeout);
}

private T GetKey(T value)
{
return value;
}

/// <summary>
/// Computes the result of value.
/// </summary>
/// <param name="value">The value to evaluate.</param>
/// <returns>The result.</returns>
public TResult Compute(T value)
{
return _CachedFunction.Compute(value);
}

/// <summary>
/// Clears the entire cache all at once for all values.
/// </summary>
public void ClearCache()
{
_CachedFunction.ClearCache();
}

/// <summary>
/// Clears a specific value from the cache.
/// </summary>
public void ClearCacheForValue(T value)
{
_CachedFunction.ClearCacheForValue(value);
}
}
}


That is relatively user friendly, but to create a caching function using that method we have to call:

var addOneCachedLong = new CachedFunction<int, int>(addOne, new TimeSpan(0, 1, 0)).GetFunction();

To reduce the complexity we can create helper extension methods and use type inference to ease in the construction. To do so I defined the following:

using System;

namespace Intercerve
{
/// <summary>
/// Various extension methods for creating alternate versions of functions.
/// </summary>
public static class FunctionExtensions
{
/// <summary>
/// Creates a caching wrapper around a function.
/// </summary>
/// <typeparam name="T">The function argument type</typeparam>
/// <typeparam name="TResult">The function return type</typeparam>
/// <param name="function">The function to wrap</param>
/// <param name="cacheTimeout">The cache timeout for cached results</param>
/// <returns></returns>
public static Func<T, TResult> CreateCachedFunction<T, TResult>(this Func<T, TResult> function, TimeSpan cacheTimeout) where T : struct
{
CachedFunction<T, TResult> cachedFunction = new CachedFunction<T, TResult>(function, cacheTimeout);
return cachedFunction.Compute;
}

/// <summary>
/// Creates a caching wrapper around a function.
/// </summary>
/// <typeparam name="T">The function argument type</typeparam>
/// <typeparam name="TKey">The cached item key type</typeparam>
/// <typeparam name="TResult">The function return type</typeparam>
/// <param name="function">The function to wrap</param>
/// <param name="keyMap">The mapping function from T to TKey</param>
/// <param name="cacheTimeout">The cache timeout for cached results</param>
/// <returns></returns>
public static Func<T, TResult> CreateCachedFunction<T, TKey, TResult>(this Func<T, TResult> function, Func<T, TKey> keyMap, TimeSpan cacheTimeout) where TKey : struct
{
CachedFunction<T, TKey, TResult> cachedFunction = new CachedFunction<T, TKey, TResult>(function, keyMap, cacheTimeout);
return cachedFunction.Compute;
}
}
}


We can then do what we did in the example, which is simply:
var addOneCached = addOne.CreateCachedFunction(new TimeSpan(0, 1, 0));


I also added support for eviction of individual items on the cache if need be, or flushing the cache entirely. The primary usage I’ve found for the CachedFunction() was to add a bounded caching solution around a function that already had been written. It’s much easier to extend a method like this than to go into the guts and reorganize. During an optimization phase we noticed that certain database function calls were running very often. It was a tricky problem because we wanted to make a method call, but depending on various factors that method could run very often or not very often. We wanted to restrict how often the inner function ran.

Put another way, take a function outer(x) that calls inner(x). We can’t control how often outer(x) runs. Sometimes it could execute multiple times per second, sometimes once per minute. We can’t change that behavior. It needs to run as often as it needs to run. However inner(x) shouldn’t change that often. Most of the time its value is the same as it was the previous execution. Sometimes it can change but rarely. So we used this function to wrap inner(x) and make cached_inner(x) with a threshold of five minutes or so. Voila. Problem solved. No matter how often outer(x) runs, inner(x) will only run once every five minutes, but will always yield a value, and better yet, we did this without having to modify the original function and in one line of code.



Of course, right now this method only supports functions with one input parameter, however it wouldn’t be too hard to extend the class to support additional variables. Wes demonstrates how to do this easily in another excellent blog post: http://blogs.msdn.com/wesdyer/archive/2007/02/11/baby-names-nameless-keys-and-mumbling.aspx



Until next time…

No comments: