Flashcache - part 1

Dorus Verhoeckx

The need for fixed size temporal cache data structure

A while ago I was writing a program to generate an optimal deck of cards for a tournament that I was going to play with friends. This card game is Called "Magic The Gathering". You create a deck of 60 cards, each card can only occur 4 times (with some exceptions). The game is such that there are over 20K unique cards and each card is designed to undermine the basic rules of the game. It's a strategy game, 1 part deck building and 1 part playing.

So you can imagine that writing a program to generate an optimal deck can be quite complex. The program needed iterate and compare and analyze each unique combination.

Everything the program needs is in memory; for optimal performance. The problem was that I needed a caching mechanism that didn't grow too fast (imagine at least 20K factorial potential text comparisons in total). Furthermore, the caching mechanism needed to be 'temporal', this means that the same stuff only needs to stay cached for a relatively short time-span. I didn’t care about complex expiring or optimizing memory usage.

Speed and guaranteed size are the only thing required.

One more thing, it should be thread safe. Multiple readers, multiple writers, don't care about ejecting items from cache when it is full.

When I was starting this project I was considering Kotlin, because a colleague @mplatvoet is a big fan and from what he told me the language is really great. (and he is a hardcore data structure and optimization "enthusiast"). But I chose my comfort zone VS2015 Community Edition with Resharper Ultimate (still some JetBrains in there) and the CSharp language. Yep that was lame, but I still learned new interesting things.

My first idea was a ring cache (@mplatvoet was explaining some stuff to me about that and it sounded like a possible solution). Couldn't find an existing library that fitted my needs, so I started to think about the simplest fixed size cache solution. A ConcurrentDictionary worked fine for a while but turned out to be too slow with multiple readers and writers (lot of wait time) and the eventually OutOfMemoryExceptions. Then I looked into the System.Runtime.Caching.MemoryCache. You can pass a parameter with the maximum size and you can setup some auto expiration. I wrapped this, modified the access and tested. This was a little faster, a bit less memory, but still no easy control over the fixed size.

So what would be the best solution in this situation? I talked for 5 minutes with @mplatvoet and he told me some stuff about hashmaps, array's with size power of two and how this very simple concept could work. I went home started to code, looked up some details about bit-shifting and binary operators (last time I used this was in college) and built a prototype. He called it the FlashCache. I would really recommend his in depth explanations of these concepts!

The concept is very simple: You have a fixed size cache (an array internally), you use hashing to store the value. You use an array with fixed size of a power of two to optimize calulating the index: & operator instead of % (modulo). Use a ThreadBarrier for volatile read and write access, and you're done!

You can view the first version here

The version I ended up with can be viewed on github github project nuget package