Posts

Showing posts from September, 2010

MapReduce in C# using Task Parallel Library

Back in August I starting playing with a C# implementation of Google's MapReduce algorithm. The implementation was based on something Stephan Brenner did, although I completely refactored it. Today I added a little bit of logic to split up the actual execution of Map & Reduce in this implementation using the Task Parallel Library in .NET 4.0. Check out the source code for MapReduce in C# on GitHub. Below is an excerpt from the Tests on how to implement the library. Counting Words in Files public static List<KeyValuePair<string, int>> Map(FileInfo document, string text) { var items = text.Split('\n', ' ', '.', ',','\r'); return items.Select(item => new KeyValuePair<string, int>(item, 1)).ToList(); } public static List<int> Reduce(string word, List<int> wordCounts) { if (wordCounts == null) return null; var result = new List<int> { 0 }; foreach (var value in wordC