patterncsharpModerate
Multiple indexes over an in memory collection for faster search
Viewed 0 times
searchcollectionindexesfasterformemorymultipleover
Problem
I have a big in-memory collection of following simplified class:
I need to search for products based on different properties like UserName or CategoryId.
One way of searching would be using linq to objects like:
This takes some processing when collection is too big and in my case it would be called hundreds of time each second.
What I came up with was to introduce a new collection that supports indexing over different properties:
```
public class FastCollection : IEnumerable
{
private IList _items;
private IList>> _lookups;
private Dictionary> _indexes;
public FastCollection(IList data)
{
_items = data;
_lookups = new List>>();
_indexes = new Dictionary>();
}
public void AddIndex(Expression> property)
{
_lookups.Add(property);
_indexes.Add(property.ToString(), _items.ToLookup(property.Compile()));
}
public void Add(T item)
{
_items.Add(item);
RebuildIndexes();
}
public void Remove(T item)
{
_items.Remove(item);
RebuildIndexes();
}
public void RebuildIndexes()
{
if (_lookups.Count > 0)
{
_indexes = new Dictionary>();
foreach (var lookup in _lookups)
{
_indexes.Add(lookup.ToString(), _items.ToLookup(lookup.Compile()));
}
}
}
public IEnumerable FindValue(Expression> property, TProperty value)
{
var key = property.ToString();
if(_indexes.ContainsKey(key))
{
return _indexes[key][value];
}
else
{
var c = property.Compile();
return _items.Where(x
public class Product
{
public int Id { get; set; }
public string UserName { get; set; }
public int CategoryId { get; set; }
public string Title { get; set; }
public string Description { get; set; }
}I need to search for products based on different properties like UserName or CategoryId.
One way of searching would be using linq to objects like:
var userProducts = products.Where(x => x.UserName == "SomeValue")This takes some processing when collection is too big and in my case it would be called hundreds of time each second.
What I came up with was to introduce a new collection that supports indexing over different properties:
```
public class FastCollection : IEnumerable
{
private IList _items;
private IList>> _lookups;
private Dictionary> _indexes;
public FastCollection(IList data)
{
_items = data;
_lookups = new List>>();
_indexes = new Dictionary>();
}
public void AddIndex(Expression> property)
{
_lookups.Add(property);
_indexes.Add(property.ToString(), _items.ToLookup(property.Compile()));
}
public void Add(T item)
{
_items.Add(item);
RebuildIndexes();
}
public void Remove(T item)
{
_items.Remove(item);
RebuildIndexes();
}
public void RebuildIndexes()
{
if (_lookups.Count > 0)
{
_indexes = new Dictionary>();
foreach (var lookup in _lookups)
{
_indexes.Add(lookup.ToString(), _items.ToLookup(lookup.Compile()));
}
}
}
public IEnumerable FindValue(Expression> property, TProperty value)
{
var key = property.ToString();
if(_indexes.ContainsKey(key))
{
return _indexes[key][value];
}
else
{
var c = property.Compile();
return _items.Where(x
Solution
Using
Instead you should compare
It seems wasteful to me to rebuild all indexes after each change. If you change the collection often, consider changing only the relevant part of each index.
Fields that are set in the constructor and then never modified should be
If you're on .Net 4.5, you could use
This check is pretty much useless. It saves you from unnecessarily creating an empty dictionary, but doing that is very cheap, so I think shorter code should take the priority here.
You could replace the whole
This won't work correctly when
ToString() to compare expressions for equality might work in simple cases, but:- It requires you to always use the same parameter name, for example, it would consider
x => x.Idandproduct => product.Idto be different expressions.
- Expressions with different meaning can produce the same string, for example
(int i) => (float)iand(int i) => (double)iboth producei => Convert(i). Because of this, it might make sense to ensure that the used expressions contain only property accesses and nothing else.
Instead you should compare
Expressions properly.It seems wasteful to me to rebuild all indexes after each change. If you change the collection often, consider changing only the relevant part of each index.
Fields that are set in the constructor and then never modified should be
readonly.IList dataIf you're on .Net 4.5, you could use
IReadOnlyList here.if (_lookups.Count > 0)This check is pretty much useless. It saves you from unnecessarily creating an empty dictionary, but doing that is very cheap, so I think shorter code should take the priority here.
You could replace the whole
RebuildIndexes() method with a single ToDictionary():_indexes = _lookups.ToDictionary(
lookup => lookup.ToString(), lookup => _items.ToLookup(lookup.Compile()));c(x).Equals(value)This won't work correctly when
c(x) returns null. You should probably use object.Equals(c(x), value) instead.Code Snippets
IList<T> dataif (_lookups.Count > 0)_indexes = _lookups.ToDictionary(
lookup => lookup.ToString(), lookup => _items.ToLookup(lookup.Compile()));c(x).Equals(value)Context
StackExchange Code Review Q#40811, answer score: 18
Revisions (0)
No revisions yet.