patternjavaMinor
Cache for JSON API
Viewed 0 times
jsoncacheforapi
Problem
I am working on a small project which utilizes a 3rd party API. Many of the queries from the API can take 5-10 seconds so to keep the front end moving a bit faster I decided to cache the responses from these calls. I ended up building my own cache implementation. There are 3 classes in this implementation which are posted below in the following order:
Also, at the bottom, I have included some JUnit tests I made in case you want to run these.
Questions:
A very simple showing of a
```
long cacheExpiration = 300; // 300 ms for testing
CacheBin bin = new CacheBin(cacheExpiration);
bin.store("hello", "world");
// cache is good
Assert.assertEquals("world", bin.get("
Cache, CacheBin, CacheEntry. All classes should be thread safe. I consider myself a novice of concurrency so its very possible I have something not quite right on that end.Cacheis simply wrapper forMapand I use store each API call seperated intoCacheBins.
CacheBinis the basic foundation of a cache with one little tweak. Instead of using a generic type or any old string as a key it uses a MD5 to calculate a key. The reason for the MD5 is many of the queries I send to the API are upwards of 1k characters. Also I am usingSoftReference>so that garbage collection can pick up these bins if memory is running low.
CacheEntryan actual entry of data from the cache, is immutable and contains an expiration time.
Also, at the bottom, I have included some JUnit tests I made in case you want to run these.
Questions:
- Is my synchronization in the
CacheandCacheBincorrect and/or is there a cleaner way to do it?
- Am I using
SoftReferenceinsideCacheBinon theCacheEntrycorrectly? Will aSoftReferenceon theCacheEntrybe enough or should I do aSoftReferenceon thedataproperty ofCacheEntry. On my JVM they are being cleand up however some reading on the subject says that JVMs do treatSoftReferencesdifferently so a different approach might be necessary,
- Does the MD5 sum of a JSON request body make sense to use for a
HashMapimplementation?
A very simple showing of a
CacheBin:```
long cacheExpiration = 300; // 300 ms for testing
CacheBin bin = new CacheBin(cacheExpiration);
bin.store("hello", "world");
// cache is good
Assert.assertEquals("world", bin.get("
Solution
I think you should never implement your own cache for any serious project. This is many times implemented and it is not only a waste of time, but quite error-prone. That said, these are my (subjective) answers to your questions:
Is my synchronization in the Cache and CacheBin correct and/or is
there a cleaner way to do it?
How is the correctness of a cache synchronization defined? I mean, what is the worst case scenario of the incorrect cache synchronization? That you compute an entry twice? That's not so bad. In fact, there are (quite common) scenarios where no synchronization gives you faster results (less synchronization overhead), e.g. when using ConcurrentHashMap. That said, I think your Cache synchronization is correct, but very non-optimal. E.g. your methods are starting with a (implicit) synchronized block, which is IMHO not optimal and a double check should be preferred (because of throughput). Imagine the key everybody uses is already cached, and you have 1000 parallel users (threads) wanting to retrieve this key from your cache. They would need to go through your synchronized methods one by one (instead of all at once if you don't start with the synchronized block, but check the cache first). Yet another reason for an optimized solution. Then you are synchronizing the whole cache even if 100 users are working with different keys. One more reason for some done project, like Guava or EhCache, which can do explicit per-key locking.
Am I using SoftReference inside CacheBin on the CacheEntry correctly?
Will a SoftReference on the CacheEntry be enough or should I do a
SoftReference on the data property of CacheEntry. On my JVM they are
being cleand up however some reading on the subject says that JVMs do
treat SoftReferences differently so a different approach might be
necessary.
My guess is you are using it correctly. That said, Guava and other cache implementation can do it as well and I bet they do it even more correctly ;)
Does the MD5 sum of a JSON request body make sense to use for a
HashMap implementation?
I don't see any good reason for the MD5 sum because of how HasMap and hashCode() of the String works. Normally if you use Strings as keys of a HashMap, hashCode is invoked, which goes through the whole String once. Your implementation does the same when you do the MD5 sum, but then you make HashMap go through MD5 sum String, so you add more overhead.
Is my synchronization in the Cache and CacheBin correct and/or is
there a cleaner way to do it?
How is the correctness of a cache synchronization defined? I mean, what is the worst case scenario of the incorrect cache synchronization? That you compute an entry twice? That's not so bad. In fact, there are (quite common) scenarios where no synchronization gives you faster results (less synchronization overhead), e.g. when using ConcurrentHashMap. That said, I think your Cache synchronization is correct, but very non-optimal. E.g. your methods are starting with a (implicit) synchronized block, which is IMHO not optimal and a double check should be preferred (because of throughput). Imagine the key everybody uses is already cached, and you have 1000 parallel users (threads) wanting to retrieve this key from your cache. They would need to go through your synchronized methods one by one (instead of all at once if you don't start with the synchronized block, but check the cache first). Yet another reason for an optimized solution. Then you are synchronizing the whole cache even if 100 users are working with different keys. One more reason for some done project, like Guava or EhCache, which can do explicit per-key locking.
Am I using SoftReference inside CacheBin on the CacheEntry correctly?
Will a SoftReference on the CacheEntry be enough or should I do a
SoftReference on the data property of CacheEntry. On my JVM they are
being cleand up however some reading on the subject says that JVMs do
treat SoftReferences differently so a different approach might be
necessary.
My guess is you are using it correctly. That said, Guava and other cache implementation can do it as well and I bet they do it even more correctly ;)
Does the MD5 sum of a JSON request body make sense to use for a
HashMap implementation?
I don't see any good reason for the MD5 sum because of how HasMap and hashCode() of the String works. Normally if you use Strings as keys of a HashMap, hashCode is invoked, which goes through the whole String once. Your implementation does the same when you do the MD5 sum, but then you make HashMap go through MD5 sum String, so you add more overhead.
Context
StackExchange Code Review Q#82209, answer score: 2
Revisions (0)
No revisions yet.