HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythondjangoMinor

Implementation of lru_cache for daily "top categories" query

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
toplru_cachequeryforcategoriesimplementationdaily

Problem

In my website, I want to show top categories by products' view / sell counts.

Since this is not something changes every hour, I want to fetch real results only once a day from database (may be once a week). Here is what I have done:

def global_context(request):
    ## ... some other queries

    today = datetime.datetime.now().day
    top_cats = top_categories(today)  # hit once a day
    return {'top_categories': top_cats}

@lru_cache(maxsize=2)
def top_categories(day):
    # fetch first 50 top-selling products
    top_products = Product.objects.all().order_by('-sell_count')[:50] 

    # get first 10 unique categories
    categories = list(set([product.categori for product in top_products]))[:10]

    return categories


This code is working. (I have tested with per minute version). But I wonder,

-
Is this correct way to achieve my goal?

-
When currsize reached to maxsize nothing happened. Code kept working. Why?

-
Should I manually reset / clear cache with cache_clear?

Solution

From what I understand, the lru_cache is not the right tool when it comes to caching things in a production web application. Depending on how the web application is deployed, you would usually end up having multiple web server processes serving your application. Which means, that the LRU cache will be created in the memory space of every process. It's like if you would use the local-memory caching:


Note that each process will have its own private cache instance, which
means no cross-process caching is possible. This obviously also means
the local memory cache isn’t particularly memory-efficient, so it’s
probably not a good choice for production environments. It’s nice for
development.

In other words, you should not be caching with Python, cache with Django built-in caching mechanisms in an external storage - a database, or a memcached or redis instance. For instance, in a simple case to avoid external dependencies, you can just use a cache table in your database:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'my_cache_table',
    }
}


Then, you can cache the result of top_categories in this table, letting Django handle the cache record expiration.

Code Snippets

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'my_cache_table',
    }
}

Context

StackExchange Code Review Q#159728, answer score: 7

Revisions (0)

No revisions yet.