HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Using Solr/Lucene for searching non-text tables?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
tablessearchingnontextsolrluceneforusing

Problem

I am creating a web application to retrieve subsets of one large (4m rows) table. The 4m rows only change once a year. The table has 200+ columns of types boolean and numeric. It has no text columns.

Users will query subsets of this table for download.

I am rather familiar with the PostgreSQL 9.1 database, and my plan is:

  • The webapp will help the user to create queries like "WHERE a=3 AND b=true AND c>300"



  • The DB will have proper indexes for the most-used columns.



NOW.. I read here:
https://stackoverflow.com/questions/10053050/why-is-solr-so-much-faster-than-postgres:


I recently switched from Postgres to Solr and saw a ~50x speed up in our queries. The queries we run involve multiple ranges, and our data is vehicle listings. For example: "Find all vehicles with mileage < 50,000, $5,000 < price < $10,000, make=Mazda..."

So now I wonder: Will/can Solr, Lucene, ElasticSearch, Amazon Cloud Search searches be faster than PostgreSQL even if no full-text search is involved?

Solution

Will/can Solr/Lucene searches be faster than PostgreSQL even if no
full-text search is involved?

Yes. As per your quoted example, it can be many times faster than a relational database for certain use cases. Not surprising really.

Solr is a search engine. PostgreSQL is a relational database engine.

Solr is built from the ground up to do one thing well, search. It cannot do the work of a transactional database system. Solr cannot guarantee ACID compliance, it does not efficiently manage concurrency, it is not a viable candidate for OLTP workloads.

PostgreSQL is built from the ground up to do one thing well, transactional processing. It can also do a reasonable job of providing search type facilities but that is not its primary role. It does guarantee ACID compliance, it does manage concurrency, it is not a search engine.


The 4m rows only change once a year.

In which case you probably don't need the key features of a relational database engine.

You need a search engine. Could be Solr, Lucene, ElasticSearch, Amazon Cloud Search or one of a dozens others but first and foremost it will be a search platform, not a relational database.

Context

StackExchange Database Administrators Q#34014, answer score: 9

Revisions (0)

No revisions yet.