HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

SQL: Search for a keyword in several columns of a table

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
searchcolumnssqlkeywordseveralfortable

Problem

I want to perform a search in several columns of a table. I use the following query:

select *
from Tabela t
inner join "TabelaPai" tp on tp."ID" = t."RefTabelaPai" and tp."RefProject" = 'projectid'
where not t."Deleted" 
 and (t.Col1 ~ '.*__param1__.*' or t.Col2 ~ '.*__param1__.*' or t.Col3 ~ '.*__param1__.*'
   or t.Col4 ~ '.*__param1__.*' or t.Col5 ~ '.*__param1__.*' or t.Col6 ~ '.*__param1__.*' 
   or t.Col7 ~ '.*__param1__.*' or t.Col8 ~ '.*__param1__.*' or t.Col9 ~ '.*__param1__.*');


This will search for the keyword __param1__ in any of the columns and it's working fine.

But I don't like the way the query looks like. Any suggestion on how to refactor the query so it can look 'prettier' (without those ~ '.__param1__.' repetitions, for example)?

Edit: A little of context about the query:

What leads to this usage is that I can parameterize the data in the table. For example, I have a column in a table where scripts are saved. My application allows the users to parametrize the script using something like __param1__. If the user wants to rename the parameter I'll have to search for the usage of the parameter in every column that is parameterizable, and this is the query that finds where the parameter is used.

Solution

Something "prettier"? And so many repetitions call for generalization?

SELECT *
FROM   tabela AS t
JOIN   "TabelaPai" tp ON tp."ID" = t."RefTabelaPai"
WHERE  NOT t."Deleted" 
AND    tp."RefProject" = 'projectid'
AND    t::text LIKE '%\_\_param1\_\_%';


Or cleaner:

...
AND    t.*::text LIKE '%\_\_param1\_\_%';


A nested column of the same name (t in this case) would take precedence. The more verbose syntax t.* makes it an unambiguous reference to the table row.

You can reference the composite type of any relation in the SELECT list. The manual:

Whenever you create a table, a composite type is also automatically
created, with the same name as the table, to represent the table's row type.

You can cast the whole row (the composite type) to its text representation in one fell swoop, which is a very convenient syntactical shorthand. The resulting filter in my query is guaranteed to find every occurrence in the whole row.

The LIKE operator is generally faster than regular expression pattern matching (~). Regular expressions are far more powerful, but whenever LIKE can do the job, use it. Its syntax is simpler, too.

Underscores (_) have a special meaning for the LIKE operator, so you need to escape literal _. Default escape character is \.

Since Postgres 9.1, the setting standard_conforming_strings is on by default. Else, or with E'' syntax to declare Posix escape strings explicitly, escape \ like:

t::text LIKE E'%\\_\\_param1\\_\\_%'


If the separator in text representation (, by default) or double quotes (which can enclose strings) can be part of the search pattern, there can be false positives (across columns). So this is corner-case "dirty".

Asides:

It is cleaner to write AND tp."RefProject" = 'projectid' as WHERE clause, as it has no connection to tabela. Else you could put NOT t."Deleted" into the JOIN condition as well. Either way, same result.

Avoid CaMeL-case spelling of identifiers in Postgres like Tabela or RefTabelaPai. Without double-quotes, all identifiers are folded to lower case automatically.

My standing advice is to use legal. lower case identifiers exclusively in PostgreSQL and avoid avoid double-quoting and possible confusion.

Code Snippets

SELECT *
FROM   tabela AS t
JOIN   "TabelaPai" tp ON tp."ID" = t."RefTabelaPai"
WHERE  NOT t."Deleted" 
AND    tp."RefProject" = 'projectid'
AND    t::text LIKE '%\_\_param1\_\_%';
...
AND    t.*::text LIKE '%\_\_param1\_\_%';
t::text LIKE E'%\\_\\_param1\\_\\_%'

Context

StackExchange Code Review Q#2364, answer score: 10

Revisions (0)

No revisions yet.