patternModerate
Alternatives to concatenating strings or going procedural to prevent SQL query code repetition?
Viewed 0 times
preventgoingsqlconcatenatingqueryalternativesproceduralrepetitioncodestrings
Problem
Disclaimer: Please bear with me as someone who only uses databases a tiny fraction of his work time. (Most of the time I do C++ programming in my job, but every odd month I need to search/fix/add something in an Oracle database.)
I have repeatedly needed to write complex SQL queries, both for ad-hoc queries and for queries built into applications, where large parts of the queries where just repeated "code".
Writing such abominations in a traditional programming language would get you in deep trouble, yet I (I) have yet been unable to find any decent technique to prevent SQL query code repetition.
Edit: 1st, I want to thank the answerers who provided excellent improvements to my original example. However, this question is not about my example. It's about repetitiveness in SQL queries. As such, the answers (JackP, Leigh) so far do a great job of showing that you can reduce repetitiveness by writing better queries. However even then you face some repetitiveness that apparently cannot be removed: This always nagged me with SQL. In "traditional" programming languages I can refactor quite a lot to minimize repetitiveness in the code, but with SQL it seems that there are no(?) tools that allow for this, except for writing a less repetitive statement to begin with.
Note that I have removed the Oracle tag again, as I would be genuinely interested whether there's no database or scripting language that allows for something more.
Here's one such gem that I cobbled together today. It basically reports the difference in a set of columns of a single table. Please skim through the following code, esp. the large query at the end. I'll continue below.
```
--
-- Create Table to test queries
--
CREATE TABLE TEST_ATTRIBS (
id NUMBER PRIMARY KEY,
name VARCHAR2(300) UNIQUE,
attr1 VARCHAR2(2000),
attr2 VARCHAR2(2000),
attr3 INTEGER,
attr4 NUMBER,
attr5 VARCHAR2(2000)
);
--
-- insert some test data
--
insert into TEST_ATTRIBS values ( 1, 'Alfred', 'a', 'Foobar', 33, 44, 'e');
inse
I have repeatedly needed to write complex SQL queries, both for ad-hoc queries and for queries built into applications, where large parts of the queries where just repeated "code".
Writing such abominations in a traditional programming language would get you in deep trouble, yet I (I) have yet been unable to find any decent technique to prevent SQL query code repetition.
Edit: 1st, I want to thank the answerers who provided excellent improvements to my original example. However, this question is not about my example. It's about repetitiveness in SQL queries. As such, the answers (JackP, Leigh) so far do a great job of showing that you can reduce repetitiveness by writing better queries. However even then you face some repetitiveness that apparently cannot be removed: This always nagged me with SQL. In "traditional" programming languages I can refactor quite a lot to minimize repetitiveness in the code, but with SQL it seems that there are no(?) tools that allow for this, except for writing a less repetitive statement to begin with.
Note that I have removed the Oracle tag again, as I would be genuinely interested whether there's no database or scripting language that allows for something more.
Here's one such gem that I cobbled together today. It basically reports the difference in a set of columns of a single table. Please skim through the following code, esp. the large query at the end. I'll continue below.
```
--
-- Create Table to test queries
--
CREATE TABLE TEST_ATTRIBS (
id NUMBER PRIMARY KEY,
name VARCHAR2(300) UNIQUE,
attr1 VARCHAR2(2000),
attr2 VARCHAR2(2000),
attr3 INTEGER,
attr4 NUMBER,
attr5 VARCHAR2(2000)
);
--
-- insert some test data
--
insert into TEST_ATTRIBS values ( 1, 'Alfred', 'a', 'Foobar', 33, 44, 'e');
inse
Solution
You are too modest - your SQL is well and concisely written given the task you are undertaking. A few pointers:
-
if you are willing for the numeric comparisons to happen after casting to varchar, the following might be worth considering:
the second view is a kind of
--EDIT--
To answer the more general side of the question, there are techniques to reduce repetition in SQL, including:
But you can't bring OO ideas into the SQL world directly - in many cases repetition is fine if the query is readable and well-written, and it would be unwise to resort to dynamic SQL (for example) just to avoid repetition.
The final query including Leigh's suggested change and a CTE instead of a view could look something like this:
t1.name <> t2.nameis always true ift1.name = REPLACE(t2.name, 'DUP_', '')- you can drop the former
- usually you want
union all.unionmeansunion allthen drop duplicates. It might make no difference in this case but always usingunion allis a good habit unless you explicitly want to drop any duplicates.
-
if you are willing for the numeric comparisons to happen after casting to varchar, the following might be worth considering:
create view test_attribs_cast as
select id, name, attr1, attr2, cast(attr3 as varchar(2000)) as attr3,
cast(attr4 as varchar(2000)) as attr4, attr5
from test_attribs;
create view test_attribs_unpivot as
select id, name, 1 as attr#, attr1 as attr from test_attribs_cast union all
select id, name, 2, attr2 from test_attribs_cast union all
select id, name, 3, attr3 from test_attribs_cast union all
select id, name, 4, attr4 from test_attribs_cast union all
select id, name, 5, attr5 from test_attribs_cast;
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from test_attribs_unpivot t1 join test_attribs_unpivot t2 on(
t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);the second view is a kind of
unpivot operation - if you are on at least 11g you can do this more concisely with the unpivotclause - see here for an example- I'm say don't go down the procedural route if you can do it in SQL, but...
- Dynamic SQL is probably worth considering despite the problems you mention with testing and maintenance
--EDIT--
To answer the more general side of the question, there are techniques to reduce repetition in SQL, including:
- Views - you know about that one :)
- Common Table Expressions (see here for example)
- individual features of the database such as
decode(see Leigh's answer for how this can reduce repetition), window functions and hierarchical/recursive queries to name but a few
But you can't bring OO ideas into the SQL world directly - in many cases repetition is fine if the query is readable and well-written, and it would be unwise to resort to dynamic SQL (for example) just to avoid repetition.
The final query including Leigh's suggested change and a CTE instead of a view could look something like this:
with t as ( select id, name, attr#,
decode(attr#,1,attr1,2,attr2,3,attr3,4,attr4,attr5) attr
from test_attribs
cross join (select rownum attr# from dual connect by rownumt2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);Code Snippets
create view test_attribs_cast as
select id, name, attr1, attr2, cast(attr3 as varchar(2000)) as attr3,
cast(attr4 as varchar(2000)) as attr4, attr5
from test_attribs;
create view test_attribs_unpivot as
select id, name, 1 as attr#, attr1 as attr from test_attribs_cast union all
select id, name, 2, attr2 from test_attribs_cast union all
select id, name, 3, attr3 from test_attribs_cast union all
select id, name, 4, attr4 from test_attribs_cast union all
select id, name, 5, attr5 from test_attribs_cast;
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from test_attribs_unpivot t1 join test_attribs_unpivot t2 on(
t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);with t as ( select id, name, attr#,
decode(attr#,1,attr1,2,attr2,3,attr3,4,attr4,attr5) attr
from test_attribs
cross join (select rownum attr# from dual connect by rownum<=5))
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from t t1 join test_attribs_unpivot t2
on( t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);Context
StackExchange Database Administrators Q#2710, answer score: 13
Revisions (0)
No revisions yet.