HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

What happens to a query that is performed on a view?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
performedwhatqueryviewthathappens

Problem

What optimizations and/or changes occur to a query that is performed on a view? (It would be great to also know if each of the major DBs - Oracle, SQL Server, Postgres, MySQL/MariaDB - handle this differently).

For example, let's say that I have the following view:

CREATE VIEW my_view AS
SELECT
id,
val0,
val1,
val2
FROM my_table
WHERE val0 = 42;


...and I want to perform the following query on that view:

SELECT
id,
val1
FROM my_view
WHERE val2 = 'Fish fingers and custard';


...does the database optimize the query or are two selects essentially performed? I.E. is it essentially the same as performing potential query 1 or 2 below or neither?

Potential Query 1:

SELECT
id,
val1
FROM (
SELECT
id,
val0,
val1,
val2
FROM my_table
WHERE val0 = 42
)
WHERE val2 = 'Fish fingers and custard';


Potential Query 2:

SELECT
id,
val1
FROM my_table
WHERE val0 = 42
AND val2 = 'Fish fingers and custard';

Solution

My answer will focus almost exclusively on SQL Server just because I'm going to give a fairly detailed answer and I don't have the same level of expertise in other platforms.

First it's important to realize that the query optimizer doesn't directly work against the SQL you write. It gets transformed into an internal format before optimization. What you have listed for potential query 1 and query 2 is pretty much the same thing, except for a view subtle differences. A similar question about the difference between query 1 and query 2 was asked and answered here. If you'd like to learn more about the internal format SQL Server uses, you can go through an excellent series of blog posts by Paul White. However, most of the time it's sufficient to just compare the query plans of two queries that you suspect might be optimized in the same way.

There are a few ways that using a view might improve performance:

-
It's possible to define views which are implemented as physical structures on the database. In SQL Server these are called indexed views. In Oracle these are called materialized views. In Oracle it's possible for a query not written against a materialized view to still use the materialized view. Further discussion is outside the scope of this answer.

-
Sometimes the same SQL query needs to run on multiple RDBMS platforms. With a view we can use syntax that is unique to each platform but the same query is sent to the databases. Without a view we might have to use user defined functions which can be bad for performance.

-
Sometimes people put really clever code in views. If it's better than what you would have written you can improve performance by using the view.

In general, a query written against a view will be as efficient or less efficient than a query written directly against the base tables. This is because a view definition often contains extra columns and joins that might not be needed for the specific question a query against the view is asking. To get good performance against a complex view we're hoping that three things happen:

-
Column elimination. If a column is presented in a view but not mentioned in the query against the view then the value shouldn't be calculated.

-
Join elimination. If a table is used in a view that could be safely eliminated without changing the results then it should be eliminated. Sometimes this could happen if the optimizer had more information. For example a foreign key might not be declared in the database. Other times the rule to implement join elimination might not cover a certain scenario. For example, in Oracle join elimination cannot happen for a multi-column join but it can in SQL Server.

-
Predicate pushdown. If I add a filter to the view and there's an index on the underlying column then I should be able to use that index. I believe that this is what your example is hinting at. Even without an index I still want filters to be pushed down as far as possible into the plan to avoid unnecessary work.

In my experience these rules are implemented pretty well by the query optimizer which is of course a good thing, but it can be bad for SE demos. However, if we write sneaky code we can end up with examples that show all of the above optimizations failing. This is because the rules that implement the optimizations are not designed to cover every possible scenario.

First I'll create some simple sample data. The data itself isn't that important, but the table definitions are.

DROP TABLE IF EXISTS dbo.BASE_TABLE;
CREATE TABLE dbo.BASE_TABLE (
ID INT NOT NULL,
ID2 INT NOT NULL,
FILLER VARCHAR(50), 
CONSTRAINT BASE_TABLE_ID CHECK (ID > 0),
PRIMARY KEY (ID)
);

INSERT INTO dbo.BASE_TABLE WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

DROP TABLE IF EXISTS dbo.EXTRA_TABLE;
CREATE TABLE dbo.EXTRA_TABLE (
ID INT NOT NULL,
ID2 INT NOT NULL,
FILLER VARCHAR(50),
PRIMARY KEY (ID, ID2)
);

INSERT INTO dbo.EXTRA_TABLE WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

DROP TABLE IF EXISTS dbo.EMPTY_TABLE;
CREATE TABLE dbo.EMPTY_TABLE(
ID INT NOT NULL,
PRIMARY KEY (ID)
);

DROP TABLE IF EXISTS dbo.EMPTY_CCI;
CREATE TABLE dbo.EMPTY_CCI (
ID INT NOT NULL
, INDEX CCI_EMPTY_CCI CLUSTERED COLUMNSTORE
);

GO

CREATE FUNCTION dbo.THE_BEST_FUNCTION () RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
RETURN NULL;
END;

GO


Here is my sneaky view definition:

```
CREATE VIEW dbo.SNEAKY_VIEW
AS
SELECT
ABS(t.ID) ID
, COUNT(*) OVER (PARTITION BY ABS(t.ID)) CNT
, dbo.THE_BEST_FUNCTION() FUNCTION_VALUE
FROM dbo.BASE_TABLE t
LEFT OUTER JOIN dbo.EXTRA_TABLE e ON CAST(t.ID AS VARCHAR(10)) = CAST(e.ID AS VARCHAR(10)) + '|' + CAST(e.ID2 AS VARCHAR(10))
LEFT OUTER JOIN dbo

Code Snippets

DROP TABLE IF EXISTS dbo.BASE_TABLE;
CREATE TABLE dbo.BASE_TABLE (
ID INT NOT NULL,
ID2 INT NOT NULL,
FILLER VARCHAR(50), 
CONSTRAINT BASE_TABLE_ID CHECK (ID > 0),
PRIMARY KEY (ID)
);

INSERT INTO dbo.BASE_TABLE WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

DROP TABLE IF EXISTS dbo.EXTRA_TABLE;
CREATE TABLE dbo.EXTRA_TABLE (
ID INT NOT NULL,
ID2 INT NOT NULL,
FILLER VARCHAR(50),
PRIMARY KEY (ID, ID2)
);

INSERT INTO dbo.EXTRA_TABLE WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

DROP TABLE IF EXISTS dbo.EMPTY_TABLE;
CREATE TABLE dbo.EMPTY_TABLE(
ID INT NOT NULL,
PRIMARY KEY (ID)
);

DROP TABLE IF EXISTS dbo.EMPTY_CCI;
CREATE TABLE dbo.EMPTY_CCI (
ID INT NOT NULL
, INDEX CCI_EMPTY_CCI CLUSTERED COLUMNSTORE
);

GO

CREATE FUNCTION dbo.THE_BEST_FUNCTION () RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
RETURN NULL;
END;

GO
CREATE VIEW dbo.SNEAKY_VIEW
AS
SELECT 
  ABS(t.ID) ID
, COUNT(*) OVER (PARTITION BY ABS(t.ID)) CNT
, dbo.THE_BEST_FUNCTION() FUNCTION_VALUE
FROM dbo.BASE_TABLE t
LEFT OUTER JOIN dbo.EXTRA_TABLE e ON CAST(t.ID AS VARCHAR(10)) = CAST(e.ID AS VARCHAR(10)) + '|' + CAST(e.ID2 AS VARCHAR(10))
LEFT OUTER JOIN dbo.EMPTY_CCI ON 1 = 0
LEFT OUTER JOIN dbo.EMPTY_TABLE e2 ON t.ID = e2.ID;

GO
SELECT 
  ID
, 1 CNT
, NULL
FROM dbo.BASE_TABLE t;
SELECT 
  ID
, 1 CNT
, NULL
FROM dbo.BASE_TABLE t;
SELECT *
FROM (
    SELECT 
      ABS(t.ID) ID
    , COUNT(*) OVER (PARTITION BY ABS(t.ID)) CNT
    FROM dbo.BASE_TABLE t
    LEFT OUTER JOIN dbo.EXTRA_TABLE e ON CAST(t.ID AS VARCHAR(10)) = CAST(e.ID AS VARCHAR(10)) + '|' + CAST(e.ID2 AS VARCHAR(10))
    LEFT OUTER JOIN dbo.EMPTY_CCI ON 1 = 0
    LEFT OUTER JOIN dbo.EMPTY_TABLE e2 ON t.ID = e2.ID
) derived_table;

Context

StackExchange Database Administrators Q#172208, answer score: 14

Revisions (0)

No revisions yet.