HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSQL Common Table Expression: Ambiguous Column

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlexpressioncolumnambiguouscommontable

Problem

I have a Common Table Expression which joins two tables with some column names in common:

CREATE TABLE first(
    id serial PRIMARY KEY,
    data VARCHAR,
    date DATE
);
CREATE TABLE second(
    id serial PRIMARY KEY,
    key INT REFERENCES first(id),
    data VARCHAR,
    date DATE
);

WITH cte AS
    (SELECT * FROM first JOIN second ON first.id=second.key)
SELECT data FROM cte;


This, of course, results in an error because field data is ambiguous.

I know that it would be possible to extract the ambiguous columns with the CTE and give them unique names.

The question is, is it possible resolve this ambiguity from the main SELECT statement? Alternatively, is there some notation I can use within the CTE to make this impossible without actually creating new names?

Edit

A number of comments suggest using the wild card is not appropriate. I only use it here to simplify the query — never in a serious query. Here is another version:

WITH cte AS
    (SELECT f.id as fid, s.id as sid, f.data, s.data
        FROM first f JOIN second s ON f.id=s.key
    )
SELECT data FROM cte;


I know that I can resolve the issue by aliasing in the CTE:

WITH cte AS
    (SELECT f.id as fid, s.id as sid, f.data as fdata, s.data as sdata
        FROM first f JOIN second s ON f.id=s.key
    )
SELECT fdata FROM cte;

Solution

While there is no real solution that matches your requirements, there are a few things to consider about the requirements.

First, let's see the hint that an error raised about a misnamed column produces:

WITH ts AS (
    SELECT * FROM t1 JOIN t2 USING (t_id)
) SELECT col2 FROM ts;

ERROR:  column "col2" does not exist
LINE 1: ...AS (SELECT * FROM t1 JOIN t2 USING (t_id)) SELECT col2 FROM ...
                                                             ^
HINT:  Perhaps you meant to reference the column "ts.col1" or the column "ts.col1".


Even the database suggests to choose from two indentically named columns - which, as you already guessed, impossible. (Interestingly, you cannot do such a thing in a definition of a view, not to mention tables.)

So, about the requirements:

It is usually wise not to use * in the first place. Apart from being faster to type, it does not bring too much. When writing the main SELECT, we usually have an idea which columns we need - just add those inside the CTE. This might be in some cases even beneficial for the performance of the query (rendering the size of the data set smaller).

When you want to make it simpler to type, you could try something like

SELECT f.*, s.data AS s_data 
FROM first AS f JOIN second AS s ON f.id = s.key


In my practice, it is even better to name columns with a prefix derived from the table name:

CREATE TABLE first(
    f_id serial PRIMARY KEY,
    f_data VARCHAR,
    f_date DATE
);

CREATE TABLE second(
    s_id serial PRIMARY KEY,
    s_first_id INT REFERENCES first(id), -- an exception, for making more clear 
                                         -- what the column refers to
    s_data VARCHAR,
    s_date DATE
);


This way, even in big databases, it is very rare to have columns with the same name. And this completely removes the problem you are fighting with.

Code Snippets

WITH ts AS (
    SELECT * FROM t1 JOIN t2 USING (t_id)
) SELECT col2 FROM ts;

ERROR:  column "col2" does not exist
LINE 1: ...AS (SELECT * FROM t1 JOIN t2 USING (t_id)) SELECT col2 FROM ...
                                                             ^
HINT:  Perhaps you meant to reference the column "ts.col1" or the column "ts.col1".
SELECT f.*, s.data AS s_data 
FROM first AS f JOIN second AS s ON f.id = s.key
CREATE TABLE first(
    f_id serial PRIMARY KEY,
    f_data VARCHAR,
    f_date DATE
);

CREATE TABLE second(
    s_id serial PRIMARY KEY,
    s_first_id INT REFERENCES first(id), -- an exception, for making more clear 
                                         -- what the column refers to
    s_data VARCHAR,
    s_date DATE
);

Context

StackExchange Database Administrators Q#153146, answer score: 3

Revisions (0)

No revisions yet.