patternsqlModerate
What does the output of a JOIN statement look like?
Viewed 0 times
thewhatstatementlookoutputjoinlikedoes
Problem
I've been wanting to use joins for a while, but I'm having trouble visualizing the output so I know how to put it to use.
Let's say I have 2 tables:
If my application is to run an SQL query to get a user's profile data, how would I use a join to get the city associated with a user's record, and how would the outputted record appear?
Let's say I have 2 tables:
CREATE TABLE Cities (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
city tinyblob
);
CREATE TABLE Users (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
username TINYBLOB,
city INT UNSIGNED,
FOREIGN KEY (city) REFERENCES Cities (id)
);If my application is to run an SQL query to get a user's profile data, how would I use a join to get the city associated with a user's record, and how would the outputted record appear?
Solution
Every result from a SQL operation is functionally a new table, whether or not it is stored on disk or in memory
The function of a join is to "Join" two tables together into a synthetic third table that (usually) only exists in memory during the time it is output to the application.
The reason to use a join is to reduce Data Anomalies, by insuring that data appears in one and only one place in the database.
Consider the following:
Here, whenever the government changes zip-codes (which happens entirely too frequently for database developers' tastes * the persons table will need to be updated such that all old zipcodes correctly belong to the correct city and new zip-code.
In this instance, the same zip code is in multiple tuples of the table, and if the table is editable manually, will be subject to severe update anomalies as new city/zip-code combinations are entered without the database as a whole being updated.
By having three tables instead, we get the same data, but with increased flexibility and reliability (at the cost of some performance, though questions of "appropriate denormalization" are a topic for another series of answers.)
SELECT Fname, Lname, AddressLine1, City, State, ZipCode
FROM Person
INNER JOIN Address USING (PersonID)
INNER JOIN City USING (CityID)
* More than never, especially in the case of moving the boundaries of zipcodes because of gerrymandering
The function of a join is to "Join" two tables together into a synthetic third table that (usually) only exists in memory during the time it is output to the application.
The reason to use a join is to reduce Data Anomalies, by insuring that data appears in one and only one place in the database.
Consider the following:
Person ( ID Integer Primary Key, FName String, LName String, Address String, City String, ZipCode String)Here, whenever the government changes zip-codes (which happens entirely too frequently for database developers' tastes * the persons table will need to be updated such that all old zipcodes correctly belong to the correct city and new zip-code.
In this instance, the same zip code is in multiple tuples of the table, and if the table is editable manually, will be subject to severe update anomalies as new city/zip-code combinations are entered without the database as a whole being updated.
By having three tables instead, we get the same data, but with increased flexibility and reliability (at the cost of some performance, though questions of "appropriate denormalization" are a topic for another series of answers.)
Person ( PersonID Integer Primary Key, FName String, LName String)
City ( CityID Integer Primary Key, City String, State String, ZipCode String )
Address ( AddressID Integer Primary Key, AddressLine1 String, CityID Integer References City, PersonID Integer References Person)
And we can get a single "table" out of the above through the following query:
SELECT Fname, Lname, AddressLine1, City, State, ZipCode
FROM Person
INNER JOIN Address USING (PersonID)
INNER JOIN City USING (CityID)
This allows us to update the individual tables whenever something specific about them changes, without corrupting the rest of the associated data.
An INNER JOIN is the default type of join. Roughly speaking it notes: "combine tuples where this record from table a matches this record from table b" The other types of joins are OUTER JOINS and what's known as a CARTESIAN PRODUCT`. An outer join is "Take everything from the left or right table (as specified) and where the other table matches, join it. Where it doesn't match, fill it with nulls" And a cartesian product (without getting into the relational algebra is: Take every possible combination of tuples in both tables and output them into one table, without matching anything.* More than never, especially in the case of moving the boundaries of zipcodes because of gerrymandering
Context
StackExchange Database Administrators Q#93, answer score: 12
Revisions (0)
No revisions yet.