Tuesday, October 26, 2004

Database Joins

I've been trying to get the following styled like my original picture for a week and I've been very unhappy with the css so I'm posting in a slightly different view.

The discussion below may open a little can of worms amongst the database programmers that have been doing dba work long enough to develop their own style and habits. As with so much in the computing world, there are always multiple ways to accomplish the same goal.

Databases are basically repositories of information. Much like you can group all the women in your household into one room and all the men into another room you can hold information in database in rooms called "tables". These tables can then be related through some commonality. For instance, although all the women are in one room and all the men are in another room they are still related by being in the same family. If we had some guests arrive at the house we could still sort the women into the one room and the men into the other room. Now we can begin to make queries on the data. For instance, we could ask all women to come into the dining room "SELECT women from room1" But what if we wanted the family to help prepare for the dinner table for the guests. We would instead ask all women and men that were related to the family to come into the dining room. This would leave our guests in their respective rooms. The query would look like "SELECT ra.women, rb.men FROM room1 ra, room2 rb WHERE relation = 'family'"

Below is an example of an inner join and a left join and some different syntax.







If you want all data from A that has cooresponding data in B you are discussing a join. For instance if A had the columns (userid, name, and age) and B had the columns (userid, favoritecolor) you could get all a users favorite colors with a join such as:


SELECT A.name, B.favoritecolor
FROM A
INNER JOIN B
ON A.userid = B.userid
WHERE A.userid = 36 (if you wanted to limit it to user 36)




The inner join as shown above would be all data where the userid existed in both A and B. The notation shown above is the long form notation. This same query can be written with shorthand as:


SELECT A.name, B.favoritecolor
FROM A,B
WHERE A.userid = 36


OR
A longer form but still shorthand:


SELECT A.name, B.favoritecolor
FROM A
JOIN B
ON A.userid = B.userid
WHERE A.userid = 36




A left outer join combines ALL data from A with ONLY the data from B where there is a match as how in green to the left. It is written as:


SELECT A.name, B.favoritecolor
FROM A
LEFT OUTER JOIN B
ON A.userid = B.userid



So you would get a result set of EVERY user's name in A regardless of whether or not there was corresponding data in B. Where there is not corresponding data, favoritecolor would be populated with NULL.