Tuesday, October 21, 2008

Oracle Outer Joins, non-commutative join conditions

In this post I am going to show how Outer Join Conditions do not behave as expected and in other words are not commutative. Check the following 2 sample tables:

Script:


CREATE TABLE TABLE1
(
ID NUMBER(10),
NAME VARCHAR2(50 BYTE)
)

CREATE UNIQUE INDEX PK_TABLE1 ON TABLE1(ID);
ALTER TABLE TABLE1 ADD (CONSTRAINT PK_TABLE1 PRIMARY KEY (ID));

-----------------
CREATE TABLE TABLE2
(
ID NUMBER(10),
TABLE1_ID NUMBER(10),
NAME VARCHAR2(50 BYTE),
CREATE_DATE DATE
)

CREATE UNIQUE INDEX PK_TABLE2 ON TABLE2(ID);
ALTER TABLE TABLE2 ADD (CONSTRAINT PK_TABLE2 PRIMARY KEY (ID));

Here is some data to insert:


INSERT INTO TABLE1 ( ID, NAME ) VALUES ( 1, 'first');
INSERT INTO TABLE1 ( ID, NAME ) VALUES ( 2, 'second');
INSERT INTO TABLE1 ( ID, NAME ) VALUES ( 3, 'third');
--------------------------------
INSERT INTO TABLE2 ( ID, TABLE1_ID, NAME, CREATE_DATE ) VALUES (
1, 1, 'kaakaa', TO_Date( '10/21/2008', 'MM/DD/YYYY'));
INSERT INTO TABLE2 ( ID, TABLE1_ID, NAME, CREATE_DATE ) VALUES (
2, 1, 'kookoo', TO_Date( '10/16/2008', 'MM/DD/YYYY'));
INSERT INTO TABLE2 ( ID, TABLE1_ID, NAME, CREATE_DATE ) VALUES (
3, 2, 'keekee', TO_Date( '10/30/2008', 'MM/DD/YYYY'));

Now for the interesting part. Here are two select statements:

Select *
From table1 Left Join
table2 On
(table2.Table1_ID = table1.ID and SysDate <= Create_Date)


Select *
From table1 Left Join
table2 On
(table2.Table1_ID = table1.ID and Create_Date >= SysDate)

The following can be noticed from the above 2 select statements:
  1. They are basically the same with the only difference of replacing
    SysDate <= Create_Date

    with
    Create_Date >= SysDate

  2. Left Join is used instead of a usual Inner Join. This would be used when we need to get all the records on the left table regardless of whether the Join Condition succeeds or fails, in case it fails then the values from the right table are null.
  3. The Join Condition (On) is a composite condition
Now the expected behavior is for both statements to return the same result set, yet that does not happen.
Results:

ID NAME    ID_1 TABLE_1_ID NAME_1 CREATE_DATE
-- ------- ---- ---------- ------ -----------
 1 first      1          1 kaakaa 10/21/2008
 1 first      2          1 kookoo 10/16/2008
 2 second     3          2 keekee 10/30/2008
 3 third
 4 rows selected

ID NAME    ID_1 TABLE_1_ID NAME_1 CREATE_DATE
-- ------- ---- ---------- ------ -----------
 2 second     3          2 keekee 10/30/2008
 1 first
 3 third
 3 rows selected
Now we can see how Outer Join Conditions (Left Join is implicitly an Outer Join) are not commutative in the sense that "a op b" is not the same as "b op' a" (where op' is the inverse operator of op). The expected behavior happens when "a" (Left Hand Side) is the column name. Also notice that using between for dates for example poses a similar problem where a condition like "sysdate between date1 and date2" is not the same as "date1 <= SysDate and date2 >= SysDate". To get the expected behavior we have to use the second condition.

kick it on DotNetKicks.com

0 comments:

Post a Comment