Joins are the foundation in SQL. The JOIN operation is essentially used to combine related tuples from two or more relations into a single type. In simple words, it is used to combine data or rows from two or more tables based on the known common attribute that has similar values.
You might be aware of the Cartesian product that returns a set of all records from two or more tables. JOIN performs the same functionality except for the fact that a Cartesian product returns all the possible combinations of relations whereas a join returns only those combinations that satisfy the specified conditions. So, Join first produces a Cartesian product followed by selection based on the mentioned filter condition.
You can continue reading to learn in-depth about all the types of join in SQL.
Types of JOINS are
1.CROSS Join
2. INNER Join
Inner join is further classified into
- Theta Join
- Equi Join
- Non-Equi Join
- Natural Join
3. OUTER Join
Outer join is further classified into
- Left Outer Join
- Right Outer Join
- Full Outer Join
4. SELF Join
5. SEMI Join
6. ANTI Join
Note:
Since Joins is one of the important topics in SQL, it requires you to have hands-on practice on the various examples that we have provided. For Joins, we used two tables (students and games). Scroll down to the end of the page to find queries to create these tables and to insert the data. You can simply execute those set of statements in your database to get started.
I. CROSS or CARTESIAN Join
Cross Join is essentially used to generate all possible combinations of the rows from two different (mostly unrelated) tables. In other words, it joins every row of first table with every row of second table thereby producing a Cartesian product. However, note that in practice it is not widely used.
Note: Cartesian product: A multiplication operation of two sets that produces a result consisting of all the possible combinations. Say, there are two sets A and B whose elements are {1, 2, 3} and {4, 5} respectively. A Cartesian product on the above mentioned sets would produce result as below. AxB ={(1,4), (1,5), (2,4), (2,5), (3,4), (3,5)}
Basic Syntax
SELECT table1.column1, table2.column2, table2.column1 ..
FROM
table1 CROSSJOIN table2;
Alternatively, you can also use the below query.
SELECT table1.column1, table2.column2, table2.column1 ..
FROM
table1, table2;
Here,
table1 is the first table
table2 is the second table
Query Example
Consider a scenario of students participating in different games. Look at the small section of tables – students (with 3 records) and games (with 2 records) depicted below. The CROSS join has to produce six (3*2) records as a result.
SELECT * FROM students;
SELECT * FROM games;
You may execute any one of the queries below
SELECT students.studentName, students.playerId, games.gameId, games.nameOfGame
FROM students, games;
OR
SELECT students.studentName, students.playerId, games.gameId, games.nameOfGame
FROM students
CROSS JOIN games;
Query Output
Look at the result below, it has combined every row of students table with every row of games table.
II. INNER Join
Inner Join is the most widely used join which returns the rows from both the tables provided they have matched values (value of a column in table 1 is equal to the value of a column in table 2).
It is a default type of join and hence INNER keyword is optional. The result set will have all those rows from both the tables which satisfy the mentioned condition, i.e. the common field’s value has to be equal.
Have a quick look at the Venn diagram of the Inner join shown below. This will already give you an idea of which rows could be returned as a result of a JOIN. It is similar to an intersection that returns the rows common to both the tables.
Further, they are divided into
a) Theta Join
Theta join combines or merges rows from two tables depending upon the theta condition specified. Theta condition can be pretty much made up of any of the comparison operators (<, >, <=, >=, =, <>). It simply allows all arbitrary comparison operations.
It can be represented by A ⋈θ B.
Basic Syntax
SELECT table1.column1, table2.column2, table2.column1, ..
FROM table1
INNER JOIN table2
ON table1.column [comparison operator] table2.column;
Here,
table1 is the first table
table2 is the second table
comparison operator can by any of the operators such as >, <, >=, <=, etc.
Query example
Look at the small section of tables to understand the Theta join – students (with 3 records) and games (with 2 records) shown below.
SELECT * FROM students;
SELECT * FROM games;
Let us execute the query below
SELECT students.studentName, students.playerId, games.gameId, games.nameOfGame
FROM students INNER JOIN games
ON students.playerId < games.gameId;
Query Output
Combinations are generated such that playerId of students is always less than that of gameId of games.
An Equi join is a Theta join which uses an equality operator (=). Whenever an equivalence condition is found in a join, it is said to be an Equi join.
c) Non-Equi Join:
As the name itself signifies, non-equi join is opposite of the equijoin, i.e. when you use any other operator except equality operator.
Both equi join and non-equi join are a subset of Theta join. However, there are some differences. The first difference between the two is obviously regarding the equality operator. The second difference is that in the case of non equi join, databases are not optimized whereas for equi join databases are optimized.
d) Natural Join:
A natural join is an Equi join based upon the common column /attribute of the same type in both the tables. It removes all the duplicates from the result set, unlike all other joins. It is unlike the Cartesian product, we can implement natural join if and only if there is at least one common attribute with the same domain in both the relations.
Basic Syntax
SELECT table1.column1, table2.column2, table2.column1, ..
FROM table1
INNER JOIN table2
ON table1.common_column = table2.common_column;
Here,
table1 is first table
table2 is second table
common_column is the attribute common to both the tables
Query Example
Time for hand on! Refer to the sample tables below.
SELECT * FROM students;
Have a quick look at the above tables where you can verify playerId of students table and gameId of games table are having similar values and datatype. Therefore, we equate them signifying the common field.
Execute the query below.
SELECT students.studentName, students.playerId, games.gameId, games.nameOfGame
FROM students
INNER JOIN games
ON students.playerId = games.gameId;
Query Output
Check out the below result set produced. The rows of the students table with playerId 1, 2 are missing in the result set due to the fact that there are no rows with gameId 1, 2 in the games table. Likewise, the rows of games table with gameId 9, 10 are also not retained in the result set because we don’t have any rows with playerId 9, 10 in the students tables. Hence, the takeaway message when it comes to inner join is that only those rows are returned that have the matching values in both the tables.
As we learned previously that Inner join can only produce the result set consisting of matched rows from both the tables. But what if we desire to have all the records in the result despite the fact that there are no matching rows? And that is where Outer Join comes to our rescue. Outer Join does not really require the two tables to have a matching row. Even though, if there is no matching record found in one or both tables, it does not matter, we can still retain all the rows.
For a better understanding of OUTER joins, we have given query examples and their corresponding results below. For outer join (left, right, outer) examples, refer the sample tables – students and games.
SELECT * FROM students;
Further, they are classified into
a) Left Outer Join
Left Join returns all the rows of the table on the left side of the join and the subset of rows of the table on the right side of join which have the matched values. Null gets inserted for all those rows of the left table for which there is no matching row in the right table.
Have a quick look at the Venn diagram of the Left outer join shown below.
SELECT table1.column1, table2.column2, table2.column1, ..
FROM table1
LEFT OUTER JOIN table2
ON table1.common_column = table2.common_column;
Here,
table1 is first table
table2 is second table
common_column is the common field to both the tables.
Query Example
SELECT students.studentName, students.playerId, games.nameOfGame
FROM students
LEFT OUTER JOIN
games ON students.playerId = games.gameId;
Notice that the playerId of students table has the same type of values as that of gameId of games table. Hence we choose these columns as common columns that can be equated as shown above.
Query Output
As you can see below, the output consists of all the records from the left table (students) and only matched rows from the right table (games). Take into consideration that the games such as FOUR SQUARE and CHARADES are not included in the output. The reason being they have an ID of 9 and 10 respectively for which no match is found in students table.
Right Join returns all the rows of the table on the right side of the join and the subset of rows of the table on the left side of join which have the matched values. Null gets inserted for all those rows of the right table for which there is no matching row in the left table.
Have a quick look at the Venn diagram of the Right outer join shown below.
Basic Syntax
SELECT table1.column1, table2.column2, table2.column1, ..
FROM table1
RIGHT OUTER JOIN table2
ON table1.common_column = table2.common_column;
Here,
table1 is first table
table2 is second table
common_column is the common field to both the tables.
Query Example
SELECT students.studentName, students.playerId, games.nameOfGame
FROM students
RIGHT OUTER JOIN
games ON students.playerId = games.gameId;
Query Output
Notice that all the rows from the right table (games) are returned and correspondingly null values are inserted into rows of students table for which no match is found. The rows with playerId 1 and 2 are missing in the result since we do not have any gameId with value 1 or 2 in the games table.
Full Outer Join simply returns all the rows of the table on the left side of the join and all rows of the table on the right side of join. It is indeed a combination of left and right outer join. Null values get inserted for all those rows of the left /right table where no matching rows are found on the other side.
Have a quick look at the Venn diagram of the Full outer join shown below.
SELECT table1.column1, table2.column2, table2.column1, ..
FROM table1
FULL OUTER JOIN table2
ON table1.common_column = table2.common_column;
Here,
table1 is first table
table2 is second table
common_column is the common field to both the tables.
Query Example
SELECT students.studentName, students.playerId, games.nameOfGame
FROM students
FULL OUTER JOIN
games ON students.playerId = games.gameId;
Query Output
Have a look at the result set below. It includes all the rows from both students and games. However, if no match is found in the other table, null values are inserted.
SELF Join is simply defined as a join where a table is joined to itself. It can also be said that it is a join of two copies of tables. However, we may define aliases for the two same tables to make a differentiation, every row of the table is joined with itself and all other rows depending upon the additional conditions that can be defined in the where clause.
Basic Syntax
SELECT a.column1, b.column2, a.column2
FROM table1 a, table1 b
WHERE condition;
Here,
table1 is just one table
a and b are the aliases to differentiate two copies of table1
condition can be any filter condition in the where clause.
Query Example
SELECT a.gameId, b.typeOfGame, b.nameOfGame
FROM games a, games b
WHERE a.gameId < b.gameId;
Query Output
Here we have come to an end of all basic joins. However, there are other two joins namely Semi and Anti join. Let us see examples for the same. Note that the below queries were executed in the Oracle database.
V. SEMI Join
A Semi join is a join between two tables where the records from the first table are returned based on the presence of the matching rows in the second table. The main difference to note between inner join and semi-join is that a semi-join only returns each row in the first table only once irrespective of how many matches are present in the second table. The operator EXISTS or IN (=ANY can also be used since it is equivalent to IN) can be used when implementing a semi-join.
You may refer to the tables below for both semi and anti join.
SELECT * FROM students;
Semi join using ‘IN’:
SELECT students.studentName, students.playerId
FROM students WHERE playerId IN (SELECT games.gameId FROM games);
Semi join using ‘EXISTS’
SELECT students.studentName, students.playerId
FROM students WHERE EXISTS (SELECT null FROM games WHERE students.playerId = games.gameId);
Query Output
At first, the subquery is executed, checks if those rows are in the first table (in our case students). All those rows from students are returned if they are present in the second table – games.
The query output for both the queries using IN and EXISTS operator will be same as shown below. As you can see, the rows with playerId 1 and 2 are missing in the result set because they are not present in the games table.
An anti join can be simply defined as a join between two tables where the records from the first table are returned based on the absence of the matching rows in the second table. It works just opposite to semi-join. The operator NOT EXISTS or NOT IN can be used to implement anti join.
You may refer to the same sample tables – students and games shown previously in semi join.
Query Example
Anti join using ‘NOT IN’:
SELECT students.studentName, students.playerId
FROM students WHERE playerId NOT IN (SELECT games.gameId FROM games);
Anti join using ‘NOT EXISTS’
SELECT students.studentName, students.playerId
FROM students WHERE NOT EXISTS (SELECT null FROM games where students.playerId = games.gameId);
Query Output
At first, the subquery is executed, checks if those rows are not in the first table (in our case students). All those rows from students are returned if they are absent in the second table – games.
The query output for both the queries using NOT IN and NOT EXISTS operator will be same as shown below.
Notice that the rows of students table with playerId 1 and 2 are the only rows returned in the result. All other rows having playerId such as 3, 4, 5, 6, 7, 8 from students table are not returned because these IDs are present as gameId in games table and thus are discarded.
Queries to create the test data for both ‘students’ and ‘games’ table
CREATE TABLE students
(
studentId NUMBER,
studentName VARCHAR2(20),
playerId NUMBER,
phoneNumber NUMBER,
email VARCHAR2(20)
);
CREATE TABLE games
(
gameId NUMBER,
typeOfGame VARCHAR2(10),
nameOfGame VARCHAR2(20)
);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (100, ‘ADAM SMITH’, 1, 7748972878, ‘asmith@yahoo.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (101, ‘JOSEPH STALIN’, 2, 8974873738, ‘stalin@hotmail.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (102, ‘NINA DOBREV’, 3, 9734892748, ‘ndobr@yahoo.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (103, ‘ANTHONY’, 4, 8374893787, ‘antho@gmail.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (104, ‘GRACE FORBES’, 5, 9734873434, ‘gforbes@yahoo.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (105, ‘ROSS BUTLER’, 6, 7384288274, ‘rossbutler@gmail.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (106, ‘JOHN GILLBERT’, 7, 9748427482, ‘jogillbert@yahoo.com’);
INSERT INTO students (studentId, studentName, playerId, phoneNumber, email) VALUES (107, ‘NINA DOBREV’, 8, 9734892748, ‘ndobr@yahoo.com’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (3, ‘INDOOR’, ‘CHESS’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (4, ‘OUTDOOR’, ‘CRICKET’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (5, ‘OUTDOOR’, ‘BADMINTON’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (6, ‘OUTDOOR’, ‘BASKETBALL’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (7, ‘OUTDOOR’, ‘FOOTBALL’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (8, ‘INDOOR’, ‘CARROM’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (9, ‘OUTDOOR’, ‘FOUR SQUARE’);
INSERT INTO games (gameId, typeOfGame, nameOfGame) VALUES (10, ‘INDOOR’, ‘CHARADES’);
this is a right website and this is a perfect lines
Yes I’m also like it.
Yes this is right website for joins
Very nice explanation