What We Are Learn:
ToggleSELECT department,
COUNT(*) AS dept_count
FROM employee_table
WHERE department IN ('Testing', 'Development')
GROUP BY department
ORDER BY department;
SELECT salary
FROM (
SELECT DISTINCT salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employee
) t
WHERE rnk = 4;
✅ Handles duplicate salaries correctly
✅ Very strong interview answer
SELECT MAX(salary) FROM employee where salary < (SELECT MAX(salary) FROM employee);
SELECT salary
FROM (
SELECT salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS rn
FROM employee
) t
WHERE rn = 2;
SELECT salary
FROM (
SELECT salary,DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
) t
WHERE rnk = 3;
Here’s a clear, interview‑style explanation, step by step, of this query:
SELECT salary
FROM (
SELECT salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
) t
WHERE rnk = 3;
✅ What this query does (High level)
👉 It finds the 3rd highest salary from the employees table, treating duplicate salaries as one rank.
✅ Step‑by‑Step Explanation
🔹 Step 1: Inner Query (Subquery)
SELECT salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees;
This part does the main work.
✅ ORDER BY salary DESC
- Sorts salaries from highest to lowest
Example salaries:
100000 90000 90000 80000 70000
✅ DENSE_RANK()
- Assigns a rank to each distinct salary
- No gaps in ranking, even if salaries repeat
Result of the inner query:
| salary | rnk |
|---|---|
| 100000 | 1 |
| 90000 | 2 |
| 90000 | 2 |
| 80000 | 3 |
| 70000 | 4 |
📌 Important:
- Both
90000values get rank 2 - Next salary
80000gets rank 3, not 4
This is why DENSE_RANK() is used instead of RANK().
🔹 Step 2: Outer Query
SELECT salary
FROM ( … ) t
WHERE rnk = 3;
- Filters only rows where
rnk = 3 - That corresponds to the 3rd highest distinct salary
✅ Output:
80000
✅ Why DENSE_RANK() is Important Here
| Function | Behavior with duplicates |
|---|---|
| ROW_NUMBER() | Treats every row uniquely |
| RANK() | Skips rank numbers |
| ✅ DENSE_RANK() | No gaps, best for Nth highest |
Example difference:
Using RANK():
100000 → 1 90000 → 2 90000 → 2 80000 → 4 ❌ (rank 3 skipped)
Using DENSE_RANK():
80000 → 3 ✅ correct
SELECT ename,
job,
salary / 2 AS half_term_salary
FROM emp;
SELECT deptno,
COUNT(*) AS employee_count
FROM emp
WHERE job <> 'PRESIDENT'
GROUP BY deptno;
SELECT COUNT(*) AS employee_count FROM emp WHERE job = 'QA';
SELECT e.ename,
e.deptno,
e.salary
FROM emp e
JOIN (
SELECT deptno, MAX(salary) AS max_salary
FROM emp
GROUP BY deptno
) m
ON e.deptno = m.deptno
AND e.salary = m.max_salary;
SELECT ename,
salary,
deptno
FROM (
SELECT ename,
salary,
deptno,
DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM emp
) t
WHERE rnk <= 3;
SELECT COUNT(*) AS employee_count
FROM emp
WHERE job IN ('CLERK', 'MANAGER');
SELECT ename,
hiredate
FROM emp
WHERE hiredate < DATE '1990-06-30'
OR hiredate > DATE '1990-12-31';
Both RANK() and DENSE_RANK() are SQL window functions used to assign a ranking to rows based on a specified order (commonly salary).
The key difference is how they handle duplicate values.
✅ RANK()
- Assigns the same rank to rows with equal values
- Skips the next rank number after a tie
Example
Employee salaries (DESC order):
| Salary | RANK |
|---|---|
| 5000 | 1 |
| 4000 | 2 |
| 4000 | 2 |
| 3000 | 4 |
➡ Rank 3 is skipped because of the tie at rank 2.
Query example:
SELECT ename, salary,
RANK() OVER (ORDER BY salary DESC) AS rnk
FROM emp;
DENSE_RANK()
- Assigns the same rank to rows with equal values
- Does NOT skip the next rank number
- Ranking is continuous, no gaps.
Example:
| Salary | DENSE_RANK |
|---|---|
| 5000 | 1 |
| 4000 | 2 |
| 4000 | 2 |
| 3000 | 3 |
SELECT ename, salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS drnk
FROM emp;
- Use
DENSE_RANK()when the requirement is Top N salaries - Use
RANK()when gaps in ranking are acceptable or expected
SQL constraints are rules that you apply to table columns to make sure the data stored in the database is valid, accurate, and consistent. They help prevent incorrect or meaningless data from being inserted into tables.
The NOT NULL constraint ensures that a column cannot have an empty (NULL) value. If a column has this constraint, every record must contain a value for that column. It is commonly used for important fields like employee name or ID.
CREATE TABLE emp ( empno INT NOT NULL, ename VARCHAR(50) );
The UNIQUE constraint ensures that all values in a column are different. No two rows can have the same value in a column with a UNIQUE constraint. It is often used for fields like email ID, phone number, or username.
CREATE TABLE users ( email VARCHAR(100) UNIQUE );
The PRIMARY KEY constraint uniquely identifies each record in a table. A primary key cannot contain NULL values and must be unique. Each table can have only one primary key, and it is commonly used to identify rows, such as employee ID or account number.
CREATE TABLE emp ( empno INT PRIMARY KEY, ename VARCHAR(50) );
The FOREIGN KEY constraint is used to create a relationship between two tables. It ensures that the value in one table exists in another related table. This helps maintain referential integrity between tables, such as linking employees to departments.
CREATE TABLE emp ( empno INT PRIMARY KEY, deptno INT, FOREIGN KEY (deptno) REFERENCES dept(deptno) );
The CHECK constraint is used to limit the values that can be stored in a column based on a condition. For example, it can ensure that salary is greater than zero or age is above 18. If the condition is not satisfied, the data is rejected.
CREATE TABLE emp ( salary INT CHECK (salary > 0) );
The DEFAULT constraint provides a default value to a column when no value is specified during insertion. For example, if no status is given, it can automatically be set to “ACTIVE”.
CREATE TABLE emp ( status VARCHAR(20) DEFAULT 'ACTIVE' );
In simple terms, SQL constraints are used to protect the database from invalid data by enforcing rules like uniqueness, mandatory values, valid ranges, and proper relationships between tables.
A PRIMARY KEY is a constraint in SQL that is used to uniquely identify each record (row) in a table.
It ensures that no two rows have the same value in the primary key column and that the column cannot contain NULL values. Because of this, a primary key guarantees that every record in a table can be uniquely referenced.
A table can have only one primary key, but that primary key can consist of one column or multiple columns (called a composite primary key).
The primary key is usually defined when creating a table.
Example:
CREATE TABLE emp ( empno INT PRIMARY KEY, ename VARCHAR(50), salary INT );
In this example:
- empno is the primary key
- Each employee number must be unique
- empno cannot be NULL
A UNIQUE key (UNIQUE constraint) in SQL is used to ensure that all values in a column (or a set of columns) are different across the table.
It prevents duplicate values, but allows NULL values (the number of NULLs allowed depends on the database, but most databases allow one or more NULLs).
The UNIQUE key is generally used for columns like email, phone number, username, etc., where values must be distinct but not necessarily the primary identifier.
CREATE TABLE users ( user_id INT, email VARCHAR(100) UNIQUE );
In this example:
- email must be unique
- Two users cannot have the same email
- email can be NULL (unless NOT NULL is also specified)
A composite key is used when a single column is not enough to uniquely identify a row. Instead of one column, the combination of multiple columns acts as the unique identifier.
In some tables, duplicate values are allowed in individual columns, but the combination of those columns must be unique. In such cases, we use a composite key.
For example:
- One employee can work on multiple projects
- One project can have multiple employees
Here, neither empno nor project_id alone can uniquely identify a record, but together they can.
CREATE TABLE emp_project ( empno INT, project_id INT, PRIMARY KEY (empno, project_id) );
In this example:
- empno alone is not unique
- project_id alone is not unique
- The combination of empno + project_id must be unique
- Neither column can contain NULL values
A FOREIGN KEY is a constraint in SQL that is used to create a relationship between two tables and maintain referential integrity.
A foreign key is a column (or a group of columns) in one table that refers to the PRIMARY KEY (or UNIQUE key) of another table. The table that contains the foreign key is called the child table, and the table being referenced is called the parent table. A foreign key ensures that the value entered in the child table must already exist in the parent table, preventing invalid or orphan records.
The foreign key is usually defined when creating the child table.
Example:
CREATE TABLE dept ( deptno INT PRIMARY KEY, dname VARCHAR(50) );
CREATE TABLE emp ( empno INT PRIMARY KEY, ename VARCHAR(50), deptno INT, FOREIGN KEY (deptno) REFERENCES dept(deptno) );
In this example:
- dept is the parent table
- emp is the child table
- deptno in emp must match an existing deptno in the dept table
- You cannot insert an employee with a department number that does not exist in dept
A surrogate key is an artificially generated key that is used to uniquely identify a record in a table, instead of using a natural/business key.
It has no business meaning and is usually created by the system (for example, an auto‑increment number or sequence).
In many real‑world scenarios, business columns such as email, employee code, order number, or combination of columns may look unique, but over time they can:
- Change (email ID updated)
- Become duplicated
- Be reused
- Be very large or complex
- Contain business meaning that should not be used as a primary key
To avoid these problems, we introduce a surrogate key, which is a stable, numeric, system‑generated identifier that never changes.
CREATE TABLE customer ( customer_key INT PRIMARY KEY, customer_email VARCHAR(100), customer_name VARCHAR(50) );
Here:
customer_keyis the surrogate key- It uniquely identifies each row
customer_emailis a natural/business key, not used as the primary key
In most databases, the surrogate key is generated automatically:
customer_key INT GENERATED ALWAYS AS IDENTITY
When I have used a surrogate key (interview‑style answer)
I have used surrogate keys mainly in data warehouse and ETL projects.
For example:
- In a customer dimension table, customer email or customer ID coming from source systems was not reliable because it could change or be duplicated across systems.
- We generated a surrogate key for each customer record to uniquely identify customers internally.
- This surrogate key was then used as a foreign key in fact tables.
This helped in:
- Handling slowly changing dimensions (SCD)
- Improving join performance
- Avoiding dependency on changing business data
- Maintaining consistency across multiple source systems
Key characteristics (interview‑ready)
- System‑generated (no business meaning)
- Usually numeric
- Never changes
- Used as PRIMARY KEY
- Common in data warehousing
- Improves performance and stability
Surrogate key vs Natural key
- Natural key → comes from business data (email, SSN, emp code)
- Surrogate key → generated by system (1, 2, 3, 4…)
A surrogate key is a system‑generated unique identifier used as a primary key when business keys are unstable, complex, or unreliable, commonly used in data warehouse projects.
SQL JOINs are used to combine data from two or more tables based on a related column (usually a primary key and foreign key).
EMP table
| empno | ename | deptno |
|---|---|---|
| 1 | Arun | 10 |
| 2 | Banu | 20 |
| 3 | Chitra | 30 |
| 4 | David | NULL |
DEPT table
| deptno | dname |
|---|---|
| 10 | HR |
| 20 | IT |
| 40 | FINANCE |
1. INNER JOIN
An INNER JOIN returns only the records that have matching values in both tables. If there is no match, the row is not included in the result.
Example explanation:
It shows employees only if they belong to a valid department.
SQL command:
SELECT e.ename, d.dname FROM emp e INNER JOIN dept d ON e.deptno = d.deptno;
Result:
| ename | dname |
|---|---|
| Arun | HR |
| Banu | IT |
✅ Employees without a department and departments without employees are excluded.
2. LEFT JOIN (LEFT OUTER JOIN)
A LEFT JOIN returns all records from the left table and only the matching records from the right table. If there is no match, the result will contain NULL values for the right table columns.
Example explanation:
It shows all employees, even if they are not assigned to any department.
SQL command:
SELECT e.ename, d.dname FROM emp e LEFT JOIN dept d ON e.deptno = d.deptno;
Result:
| ename | dname |
|---|---|
| Arun | HR |
| Banu | IT |
| Chitra | NULL |
| David | NULL |
✅ All employees are shown, even those without a department.
3. RIGHT JOIN (RIGHT OUTER JOIN)
A RIGHT JOIN returns all records from the right table and only the matching records from the left table. If there is no match, NULL values are shown for the left table columns.
Example explanation:
It shows all departments, even if no employees are working in them.
SQL command:
SELECT e.ename, d.dname FROM emp e RIGHT JOIN dept d ON e.deptno = d.deptno;
Result:
| ename | dname |
|---|---|
| Arun | HR |
| Banu | IT |
| NULL | FINANCE |
✅ All departments are shown, even if no employees work in them.
4. FULL JOIN (FULL OUTER JOIN)
A FULL JOIN returns all records from both tables. If there is no match, NULL values appear for the missing side.
Example explanation:
It shows all employees and all departments, whether they are related or not.
SQL command:
SELECT e.ename, d.dname FROM emp e FULL OUTER JOIN dept d ON e.deptno = d.deptno;
Result:
| ename | dname |
|---|---|
| Arun | HR |
| Banu | IT |
| Chitra | NULL |
| David | NULL |
| NULL | FINANCE |
✅ Shows all employees and all departments.
5. CROSS JOIN
A CROSS JOIN returns the Cartesian product of both tables, meaning each row of the first table is combined with every row of the second table.
Example explanation:
If there are 5 employees and 3 departments, the result will have 15 rows.
SQL command:
SELECT e.ename, d.dname FROM emp e CROSS JOIN dept d;
Result:
4 employees × 3 departments = 12 rows
✅ Rarely used in real projects.
6. SELF JOIN
A SELF JOIN is a join where a table is joined with itself. It is commonly used to compare rows within the same table.
Example explanation:
Used to find employees and their managers from the same table.
EMP table (with manager):
| empno | ename | mgr |
|---|---|---|
| 1 | Arun | NULL |
| 2 | Banu | 1 |
| 3 | Chitra | 1 |
SQL command:
SELECT e.ename AS employee, m.ename AS manager FROM emp e JOIN emp m ON e.mgr = m.empno;
Result:
| employee | manager |
|---|---|
| Banu | Arun |
| Chitra | Arun |
✅ Interview Summary (One‑liner)
- INNER JOIN returns only matching rows,
- LEFT JOIN returns all left table rows,
- RIGHT JOIN returns all right table rows,
- FULL JOIN returns all rows from both tables,
- CROSS JOIN returns all combinations,
- and SELF JOIN is used to join a table with itself.