12 February 2017 - Upkar Lidder Open Issue

FROM to ORDER and everything in between


SELECT DISTINCT column, AGG_FUNC(column_or_expression), …
FROM mytable
JOIN another_table
  ON mytable.column = another_table.column
WHERE constraint_expression
GROUP BY column
HAVING constraint_expression
ORDER BY column ASC/DESC
LIMIT count OFFSET COUNT;

Order of execution

The order in which the above sql is executed is as follows:

  1. FROM (JOIN)
  2. WHERE
  3. GROUPBY
  4. HAVING
  5. SELECT
  6. DISTINCT
  7. ORDERBY
  8. LIMIT / OFFSET

  1. Tell sql where the data is coming from.
  2. Filter the table created in step 1 above based on criteria in the WHERE clause.
  3. Group the remaining columns using the columns specified in the GROUPBY clause. Useful with aggregate functions.
  4. Further filter down the grouped data using the HAVING filter.
  5. Select columns after all the filtering is finished.
  6. Remove duplicates if necessary.
  7. Sort if necessary.
  8. Get limited dataset with optional offsetting.

Example

Table:


ID |NAME      |NUMBER_GRADE |FRACTION_COMPLETED |
---|----------|-------------|-------------------|
1  |Winston   |90           |0.8050000071525574 |
2  |Winston   |90           |0.8050000071525574 |
3  |Winnefer  |95           |0.9010000228881836 |
4  |Winsteen  |85           |0.906000018119812  |
5  |Wincifer  |66           |0.7053999900817871 |
6  |Winster   |76           |0.5012999773025513 |
7  |Winstonia |82           |0.9045000076293945 |

Get number of students who have grades above 80


SELECT COUNT(*) AS above_80 FROM APP.STUDENT_GRADES WHERE NUMBER_GRADE > 80;


ABOVE_80 |
---------|
5        |

Good ! What if we want to assign letter grades to these number grades? We are not storing letter grades in the database. So the first thing that comes to mind is using CASE statements to figure out letter grades from the number grades.

Assign letters A, B, C and F to number grades


SELECT S.name,
	   S.number_grade,
	   CASE
			WHEN S.number_grade>90 THEN 'A'
			WHEN S.number_grade>80 THEN 'B'
			WHEN S.number_grade>70 THEN 'C'
			ELSE 'F'
		END AS letter_grade
FROM APP.STUDENT_GRADES S


NAME      |NUMBER_GRADE |LETTER_GRADE |
----------|-------------|-------------|
Winston   |90           |B            |
Winston   |90           |B            |
Winnefer  |95           |A            |
Winsteen  |85           |B            |
Wincifer  |66           |F            |
Winster   |76           |C            |
Winstonia |82           |B            |

Great ! Lastly, count the number of students in each letter grade. We already have the letter_grade, why not just group by letter_grade and count(*).

Count number of students in each letter grade


SELECT CASE
	WHEN S.NUMBER_GRADE>90 THEN 'A'
	WHEN S.NUMBER_GRADE>80 THEN 'B'
	WHEN S.NUMBER_GRADE>70 THEN 'C'
	ELSE 'F'
END AS letter_grade, COUNT(*) AS num
FROM APP.STUDENT_GRADES AS S
GROUP BY letter_grade;

Problem

SQL Error [30000] [42X04]: Column 'LETTER_GRADE' is either not in any table in the FROM list or appears within a join specification and is outside the scope of the join specification or appears in a HAVING clause and is not in the GROUP BY list. If this is a CREATE or ALTER TABLE statement then 'LETTER_GRADE' is not a column in the target table.

What is that about? The reason is our ordering. Let's go through the steps one by one again
1. FROM (JOIN) - we are telling sql the data is coming from the table APP.STUDENT_GRADES. It will be called S from now on.
2. WHERE - we are not filtering any data and so this is ignored.
3. GROUPBY - we are telling sql to group the data by letter_grade.

Oh wait ! We have not yet told sql what letter_grade is. This happens in the SELECT clause. But remember SELECT has not happened yet !

Solution

So how do we fix this? Well, the simplest way is to somehow tell sql about letter_grade before the GROUP BY is called. One way to do this is nesting. Check this out!



SELECT ns.letter_grade, COUNT(*)
FROM
(SELECT CASE
WHEN S.NUMBER_GRADE>90 THEN 'A'
WHEN S.NUMBER_GRADE>80 THEN 'B'
WHEN S.NUMBER_GRADE>70 THEN 'C'
ELSE 'F'
END AS letter_grade
FROM APP.STUDENT_GRADES S) ns
GROUP BY ns.letter_grade;

Here is our result:


LETTER_GRADE |NUM_STUDENTS |
-------------|-------------|
A            |1            |
B            |4            |
C            |1            |
F            |1            |

So why does this work you ask? The order is still the same
  1. FROM (JOIN) - we are telling sql the data is coming from the table
    APP.STUDENT_GRADES. It will be called S from now on. But there is more to it this time. We are also saying the data will come from ns "Table". You can think of it as a temporary table. It let's us define the letter_grade column.
  2. WHERE - we are not filtering any data and so this is ignored.
  3. GROUPBY - we are telling sql to group the data by ns.letter_grade, which was already defined in step 1.

Open Issue

Back home