What is SQL ?


SQL - structured query language - is a classical tool for data analysis. Together with Excel and Python it builds modern must have package for successful Data Science carrier.

SQL aka SEQUEL is a programming language designed for managing data in a relational database, it helps to build complex models and perform analyses quickly and offers a great ability for data manipulation.

SQL DS cheat sheet.


Compared to Python using SQL to transform data inside the warehouse is usually much faster for many basic queries and aggregations as it moves code to data rather than data to code.



Simple calculations with SQL:


Making a single calculation.


SELECT 13+15  from dual;

Making multiple calculations.


SELECT 1+17, 11-3, 7*7, 18/6 from dual;

Making calculations with multiple numbers.


SELECT 5*2*3, 7+2+7 from dual;

Naming results.


SELECT 4*3 AS res1, 5+2+3 AS res2 from dua;

Selecting tables, columns, and rows:


Simple select - the whole table.


SELECT * FROM  table1;

Select columns from a table.


SELECT column1, column2 FROM table2;

Selecting the first 10 rows on a table.


SELECT * from table3
fetch first 10 rows only;
-- LIMIT 10; --otherwise

Selecting UNIQUE VALUES.


SELECT DISTINCT(column1) FROM table_name;

HAVING - returns only those results where the counted value is greater than certain value.


SELECT column1, column2 FROM table_name 
GROUP BY column1 
HAVING COUNT(column2) > 100; 

Adding comments to SQL code:


Single-line comments.


-- Useful comment 
SELECT column1, column2, column3 -- Useful comment 2
FROM table4; -- Useful comment 3 

Block comments.


/* Multiple 
lines 
comment */
SELECT column1, column2, column3 
FROM table5; 

SQL FUNCTIONS FOR AGGREGATION:


Counting the number of rows.


SELECT COUNT(*) FROM table_name; 

Calculating the average (mean) of the values in column1.


SELECT AVG(column1) FROM table_name; 

Calculating the sum of the values in column1.


SELECT SUM(column1) FROM table_name;  

Calculating the MIN of the values in column1.


SELECT MIN(column1) FROM table_name;  

Calculating the MAXof the values in column1.


SELECT MAX(column1) FROM table_name;  

Calculating the MEDIAN of the values in column1.


SELECT MEDIAN(column1) FROM table_name;  

Using Joins & Complex Queries:


Joining tables with INNER JOIN.


SELECT * FROM table1 
INNER JOIN table2 ON table1.column1 = table2.column1;   

Joining tables using a LEFT JOIN.


SELECT * FROM t1 LEFT JOIN t2 ON t1.id = t2.id;   

Joining tables using a RIGHT JOIN.


SELECT f.col1, c.col2
FROM t1 c  RIGHT JOIN t2 f ON f.id = c.id;  

Joining tables using a FULL OUTER JOIN.


SELECT f.col1, c.col2
FROM t1 c   FULL OUTER JOIN   t2 f ON f.id = c.id;  

Sorting a column without specifying a column name.


SELECT col1, col2
FROM table
ORDER BY 2 desc; -- descending order by 2 column  

Using a join within a subquery.


SELECT c.name, f.*
FROM table f INNER JOIN 
( SELECT * FROM table2 WHERE name = 1 ) c ON c.id= f.id;  

Common SQL operations:


Concatinating columns and text into a single column.


SELECT id, name, 
"Original id is " || column1, 
"Details are : " || column2 || column3
FROM table  

Matching part of a string.


SELECT * FROM table 
WHERE name LIKE "%Andrew%";  

Applying if / then logic with CASE.


CASE 
WHEN age < 18 THEN 'teenager' 
WHEN  age > 18  THEN 'adult'
ELSE 'unknown' 
END AS column_name  

Using the WITH clause.


WITH stuff AS 

( SELECT col1,  col2 , col3
FROM table10) 

SELECT * FROM stuff 
WHERE  col1 = 15;    

Creating a table from select.


CREATE TABLE mytable2
AS SELECT  col1,  col2  FROM fulltable;   

Creating a table from scratch.


create table mytable (
start_date date , code varchar2 (200),
salary number
);   

Inserting data into a table.


INSERT INTO mytable  (start_date, code, salary )
VALUES ('10.10.2010', '12345', 12000);  

Dropping a table.


DROP TABLE mytable;  

Cleaning a table.


truncate table  mytable2;  

Creating a view.


CREATE VIEW mytable3
AS SELECT  col1,  col2 , col3  FROM fulltable;   

Results from 2 different select statesments.


select * from a
 UNION ALL
select * from b;    

Results that occur in both SELECT statements.


select * from a
INTERSECT
select * from b;    

Results that occur in the first SELECT statement but not the second SELECT statement.


select * from a
EXCEPT
select * from b;    

CORRECT KEYWORDS ORDER:


SQL is extremely sensitive to keyword order. So make sure you keep it right:

1. SELECT
2. FROM
3. JOIN (ON)
4. WHERE
5. GROUP BY
6. HAVING
7. ORDER BY
8. LIMIT or fetch first x rows only



See also related topics: