SQL : Basic queries

Before starting, lets have an employee table with the following data :

Employee_full

The Building Blocks of Data Retrieval: SELECT and FROM . As your experience with SQL grows, you will notice that you are typing the words SELECT and FROM more than any other words in the SQL vocabulary.


SELECT * FROM employee; // will show all records in ‘employee’ table .

We can use ‘WHERE’ clause that gives the DB something specific to search for. You can also select specifc columns.


select idEmployee,Name,Salary from employee where PrimarySkill='Java';

will retun the following:

Employee_Where

Queries with Distinction – If you look at the original table, CHECKS, you see that some of the data repeats in the PrimarySkill column.


select distinct PrimarySkill from employee; // will give all distinct values of PrimarySkill column.

ALL is a keyword that is implied in the basic SELECT statement. You almost never see ALL because SELECT <Table> and SELECT ALL <Table> have the same result.

Comparison operators compare expressions and return one of three values: TRUE, FALSE, or Unknown. To understand how you could get an Unknown, you need to know a little about the concept of NULL. In database terms NULL is the absence of data in a field. It does not mean a column has a zero or a blank in it. A zero or a blank is a value. NULL means nothing is in that field. If you make a comparison like Field = 9 and the only value for Field is NULL, the comparison will come back Unknown. Because Unknown is an uncomfortable condition, most flavors of SQL change Unknown to FALSE and provide a special operator, IS NULL, to test for a NULL condition.

Notice in the previous ‘employee’ table we had a null salary.


select * from employee where salary is null;

Result:

Employee_isNull

However :


select * from employee where salary=null;

will not fetch any rows. You didn’t find anything because the comparison salary = NULL returned a FALSE – the result was unknown.

You can use character operators to manipulate the way character strings are represented, both in the output of data and in the process of placing conditions on data to be retrieved. Lets see the LIKE operator.


select * from employee where PrimarySkill like 'Jav%';

will give following result:

Employee_like

You can see the use of the percent sign (%) in the statement after LIKE. When used inside a LIKE expression, % is a multi-characater wildcard.

The underscore (_) is the single-character wildcard.


select * from employee where PrimarySkill like 'Jav_';

result :

Employee_like_

We can use ‘and’ to add further filters to our result :


select * from employee where salary > 75000 and salary<85000;

Employee_AND

Similary for ‘OR’.

If the condition it applies to evaluates to TRUE, NOT make it FALSE. If the condition after the NOT is FALSE, it becomes TRUE.

The two operators IN and BETWEEN provide a shorthand in some cases. Like


select * from employee where PrimarySkill = 'Java' or PrimarySkill = 'SQL';

can be written as


select * from employee where PrimarySkill in('Java','SQL');

Similary :


select * from employee where salary > 75000 and salary<85000;

can be written as :


select * from employee where salary between 75000 and 85000;

Aggregate functions –COUNT, MAX, MIN, SUM, AVG etc

The function COUNT returns the number of rows that satisfy the condition in the WHERE clause.


select count(*) from employee where PrimarySkill in('Java','Python');//will return 6

select count(salary) from employee where PrimarySkill in('Java','Python');//will return 5

If you want to find the largest value in a column, use MAX.


select max(salary) from employee; //return 95000

If you try the following query :


select name from employee where salary=MAX(salary);

will throw error like ‘Invalid use of group function’. The error message is a reminder that this group function (Remember that aggregate functions are also called group functions) does not work in the WHERE clause.

From time to time you will want to present the results of your query in some kind of order. As you know, however, SELECT FROM gives you a listing, and unless you have defined a primary key, your query comes out in the order the rows were entered.


select * from employee order by salary desc;

Employee_OrderBy

You can also use ORDER BY on more than one field.

Following query will tell the total amount being spent on java developers:


select PrimarySkill, SUM(salary) from employee where PrimarySkill='Java';

Employee_SUM

To get the combine result for many spent, skill wise and ordering it in descending order of the sum  we can use group By :


select PrimarySkill, SUM(salary) from employee group by PrimarySkill order by 2 desc;

Employee_groupBY

 

To get the count of developer similary :


select PrimarySkill, count(*) as DevelopersCount from employee group by PrimarySkill;

Notice the ‘as’ keyword that changes the name of the column. ‘as’ is however optional

Employee_groupByCount

Suppose you want those skills whose SUM(salary) is less than 175000,


select PrimarySkill, SUM(salary) as GroupSalaryTotal from employee

where GroupSalaryTotal<175000 group by PrimarySkill ;

above will throw an error saying GroupSalaryTotal is an unknown column.

Using ‘having‘ clause :


select PrimarySkill, SUM(salary) as GroupSalaryTotal from employee group by PrimarySkill

having GroupSalaryTotal<175000;

Employee_having1

If you try :


select PrimarySkill, SUM(salary) as GroupSalaryTotal from employee group by PrimarySkill

having salary<175000;

you will get an error : ‘salary’ as an unknown column.

But If You try:


select PrimarySkill, SUM(salary) as GroupSalaryTotal from employee group by PrimarySkill

having PrimarySkill='SQL';

will give result

Employee_having2

So that means while ‘where’ only works on the table columns, ‘having’ works on the table produced as a result of ‘group by’ clause.

Note : This is the result in MySQL DB. Implementation of having/other clause might change in other DB.

The SQL TOP clause is used to fetch a TOP N number or X percent records from a table.

Note: All the databases do not support TOP clause. For example MySQL supports LIMIT clause to fetch limited number of records and Oracle uses ROWNUM to fetch limited number of records.

Let us see an example to fetch the 3rd highest salary from out employee table . Since we are using MySQL, so we would be using LIMIT.


select salary from (SELECT distinct salary FROM Employee ORDER BY salary DESC LIMIT 3)as emp_salary order by salary limit 1;

The select in brackets first returned the salaries in the descending order with 3 as limit, that is 3 rows. It is wrapped again with a select where we re-ordered the rows and make the limit as 1, so that we finally get the 3rd highest salary.This can be used as a subquery to fetch more details like :


SELECT name,salary from employee where salary=(select salary from (SELECT distinct salary FROM Employee ORDER BY salary DESC LIMIT 3)as emp_salary order by salary limit 1);

JOINS

SQL joins are used to combine rows from two or more tables. Lets have a home_address table with following data:

home_address

If we run following query:


select * from employee,home_address;

join_cross

40 rows are returned (10(in employee)*4(in home_address)) this type of join is called a cross-join. A cross-join is not normally as useful as the other joins covered today, but this join does illustrate the basic combining property of all joins: Joins bring tables together.

INNER JOIN 

The most frequently used clause is INNER JOIN. This produces a set of records which match in both the ‘employee’  and ‘home_address’ tables, i.e. all employees have home_address mapped:


select Name,PrimarySkill,City from employee e inner join home_address ha on e.idEmployee=ha.employeeId;

select Name,PrimarySkill,City from employee e inner join home_address ha where e.idEmployee=ha.employeeId;

select Name,PrimarySkill,City from employee e, home_address ha where e.idEmployee=ha.employeeId;

All three queries above will produce the same result:

join_inner

LEFT JOIN

The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table.


select Name,PrimarySkill,City from employee e left join home_address ha on e.idEmployee=ha.employeeId;

Note: you cannot use ‘where’ in this case in MySQL, joining ‘on’ has to be there.

join_left

RIGHT JOIN

The SQL RIGHT JOIN returns all rows from the right table, even if there are no matches in the left table. This means that if the ON clause matches 0 (zero) records in left table, the join will still return a row in the result, but with NULL in each column from left table.


select Name,PrimarySkill,City from employee e right join home_address ha on e.idEmployee=ha.employeeId;

join_right

FULL JOIN

The SQL FULL JOIN combines the results of both left and right outer joins. The joined table will contain all records from both tables, and fill in NULLs for missing matches on either side. OUTER JOIN is less useful than INNER, LEFT or RIGHT and it’s not implemented in MySQL. However, you can work around this restriction using the UNION of a LEFT and RIGHT JOIN, e.g.


select Name,PrimarySkill,City from employee e left join home_address ha on e.idEmployee=ha.employeeId

union

select Name,PrimarySkill,City from employee e right join home_address ha on e.idEmployee=ha.employeeId;

join_outer

SELF JOIN

The SQL SELF JOIN is used to join a table to itself as if the table were two tables, temporarily renaming at least one table in the SQL statement.

Suppose there is a Relationship table having father and son columns like below:

join_self

Now you have a requirement to find out all the grandfathers and grandson , you can use self join to achieve this:


select gradparent.Father GrandFather, child.Son Grandson

from relationship child inner join relationship parent

on child.father=parent.Son

inner join relationship gradparent

on gradparent.Father=parent.Father;

join_self_fatherSon

A similar senario like an employee and reporting to anoher employee can also use this self join.

The SQL UPDATE Query is used to modify the existing records in a table.

You can use WHERE clause with UPDATE query to update selected rows otherwise all the rows would be affected. Syntax

<p class="prettyprint prettyprinted">UPDATE table_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];</p>

The SQL DELETE Query is used to delete the existing records from a table.

You can use WHERE clause with DELETE query to delete selected rows, otherwise all the records would be deleted.


DELETE FROM table_name WHERE [condition];

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: