SQL
Terminology
structured query language is a standard Database language which is used to create, maintain and retrieve the relational database.
Relational database: data is in the form of relations.
Attribute: Attributes are the properties that define a relation.
Tuple: Each row in the relation is known as tuple.
Degree: The number of attributes in the relation
Cardinality: The number of tuples in a relation
Column: Column represents the set of values for a particular attribute.
========================
SELECT [Distinct] attri
FROM R1, R2
WHERE ...
GROUP BY ... [HAVING]
ORDER BY .... [DESC]
Aggregation functions
Used to perform mathematical operations on data values of a relation.
- COUNT : count the number of rows in a relation.
SUM: sum up values
GROUP BY: Group by is used to group the tuples of a relation based on an attribute or group of attribute. It is always combined with aggregation function which is computed on group.
SELECT ADDRESS, SUM(AGE) FROM STUDENT
GROUP BY (ADDRESS);
SUM(AGE) will be computed for each address.
Group function
Group functions work on the set of rows and returns one result per group. Some of the commonly used group functions are: AVG, COUNT, MAX, MIN, SUM, VARIANCE.
Different subsets of SQL?
- DDL (Data Definition Language) – It allows you to perform various operations on the database such as CREATE, ALTER and DELETE objects.
- DML ( Data Manipulation Language) – It allows you to access and manipulate data. It helps you to insert, update, delete and retrieve data from the database.
- DCL ( Data Control Language) – It allows you to control access to the database. Example – Grant, Revoke access permissions.
JOIN
Used to combine rows from two or more tables, based on a related column between them.
- Inner Join: intersection basically. It is used to return all the rows from multiple tables where the join condition is satisfied.
- Right Join: return all the rows from the right table but only the matching rows from the left table where the join condition is fulfilled.
- Left Join: return all the rows from the left table but only the matching rows from the right table where the join condition is fulfilled.
- Full Join: it returns all the rows from the left-hand side table and all the rows from the right-hand side table.
Natural join: Matches the attributes that have the same name.
For example, if we have two tables with two column names in common (the two columns exists with the same name in the two tables), then the natural join will join the two tables by comparing the values of the both columns and not just from one column.
In INNER JOIN, you have to specify a join condition which the inner join uses to join the two tables. Whereas in the natural join, you don't write a join condition.
Cross Join: A CROSS JOIN gives the Cartesian product for the selected columns of the two joined tables, by matching all the values from the first table with all the values from the second table.
Primary key
is a column (or collection of columns) or a set of columns that uniquely identifies each row in the table.
- Uniquely identifies a single row in the table
- Null values not allowed
Constraints
Constraints are used set he limit on the data type of the table. It can be specified while creating or altering the table statement. The sample of constraints are:
- NOT NULL
- CHECK
- DEFAULT
- UNIQUE
- PRIMARY KEY
- FOREIGN KEY
Foreign key
Foreign key maintains referential integrity by enforcing a link between the data in two tables.
The foreign key in the child table references the primary key in the parent table.
The foreign key constraint prevents actions that would destroy links between the child and parent tables.
Transaction
is a single unit of work. A set of operations
ACID
Atomicity: all or nothing transaction
Atomicity refers to the transactions that are completely done or failed where transaction refers to a single logical operation of a data. It means if one part of any transaction fails, the entire transaction fails and the database state is left unchanged.
Consistency: guarantee committed transaction state
Consistency ensures that the data must meet all the validation rules.
Isolation: transactions are independent
Durability: committed data never lost
Normalization
- remove redundant data and maintain data integrity, break down the large table into smaller ones
- save space but slow down query
- better for write-intensive application
3 different types of normalization:
_First Normal Form (1NF): _No repeating groups within rows
_Second Normal Form (2NF): _Every non-key (supporting) column value is dependent on the whole primary key.
_Third Normal Form (3NF): _Dependent solely on the primary key and no other non-key (supporting) column value.
Denormalization
add some redundant data or group data -> improve read performance but losing some write performance
View
A virtual table that is used to fetch data from tables.
stored procedure
A stored procedure is a set of SQL statements with an assigned name, which are stored in a relational database management system as a group, so it can be reused and shared by multiple programs.
Trigger
a special type of stored procedures. execute automatically when or after data modification