## Overview of the relational algebra

- It provides a series of operations including Union, Intersection, Difference, Cartesian Product, Project, Select, Join and Division based on sets.
- The opertors take one or more relations as inputs and give a new relation as a result.
Procedural language

For example : $\prod_{Sno,Cname}(\sigma_{Cno = “002”}(Student\bowtie SC))$An abstract language, and is the foundation to learn SQL.

### Traditional Relational Algebra Operations [only involve row]

Operation | Relation R | Relation S | Notation |
---|---|---|---|

UNION | $R$ | $S$ | $R \cup S$ |

INTERSECTION | $R$ | $S$ | $R \cap S$ |

DIFFFERENCE | $R$ | $S$ | $R-S$ |

Cartesian PRODUCT | $R$ | $S$ | $R \times S$ |

### Special Relational Algebra Operations [involve row and column]

Operation | Relation R | Relation S | Notation |
---|---|---|---|

PROJECT | $R$ | $\pi_{A}(R)$ | |

SELECT | $R$ | $\sigma_{Con}(R)$ | |

JOIN | $R$ | $S$ | $R \mathop{\Join}\limits_{A \ \theta \ B} S$ |

Cartesian PRODUCT | $R$ | $S$ | $R \div S$ |

## Compatibility for the relational algebra

- The relations involved in Union, Intersection and Difference must be compatible(相容的) to ensure above relational algebra operations to be valid
Relation $R$ and relation $S$ are compatible when

$[1]$ $R$,$S$ must have the same arity(同元的, same number of attributes)

$[2]$ The attribute domains must be compatible (e.g., the $2^{nd}$ column of $S$.)**For relation $R(A_1,A_2,…,A_n)$ and relation $S(B_1,B_2,…,B_m)$,**

IF $R$ and $S$ are compatible,

Then $n = m$ and Domain$(A_i)$ = Domain$(B_i)$, $i=1,2,…,n$

### Relational algebra (1) : Union

- Notation: $R \cup S$
- Defined as: $R \cap S$ = $\lbrace t\ |\ t \in R \vee t \in S \rbrace$
- For $R \cup S$ to be valid, $R$ and $S$ should be compatible
- $R \cup S $ = $S \cup R$

### Relational algebra (2) : Difference

- Notation: $R - S$
- Defined as: $R-S = \lbrace t \ | \ t \in R \land t \notin S \rbrace$
- For $R-S$ to be valid, R and S should be compatible
- $R - S \ne S - R$

### Relational algebra (3) : Intersection

- Notation: $R \cap S$
- Defined as: $R \cap S$ = $\lbrace t \ | \ t \in R \land t \in S \rbrace$
- For $R \cap S$ to be valid, R and S should be compatible
- $R \cap S$ = $S \cap R$
- $R \cap S$ = $R - (R-S)$ = $S - (S - R)$

### Relational algebra (4) : Cartesian Product

- Notation: $R \times S$
- Defined as: $R \times S$ = $\lbrace t,q \ |\ t \in R \land q \in S \rbrace$
- $R \times S$ = $S \times R$
- If the defree of $R$ is $n$, and the degree of $S$ is $m$, then the degree of $R \times S$ is $n + m$
- If the cardinality of $R$ is $n$, and the cardinality of $S$ is $m$, then the cardinality of $R \times S$ is $n \times m$.

### Relational algebra (5) : Select

- Notation: $\sigma_p(R)$
- $p$ is called the selection predicate (选择谓词)
- Defined as: $\sigma_p(R)$ = $\lbrace t \ | \ t \in R \land p(t) \rbrace$

where $p$ is a formula in propositional calculus(命题演算) consisting of terms connected by: $\land (and), \lor (or),\lnot (not)$

Each term is one of: <$attribute$> $op$ <$attribute$> or <$constant$>

where $op$ is obne of: $=,\ne,\gt,\ge,\lt,\le$

Usually, there are many operators in the selection predicate $p$, and the priority sequence is as following:

- () [Parentheses]
- $=,\ne,\gt,\ge,\lt,\le$
- $\lnot$
- $\land$
- $\lor$

### Relational algebra (6) : Project

- Notation : $\prod _{A_1,A_2,…,A_k}(R)$

where $A_1$,$A_2$,…,$A_k$ are attribute names and $R$ is a relation name. - The result is defined as the relation of $k$ columns obtained by erasing the columns that are not listed
- Duplicate rows removed from result, since relations are sets

### Ralational algebra (7) : Join

#### $\theta$-Join

- Notation: $R \mathop{\Join}\limits_{A \ \theta \ B} S$
- Defined as: $R\mathop{\Join}\limits_{A \ \theta \ B} S = \sigma _{t[A] \ \theta \ s[B]}(R \times S)$
- $R(A_1,A_2,…,A_n)$, $A \in \lbrace A_1,A_2,…A_n \rbrace$
- $S(B_1,B_2,…,B_m)$, $B \in \lbrace B_1,B_2,…B_m \rbrace$
- $t \in R$, $s \in S$
- $A$ and $B$ are compatible
- $\theta \in \lbrace \gt,\ge,\lt,\le,=,<> \rbrace$

- $\theta$-Join usually used with Select and Project together.

#### Rename

- Notation: $\rho$
- Rename a relation to another with a different name
- Duplicate a relation and give a new name
- $R_1 \rightarrow R_2$, and only the relation names are different for R_1 and R_2

Query: Select all course No.s which both “2015030101” and “2015040101” from relation SC have taken.

Answer: $\pi_{SC.Cno = “2015030101” \lor SC1.Sno = “2015040101”}(SC \mathop{\Join}\limits_{SC.Cno = SC1.Cno} \rho_{SC1}(SC))$

#### Equal-Join

- Notation: $R\mathop{\Join}\limits_{A=B} S$
- Defined as: $R\mathop{\Join}\limits_{A=B}S = \sigma_{t[A]=sB}$
- $R(A_1,A_2,…,A_n),A\in \lbrace A_1,A_2,…,A_n \rbrace$
- $S(B_1,B_2,…,B_m),B\in \lbrace B_1,B_2,…,B_m \rbrace$
- $t \in R, s \in S$
- $A$ and $B$ are compatible

- Equal-Join is a special case of $\theta$-Join

#### Natural-Join

- Notation: $R \Join S$
- Defined as: $R \Join S = \sigma_{t[B]=s[b]}(R\times S)$
- $R$ and $S$ have one same attribute or a group of same attributes
- Duplicated columns should be deleted in the result relation

- Equal-Join is a special case if $\theta $-Join

#### Outer-Join

- An etension of the join operation that avoids loss of information
- Computes the join and then adds tuples from one relation that does not match tuples in the other relation to the result of the join
- User null values: null signifies that the value is unknown or does not exist
- Notation:
- Left-Join: ⟕
- Right-Join: ⟖
- Full-Join: ⟗

### Ralational algebra (8) : Divisiojn

- Notation: $R \div S$
- $R = (A_1,…,A_m,B_1,…,B_n)$
- $S = (B_1,…,B_n)$
- $S \subseteq R$ Each attribute of schema $S$ is also in schema $R$
- The result of $R \div S$ is a relation schema, and containing all attributes of $R$ that are not in $S$.

$R - S = (A_1,…,A_m)$ - Suited to queries that include the phrase “for all”
- $R \div S = \lbrace t \ |\ t \in \prod _{R-S}(R) \land \forall u \in S(tu \in R)\rbrace$