Relation Algebra

Overview of the relational algebra

  • It provides a series of operations including Union, Intersection, Difference, Cartesian Product, Project, Select, Join and Division based on sets.
  • The opertors take one or more relations as inputs and give a new relation as a result.
  • Procedural language
    For example : $\prod_{Sno,Cname}(\sigma_{Cno = “002”}(Student\bowtie SC))$

  • An abstract language, and is the foundation to learn SQL.

Traditional Relational Algebra Operations [only involve row]

Operation Relation R Relation S Notation
UNION $R$ $S$ $R \cup S$
INTERSECTION $R$ $S$ $R \cap S$
DIFFFERENCE $R$ $S$ $R-S$
Cartesian PRODUCT $R$ $S$ $R \times S$

Special Relational Algebra Operations [involve row and column]

Operation Relation R Relation S Notation
PROJECT $R$ $\pi_{A}(R)$
SELECT $R$ $\sigma_{Con}(R)$
JOIN $R$ $S$ $R \mathop{\Join}\limits_{A \ \theta \ B} S$
Cartesian PRODUCT $R$ $S$ $R \div S$

Compatibility for the relational algebra

  • The relations involved in Union, Intersection and Difference must be compatible(相容的) to ensure above relational algebra operations to be valid
  • Relation $R$ and relation $S$ are compatible when
    $[1]$ $R$,$S$ must have the same arity(同元的, same number of attributes)
    $[2]$ The attribute domains must be compatible (e.g., the $2^{nd}$ column of $S$.)

    For relation $R(A_1,A_2,…,A_n)$ and relation $S(B_1,B_2,…,B_m)$,
    IF $R$ and $S$ are compatible,
    Then $n = m$ and Domain$(A_i)$ = Domain$(B_i)$, $i=1,2,…,n$

Relational algebra (1) : Union

  • Notation: $R \cup S$
  • Defined as: $R \cap S$ = $\lbrace t\ |\ t \in R \vee t \in S \rbrace$
  • For $R \cup S$ to be valid, $R$ and $S$ should be compatible
  • $R \cup S $ = $S \cup R$

Relational algebra (2) : Difference

  • Notation: $R - S$
  • Defined as: $R-S = \lbrace t \ | \ t \in R \land t \notin S \rbrace$
  • For $R-S$ to be valid, R and S should be compatible
  • $R - S \ne S - R$

Relational algebra (3) : Intersection

  • Notation: $R \cap S$
  • Defined as: $R \cap S$ = $\lbrace t \ | \ t \in R \land t \in S \rbrace$
  • For $R \cap S$ to be valid, R and S should be compatible
  • $R \cap S$ = $S \cap R$
  • $R \cap S$ = $R - (R-S)$ = $S - (S - R)$

Relational algebra (4) : Cartesian Product

  • Notation: $R \times S$
  • Defined as: $R \times S$ = $\lbrace t,q \ |\ t \in R \land q \in S \rbrace$
  • $R \times S$ = $S \times R$
  • If the defree of $R$ is $n$, and the degree of $S$ is $m$, then the degree of $R \times S$ is $n + m$
  • If the cardinality of $R$ is $n$, and the cardinality of $S$ is $m$, then the cardinality of $R \times S$ is $n \times m$.

Relational algebra (5) : Select

  • Notation: $\sigma_p(R)$
  • $p$ is called the selection predicate (选择谓词)
  • Defined as: $\sigma_p(R)$ = $\lbrace t \ | \ t \in R \land p(t) \rbrace$
    where $p$ is a formula in propositional calculus(命题演算) consisting of terms connected by: $\land (and), \lor (or),\lnot (not)$
    Each term is one of: <$attribute$> $op$ <$attribute$> or <$constant$>
    where $op$ is obne of: $=,\ne,\gt,\ge,\lt,\le$

Usually, there are many operators in the selection predicate $p$, and the priority sequence is as following:

  • () [Parentheses]
  • $=,\ne,\gt,\ge,\lt,\le$
  • $\lnot$
  • $\land$
  • $\lor$

Relational algebra (6) : Project

  • Notation : $\prod _{A_1,A_2,…,A_k}(R)$
    where $A_1$,$A_2$,…,$A_k$ are attribute names and $R$ is a relation name.
  • The result is defined as the relation of $k$ columns obtained by erasing the columns that are not listed
  • Duplicate rows removed from result, since relations are sets

Ralational algebra (7) : Join

$\theta$-Join

  • Notation: $R \mathop{\Join}\limits_{A \ \theta \ B} S$
  • Defined as: $R\mathop{\Join}\limits_{A \ \theta \ B} S = \sigma _{t[A] \ \theta \ s[B]}(R \times S)$
    • $R(A_1,A_2,…,A_n)$, $A \in \lbrace A_1,A_2,…A_n \rbrace$
    • $S(B_1,B_2,…,B_m)$, $B \in \lbrace B_1,B_2,…B_m \rbrace$
    • $t \in R$, $s \in S$
    • $A$ and $B$ are compatible
    • $\theta \in \lbrace \gt,\ge,\lt,\le,=,<> \rbrace$
  • $\theta$-Join usually used with Select and Project together.

Rename

  • Notation: $\rho$
  • Rename a relation to another with a different name
  • Duplicate a relation and give a new name
  • $R_1 \rightarrow R_2$, and only the relation names are different for R_1 and R_2

Query: Select all course No.s which both “2015030101” and “2015040101” from relation SC have taken.

Answer: $\pi_{SC.Cno = “2015030101” \lor SC1.Sno = “2015040101”}(SC \mathop{\Join}\limits_{SC.Cno = SC1.Cno} \rho_{SC1}(SC))$

Equal-Join

  • Notation: $R\mathop{\Join}\limits_{A=B} S$
  • Defined as: $R\mathop{\Join}\limits_{A=B}S = \sigma_{t[A]=sB}$
    • $R(A_1,A_2,…,A_n),A\in \lbrace A_1,A_2,…,A_n \rbrace$
    • $S(B_1,B_2,…,B_m),B\in \lbrace B_1,B_2,…,B_m \rbrace$
    • $t \in R, s \in S$
    • $A$ and $B$ are compatible
  • Equal-Join is a special case of $\theta$-Join

Natural-Join

  • Notation: $R \Join S$
  • Defined as: $R \Join S = \sigma_{t[B]=s[b]}(R\times S)$
    • $R$ and $S$ have one same attribute or a group of same attributes
    • Duplicated columns should be deleted in the result relation
  • Equal-Join is a special case if $\theta $-Join

Outer-Join

  • An etension of the join operation that avoids loss of information
  • Computes the join and then adds tuples from one relation that does not match tuples in the other relation to the result of the join
  • User null values: null signifies that the value is unknown or does not exist
  • Notation:
    • Left-Join: ⟕
    • Right-Join: ⟖
    • Full-Join: ⟗

Ralational algebra (8) : Divisiojn

  • Notation: $R \div S$
  • $R = (A_1,…,A_m,B_1,…,B_n)$
  • $S = (B_1,…,B_n)$
  • $S \subseteq R$ Each attribute of schema $S$ is also in schema $R$
  • The result of $R \div S$ is a relation schema, and containing all attributes of $R$ that are not in $S$.
    $R - S = (A_1,…,A_m)$
  • Suited to queries that include the phrase “for all”
  • $R \div S = \lbrace t \ |\ t \in \prod _{R-S}(R) \land \forall u \in S(tu \in R)\rbrace$