Rank and State Algorithm

This page clearly illustrates the development of the SQL for the rank and state algorithm.

The starting point is the two-dimensional timing diagram for the life story of the selected product with id 1002 on the page Generate Bitemporal Intervals:

two-dimensional timing diagram
The registered time runs on the vertical axis reg.
The valid time runs on the horizontal axis val.
Fig. 1

9999-12-31

reg 100 $ 103 $ 111 $

2019-04-01

103 $

2019-03-02

100 $ 105 $ 102 $

2019-02-12

105 $ 103 $

2019-02-10

105 $

2019-02-09

100 $

2019-01-01

2019-01-05

2019-01-10

2019-02-08

2019-02-15

2019-02-17

2019-04-02

9999-12-31

val

A valid rank is assigned to the registered_from data in ascending order. Independently of this, a registered rank is assigned to the valid_from data in ascending order. A new state is created with each registered date.

A state number is assigned to each rectangle in Fig. 1. The state number has the meaning of a version number. It is equal to 1 when the product is created and is incremented with each change. This means that the status number is increased by 1 with each new registration date.

Ranks and States
The registered time runs on the vertical axis reg.
The valid time runs on the horizontal axis val.
Fig. 2

reg state 1 state 5 state 6

rank 6

state 5

rank 5

state 1 state 2 state 4

rank 4

state 2 state 3

rank 3

state 2

rank 2

state 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

A state number is assigned to each rectangle in rank Fig. 2. The state number has the meaning of a version number. It is equal to 1 when the product is created and is incremented with each change. This means that the status number is increased by 1 with each new registration date.

The rectangles are distinguished in two categories and the category is made recognizable with a yellow or grey background in Fig. 3.
The grey rectangles are created by registering a new state that is initially valid indefinitely with valid_date = ‘9999-12-31’. For the yellow rectangles, an end of validity is known on the registered date with valid_date < ‘999-12-31’.
This gives us the following diagram:

Differentiation between rectangles with finite validity (yellow)
and infinite validity – e.g. to valid date = 9999-12-31 (gray)
Fig. 3

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val
Tab. 1
number of registration 1 2 3 4 5 6
date of registration 2019-01-01 2019-02-09 2019-02-10 2019-02-12 2019-03-02 2019-04-01
validity start date 2019-01-05 2019-02-08 2019-02-17 2019-02-15 2019-01-10 2019-04-02
rank of registration 1 2 3 4 5 6
rank of validity 1 3 5 4 2 6

The rectangles in Fig. 2 result in time areas as rows of a bitemporal table.
For state № 2, which was recorded on 09.02.2019, these would be the rows in Tab. 2.

Tab. 2
product_id registered_from registered_to valid_from valid_to price
1002 2019-02-09 2019-02-10 2019-02-08 9999-12-31 105
1002 2019-02-10 2019-03-02 2019-02-08 2019-02-15 105
1002 2019-02-10 2019-02-12 2019-02-15 2019-02-17 105
First row in Tab 2 as time area Fig. 4

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

At its core the algorithm’s task is to determine the correct state number for each square in Fig. 3.

Squares or rectangles with the same registration period [registered_from, registered_to) and the same state can be combined horizontally to form a larger rectangle.
Squares or rectangles with the same validity period [valid_from, valid_to) and the same state can be combined vertically to form a larger rectangle.

There are two different types of rectangles (Fig. 4). The gray ones have no end date of validity, which is indicated by the date registered_to = 9999-12-31. The yellow ones, on the other hand, have a real end date of validity.
For each state number there is exactly one gray rectangle. For gray rectangles, the time areas can be formed very easily.

Ranks are now added to Table 2.

Tab. 3
product_id registered_from registered_to valid_from valid_to price
1002 2019-02-09 2019-02-10 2019-02-08 9999-12-31 105
rank registered 2 3
rank valid 3 null
1002 2019-02-10 2019-03-02 2019-02-08 2019-02-15 105
rank registered 3 5
rank valid 3 4
1002 2019-02-10 2019-02-12 2019-02-15 2019-02-17 105
rank registered 3 4
rank valid 3 5

The first row in Tab. 3 corresponds to the gray colored time areas of state 2.
The other two rows correspond to the yellow colored time areas of state 2.

For the gray colored time areas applies:
Registered_on date corresponds to the date the state was first registered.
Valid_from corresponds to the date the validity began at that registration.
The rank of registered_to is always +1 greater than the rank of registered_on.
The end of the validity time areas is always 9999-12-31.

SQL 1

Code to determine the ranks according to Tab. 3 and the rows to Tab. 2.

with product as
select *,
    row_number() over (partition by product_id  
order by registered_from) as rankReg, row_number() over (partition by product_id
order by valid_from, registered_from) as rankVal from product_1

SQL 1

RankReg is the rank of registration (registered_from) and rankVal is the rank of validity (valid_from) – see Tab 3.

If registered_from is not unique to the product, <registered_date, registered_time> can be used in SQL instead of <registered_from> in SQL 1.
To determine rankVal, not only the order of <valid_from>, but of <valid_from, registered from> is used so that the rank is unambiguous.

                                 

SQL 2

Extension by state “0”.

For purely practical reasons to determine the view tail in SQL 5, the view product in SQL 1 is extended by one row with the state “0” for each product_id:

with productExt as
( 
    select product_id, rankReg, rankVal from product 
         union 
    select product_id, 0, 0 from product where rankReg = 1 
) 

SQL 2

                                 

SQL 3

state of yellow squares Fig. 5

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

The state numbers of the yellow squares results from the closest gray square lying underneath. E.g. state 2 of the blue bordered squares results from state 2 of the red bordered squares in Fig. 5. For the algorithm used, it is important to know with which rank of registration a yellow rectangle begins (Fig. 6).

registration rank of yellow rectangles Fig. 6

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

The rectangles outlined in blue begin with registration rank 3. The registration rank of a rectangle always corresponds to the state number of the gray rectangle right in the row in which the rectangle begins.

begin of yellow rectangles Fig. 7

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

All green lines in Fig. 7 mark the beginning of all yellow rectangles with the same state.
The yellow squares directly above a green line have the same state and the same validity rank as the gray squares directly lying underneath. These yellow squares have a smaller rank of validity than the gray rectangles to the right of them.

The area of ​​all squares above a green line has gray rectangles as neighbors on the right and below. This means that all clusters of the same state that are delimited from the right and from below by gray rectangles are the beginning of yellow rectangles.

The registration rank of the right gray neighbors is +1 greater than the registration rank of the lower gray neighbors. The view tbl joins the right gray time area with the corresponding lower gray time area in SQL 3.

with tbl as 
( 
    select b.product_id, 
b.rankReg, b.rankVal, a.rankVal as rankValPre from productExt a join productExt b on a.product_id = b.product_id and a.rankReg + 1 = b.rankReg )

SQL 3

Two gray rectangles adjoin yellow squares right side and below Fig. 8

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

The squares outlined in green in state 2 in Fig. 8 are the start of vertical yellow and blue outlined rectangles in Fig. 6. Their validity rank corresponds to the validity rank of the underlying gray square with state 2 and is smaller than the validity rank of the gray rectangle to the right with state 3. RankValPre in SQL 3 is the validity rank of the previously registered gray time area, here the gray time area with state 2 and validity rank = 3. RankVal in SQL 3 is the validity rank of the registered gray time area, here the gray time area with state 3 and validity rank = 5. The validity ranks rankValPre and rankVal in tbl determine the validity rank range in which yellow rectangles with status 2 are located. The validity rank of the yellow rectangles is in the range of rankValPre and rankVal minus 1.

SQL 4

Starting points of the vertical rectangles

The squares how bordered with green in state 2 in Fig. 8 are the start of vertical yellow and blue outlined rectangles in Fig. 6. Their validity rank is in the range of rankValPre and rankVal minus 1 as explained below Fig. 8.

All vertical yellow rectangles are represented by the view head:

with head as 
( 
    select a.product_id, a.rankReg, b.rankVal, 
a.rankReg - 1 as state, row_number() over(partition by a.product_id, b.rankVal
order by a.rankReg) as rankRegrank from tbl a join productExt b on a.product_id = b.product_id and b.rankVal between a.rankValPre and a.rankVal - 1 )

SQL 4

SQL 4 generates the beginning of the yellow rectangles according to Fig. 6. For state 2 it is the rectangles with registration rank rankReg = 3 and validity ranks rankVal = 3 resp. 4. The rankRegrank says for a whole vertical from Fig 6, the ordinal number of the beginning of yellow rectangles according to all beginnings in this vertical. So rankRegrank = 1 for the yellow rectangle with state 2 and rankRegrank = 2 for the yellow rectangle with state 5 in column 3 in Fig. 7.

The content of the view head for product_id = 1002:

contents of the view head for all state &gt 0 Tab. 4
product_id rankReg rankVal rankRegrank state
1002 2 1 1 1
1002 2 2 1 1
1002 3 3 1 2
1002 3 4 1 2
1002 6 2 2 5
1002 6 3 2 5
1002 6 4 2 5
1002 6 5 1 5

SQL 5

Ending points of the vertical rectangles
Two gray rectangles adjoin yellow squares right side and above Fig. 9

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

In Fig. 9, the yellow square with the validity rank 3 and the registered rank 4 was bordered with green. It is the end of a yellow rectangle with state 2 and is enclosed on the right and above by gray rectangles bordered with red.

The registration rank for endings of yellow rectangles is equal to the registration rank and equal to the state of the gray rectangle above. The validity ranks of such yellow rectangles is equal to the validity ranks of the gray squares above, but these must be less than the validity rank of the gray rectangle on the right.

This results in the view definition tail for the set of end squares of the yellow rectangles:

with tail as 
(
    select a.product_id, a.rankReg as rankRegTo, b.rankVal,
        row_number() over(partition by a.product_id, b.rankVal 
                     order by a.rankReg) as rankRegTorank
    from tbl a join productExt b on a.product_id = b.product_id
        and b.rankVal between a.rankVal and a.rankValPre - 1
)

SQL 5

RankRegTo corresponds to the registered_to of a yellow rectangle, rankVal to its validity start. RankValPre is equal to rankVal from the gray rectangle with the previous state.
The rankRegTorank says for a whole vertical from Fig 6, the ordinal number of the ending of yellow rectangles according to all endings in this vertical.

The logic for calculating the endings is completely symmetrical to the logic for calculating the beginnings, if one imagines another state “inf” for infinity, which is created last and becomes valid at 9999-12-31.

an additional state infinite Fig. 10

inf inf inf inf inf inf

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

For performance reasons, not each yellow rectangle is considered individually, but always a whole column.

The column with validity rank 3 has two yellow rectangles with state 2 and 5 respectively in Fig. 11. Each rectangle has a head marked in green and a tail marked in red. With the addition of state inf, each column has an equal number of heads and tails to the yellow rectangles. The rectangle with state 2 starts with registration rank 3 and ends with registration rank 5. The rectangle with state 5 starts with registration 6 and ends at 9999-12-31 (registration rank of tail is null).

heads and tails of yellow rectangles Fig. 11

inf inf inf inf inf inf

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

It has to be considered in the algorithm which heads and tails belong together in each case. Therefore, in the algorithm an ordinal number of the heads rankRegrank and one of the tails rankRegTorank is calculated to a validity. Since heads and tails are ranks of the registration times, the view columns rankRegrank and rankRegTorank are thus ranks of ranks.

In SQL 4 the heads of the yellow rectangles are determined, in SQL 5 the tails. The calculations of the two behave symmetrically to each other. A state is not needed in the view tail, because it is available in the view head and the pairing of the tails to the heads can be established via rankRegrank = rankRegTorank.

In the algorithm no pseudo state inf is introduced, but head is joined with tail instead of inner join by left join. Thus, despite the symmetry, there may be fewer ends than beginnings.

The view tail has the following content:

contents of the view tail for all state &gt 0 Tab. 5
product_id rankReg rankVal rankRegTorank state
1002 5 2 1 only in head: 1
1002 5 3 1 only in head: 2
1002 4 4 1 only in head: 2

If the state inf (Fig. 11) actually existed, table Tab. 5 would have the following contents:

contents of the view tail for all state greater 0 and state equal inf Tab. 6
product_id rankReg rankVal rankRegTorank state
1002 is null 1 1 only in head: 1
1002 5 2 1 only in head: 1
1002 5 3 1 only in head: 2
1002 4 4 1 only in head: 2
1002 is null 2 2 only in head: 5
1002 is null 3 2 only in head: 5
1002 is null 4 2 only in head: 5
1002 is null 5 1 only in head: 5

SQL 6

Joining head and tail

To determine the yellow time areas, the tails to the heads are searched and the results are provided in another view tbl2 in SQL 6. For each yellow rectangle the ranks of the ranks are identical, i.e. rankRegrank = rankRegTorank.

with tbl2 as 
( 
    select a.product_id, a.state, a.rankReg, 
        case 
            when b.rankRegTo is null then 0 
            else b.rankRegTo 
        end as rankRegTo, 
        a.rankVal, a.rankVal + 1 as rankValTo 
    from head a left join tail b on a.product_id = b.product_id 
        and a.state > 0 
        and a.rankVal     = b.rankVal 
        and a.rankRegrank = b.rankRegTorank    
)

SQL 6

SQL 7

Temporal Normalization

Now a temporal normalization takes place.

temporal normalization Fig. 12

reg 1 5 5 5 5 6

rank 6

1 5 5 5 5 5

rank 5

1 1 2 4 4 4

rank 4

1 1 2 2 3 3

rank 3

1 1 2 2 2 2

rank 2

1 1 1 1 1 1

rank 1

rank 1

rank 2

rank 3

rank 4

rank 5

rank 6

val

Here, time areas with the same start of the registration period and the same end of the registration period as well as the same state are combined into one time area. The four yellow time areas with state 5 in Fig. 3 would result in a single time area with state 5 how blue bordered in Fig. 12.

The temporal aggregation is done via group by, so the view tbl2 is now defined as:

with tbl2 as 
( 
    select a.product_id, a.state, a.rankReg, 
        case 
            when b.rankRegTo is null then 0 
            else b.rankRegTo 
        end as rankRegTo, 
        min(a.rankVal) as rankVal, 
        max(a.rankVal) + 1 as rankValTo 
    from head a left join tail b on a.product_id = b.product_id 
        and a.state > 0 
        and a.rankVal     = b.rankVal 
        and a.rankRegrank = b.rankRegTorank 
    group by a.product_id, a.state, a.rankReg, b.rankRegTo     
)

SQL 7

SQL 8

Union of gray and yellow time areas

Since not only the yellow but also the gray time areas are to be included in the target view, the SQL code for tbl2 is extended with union:

with tbl2 as 
( 
    -- yellow time areas
    select a.product_id, a.state, a.rankReg, 
        case 
            when b.rankRegTo is null then 0 
            else b.rankRegTo 
        end as rankRegTo, 
        min(a.rankVal) as rankVal,
        max(a.rankVal) + 1 as rankValTo 
    from head a left join tail b on a.product_id = b.product_id 
        and a.state > 0 
        and a.rankVal     = b.rankVal 
        and a.rankRegrank = b.rankRegTorank 
    group by a.product_id, a.state, a.rankReg, b.rankRegTo 
    union 
    -- black time areas
    select product_id, 
        rankReg as state, 
        rankReg, rankReg + 1 as rankRegTo, rankVal, 0 as rankValTo 
    from product
)

SQL 8

SQL 9

From ranks and states back to dates and attributes

The practitioner does not want to work with the abstract ranks and states, but intervals with start and end dates and the attribute values to the product. Therefore a reverse translation of these abstract variables into the “real world” is made in addition.

create view product_2 as 
( 
    select a.product_id, b.registered_from,
    case when c.registered_from is null then '9999-12-31'
         else c.registered_from
    end as registered_to,
    d.valid_from,
    case when e.valid_from is null then '9999-12-31' 
         else e.valid_from 
    end as valid_to,
    f.price
from tbl2 a 
    inner join product b on a.product_id = b.product_id 
                         and a.rankReg   = b.rankReg
    left  join product c on a.product_id = c.product_id 
                         and a.rankRegTo = c.rankReg
    inner join product d on a.product_id = d.product_id 
                         and a.rankVal   = d.rankVal
    left  join product e on a.product_id = e.product_id 
                         and a.rankValTo = e.rankVal
    inner join product f on a.product_id = f.product_id 
                         and a.state     = f.rankReg;
)

SQL 9

Merging the SQL parts

The complete view definition results from the merged code parts SQL 1SQL 9:
SQL Code to Generate Two-Dimensional Time Intervals

<span>%d</span> bloggers like this: