April 2, 2019

Flattening the 7-Layer Recursive Hierarchy Salad

Post by: Bob Charapata

Sometimes organizations must model a hierarchy with data, but they don’t know how deep it will be. Developers often create recursive hierarchy tables for transaction processing systems to solve this problem. Those tables have one column on the table that refers to the table’s identity column to create a parent-child relationship. (Typical situations for this may be organization reporting structures or sales locations.) A recursive hierarchy table will look something like this (using table declaration for ease of use in this article):

DECLARE @RecursiveHierarchy TABLE
(
    [ID]            int,
    [HierarchyName] varchar(50),
    [SortID]        int,
    [ParentID]      int
);

In addition to the ID and the column that refers to the Parent ID, there’s a SortID column that allows ordering of the hierarchy. There’s also a HierarchyName which will serve as the pivot column later.

Having a recursive hierarchy with an unknown depth works well for transaction processing. Yet it does not lend itself well to browsing the hierarchy in analytical models. Many analytical tools allow only the selection of named columns. This means that hierarchy levels are better displayed as columns for analytical purposes.

The data transformation should perform well, so use two stacked recursive Common Table Expressions (CTE) and pivot on the resulting data to flatten the hierarchy.

Using the table structure declared earlier, let’s assume the following sample data. The top level parent of the hierarchy has a ParentID of \ to identify that row as the root of the hierarchy.

-- R = Root
-- L = Layer
-- I = Item

INSERT @RecursiveHierarchy
(   [ID],
    [HierarchyName],
    [SortID],
    [ParentID]  )
VALUES
    (1, 'R1', 1, NULL),
    (2, 'R2', 2, NULL),
    (3, 'R3', 3, NULL),
    (4, 'R1 L1 I2', 2, 1),
    (5, 'R1 L1 I1', 1, 1),
    (6, 'R2 L1 I2', 2, 2),
    (7, 'R2 L1 I1', 1, 2),
    (8, 'R1 L2 I2', 2, 5),
    (9, 'R1 L2 I1', 1, 5),
    (10, 'R2 L2 I1', 1, 7),
    (11, 'R1 L3 I2', 2, 9),
    (12, 'R1 L3 I1', 1, 9),
    (13, 'R2 L3 I1', 1, 10),
    (14, 'R1 L4 I2', 2, 12),
    (15, 'R1 L4 I1', 1, 12),
    (16, 'R1 L5 I2', 2, 15),
    (17, 'R1 L5 I1', 1, 15),
    (18, 'R1 L6 I1', 1, 17),
    (19, 'R1 L7 I2', 2, 18),
    (20, 'R1 L7 I1', 1, 18);

First, determine the greatest depth of the sample data’s hierarchy using the first recursive CTE. This CTE adds the depth of each member of the hierarchy. It sets the root member(s) of the hierarchy to depth 0 and adding 1 to every level on recursion.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID])
     SELECT MAX([hd].[Depth])
     FROM [HierarchyDepth] AS [hd];

For the sample data, the greatest depth is 7. This depth will be used later for the number of columns pivoted.

Once current depth of the hierarchy is determined, another recursive CTE is added below the original CTE to create a mapping for every record to all ancestors of that record. For example, record ID 20 will have 8 records because it has parent ID 18, which has parent ID 17, which has parent ID 15, and so on up to single ancestor ID 1. (One record for itself, and 7 records to each ancestor.) Also, this second CTE subtracts 1 to assign the correct depth for each record-ancestor mapping.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID]),
     [Ancestor]([ID],
                   [ParentID],
                   [Depth])
     AS (SELECT [dh].[ID],
                [dh].[ParentID],
                [dh].[Depth]
         FROM [HierarchyDepth] AS [dh]
         UNION ALL
         SELECT [a].[ID],
                [rh].[ParentID],
                [a].[Depth] - 1
         FROM [Ancestor] AS [a]
         JOIN @RecursiveHierarchy AS [rh]
         ON [rh].[ID] = [a].[ParentID])
     SELECT *
     FROM [Ancestor];

Finally, the HierarchyName column is added back to the two CTEs so there is a column to pivot on. The final select with a pivot to flatten the 7 layers of the hierarchy is joined to the original table to return the ID of each record, the hierarchy name, and the flattened hierarchy.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [HierarchyName],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                [rh].[HierarchyName],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [rh].[HierarchyName],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID]),
     [Ancestor]([ID],
                   [HierarchyName],
                   [ParentID],
                   [Depth])
     AS (SELECT [dh].[ID],
                [dh].[HierarchyName],
                [dh].[ParentID],
                [dh].[Depth]
         FROM [HierarchyDepth] AS [dh]
         UNION ALL
         SELECT [a].[ID],
                [rh].[HierarchyName],
                [rh].[ParentID],
                [a].[Depth] - 1
         FROM [Ancestor] AS [a]
         JOIN @RecursiveHierarchy AS [rh]
         ON [rh].[ID] = [a].[ParentID])
     SELECT [pn].[ID],
            REPLICATE('- ', [hd].[Depth]) + [hd].[HierarchyName] AS [HierarchyName],
            [hd].[Depth],
            [pn].[0] AS [Root],
            [pn].[1] AS [Level1],
            [pn].[2] AS [Level2],
            [pn].[3] AS [Level3],
            [pn].[4] AS [Level4],
            [pn].[5] AS [Level5],
            [pn].[6] AS [Level6],
            [pn].[7] AS [Level7]
     FROM
     (
         SELECT [a].[ID],
                [a].[HierarchyName],
                [a].[Depth]
         FROM [Ancestor] AS [a]
     ) AS [hn] PIVOT(MAX([hn].[HierarchyName]) FOR [hn].[Depth] IN([0],
                                                                   [1],
                                                                   [2],
                                                                   [3],
                                                                   [4],
                                                                   [5],
                                                                   [6],
                                                                   [7])) AS [pn]
     JOIN [HierarchyDepth] AS [hd]
     ON [pn].[ID] = [hd].[ID]

Here is the final result set:

ID    HierarchyName             Depth  Root  Level1       Level2       Level3       Level4       Level5       Level6       Level7
----- ------------------------- ------ ----- ------------ ------------ ------------ ------------ ------------ ------------ ------------
1     R1                        0      R1    NULL         NULL         NULL         NULL         NULL         NULL         NULL
2     R2                        0      R2    NULL         NULL         NULL         NULL         NULL         NULL         NULL
3     R3                        0      R3    NULL         NULL         NULL         NULL         NULL         NULL         NULL
6     - R2 L1 I2                1      R2    R2 L1 I2     NULL         NULL         NULL         NULL         NULL         NULL
7     - R2 L1 I1                1      R2    R2 L1 I1     NULL         NULL         NULL         NULL         NULL         NULL
10    - - R2 L2 I1              2      R2    R2 L1 I1     R2 L2 I1     NULL         NULL         NULL         NULL         NULL
13    - - - R2 L3 I1            3      R2    R2 L1 I1     R2 L2 I1     R2 L3 I1     NULL         NULL         NULL         NULL
4     - R1 L1 I2                1      R1    R1 L1 I2     NULL         NULL         NULL         NULL         NULL         NULL
5     - R1 L1 I1                1      R1    R1 L1 I1     NULL         NULL         NULL         NULL         NULL         NULL
8     - - R1 L2 I2              2      R1    R1 L1 I1     R1 L2 I2     NULL         NULL         NULL         NULL         NULL
9     - - R1 L2 I1              2      R1    R1 L1 I1     R1 L2 I1     NULL         NULL         NULL         NULL         NULL
11    - - - R1 L3 I2            3      R1    R1 L1 I1     R1 L2 I1     R1 L3 I2     NULL         NULL         NULL         NULL
12    - - - R1 L3 I1            3      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     NULL         NULL         NULL         NULL
14    - - - - R1 L4 I2          4      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I2     NULL         NULL         NULL
15    - - - - R1 L4 I1          4      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     NULL         NULL         NULL
16    - - - - - R1 L5 I2        5      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I2     NULL         NULL
17    - - - - - R1 L5 I1        5      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     NULL         NULL
18    - - - - - - R1 L6 I1      6      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     NULL
19    - - - - - - - R1 L7 I2    7      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     R1 L7 I2
20    - - - - - - - R1 L7 I1    7      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     R1 L7 I1

The dashes are added in the final statement to show depth of each level of the hierarchy. Typically, these wouldn’t be included when extracting data for consumption by an analytical system. Notice in the above SQL statement that the data is pivoted on all 7 levels of the hierarchy. Also, the 7 newly pivoted columns are returned in the final select statement from the CTE. This can be adjusted based off the analyzed depth of the recursive hierarchy.

Finally, notice that the results are not displayed in proper hierarchical order. This is not a concern for extracted data that will be consumed by an analytical system. However, this may be useful for other purposes.

I have shown how to use recursive CTE’s and a pivot structure to transform a recursive hierarchy into one that displays the hierarchy across columns. The query structure performs well because it does not loop through the data to transform the structure. The next time you find yourself extracting recursive hierarchy to analysis models, consider using this method.

Relevant Insights

New Site Showcases Our Growth Into a Full-Service IT Consultancy

Over the past year, Core BTS has evolved. Simply put, we’ve amassed greater scale and expertise that, in combination with...

5 Steps to Reduce Your Ransomware Risk

As the recent ransomware attack on the U.S.’s second-largest meat producer, JBS, made clear, cyberattacks on critical infrastructure can cause...

How to Unlock the Organizational Value of Digital Transformation

As organizations look to stay competitive in today's dynamic and unpredictable marketplace, a trend has re-emerged that is ushering us...
X