If you’re involved in eliciting, modeling, analyzing, or consuming requirements for Business Intelligence (BI) projects, then this post is for you.
What is a Data Dictionary?
The technical definition of a Data Dictionary from the International Institute of Business Analysis (IIBA®) is: “A data dictionary is used to standardize a definition of data elements and enable a common interpretation of data elements.” – BABOK® v3.0
Data Dictionaries are often confused with another business analysis technique: Glossaries. These two techniques usually overlap. The terms in a Glossary may be included in a Data Dictionary, or vice versa.
However, the audience for a Glossary is much different than that of a Data Dictionary. A Glossary is meant for business users, and the purpose is to use a common vocabulary within an organization, providing a clear understanding of each item – along with any known aliases.
A Data Dictionary, on the other hand, is meant for a technical audience. It may be used by Business Intelligence Analysts on a BI project to help understand the attributes and source of data elements, as well as their destination or possible transformations to the data.
Data Dictionaries contain a detailed list of:
- Data elements
- Their characteristics
- Possible values
NOTE: Data Dictionaries may also be referred to as “metadata repositories”.
What Are the Elements of a Data Dictionary?
Data Dictionaries contain both primitive (singular) and composite (combined) elements:
|Primitive (singular) Data elements:||Composite (combined) data elements|
|Name – a unique name (which may be referenced by the composite data elements)Aliases – alternate names for the data element used by various stakeholdersValues/Meanings – a list of acceptable values for the data elementDescription – the definition of the data element in the context of the solutionOthers – any other attribute that is important about the data (see sample below)||Sequences – required ordering of primitive data elements within the composite structure. For example, the three primitive elements of a person’s name (First Name, Middle Name & Last Name) can be put together to create a new piece of data called the Customer Name: |
Customer Name = First Name + Middle Name + Last Name.Repetitions – whether one or more data elements may be repeated multiple times (not common)Optional items – may or may not occur in a particular instance of the composite element.In our example above, there may or may not be a Prefix (Miss, Dr., etc.) in front of the name.
Data Dictionaries are used to:
- Manage data within context of a solution
- To standardize usage and meaning of data elements
- Complement other data models
What Does a Data Dictionary Look Like?
Here is a sample from the Business Analysis Practice Guide, published by the Project Management Institute (PMI®):
Example from: “Business Analysis for Practitioners, a Practice Guide”, published by PMI®
Data Dictionaries are extremely useful when working on any type of project that includes data. It ensures that each piece of data is clearly identified and that its attributes are accounted for. It provides a single source of reference when re-using the same data elements in different locations (such as on multiple reports). It also makes clear the valid uses and values of the data.
Many data management systems have the ability to export a data dictionary – the functionality is essentially built into the tool, and this makes it much simpler to manage.
Data Dictionaries are heavily used in combination with other data models, removing the details of the data from the models themselves, but referencing the Data Dictionary. This keeps the models clean, and enables the use of a single source for the data definitions.
If the data dictionary is maintained manually, versus being systematically created, it may be difficult to manage it and keep it up-to-date. In order for it to be useful, it will need to be available and accessible in a shared location for anyone working with the same set of data.
Data dictionaries are great tools for any data project, but must be maintained in order to continue being useful. It’s also possible that data dictionaries may be misused and treated as a Glossary – which is not their intended purpose.