2025-02-13 Business glossary best practices
Brian Parish (IData CEO/Founder, Chief Product Officer)
Data Intelligence
Data Governance Framework
review, approval, consensus, engaging experts, steward, human curation, workflows
Data Catalog and Data Intelligence Content
content capture, inventory documentation, tech catalog, automation, and/or rapid entry
“Data dictionary” is functional-forward, which is really a business glossary. Within it should be the technical definitions (how to interpret data model)
Why do you need a business glossary?
Use cases
Agreeing on what full-time employee means
What is an at-risk client and how do we calculate it?
Pulling the current mailing address from the CRM and/or the data warehouse?
Report listing customer satisfaction for licensed customers by product and by number of years licensed.
How do we get others to know about and use the same rules and calculations for the model for calculating at-risk clients
How many undergraduate English faculty members are at your institution?
Writing business glossary definitions are very difficult.
Value of engaging with your business glossary
Consumption of info/data
Where is the data coming from
Answer questions on the meaning/source or calculation of data items
Requesting info and data deliverables
develop a common language and reference for communicating specific data needs with your analysts/developers
Creation/dev of data deliverables
using common language with requestor
reference existing definitions to reduce rework
adding new definitions
Data doc and curation - stewards and SMEs
authoring/reviewing/approving definition as needed for curation or creation of data deliverables
The business glossary should be referenced citation.
Glossary contents
glossary name
functional definition
technical definitions (related data systems)
Ownership - define related domains and functional areas - who approves/reviews/authors definitions
synonyms/common names
synonym another name for a definition - FTE and full time employment
common names - student connection to full time student
Policy attributes - security classification, privacy status, access rules - associate with glossary terms
Quality attributes - valid values - is it required? is there a range for this?
Reference data - reference data lists maintained separately
Source - where did this definition come from? state, fed, vendor, accreditation
version/status - this version is the currently approved version. What constitutes a new version?
History/comments
Usage - related content
Best practices
Names should be as specific as possible
Start Date vs Employee Original Hire Date
No data system or db specific names - system agnostic
Favor common names where applicable
easy to find by most people, use synonyms to help with search
The name must be unique across all glossary names
Include relevant context/source in name
salary vs US federal income
Specificity is critical
functional def should fully define the term avoiding any hidden assumptions
not and and all criteria, restrictions, scoping, and exceptions
functional definitions should provide enough specificity to inform a complete technical definition
functional definition should be agnostic to the data systems
Glossary definitions are connected/related
use links/references to basic terms for more specific terms
anticipate adverse event is more specific context for an adverse event
use links or pointers
related glossary definitions become valuable emergent content - may spawn more definitions
Define context
Scoping
Agency/domain
Time context
Example 1
Address
Address street (scoping)
current address street (time)
current billing address (domain)
Example 2
Enrollment
Study enrollment
Current study enrollment
current study enrollment status
current study enrollment status for FDA approval
current study enrollment status for internal research
Managing collisions
“we have 7 definitions for active student” vs “we have 7 different things named active student”
Free yourself from collisions by saying it’s ok we can create 7 different definitions as long as it’s clear what they are.
3 types:
Data system naming used as definition name
Invoice ID, Customer Status Code
Common name used with different valid contexts
FTE, Department
Legitimate disagreement about definitions
active prospect, customer engagement
People and roles involved in glossary
Types of involvement
authoring
contributing/collaborating
reviewing - approval, edits, feedback
assigning policy attributes - custom attributes
gaining understanding
researching - search and discovery, referencing, requesting
owning (accountability role)
Engagement
Technical definitions
Purpose
if you have multiple data systems where this element can be retrieved or calculated you can specify each of them
guidance can be narrative, code, or visual (mapping)
specify the relationship to time
Glossary creation/review/approval workflow
How to start a business glossary
Gather any existing glossary content
Definitions from relevant standards organizations (be sure to acknowledge the source)
Identify 10-20 critical reports and document the reports to create the associated data definitions (emergent content)
Promote the key data model items to glossary entries
Data warehouse items are a good start
Create definitions as needed for new data requests
Hypothetical definitions - defining for defining sake - is not a good use of time.
Other options, but expect burnout
Work from a list in meetings
Continue to document old reports