Data Modeling—Just Don’t Show Me Any Crows’ Feet!

Featured
37501 Views
0 Comments
12 Likes

Data Modeling Overview

This is the second in our series of requirements modeling. In the previous article we defined what models are and what they are used for. We explored the close relationship between requirements elicitation and modeling. We also introduced the concept of Concurrent Requirements Modeling and gave a very brief explanation of each.

Now we delve into data modeling, one of the core model types. We choose to start here because data requirements are an important foundation for most information technology projects. If you are a business analyst and not doing data modeling today, a) we feel you should be able to at least read them to validate requirements against what a data modeler has created and b) our bias is that business analysts can and should be doing “functional” data modeling (as distinguished from technical or physical modeling).

Terms, Terms, and More Terms

What is a data model? A graphic representation of the data requirements of a business/business area. It gathers various assets such as business rules, requirements, and definitions for a business area into a cohesive whole. It can optionally include goals and strategies relating to information to guide a development team to ensure the data will support business needs. A data model, such as an Entity/Relationship Diagram (ERD), answers these questions:

  1. What information does the business care about?
  2. What are the facts about that information?
  3. Are those facts required?
  4. What are the business rules associated with that information?

 

Entities are people, places, things, processes or events that occur in a business.

Entities. Information that the business cares about are called usually called entities,[i] which include people (e.g. customer, subscriber, member, employee, passenger, student), but usually not job titles or roles. They also include things (order, shipment, location, insurance policy claim, course, class, reservation, flight segment).Another term called “concept” has been used to describe an entity, but we prefer the more traditional and more concrete term over the more abstract “concept.”

Figure 1 shows a few examples of entities along with some naming guidelines.

 
Naming Guidelines:
  • Use singular nouns
  • Business-oriented names best
  • Use synonyms if need be until final choice is made
  • Spell out names

 

Figure 1: Examples of Entities 

Attributes are facts about an entity and represent the detailed data requirements.

Attributes. Entities have facts about them called attributes. We also need to know whether or not these facts are required and possibly other limitations on those facts. During elicitation, for example, we might ask whether the end-user has to fill in this field before leaving the screen. In the examples below, do our stakeholders need to: Fill in passenger middle name? Flight date for a reservation? Services provided on a plane? These are business rules that stakeholders need to identify.

 
Figure 2: Examples of Attributes

Occurrences. Entities have instances called occurrences, meaning there must be more than one to be an entity.

Occurrences. Entities have occurrences, sometimes called instances. Occurrences mean that there are more than one, usually many, to be considered an entity. Not all business objects have occurrences meaning they are not entities. In the entity Passenger in Figure 2, there are numerous passengers. John Jones is a passenger, as are Mary Johnson and Laura Peterson. If an airline did not have lots of passengers, they likely would not be in business long.


Figure 3: Examples of Occurrences

Many students in our classes ask a common question: can’t we just call an entity a table, call attributes columns, and occurrences rows? Yes, of course we can. But should we? We could certainly go ask our stakeholders for names of tables and columns, but as business analysts, it is probably a good idea to talk their language rather than forcing them to understand our techie talk.

A relationship is a business rules showing how the data in one entity is associated with another entity.

Relationships. In a nutshell, relationships enforce the business rules surrounding entities. Business rules dictate how organizations want to operate, including their policies. Because they are business rules, they need to be articulated and confirmed by the business stakeholders. Over the years we’ve been on too many projects where the technical staff thought they “owned” the business’ data, so they routinely made assumptions about operational policies and procedures to their subsequent regret. Once the project has been implemented it is costly to make changes to these data business rules. Business rules usually apply across projects, so changing the business rule can means extensive maintenance headaches.

To elicit these business rules, we need to ask about the cardinality and optionality (multiplicity) of the relationship. However, if you want to scare your business stakeholders, try asking them about their rules using those terms. Very few would be able to articulate their rules using these terms. So what do these terms mean? Let’s have a look.

Cardinality and optionality. Each set of entities has two relationship business rules. Let’s look at the two entities below—building and office to explain these concepts. Cardinality has to do with how many of one entity relates to one other entity, namely is there only one or can there be many occurrences. Optionality describes whether the relationship is required.


Figure 4: Example Relationship

 

A Note on Notation (Data Modeling)Figure 4 shows an example of buildings and offices that has two cardinality rules, which we read as:

  • Each office contains any number of offices (Rule #1)

  • Each office is housed in a single building (Rule #2)

Or we might word the rules as:

  • Each office contains zero, one, or many offices but

  • Each office is housed in one and only one building

When we elicit cardinality rules, we usually like to ask for minimums and maximums. What is the maximum number of offices this building can contain? The minimum? What is the maximum number of buildings that can house each office? Minimum?

However, by asking about minimums, we are asking about optionality. It is virtually impossible to separate optionality and cardinality. When the answer to the question what is the minimum comes back as zero, it means that the relationship is not required, that it is optional.

Here are some example elicitation questions and answers to get at optionality. We tend to get the best results from the third example.

  1. Q: What is the optionality between building and office?

    1. : (Blank stares.)
  2. Q: What is the minimum number of offices contained in a building?

    1. :There has to be at least one office in the building.
  3. Q: In your system, when you’re setting up buildings, do you have to set up offices as well? Can you create a building without any offices?

    1. : We have to be able to set up buildings without any offices. We first create buildings in the system long before we have any information on offices. Each initial tenant can choose how much space they need, but we don’t start leasing space before we have an office and the information about an office.

The most common relationship is one to many. One-to-one relationships occur with type entities (fourth normal form), which we’ll discuss in a later article.

When there is a many-to-many relationship, it is usually resolved by creating another entity, commonly called an association entity. There are rules for creating this type of entity. Because it can be confusing to beginners, we tell students new to data modeling to remember that the crows’ feet always point to the new entity. Here is an example.

 

Figure 5: Many-to-Many Relationship

Association entities resolve Many-to-Many relationships by creating One-to-Many relationships to a new entity that associates the original entities.

The above Many-to-Many relationship is a problem in a data model and in a subsequent relational database because associations are hard to manage and queries are more difficult to navigate. They cause “repeating groups” of data (a basic normalization “no-no” in data modeling) and is beyond the scope of this article to explain. The usual resolution is to remove the M-M relationship and replace it with two 1-M relationships as shown in Figure 6 below.

 
An Association Entity

Figure 6: Association Entity to Resolve a M-M Relationship

To summarize, in this article we have:

  • Defined the terms entity, attribute, relationship, cardinality, optionality, and association entity and provided examples.

  • Provided examples of data business rules.

  • Provided examples of questions to ask to elicit data requirements.

Our next article will cover data normalization—what it is, why we need it, and the importance of going beyond third normal form.


Authors: Elizabeth Larson and Richard Larson, Watermark Learning

Elizabeth Larson, CBAP, PMP, CSM and Richard Larson, CBAP, PMP are Co-Principals of Watermark Learning, a globally recognized business analysis and project management training company. With over 30 years of industry experience each, they have used their expertise to help thousands of BA and PM practitioners develop new skills. Their speaking history includes repeat appearances at IIBA and PMI Global Congresses, chapter meetings, and professional development days, as well as BA World conferences.

They have co-written the acclaimed CBAP Certification Study Guide and The Practitioners’ Guide to Requirements Management. They have been quoted in PM Network and CIO magazine. They were lead contributors to the BABOK Guide® Version 2.0, as well as the PMBOK Guide® – Fourth edition.


[i] UML class diagrams also contain business information.

 



 




Copyright 2006-2024 by Modern Analyst Media LLC