Friday, November 2, 2018

10 Most Critical Database Mistakes

Top 10 Database Design Issues

After well over 30 years in the industry, I am amazed how some basic problems still exist.  It is like the industry has not matured at all.  You can look through the Internet and see similar problems described in the past.  The list is not in order of significance but some of the biggest issues are the last three.
Is a Data Model Just a Pretty Picture?

Summary of Top 10 Design Issues

  1. Assigning Atrocious Element Names
  2. Missing Data Documentation
  3. Disregarding Normalization
  4. Misusing SQL Functions and Procedures
  5. Ignoring Data Integrity
  6. Dismissing Interoperability & Reuse
  7. Promoting Poor Performance
  8. Dismissing Data Modelling & Design
  9. Ignoring Object-Oriented Design
  10. Bypassing Data Management
Each post will discuss one of these issues.  If you can think of some examples from your experience, please share them as comments.

1. Assigning Atrocious Element Names

Should names of entities and attributes be a developer’s personal choice?  Do we design IT solutions for the benefit of the developer or the Business?  If you agree it’s the Business, then why would you use names that do not clearly explain the purpose of the data element?  For example, what does the column name CODE really mean?  Yes I have seen that column name!
In fact, when management says they want a data dictionary, this is really a reflection on the fact they do not understand the meaning of the names that were used.  Attribute/column names must be clear in meaning.  Future users and developers should not have to search endlessly for documentation – if it can be found – that explains their meaning.
One space should be used between words because that is how human languages work and that is what should be shown to Business users in conceptual/logical models and data dictionaries.  Spaces improve readability.  Moreover, the Physical Data Model automatically changes the space to an underscore with no effort on the part of the modeller.

Abbreviations

Badly Named Table
Badly Named Table
If you agree that a database/system should be designed for the Business, then why would you use unintelligible, highly-abbreviated names without vowels?  For instance, what does X20 mean?
Yes, there is a limit of 30 characters for names in Oracle.  But one should be able to name an element with 30 characters.  In fact, developers drop vowels and use totally non-standard abbreviations even when the fully-spelled name would have been less than 30 characters.  This has to stop.
Badly Named Table2
To improve this situation on my projects, I developed a routine to convert all physical names to improved conceptual business names.  The abbreviations were expanded to business language and very few exceeded the 30-character limit.
If the designer has to use abbreviations, it should be ones that are used throughout the world, generally accepted in the industry, found in the dictionary or the www.abbreviations.com web site.  For example, the standard abbreviation for elevation is ELEV not ELVTN.
How do you manage and standardize acceptable abbreviations?  You make a “small” list that developers are allowed to use and vet all names against that list.  There is no point abbreviated Date to DT to save a mere two characters.  Also consider clarity.  If you abbreviate Street to ST how does the reader know if ST means street, saint or standard time?  Clearly, one should focus on abbreviating longer words.  Here is a example list:
Table 1  Standard Abbreviations Sorted by Meaning
Abbreviation Meaning
Abbrev Abbreviation
Accom Accommodation
Admin Administration
Desc Description

Name Prefixes and Suffixes

Prefixes and suffixes are popular in application languages but they should not be used in data models and databases.
  • Names should not contain data type prefixes or suffixes like txtGender.
That is developer-speak.  You look in the database or the data dictionary IN the data model to know data types and other metadata.  A similar example is the superfluous use of “text” when describing a message or description, or “number” when describing a count column; e.g. RETURN_MESSAGE_TEXT or PERSON_COUNT_NBR. 
  • Names should not contain system or table prefixes like ERS_GENDER_TYPE, AMS_ADDRESS or CRM_ERROR_MSG. 
It takes up too many characters when name is limited to length 30.  Moreover, a data modelling tool will migrate the key of a parent table to the child table AS IS.  It will not change Country Code to Purchase Order Country Code, Requisition Country Code…  Similarly, a good data modelling or database tool will automatically join tables using the named data elements that are common to both tables.  Another way to look at it is element reuse…

Element Reuse

  • Table and Data Elements are meant to be reused.
In fact, if you believe in object-oriented design then data elements that mean the same thing should have exactly the same name.  Normally database elements χ1 and χ2 are physically separate entities even if they have the same name.  But it tells the users they are the same.  It also means that if I search for element χ then all instances of element χ will be found. 
  • Some Data Modelling tools enable Table and Data Element reuse.
In fact, the best data modelling tools allow model element reuse and inheritance.  This means that if I update element χ then all instances of element χ in the model will be modified.  This facilitates changes to the description and other metadata in the base object as you know they will be automatically inherited by all other instances – a real improvement for data dictionaries and metadata repositories. 

If your data modelling tool does not allow inheritance, then use the DOMAIN concept.  A DOMAIN applies a set of metadata properties to all members of the Domain.  The data elements do not have to have the same name but will have the same data types, defaults, maximum values and other constraint rule properties.  DOMAIN is an actual SQL construct that dates back to the origin of Relational Databases.  

1 comment:

  1. Once again you are posting my material without any attribution to Terra Encounters or the source URL https://terraencounters.wordpress.com/2018/10/10/10-most-critical-database-mistakes-part-1/. This is plagiarism. You need to remediate this ASAP. change this post to provide my credentials OR delete the post. You did NOT write this material.

    ReplyDelete