Friday, November 2, 2018

Three main categories to SQL and database performance

There are three main categories to SQL and database performance.
  1. SQL Coding Techniques
  2. Database Indexing
  3. Database Design
SQL CODING TECHNIQUES
SQL is a language and you can control how well it is being used. Most of these are things you should not do:
  • Avoid an equation on left side of a WHERE clause, such as WHERE Amount + 3000 > Past_Amount;
  • Never use function calls on the left side of a WHERE clause such as WHERE UPPER (Last_Name) LIKE β;
  • Do not concatenate columns in the WHERE clause as it cannot use the indexes; e.g. Last_Name || Given_Names;
  • Avoid UNION of datasets: UNION requires a sort operation to eliminate any rows that are duplicated across the two data sets;
  • Avoid SQL statements that improperly handle null values;
  • Code the most important selection criteria first in the WHERE clause.

DATABASE INDEXING
This is one of most important things that affects performance. But only a DBA can add or change indexes. So a developer has no control over this. You may not even have the authority to look at the database to see how it’s defined. What are the things that need to be looked at? The candidate action items are:
  • Ensure you always define Primary Keys (PK) and Foreign Keys (FK)
If these are missing then your database was not designed by a data modeller using a data model. A data model is not just a pretty picture. It automatically creates FK constraints and indexes when you join two tables. Moreover, the parent tables — usually your reference codes or dimensions — are the most important selection criteria in your BI queries and reports.
See What Makes Good Primary Key? – no, PKs do not need to be integers!
If there is no data model it probably means your database was not designed at all and probably is a victim of the “Just Build It” approach performed by developers. In other words, it’s a slum not an architecture. I have seen databases where only half the tables had a PK, which means you cannot have an FK, which means you cannot have Referential Integrity. RI is a far bigger and more important problem than performance.
  • Use Clustered Indexes appropriately
  • Ensure that the most Important Selection Criteria are Indexed
Selection Criteria are not only in your own SQL scripts but also implicitly in stored procedures and the SQL run by BI and other analytical tools. The candidates are:
  • Index Columns used frequently in a WHERE clause
  • Index Columns used in JOIN, GROUP BY, ORDER BY and TOP clauses
  • Index Columns used in Aggregation Clauses
  • Avoid Indexes with a Long Length
It is not true that integer keys are faster than character-based keys. That was true eons ago but not in todays databases. What is true is that a two-character key will be faster than a set of big integer fields or a > 30 variable character key.
  • Avoid Creating Tables with Less Than 6 Rows
If you use an index based on a table with 3 or 4 values (e.g. Gender F, M, T, and unassigned) you will be performing a full table scan — go for coffee. There will be probably be a Gender value of type F on every page of the table so the index is non-performant. Do not even do it.
P.S.: use a value list (aka enumeration) constraint instead.
DATABASE DESIGN

If the DB is poorly designed it may be near impossible or very inefficient to create the query your users want. For instance, one database created a separate table for each request form or document. Now management wants to get a cross-tab of the number of requests received, reviewed, approved or rejected and the average number of days spent in each approval stage by document type. Because the data is split between different tables, you would have to do multiple queries and then use the very inefficient UNION clause. That assumes that each table uses the same columns, names, definitions and business rules. Alternatively, the users extract the data to Excel and spend significant manual effort (days) massaging the data. The answer should really be possible in one query taking seconds.

1 comment:

  1. Another article from https://terraencounters.wordpress.com/2018/01/22/sql-skills-missing-in-developers/ without saying that it was written by Terra Encounters. All text and images are COPYRIGHTED to Terra Encounters. You have no right to base you web site on my writing. That is four posts already.

    ReplyDelete