Perhaps you are wondering why I have written a post on ‘Database or Spreadsheet?’ for a site focused on Business Resilience / Business Continuity (BRBC). There are many critical reasons, which I will discuss in this post.
Tools I mention in this post include BCM (recommendations) and Spreadsheets
I have known a lot of great BRBC people and unless they came from IT they can sometimes be at a disadvantage:
- Understanding which is a better solution for different use cases – a database or a spreadsheet
- Understanding the true power of a well-designed database solution and how to implement it properly
Using the wrong tool or implementing the wrong vendor solution can result in a nightmare for you and your organization. It can become:
- Life threatening to your employees
- Expensive for your organization
- A frustrating drain on your limited resources and very difficult to reverse for you and your team
In this post, we are going to stay away from complex tech jargon. We do not need to go there. We will keep it at a high level. Our goal is to get you comfortable with the basic concepts. When you consider building a solution in-house you will have a good idea which tool to use. When you get on a call or webinar with a vendor you will have a head start. Database concepts will not be completely foreign to you. Many people are not comfortable with database concepts and I can completely understand it. Hopefully, this short post will start you in the right direction and you can further research on your own with a simple database book or video. There are some very good ones available. If you prefer you can contact me for some suggestions. As you know my door is open to you.
In my experience people throughout corporations, education and the government, not just BRBC, often use the tool they are most familiar with rather than the one best suited for the job. The result can be like using a screw driver to hammer in a nail. This has become a pet peeve of mine. I think organizations are doing themselves and their customers a disservice when they think spreadsheet first and only. I am not a spreadsheet hater. I use spreadsheets. Spreadsheets can have great value. Sometimes a spreadsheet is the appropriate tool and sometimes it is not.
In Business Resilience, we work with oceans of data and it is increasing every day. Often we have to analyze how data ‘elements’ relate to each other. For example, creating a dashboard of the important metrics derived from hundreds or thousands of plans and determining the relationship of processes upstream and downstream. This type of data-analysis plays to the strength of databases. You need to be able to spin these large data-sets into insight, knowledge and decisions using the best possible tool. Picking the wrong solution will undermine this and it will waste your time.
Making the Decision – Database or Spreadsheet:
To be 100% transparent, I have written eight books on database development that have been used by professional programmers globally. Don’t worry, this post is not meant to be a database programming course and I am pretty sure most of you would not want it to be. I promise I am not going to get too techie here, as it would be counter-productive. I will pose some ideas and suggestions that will help you decide if a database or spreadsheet is the proper tool to use.
There are occasions where developing a simple database can be faster and much more valuable than building complex macros in a spreadsheet to make it ‘almost’ like a database. I will also point out use-cases where a spreadsheet is preferable to a database.
I will then conclude the post with some common database concepts and terms that will help level the playing field for you when speaking with your IT folks or vendors trying to sell you their expensive software solutions. When you ask them questions about fields, records, properties, data normalization, data models and API’s, they will see you on a different level. You will be in the top 1% of business users in terms of understanding the power of a database. The questions will help you select the best solution for your needs. It is critical to ask the right questions before you buy.
If fact, ask 10 of your non-IT business friends what data normalization is or what a data model is and I will bet close to zero will not have any clue what you are talking about. They may roll their eyes.
Most importantly, the information you will learn will allow you to get a feel for how robust a vendor’s solution really is, not strictly what they are telling/selling you. I mean, we trust them to a certain degree but they may be slightly biased in favor of their product, ya think? Poor database design can indicate a poor product and trouble ahead. Great database design can provide incredible value and can be a foundation to your program.
When you use one of the concepts such as an API or normalization in a question to a technology salesperson and he or she says, ‘hmmm I was never asked that before. Great question, I will speak with our tech people’…, let me know. You will make my day. I hear it from salespeople often and I hope you do too!
Spreadsheets in BRBC:
I am guessing you are pretty good with a spreadsheet. The clear majority of people I have met are more familiar with using spreadsheets than developing database solutions.
You probably know a spreadsheet is a paper based or software program that captures data in rows and columns. The data can be in one worksheet or in multiple worksheets in a workbook container. Excel and Google Sheets are examples of spreadsheets. I started with Lotus 1-2-3, back in the day. I have seen spreadsheets used for almost every application from word processing to a home spun FTP (file transfer protocol) program using VBA (Visual BASIC for Applications). In fact, I have built some very complex spreadsheets using VBA. In my opinion it is a fun and easy to learn language.
In a spreadsheet individual data elements such as first name, last name, city, state and zip code are captured in cells that intersect rows and columns in the worksheet. A1, B12, D5… sounds like a bingo card, right?
I commonly use spreadsheets for simple contact lists, vendor features/benefits analysis and sometimes for reports from my business continuity management tool or mass notification tool in which I need to apply a further level of numeric analysis beyond what I can do in the core vendor tool. I also use spreadsheets for checklists and simple project plans. I find each of these use cases shine in a spreadsheet.
Spreadsheets can get us into trouble when many people need to contribute or access information.
You have probably seen spreadsheets emailed to 10, 20 or 50+ people for input. You have multiple people updating a BIA or updating their contact information. Each person enters information and returns it to the sender. It then becomes cumbersome and error prone to integrate and present the results. Keeping the spreadsheet up to date can be a time sapping nightmare.
At that point, if you are good with spreadsheets, you may start designing complex macros, programming in VBA or developing cross-worksheet formulas and pivot tables to produce the metrics and summary results you and management need to do your jobs. Then, when you finally complete that effort, the need to produce additional metrics triggers another larger than necessary effort. This type of use case is ‘easy peasy’ using a more flexible database solution.
Spreadsheets also have the propensity to breed faster than rabbits. Silo’ed departments often have their own ‘secret’ version of ‘the truth’ or the ‘not so gold copy’ stored in spreadsheets out on their share drive, either on an official corporate drive or perhaps on their personal drive in the cloud, which is a huge security risk. This can get ugly real quickly when there are many versions of ‘the truth’. Which one do you believe? I have witnessed very expensive organization-wide data cleanup projects dedicated to trying to consolidate and ‘normalize’ this data, after the fact.
Overview of Databases in BRBC :
A database is a central repository that can manage vast amounts of data. If the database is designed properly gigabytes, terabytes, exabytes and even zettabytes of data can be sliced and diced any way you need it in fractions of a second. For example, a database with 1,000,000 employee records, 100,000 vendors and thousands of systems and organizational locations is trivial. A database comprised of 2,000 business continuity plans is ‘easy peasy’ for modern database systems.
A database is an excellent tool when you need input from many people AND when you want to present the data in different ways to a variety of users with varying permissions. You can easily control what they can input and view. You can easily hide data they do not have permission to see.
The information in a database is most often stored in tables (relational database) or in nodes (graph database). As of this writing relational databases are much more popular than graph databases, primarily because they have been around far longer. Most, if not all, HR systems, business continuity management systems and mass notification systems use relational databases.
Although graph database systems make up only a small portion of the database universe as of 2017 that will change drastically in the future, as there are many interesting uses for graph databases. Graph databases allow us flexible and easy analysis of millions or billions of data nodes. Advanced situational awareness and real-time gap analysis are two of many uses. Employee skill and certification information can be a graph database solution as well. Facebook manages their billions of users in a graph database. Graph databases can help us discover critical well hidden inter-relationships between entities in our organizations or any other facet of life. For example, graph databases are key to mapping DNA and discovering the causes of disease.
In future posts we will dig deeper into how we can leverage graph databases. In the meanwhile I will keep you up to date on graph database solutions in the free Ultimate Business Continuity Tips, Techniques and Tools Newsletter.
Basics of a relational database:
A relational database stores information (data) in one or more tables. Each table stores a common set of data in rows and columns. Employees, equipment, vendors, customers and applications should be stored in their own dedicated table, rather than one huge table.
A well designed database solution separates the presentation layer (what the user sees) and the data layer (the back-end data tables). This allows you extraordinary flexibility and control in presenting and accessing data to users. Make sure the layers are separated.
Tables are made up of the following:
Each table will be made up of ‘records’ and ‘fields’. Each field will store a particular data element which is a single piece of information. For example, in the employee table ‘first name’, ‘last name’ and ’employee ID’ would be stored in individual fields. Employee ID, if unique, can be a special field called a primary key. Each table requires a primary key that defines the uniqueness of each record in that table. In some instances, a primary key can be the concatenation (combination) of more than one field.
Primary keys and uniqueness may be a discussion point during your vendor calls and definitely when importing data to a BCM or mass notification system. I speak a little more about primary keys below, as it is an important concept for you to be comfortable with.
Records: are comprised of groups of fields. That would include the three fields I just mentioned and more – city, state… in total for one person which would constitute one record. For a Vendor record fields might include: Vendor ID, Vendor Company Name and Vendor Representative.
A simple analogy with a spreadsheet is that database fields are columns, and records are rows.
Tables: are comprised of multiple records. As I mentioned you will have an Employees Table, Vendor Table, Applications Table…
The amazing thing is, if the database is properly designed, data about a person, a process or any other asset might be spread across multiple tables but it will be easy to bring everything together in reports and display information at a detail or summary level.
Proper database design is an important concept. If the relational database is treated as a ‘flat file’, sort of like building one big spreadsheet in the database, it will cause trouble down the road. It can easily become an expensive re-design and it will cause you lots of nights thinking about how to fix the mess. Your database MUST use the power of proper powerful relational design.
Here are a few database terms you might hear on vendor calls or in meetings with your IT liaison. You do not have to memorize them. Just get a feel for them. You can also further research any that especially interest you:
Primary Key – is a special relational database table field (or combination of fields) designated to uniquely identify all records in each table. A primary key’s main features are: It must contain a unique value for each row (record) of data and the primary key cannot contain null values. An example of a primary key might be employee ID’s. Even if 10 people have the same first and last name they can be uniquely identified by the primary key. Primary key definition will be an important part of any BCM and mass notification tool discussion and implementation. When you update the records in the database you will most likely cut a file from your HR system and upload it to the BCM or mass notification tool to delete, update or add records. The match will be accomplished using the primary key.
Foreign Key – is a field (or collection of fields) in one table that uniquely identifies a row of another table or the same table. In other words, the foreign key is used in a second ‘related’ table, but it refers to the primary key in the first table. This is how the link between the tables is established. Often it is used in a one-to-many relationship, which we discuss below.
For example, your Accounting ‘process’ may use many types of software which need to be linked in their Business Continuity Plan. In that case ‘process’ would be the master table and software would be the child table. A ‘join’ would be created between the tables using each tables key field. The system will use these linked fields to produce reports in a variety of formats. Joins are really important.
Normalization – is the process of organizing the columns and tables to reduce data redundancy and improve data integrity. Normalization occurs by breaking out repeating data into related tables. In my experience, there are instances in database design where partial normalization makes more sense from a response perspective.
The three levels of normalization are:
First Normal Form
- Eliminate repeating groups in individual tables
- Create a separate table for each set of related data
- Identify each set of related data with a primary key
Second Normal Form
- Create separate tables for sets of values that apply to multiple records
- Relate these tables with a foreign key
Third Normal Form
- Eliminate fields that do not depend on the key
Don’t worry if the three levels of normalization do not make sense. The practical use of normalization is more important. Here it is… if you are implementing a BCM tool you should be very concerned with normalization and quality of data. You will most likely set up master tables for employees, equipment, applications and other dependencies. If the data is not normalized, what can happen is the same application can be called by multiple names – Microsoft Office, Office, Office 365… When it comes time to produce reports on a particular piece of software, you will have a mess on your hands.
The reporting demands will quickly turn into ‘kludgy’ (tech term for messy) solutions in an attempt to produce quality output. Worst case is, you can wind up with a very expensive cleanup and database re-design on your hands. Trust me, you DO NOT want to go there. The best solution is for the database to be designed properly upfront and the data should be automatically imported or manually entered into the system abiding by strict rules for consistency. Users can select dependencies to link to their plans, but your team controls the dependency data in the master tables.
Repeating data in tables can also turn into a maintenance nightmare if you need to change a piece of data throughout your system. You will have to find and change every occurrence if it throughout the database. By having it in one master table and linked to other forms when you need to use it, you will only have to change the information in one place and the change will automatically be reflected (ripple) wherever that piece of data is used in relationships. Change it once and the change is reflected everywhere in the system.
If you are fortunate enough to have corporate central ‘gold sources’ of data within your company with unique identifiers (primary keys) you may want to upload that data to your BCM systems master tables. You can even set up automated SFTP uploads to add, modify or delete data as often as required. The unique identifiers (primary keys) must not change, as that is what the BCM tool will use to match the records!
One-to-many relationship – this is the most common type of relationship you will encounter in whichever database system you use. A parent record in a table can refer to zero to many child records in another table. The child record can only have one parent record. For example, a department can link to many dependent applications, vendors or any other dependency or asset.
Data Model – this is a diagram of how the tables and the underlying data is connected and stored in a database system. To the trained database professional analyzing a data model can be very valuable in determining the normalization degree of a database and in the ability to create complex reports. In my experience some vendors are hesitant to release their data model. Having it can save you time. If you do not have it you can figure it out, if you know databases.
Structured Query Language (SQL) – a programming language designed for managing data stored in a relational database. I know many languages and I rate SQL very easy to use for basic updates and reports. It can get pretty complex depending on the type of table joins and actions you need to perform. You will most often be using inner joins. There are also outer, left, right, cross and self joins that can prove useful.
* In the ‘Golden Opportunities to Generate Beyond BC Value Streams’ post I discuss ‘Data Referential Integrity’. This is an important consideration when designing databases. It is also low hanging fruit when analyzing data problems in organizations. As data moves horizontally from department to department bad data can become very costly. You can provide extraordinary value identifying data issues and bringing them to the attention of management and IT. Please be sure to read the Golden Opportunities post.
The bottom line is databases might sound complicated but they are not. Just take it slow and build your skills at your own pace. It really is fun and it is a very valuable skill to know. As a BRBC professional no one expects you to develop complex databases. Knowing when to use a database as opposed to using a spreadsheet or another tool is very important to your success. Contact me with any questions, concerns or ideas you may have.