Have you ever been in a meeting where all the reports show you outstanding numbers and only green lights? You wonder how that could be possible when your experience on the frontline of the business reflects a different reality. There are two potential answers:
- People are lying about the reports and corrupting data voluntarily.
- The data is wrong and does not reflect the reality.
It is my hope that you work with honest people who apply best practices when processing and presenting data. In most cases, however, it’s the latter issue that is most likely to be causing the discord, and that’s the one I’m going to address. In this blog, I’d like to present you with a few tips on how to set up a proper database for your organisation––in particular, if you use multiple sources for your reporting.
Do a quick troubleshoot
Start small; there's no need to change the world in a single day. Compare the data shown in the report with your experience, it’s likely that your colleagues had the same reservations about the results, and you’re all probably right! Ask yourself the following questions to start a troubleshoot:
- Is my data up to date?
- Do my reports contain multiple data sources?
- How are the data attributes set up? (Data Configuration, UPDATE, DELETE, CREATE)
- How is my data collected?
- How are my metrics calculated?
- Is my report chart the correct one to showcase the learnings?
Is my data up to date?
It might sound silly, but if your data uses figures from a different desired data range, the report won’t be realistic. Ensure that your database administrator gave you the latest information or the relevant data set you need for your report. It's also your responsibility to ask for the exact data set you need. Check in with the administrator to verify how frequently the data is refreshed.
Imagine that you have two systems, A and B. System A sends raw data to System B only once a week. You rely on your System B for that set of data and do your reporting on Wednesday. However, the interface (the connector between System A and System B) operates only on Thursday. This means you will always be one week late with your reporting.
Do my reports contain multiple data sources?
If your data is all coming from a single source, you might skip this part. However, if you collect data from multiple sources you need to ensure that your data is consistent and accurate. In my experience, multiple sources of data often corrupt your report organically if you try to measure KPIs which are actually shared between systems. What does that mean?
Here is an example:
You are collecting data from Facebook Business Manager (the Facebook Advertising Tool) to measure cost performance and you also use Google Analytics to compare conversion performance; you might have discrepancies. Why? Facebook and Google have different attribution models. Facebook and Google don’t align in their methodology of calculating conversions. As a result, you will have discrepancies in your report if you simply combine data from two different sources sharing common metrics.
How are data attributes set up?
How did you or your system administrator set up the attributes in your database? Do they fit your business requirements? What I mean by that is easier to explain with a real-life situation. Let's pick the example of a CRM system where you collect information about your clients such as occupation, age, and type of subscriptions they have with your company (imagine a SaaS environment) or products that your company sells. Let's do a small exercise here and try to picture how these three customers’ attributes need to be configured in your organisation. [My recommendations can be found at the bottom of the article]
Which fields are required? If data is collected from multiple sources for the same field, do you ensure the consistency of your data? If a report contains empty fields, you might revise if this is actually what you expected. This might also be the expected behaviour as empty values also have a significance.
There are numerous types of fields which I will not explain in an exhaustive manner but only mention the most frequent ones:
- numeric value
- boolean (true or false/0 or 1)
The way you structure your data will massively influence how it is connected between these fields and the ultimate outcomes of your report. Think about your business needs to structure the data. I've seen companies collecting a customer's occupation attribute with a text field. This can result in you having 10'000 different values in your database with plenty of errors. Other companies collect data for this field with a dropdown menu (pre-selected values that you can choose from – only one choice is applicable), ending up with the majority of their clients being "occupation = other".
This is a rather technical part which you may not be familiar with, but it's important to understand the mechanisms. How data are created might depend on the methods of your interface (or the connector between System A and System B). I will simply mention here two basic functionalities:
- Create/Insert: This is the most basic method when creating a new data record – the value will simply be parsed into your attribute field.
- Update/Overwrite: This is a very important feature where you should decide if your field should be updated or not. Do you want the customer to update his address? Yes, probably. Do you want your customer to update his password? Yes. Do you want your customer to update his date of birth? Maybe not! This information should not vary unless your customer made a mistake upon registration. Think about your services where a special package is only marketed to people below 18 (e.g. 50% off on annual phone plan). If you give the customer the ability to change his or her date of birth, you may encounter someone who has eternally remained in their teens.
Thank you for reading! The second part (a bit more technical and specific) will be published in a few weeks, as I thought it would be too long, in particular for an introduction to data troubleshooting. If you're interested to work on a project with us, just click here:
Please find below my recommendations for the exercise about the setup of 3 attributes for your organisation.
Recommendations for the exercise
How is this information relevant for your organisation? Keep in mind that some people might have two occupations! People also change jobs –– how do you keep that information up to date? Do you have any client portals so they can amend this information or do you collect only when they first register? Is this field also required in your organisation?
Do you want to collect the age of your customers or rather their birth date? As the famous motto states "age is just a number", but it's also a dynamic value which evolves from the date of birth. So you might want to create a date field to collect the customer birthdate and ask your system administrator/data scientist to create a formula field/calculated field to return the correct value into age (basically today's date - date of birth).
Most probably you already know your products/services so you don't want to leave this field open for the client but rather create a dropdown. The concern here is to understand the way your products/services are connected to your client. Can they subscribe to multiple services at the same time? If your client can have only one option, you might set up a dropdown field for this attribute; however, if a customer can have multiple services at the same time, you should opt for a multi-select dropdown field.