Your data is not enough

Updated: Nov 25, 2020

Trying to solve a problem with only data is like trying to understand a book by only reading the count of keywords. You may get the gist of the topic, but you’ll never know the story - or if you are even reading the right book.

Reading words and data is different than understanding.

A pervasive thought in many organizations is that if we could only get all of the data together then we could really figure out how to solve a problem, make a better decision, or generate exciting new revenue streams. This is a completely reasonable thought as having all of the data could result in all of those things. But moving blindly forward with collecting all of the data will lead to a lot of frustration:

  • Why aren’t we getting any insights from this data?

  • Why is it taking so long to put everything together?

  • We just invested a ton of money into this project, why aren’t my analysts able to answer my questions?

The short answer is that getting the data together was only one step of many to solve your problem. Think of data as just words in a book. Certainly, each word has some meaning by itself, but the real value comes when all of the words are put together to convey an idea. There is a reason why libraries consist of books and not just the words that comprise the books. Having a strategy to collect all of the data is equivalent to collecting all of the words ever written. At the end, you’ll have a dictionary that has some uses but no one ever goes to a dictionary to figure out how to solve a problem. Most data strategies today are only building dictionaries and getting exactly the kind of boring results you’d expect to get from a dictionary.

Now dictionaries still have their place in the world, so we shouldn’t completely throw out the concept of collecting data. The key is figuring out when to collect the data and, more importantly, what data we should be collecting. It may go without saying, but you should only start collecting your data once you know what data you want to collect. Similarly, most people wouldn’t go to a library before they knew what they were looking for. This requires that you define a clear problem, strategy or goal, and I generally wouldn’t recommend that you set your goal to collect all of the books ever written just because you could. Defining what data is exactly interesting should be the hardest and most rewarding part of any analytics project. By the time you are done, you should be able to define the problem so well that you may not even need to collect the data at all. The answer or insight may be obvious. If the answer isn’t obvious, then it is only a matter of time and technology mechanics to collect and analyze the data to get an answer to your problem. Skipping this first step will most often result in you realizing you’ve been reading a summary of words from a poorly written children’s book on cats instead of the insights from the work of Socrates.

For those who are skeptics, consider the following word cloud of this blog post. Can you tell from just the word frequencies the core ideas of this post? From a data and analytics perspective, it is fairly straightforward to extract the data from this blog post, clean it up, and put it into a graphical form, but it doesn’t tell you the full story. The data from this document is not enough and neither is your data to solve your problem.

Word cloud of this blog post

Interested in learning more? Contact us at

17 views0 comments

© 2020 by Heath Analytics, LLC