Anatomy of Data and Data Analytics
A Beginner Guide for New Non-technical Product Managers
This post appeared in my substack newsletter on data, business and product analytics. 👉 Sign up here to get new posts straight to your inbox. 👉 Connect with me on twitter.
The word “data” or “analytics” probably did not mean anything for product companies over 100 years ago. I guess there was little or no data to analyze products, and data did not drive decisions at that time. Also, collecting data was probably a time-consuming and expensive task. In short, the product world was simple without data.
Now, things are different. Products and products-driven companies have grown mammoth and complex in customer base, business, and solutions they offer. The only way to demystify them is through the use of data and data analytics. The revolution in computing that began in the late 1960s has fueled the growth of data and data analytics. Also, it deciphered the magical powers of mathematics and its application in everything we do today. Now, the mantra is:
Without big data analytics, companies are blind and deaf, wandering out onto the Web like deer on a freeway. — Geoffrey Moore, Author of Crossing the Chasm
If you google the word “data analytics,” you will see a flood of information and various analysis tools to collect, analyze and summarize data. There is a huge online community called “Kaggle” for passionate data experts who talk about all sorts of data and products in the world. I recommend you check that out.
Well, how do you get into the world of data and data analytics if you are new to product management? What’s the starting point? Well, before you start using technology-based tools, I insist you get to know the anatomy of data and data analytics. Without this, it is going to be hard for you to even prepare simple metrics. Let’s get into this topic from the perspective of how it is described, collected, and analyzed.
Definition of Data
Data is simply the factual information you collect from various sources, either in the form of physical or digital format. As per Merriam Webster dictionary, data is defined as the factual information (such as measurements or statistics) used as a basis for reasoning, discussion, or calculation. In a more technical sense, data is a collection of factual information in the form of qualitative or quantitative formats. What I mean by that, data can be descriptive and quantifiable. Let’s get into those in detail below.
Qualitative data is all about qualities and characteristics. It is non-numerical data that is generally collected by observing, labelling, listening, and describing the qualities of the object. Here is a list of methods to generate qualitative data.
- Image processing
- Symbol translations
- Listening to audio tapes
- Watching video recordings
- Observational notes
- Written notes and documents
And some examples of qualitative data you might see:
- Likes vs. dislikes
- Shapes and sizes
The use of qualitative data is very common in product management. In my product career, I made quite a few decisions looking at qualitative data. When I want to build something new, I check on really cool websites and read what their users say about the product in their user forums. I take notes from several websites and try to implement the same techniques. As a product manager, you can also generate qualitative data by attending tradeshows/conferences and talking to the customers, notes from internal sales and tech support teams, and 121 meetings with customers.
Quantitative data is all about “how much” or “how many,” and it is statistical, measurable, and conclusive in nature. From a product perspective, it is about quantifying the product performance or the user experience, or the product’s success from a business point of view. Understanding and working with quantitative data is a highly essential skill for a product manager. It provides a clear picture of “what” and “how many” aspects of a problem you are trying to solve for your customers. A few examples of how to collect quantitative data are listed below:
- Market research
Using statistical procedures, it is much easier to analyze and summarize the data compare to qualitative data. Most products these days have in-built or third-party metrics tools to collect quantitative data. A cool thing about these tools is that they run continuously and generate reports almost anytime.
Another thing to note about quantitative data is that it comes in two different data types — discrete data and continuous data. “Discrete data” cannot be broken down into an infinite number of pieces. However, you can split the data into smaller groups. For example, user login counts are discrete data. On the other hand, “Continuous data” can be broken down into an infinite number of pieces. A good example of this is the time spent by a user per session.
Now that we know how data is classified from a descriptive standpoint let us look at how it is classified from a collection standpoint.
Structured Data, as the name implies, is organized data in a standardized format that is easily readable and searchable. The data is mostly kept in tables or spreadsheets, with each data point representing a row and attributes in columns. It is often categorized as “quantitative data.”
Some benefits of Structured Data:
- It can be easily placed in a dataset to create data models.
- It works well with relational databases and querying data using SQL.
- Datasets can be easily joined, mapped, created, and updated.
Examples of Structured Data:
- Names and dates
- ERP system data
- Product users data
- Identification numbers
- Bank statements
For data analysis, it is better to have Structured Data as it is much easier to work with compared to Unstructured Data.
As opposed to Structured Data, Unstructured Data is not organized and is not presented in a way that is easy to read and search for key information. In other words — it is everything else that is presented in a non-standardized format. When I say “non-standardized,” I refer to the state that the data is not presented in a tabular or recognizable format. In fact, Unstructured Data has an internal structure, but it is not defined in tabular format. This type of data cannot be displayed in a data model or schema. Therefore, it is much more difficult to search, manage and analyze.
An interesting finding posted by Bernard Marr on Forbes is that 90% of the data is Unstructured Data is growing at 55–65% each year. It comes in various formats such as emails, SMS, social media posts, images, text, audio and video files, graphs, and so on, categorized as qualitative data. What’s driving this rapid growth in Unstructured Data? Especially since the genesis of the internet, the amount of Unstructured Data has drastically grown because of digital content creation and sharing. It is one of the reasons why data science powered by Big Data, Artificial Intelligence, and Machine Learning has gained huge traction lately and continuously evolving.
Unstructured Data is generally stored using modern methods like NoSQL (non-relational databases) or Data Lakes technologies. One good example of a NoSQL data format is JSON, a key-value-based data storage technique.
In sum, Unstructured Data is not easy to slice and dice to get insights. Nowadays, businesses that deal with this type of data find more innovative ways to get more predictive and proactive insights.
To present these two data types visually, I found a great infographic created by Three Graces Legal. The creator of this graphic clearly demonstrated the difference between Structured and Unstructured Data in the word cloud of examples. Here is the graphic.
Now, before I conclude this article, let us understand a little bit about data analytics.
What is Data Analytics?
Specifically, the word analytics comes from the Greek word “analytika,” which means “science of analysis.” It is generally used to refer to a group of analyses you perform to meet a common objective using the established techniques of statistics and crunching the numbers. However, when it comes to data analytics in the modern days, it is the use of processes and technology, typically software applications, to extract valuable insight from datasets.
As Geoffrey Moore puts it in his quote (provided at the beginning of this article), big data analytics has become an essential part of modern businesses. Advanced developments in data analytics are helping businesses become more data-driven. That means key decisions are guided through the analysis of data. It also helps businesses predict problems before encountering them and figure out appropriate solutions in a timely manner.
There are four types of data analytics commonly performed by data-driven companies.
- Descriptive Analytics
- Diagnostic Analytics
- Predictive Analytics
- Prescriptive Analytics
This is the most basic and foundational form of data analytics. It provides answers to the questions such as How? When? What happened? Since it is a situation or event-based analysis, it actually uses historical measurements or data and renders its findings after the fact. A good example is showing the trend in the number of user sign-ups since launching a product.
Typical questions Descriptive Analytics answers:
- What is the average customer lifetime value?
- What post is getting the most engagement in North America?
Descriptive analytics is a very common type of analytics used by data analysts.
This is the first layer of the juicy part of descriptive analytics that everyone wants to taste. It answers the question, “Why something happened?”. When I prepared a monthly user retention chart and presented it to my manager, I presented only descriptive analysis. The very first I hear from my manager is that “why user retention went down?”. To answer this question, I need to do Diagnostic Analytics. I hope you got the picture.
Diagnostic Analytics is not completely different from Descriptive Analytics. It is still the same. However, it explores the mechanics behind observations. It looks for patterns in the data and attempts to identify relationships.
A few of the common techniques are:
As the name implies, Predictive Analytics will answer the question, “What will happen?”. Using historical data, it attempts to forecast what is likely to happen in the future. This is where things get interesting. Understanding the future trend will help you make better decisions. So, going back to the monthly retention chart I presented to my manager, I get the next question, “How many more we lose in the next month?”. To get the answer, I do Predictive Analytics.
This type of analytics is highly technical. Experienced data scientists or data engineers generally perform it. They use methods like:
Predictive Analytics is used in many industries. For example, banks detect possible frauds, manufacturers detect possible maintenance requirements, and retailers look for up-sell opportunities using Predictive Analytics.
Well, you guessed it right. It is like a doctor prescribing a “specific” medicine that she is very sure of to treat an illness. Prescriptive Analytics analyzes extremely complex data to come up with a specific recommendation. Very few organizations perform this type of analytics to uncover the magic using advanced analytical techniques such as artificial intelligence, machine learning, and neural networks. When the analysis is done properly, business leaders and product managers can translate the prescriptions into a goal-oriented products or business strategies.
Some examples of Prescriptive Analytics are:
- Bayes classifier (A machine-learning algorithm to compute the conditional probability of an event happening)
- ID3 (Another machine-learning algorithm to structure a graph of possible outcomes from a dataset)
Yet another long post. I hope it was not a boring read. I aimed to give you a complete picture of the anatomy of the data and data analytics here. I hope I have covered it all. The key takeaway from this post is that data and data analytics play an instrumental role in product management. Understanding the above concepts will surely put you in a better place if you are stepping into product management with minimal knowledge of data and data analytics.
Thanks for reading!
On this topic, there is a sea of information on the internet. What is covered here is curated from the same source specifically for product managers as a beginner guide. For more information, I highly recommend you to check these sites.