In previous posts, I’ve written about the growing need for business professionals with strong data literacy skills. In this post I want to start unpacking the various aspects of data literacy and provide guidance on what you, as a non-analyst, can do really up your game and stand out more in your data analytic culture. Specifically, we’re going to dive into the concept of data analysis comprehension and the various levels that are needed to be effective with data. One of these levels is the most important and is the part that non-analysts should focus on the most.

## Data Analysis Comprehension as A Concept

Google the term “*How to understand data analysis*” and you’ll get results that will largely capture the general data analytic process of defining a research question, defining measures of inputs and outcomes, collecting your data, analyzing your data, and interpreting the results. While these results make good sense and probably answer the original intent of the searcher, they also overlook the varying levels of comprehension that exist for any given analysis and how you can “understand” any analysis at each of level in that structure.

The three levels of data analysis comprehension are presented in Figure 1 below.

The three levels of data analysis comprehension are:

- Foundational
- Mathematical
- Programmatic

## The Foundational Level

The foundational level of comprehension is the most basic and *general sense of*** “what does this analysis do?”** For example, let’s say you know that average age in the U.S. population is 38 years old. You capture data from 30 customers in your database and want to know if the population of your customer base is older or younger than the U.S. population. The average in your customer sample is 39.5 which makes them slightly older; however, this could have happened because you randomly had a few older individuals who purchased from you. If given more time, the average customer age might come down some. The one-sample t-test will help account for that kind of potential random difference.

So, if you understand that a one-sample t-test is used to determine whether there is a difference between an average calculated from a small sample of data and an estimate of the average in the population, then you have foundational comprehension. An additional piece of foundational comprehension is knowing that a statistically significant result indicates that there is likely to be a difference between the average age of your customer base and the average age in the U.S. population. If the result is not significant, then you would conclude that your customer base was, in fact, statistically similar to the U.S., population in terms of average age.

At this level, the specific mathematical formulas and proofs are not important. Rather, your interest is in understanding the end goal of the analysis, and the specific criteria that differentiate this analysis from other types of analyses. You are also interested in understanding how to interpret the results with respect to your original question (i.e., *is my sample average meaningfully different from the population average?*). It is important to note, however, that while foundational comprehension does not require mathematical or programmatic comprehension, the reverse is not true. The more specific levels of comprehension require understanding of more general levels of comprehension.

## The Mathematical Level

The mathematical level of comprehension is more specific and requires an understanding of how a specific analysis works. As the name implies, comprehension of data analysis at the mathematical level indicates that ** you understand the calculations required to go from raw data to a final result**. For example, you have mathematical comprehension of the one-sample t-test mentioned above if you know that the general form of the test statistics is:

Most of us encounter this level of data analysis comprehension for the first time when we calculate averages and proportions in our junior high school math classes. We encounter it again at the collegiate level when we take courses in probability and statistics. An important distinction about the mathematical level of comprehension is that the formulas remain the same regardless of what programming language the analysis is completed in.

## The Programmatic Level

The programmatic level of comprehension is the most specific for any analysis. At this level, comprehension requires ** understanding how to implement or program the analysis in a particular software package**. For example, in Microsoft Excel a t-test can be implemented with an array of data, let’s say cells A1 to A10, using the following formula:

=(Average(A1:A10) – μ)/(Stdev(A1:A10)/Sqrt(Count(A1:A10)-1))

All we need is a number to plug in for μ, and you’ll get back the value of the t-statistic. With another formula in a different cell, we can determine whether this difference is statistically significant as well.

In Python, the same test can be performed with a simple line of code from the Scipy package, such as:

Scipy.stats.ttest_1samp(A, popmean)

Here, *A* represents the object containing the array of data (cellsA1 to A10 in the Excel example above), and *popmean* is our estimate of the population mean (μ in the Excel example).

The key issue at the programmatic level of comprehension is how to implement a specific analysis in your analytic package of choice. In the example above, the one-sample t-test remains the same at the foundational and mathematical level, but the code used to implement the test differs between Excel and Python.

## The ONE Thing Non-Analysts Need to Learn

As a non-analyst working to strengthen your skill set and understanding of data analysis, **the one thing you must learn is the foundational level of data analysis comprehension**. You want to understand what various analyses are used for, how those analyses differ from each other, and how to interpret the results in terms of the original question.

To stay with the one-sample t-test example above, we use the t-test when our sample size is small (i.e., typically less than 50 observations), and when our goal is to compare an average from one sample to an average of a population. If we had a larger sample of observations (i.e., typically 50 or more), then we would use a z-test instead of a t-test to perform the one-sample test. So, whether you have a large or small sample makes a difference in which test you choose. Knowing that this is the case represents foundational comprehension. Knowing *WHY* the sample size impacts the choice of test gets into mathematical comprehension.

Another way to think about foundational comprehension is to consider learning a foreign language at the conversational level. You might be perfectly capable of having everyday conversation with someone in a particular language with some minor mistakes, but still are not fluent enough to write in that language. The foundational level of comprehension in data analysis is exactly the same. You can be conversant in the language of data analysis, even if you are not fluent enough to do the analysis yourself.

To be fair, it is really helpful if you are also able to acquire comprehension at the mathematical level since this will flesh out the foundational ideas to understand why these techniques work. But mathematical comprehension is not strictly required, and it’s okay if you start at the foundational level of comprehension. After all, you have analysts that actually do the work. They are the ones who need mathematical and programmatic comprehension. You only need to understand what they are doing and be able to know if the choices they are making are appropriate for the problem you are tackling.

## What Can You Do Right Now?

If you’re a non-analyst business professional who works with data analysts, or analytic results, you want to make sure that your foundational comprehension levels are strong. Focus on learning what various analyses are used for, how those analyses differ from others, and how to interpret the results in terms of your original question.

A few well-structured Google searches can go miles in helping you find resources to better understand specific analyses. In contrast, if you are trying to understand which analysis to select from all of the possible analyses, then you may need to seek out answers from individuals who have mathematical comprehension in data analysis, as well as strong research design skills.

**I recommend joining the Data Analytics for Non-Analysts Group on Facebook or LinkedIn**. These groups, sponsored by F1 Analytics, were created to build a community of non-analysts looking to strengthen their data analytic skills. The group discussions focus on explaining analytics at different levels of comprehension so that members are able to focus on the material most relevant to them.

I invite each of you to join the group and let me know what your data analytic questions are so that we can strengthen your skills together.

Looking forward to seeing you there!