Data literacy is one of those concepts we all understand on some level. We’ve probably all sat through a meeting where one of our colleagues just didn’t seem to get the point that was being presented on a slide deck by one of the data people. Whether that is the fault of the viewer in not understanding what was being presented, or because the data person did a poor job communicating the analysis or results, both represent a problem of data literacy.
To provide a clear definition of the concept, data literacy is the ability to read, understand, create, use, and question data to accomplish various purposes. Just like your reading literacy level indicates how well you comprehend complex writing; your data literacy level indicates how well you can work with and comprehend data.
Forbes recently listed data literacy as one of the top-most in demand skills for employers over the next 10 years. In a global study of 2,000 executives, decision-makers, and individual contributors, nearly 70 percent of employees are expected to use data heavily in their roles by 2025. Yet, fewer than 40 percent of organizations make data training available to all employees. 
If you are one of those employees without access to solid data training, and you have questions about your experience working with data, here are 7 essential element that you will want to learn more about to improve your data literacy:
- Data Exploration and Visualization
- Data Management and Data Wrangling
- Data Analysis and Visualization
- Continuous Improvement
- The Data Ecosystem
- Data Governance
- Continuous Learning
Data Exploration and Visualization
The process of identifying the questions you want to answer, and determining if your data can provide an answer to those questions is at the heart of data exploration. Identifying the questions you want to answer includes developing specific details to provide clarity around what will be analyzed with data, why you are interested in the analyses (i.e., justification) and what the form of your answers will look like. You will want to include relevant stakeholders in this discussion to ensure that the results will be useful to the intended audiences.
Data exploration also involves the systematic review of your data to determine whether they can provide an answer to your questions. The review will typically include assessments of the quality of the data (e.g., the extent of missing or inaccurate data, etc.), the substantive information captured by the data, the timeliness of the data, and basic descriptive statistics and visualizations to identify preliminary area of interest.
Data Management and Data Wrangling
The processes of data management and data wrangling span the collection of data to be used for data analysis, as well as the methods used to prepare those data for the analysis. Data management includes identifying a source for the data, procedures for the data to be collected and/or submitted, procedures to ensure the data meet quality standards, infrastructure for the storage of data, and policy decisions about who will have access to the data. Having a strong data management plan in place ensures that when data are needed, they are available to the correct staff, they meet high quality standards, and they provide the greatest potential to answer your questions.
Data wrangling is the process of extracting data from your storage system and performing standardized procedures to ensure the data are ready for your analysis. Wrangling includes tasks such as merging data sets together and calculating new fields from the existing data for the analysis. Wrangling also includes identifying the extent of missing or potentially inaccurate data and deciding how to impute (i.e., fill in) the missing values or potentially correct data inaccuracies. Finally, the data wrangling process requires generating a final data set with the structure required to implement the chosen analytic strategy and obtain results.
Data Analysis and Visualization
When most people think about analytics, they are thinking about the actual process of analyzing the data, or performing the statistical analyses that will generate results and provide answers to their questions. The data analysis process begins being formed during the data exploration phase because the types of questions being asked of the data, and the types of data available to provide answers, often have a strong influence on the specific choice of analytic methods used. The large variety of analytic methods available today exist in large part because of the wide variety of types of data and questions that analysts are interested in. Aligning the correct methods with the data on hand and the question of interest is critical to obtaining valid and reliable answers.
Additionally, data visualization may also play an important role in presenting the results of the data analysis in a clear and effective manner. While some analytic techniques can easily be summarized with one or two numbers, other techniques are easier to summarize using graphs and other images. Understanding how to effectively present and interpret analytic results, and knowing which results work best in a table versus a graph, are critically important skills to communicate actionable results to stakeholders.
As the needs of your organization and stakeholders evolves over time, the types of questions you need to answer, and the types of data you collect and analyze are likely to change. Additionally, maintaining high quality data systems requires continuous monitoring to ensure that the appropriate data is being collected efficiently and using the correct procedures, is stored in the correct locations, and is available to the necessary data analysts in a timely manner.
Problems in the data system are often identified first when the data wrangling or analysis process identify results that do not make sense. When this happens, you must identify the extent of any problems in the data system, and work with stakeholders to develop strategies to improve the system. Even when no problems are identified, as the data analytic focus changes over time you will need to identify new questions, data sources, and methods to continue using data-driven decision-making approaches.
Data Ecosystems and Infrastructure
Beyond the steps in the data analytic process described above, there are three additional components to ensure comprehensive data literacy. Understanding the data ecosystem requires knowledge of how and where your organizational data is being collected, stored, and analyzed. This includes both the physical hardware such as computer servers, hard drives, and cloud storage space, but also includes external data sources, software used to perform analyses, and programming code created for specific analyses.
Developing an understanding of the data ecosystem in your organization allows you to better identify potential strengths as well as limitations in the data that is being collected and analyzed. The hardware configuration for data collection and storage may be a limiting factor in determining how long it takes to complete certain analyses; although, this is becoming less and less of an issue with greater use of extremely fast cloud computing solutions. Similarly, differences in programming languages and analytic software applications allow different types of analyses to be performed. While there is significant overlap in the capabilities of most major analytic packages, there are still some specialized analyses that require specific types of programs, or add-on applications to perform.
Data governance refers to the policies and procedures put in place by the organization to ensure its data assets are managed properly. Data governance typically includes the creation of a data policy outlining how the organization will keep its data complete and accurate and protect data against unauthorized access and use.
Data governance is often divided into four key elements. Data quality focuses on how the organization will maintain complete and accurate data for use in its analyses. This may include multiple ways to capture or fill in specific data elements if the fields are not always filled in during the initial capture. Quality also pertains to the methods used to validate the data or verify that it is complete and accurate when the data is added to the storage system.
Data privacy focuses on how the organization will protect the sensitive information kept in its data ecosystem. This may include client or employee data such as addresses, credit card numbers, health insurance member ids, or other sensitive information.
Data security focuses on how the organization will secure its data from unauthorized access and use. Security can take the form of hardware controls such as limiting physical access to specific computers and servers or could consist of software controls used to limit which employees can access specific drives and folders on a network.
Finally, data stewardship focuses on the processes an organization will use to make sure that its data governance policies and procedures are followed. Organizations may refer to those people responsible for specific data sources and databases as “data owners”. Similarly, database managers will be responsible for the stewardship of the data storage platform, while Information Technology has stewardship over the hardware and software used throughout the organization. Finally, data scientists and analysts will hold stewardship over the programming code and processes used to analyze the data and present results to stakeholders.
Understanding the data governance policy of an organization provides you with a larger and more holistic view of where your data is coming from, what happens to it inside the organization, and who is responsible for it at each stage of the data collection, storage, and analysis process.
As the data analytics landscape continues to evolve, new techniques and technologies will be introduced and integrated into our work lives. The pace of this evolution is expected to accelerate in the future, and we are already seeing the potentially massive impact that artificial intelligence is having across many different facets of our lives.
To maintain your level of data literacy, it is important to stay informed of new tools and techniques as they are introduced and institutionalized in the organization. Notice here that I said “institutionalized” in the organization. You won’t need to learn every new technique that is tested out by your analytic team; some of those methods will be tested and then dropped as not being useful, or simply being rapidly outdated. You will, however, want to learn about the ones that they find useful and can incorporate into your data analytics processes moving forward.
Through an ongoing continuous learning process, you will build your data literacy and become a stronger data user. Given the number of positions expected to rely heavily on data in the next several years, learning more about these seven essential elements of data literacy and how they are applied in your organization will position you for greater success down the road.
You’ve got this, and we’re here to help you with the journey!
 Marr, B. (2022, August 22). The top 10 most in-demand skills for the next 10 years. Forbes. https://www.forbes.com/sites/bernardmarr/2022/08/22/the-top-10-most-in-demand-skills-for-the-next-10-years/?sh=6013e0a017be. Accessed August 22, 2023.
 Forrester Consulting. (2022, March 15). Building Data Literacy: The Key to Better Decisions, Greater Productivity, and Data-Driven Organizations. Retrieved from https://www.tableau.com/sites/default/files/2022-03/Forrester_Building_Data_Literacy_Tableau_Mar2022.pdf. Accessed on August 22, 2023.