Selecting Data Sources
Research continually reaffirms the importance of analyzing data to make sound business decisions, but knowing where to find data and which data to use can be intimidating at first glance. This article, which is part 3 in our series on getting started with data analytics, covers practical sources available to many organizations and the characteristics that determine the usefulness of the data.
Common Sources of Data
You have probably heard some version of the common statement, “Data is all around us,” which is true due to the depth of computer system involvement in daily business practices. The problem most people have does not include a lack of data but rather an understanding of where to look to find data they can apply to their needs. The following provides several sources most businesses can easily utilize to begin a data analysis project.
Accounting systems: If you have ever looked at an income statement or profit and loss (P&L), you already used an accounting system to do data analytics! The system stored data in the general ledger and summarized it for the selected period to make it more useful. Deepening the data analytics from accounting systems often involves exporting the data for further analysis in other software.
The most ubiquitous accounting system is clearly QuickBooks. While this platform has ease of use and standardization benefits, some pitfalls make the data difficult to use. QuickBooks files that have been used for many years (a long history of data improves analytics) often have a chart of accounts that has ballooned over the years with duplicate, ambiguous, and inactive accounts. A large and unorganized account structure can make a standard P&L nearly unreadable. Categorization techniques like account groupings can alleviate these issues.
Inventory Management and Point of Sale Systems: Accounting systems provide data useful for solving many general business problems, but more specialized systems like inventory management systems and point of sale systems provide the additional detail needed for solving specific problems. Many of these systems are automated and already storing valuable data waiting to be turned into useful information.
With these systems, you can discover the most popular products and how they change in relation to the outside factors, understand the impacts of advertising by examining the sales reaction, and plan inventory by discerning how seasonality and sales factors affect inventory levels. In many cases, the data to solve these problems already exists.
Customer Surveys: The ability to understand customers drives success. Sales and accounting systems populate information about customers, but you may want information about how customers feel to understand their actions reflected in these systems. Creating new data with a customer survey can shed light on these actions and give a fuller picture of what customers want.
Accounting systems, point-of-sale systems, donor databases, inventory management, e-commerce platforms, and social media platforms have likely already compiled mountains of data on your business. But as we noted in the first two articles, this data only benefits you if you can evaluate it appropriately for the problem your business faces.
Another cliché that applies to data is, “You are what you eat.” A poor data diet can spoil healthy analytics goals. Data should have the following attributes: complete, detailed, extensive, and user-friendly.
Complete: Just like your diet, data needs to be complete. When data has missing pieces, any possible observations gained cannot be attributed to the data for certain, as they could be disproven by the missing data. Inversely, missing data can hide trends and insights that would only be apparent if all of the information was considered.
Detailed: Detail can be intimidating or off-putting. This is why the P&L makes more sense than the general ledger, but when setting up data analytics, high levels of detail on the back end will provide more options in the analytics. The underlying data alone does not have meaning or provide insight. High levels of detail allow more flexible categorization and summarization to get the most information out of the data.
Extensive: The benefits of having an extensive history of data include higher confidence in the outcome, accuracy, and probability of meaningful analysis. Data over a longer period of time increases the likelihood that trends identified come from a true connection as opposed to random chance. Accuracy increases when seasonality can be recognized and adjusted for: a process that is more exact with more data to fine tune the model. Finally, an extensive history provides more chances for trial and error, allowing the valuable information to become apparent.
User-Friendly: On a practical level, we want a square, tabular data format. This means that each row contains all the fields for one record and each column has a header with only one type of data. No blank rows or skipped columns should be present. Users can employ many file types, each with their own strengths, but we recommend files that Excel can open (.XLSX, .CSV). Text files (.TXT, .CSV) can also function well for large datasets as they don’t have formatting complexity. All of these formats may not be easy for humans to read, but they are perfect for computers to take in and process.
We hope this article got you thinking about the data that you already have available to you. Whether it is in a point-of-sale system, accounting software, or self-created customer satisfaction survey, business insights through data analytics may be closer than you expect.
Now that you know where to find data and which attributes make it useful, you can prepare to analyze the data to determine a solution and take action, which we will explain how to do in part 4 of this series. If you have any questions about your particular circumstances or would like to begin a project with one of our data analytics experts, please contact us.
Articles in the series:
Part 1:The Value of Data
Part 2: Starting with the End in Mind
Part 3: Selecting Data Sources