Where to start with data analytics in internal audit: The data analytics triangle
ComplianceDecember 18, 2024

Where to start with data analytics in internal audit: The data analytics triangle

This article will cover the following topics:

Where to begin with data analytics in internal audit: An overview

Data analytics in internal audit continues to be top of mind for many audit and compliance shops. Our first line management teams rapidly gather information to understand customers, revenues, expenses, opportunities, and risks. Such information is also useful for the second and third lines of defense where internal audit can assist our management teams to distill that data into both big picture and more granular “what could go wrong” risks for the business unit.

Data analytics, like other emerging topics such as Artificial Intelligence (AI), can sound overwhelming, or even intimidating, if you don’t know where to start. If you’re just beginning your data analytics journey, I recommend reviewing this article from KPMG that describes the “KPMG Internal Audit Data Analytics Maturity Model”. The best advice is to carve out meaningful chunks of time in your annual plan to invest “hands to keyboard” practice to explore what can be done in this space.

Speaking from recent experience with newer technology, I personally dove headfirst into ChatGPT, one of the available resources of generative AI, to develop risk and control matrices and audit programs that I use as examples during working sessions with TeamMate+ customers. Similarly, with data analytics, I found that it took some trial and error to develop a repeatable decision-making framework that I eventually felt comfortable with.

As a result of those efforts, I developed a useful framework to weigh the costs and benefits of leveraging analytics in the planning and fieldwork stages of an audit. In this article, I will explain my journey and what is involved with this framework.

Benefits and challenges of data analytics in internal audit

The use of data analytics in a particular audit engagement often comes down to weighing the costs of gathering, cleansing, and organizing the data against the benefit of being able to derive meaningful insights about risk and control activities for our organizations. As described in a recent American Institute of Certified Public Accountants (AICPA) journal article, the inability to access usable data remains a challenge when attempting to perform data analytics in the audit process.

From an assurance standpoint, business units at our organizations generate the data needed to run the day-to-day aspects of their business. Therefore, the key benefit of internal audit is the ability to apply audit tests against the entire population of a particular control activity, instead of pulling a statistically relevant sample for testing. Further, if we can build repeatable analytic processes, such as programming a bot to perform data extraction and analysis, we can provide a continuous feedback loop to our governance stakeholders.

From a personnel standpoint, designing and implementing a data analytics audit program can be a creative endeavor. While this requires some degree of data management knowledge, visualization, and a basic understanding of statistics, this process works best when resources are given latitude to develop hypotheses about the business process and engage with our business partners to derive methods of data extraction. If your organization has in-house resources to help aggregate data, an analytics program is an easy way for internal audit to collaborate with those resources.

However, while organizations are generating lots of information, it is often in an “unstructured” format, such as text, image, video, and social media posts. As a result, there are two potential roadblocks – gaining access to this information and the time and investment of resources to make the data usable. To address both potential problems, it is beneficial to collaborate with resources throughout the organization. From an access standpoint, business contacts can provide direction on how they are using and analyzing unstructured information to make decisions, and the Information Technology and/or Information Security functions can advise on how best to gain access to this data. From a usability standpoint, seeking out in-house potential data experts on how best to clean and reformat the data would be the best approach.

What is the difference between data analysis and data analytics?

Data analysis is a form of data analytics and is the springboard to more advanced analytic techniques. Data analysis also provides a path to understanding and analyzing an entire population set.

Let’s begin by comparing analysis and analytics. Some of the similarities between data analysis and data analytics include:

  • Both use data to extract insights.
  • They rely on collecting, organizing, and analyzing data to uncover patterns, trends, and relationships.
  • In an internal audit context, they both require an understanding of the business process, risk, and controls, to derive meaningful insights.

It’s also important to point out and detail the differences between data analysis and data analytics.

Data analysis is descriptive and focuses on what has already happened in the past. It is often performed on structured data, such as spreadsheets and databases, and can employ basic statistical modeling and visualization tools, such as PowerBI or Tableau.

Data analytics, on the other hand, is descriptive, prescriptive, and predictive, and focuses on what has already happened in the past, while also making accurate predictions. It is often performed on unstructured data — such as text documents, social media posts, images, and videos — which often requires additional data hygiene checks to make the data usable. Lastly, it can employ advanced statistical modeling techniques, knowledge of machine learning, and programming to make the unstructured data usable.

The data analytics triangle: Evaluating costs and benefits

On a personal note, I prefer using triangles as a visual aid to guide decision-making. When I initially started with data analysis and descriptive data analytics, I kept coming back to three key decision points. I decided to write these three conditions down as a triangle because it provided me with an effective visual on weighing the costs and benefits as I progressed through the decision of whether to apply the use of data sets for a particular audit step. If all three of the below criteria are met, the benefits likely outweigh the costs, and I can proceed with leveraging data analytics for the task. However, if one side of the triangle is weak (for example, if the data is highly unstructured and costs additional weeks of time to make the data usable for the engagement, and reporting deadlines are at risk), I might forego the benefit of studying the data set for insights.

Part 1: Understanding the business process, risks, and management's use of analytics

This remains a fundamental skill in planning and performing an audit but is also my first stage gate in the cost/benefit triangle. If I do not understand the business process and overall associated risks, but can gather the data, there is still a significant risk that I will be identifying false positives in the audit. Further, as part of routine test of design walkthroughs, we can inquire of management to understand if they are already performing some degree of data analysis/data analytics in managing the day-to-day operations of the business unit. This is a great way to strengthen your relationship with the business unit by collaborating with management on this topic. During the audit, we can also evaluate the strengths of their data analytics model to make the process even better.

Part 2: Identifying "bright-line" rules in the business process

Reflecting on my undergraduate and graduate accounting classes, several accounting textbooks referenced “bright-line” rules from authoritative guidance that shape financial accounting and tax concepts. I use this phrase here as well when we set out on the decision whether to use data or data analysis techniques. The closer that we can get a binary “yes/no” condition for a particular attribute, the easier it will be to identify anomalies in the population. The risk here is the more ambiguity involved, the more time we may spend on studying false positives. If this is weak, I would consider passing on data analytics for this task, especially if you are just starting out with data analytics in your audit workflow. It might be best to avoid that “climbing a mountain” feeling. Financial-based, transactional type of information is a good starting point for a potential data analytic test because there are typically some bright-line rules that are followed before a transaction is recorded in the general ledger.

Part 3: Assessing data availability and usability

This is key. More likely than not, data is going to be available. As described above, there are two potential challenges. The first roadblock might be accessing the data. The second could be the format and weighing how much time needs to be spent on applying the data into a workable format. The risk here is that if we don’t have good data cleansing techniques we may end up spending too much time on false positives throughout the population set.

Examples of how to apply the data analytics triangle

Using the decision-making framework outlined above, the following are two examples that highlight the factors I considered whether a particular audit engagement was suited for leveraging large data sets for analysis and descriptive analytics.

Example 1: Logical access audit

Part 1: Understanding the business process, risks, and management's use of analytics.

The logical access team was responsible for managing access to IT resources by granting or removing access in accordance with an approved request. The primary risks inherent in the process were unauthorized access to IT resources, access that exceeds or involves mismatched job responsibilities, and access not being revoked in a timely manner.

Given my understanding of the business process, I developed several hypotheses that I wanted to evaluate against an entire population of tens of thousands of tickets. For example, I wanted to use the data set to understand the systems with the greatest number of logical access requests for the audit period; what systems that had “bunches” of logical access requests in short periods of time (likely the result of a new system, or an indication of something else); and the elapsed time between the initial request, approval, and ticket closure. We also inquired of management to understand what metrics and data techniques they might have been using to help manage the day-to-day, which led to productive and collaborative conversations about “a day in the life” from management’s standpoint.

Part 2: Identifying "bright-line" rules in the business process.

This IT team provisioned or deprovisioned access after the request was approved by the environment owner. Further, the access that was granted needed to match the requested access. And, in the case of deprovisioning access, that step was time-bound. Therefore, there were clear bright-line rules embedded in the business process.

Part 3: Assessing data availability and usability.

The logical access team granted the internal audit team read-only access to the ticketing system. The data was structured, did not require much cleansing, and was exportable to Excel, where I could filter, apply formulas, and create charts and graphs. When it came to evaluating other control attributes that required additional data sources, such as access listings, those were in a structured format from a database query or Excel export. Therefore, it made sense to invest time in applying some degree of data analysis and diagnostic analytics for this engagement.

How did I apply data analysis and descriptive analytic techniques?

Understanding that all activity had already taken place, I would classify my approach as a series of data analysis with limited descriptive analytics.

The analysis was developed in Excel, using common techniques such as filters, formulas, pivot tables, and charts. By evaluating the entire population of tickets, anomalies stood out, such as date ranges with spikes in requested access, tickets that exceeded average closure time, the requesting party that filed the greatest number of requests, the employee that approved the greatest number of requests, and the total number of requests that were made for access to systems classified as the most restrictive for the organization.

This analysis was done during the planning stage of the engagement. The results provided key metrics, such as the total number of requests processed by the team, that set the context for the planning memo and for the audit report. Further, it gave our engagement team an opportunity to reflect and organize our level of effort for specific systems to apply more granular audit procedures.

View a demo

Example 2: Travel and Entertainment (T&E) audit

This is a good business process to consider the use of data analytics in internal audit. There are many financial and time-based metrics based on “bright-line” rules embedded in the business process. Further, this business process typically generates an intuitive, clean data set with straightforward data formatting and exporting options.

Part 1: Understanding the business process, risks, and management's use of analytics.

Employees were issued a corporate credit card, managed by a contracted bank, for T&E transactions. Transaction data was fed from the card issuer into a separate, commercial off-the-shelf T&E application, where employees itemized transactions and uploaded receipts. The transactions were then approved by the employees’ next-level manager and reviewed by Accounts Payable in accordance with corporate-wide T&E policy. The transactions were then fed into the core accounting application to record the expenses and pay the card issuer. Further, on a monthly basis, the centralized accounting function had to internally report on T&E spend and had used some degree of data analysis and analytics to facilitate that reporting. Therefore, with a clear understanding of the business process, along with guidance from what the process owner was doing with the data, it was appropriate to move to the next step.

Part 2: Identifying "bright-line" rules in the business process.

The enterprise developed a corporate-wide policy with several rules. This included the approval of T&E expenses by the employee’s next-level manager, expenses over a specified dollar amount that required a receipt, itemizations coded to the correct general ledger account number, amounts per the receipt that must match the amount of the charge, and certain prohibited expenses. Therefore, with several “bright-line” rules identified, it made sense to move to the next step.

Part 3: Assessing data availability and usability.

Internal audit was granted read-only access to the commercial card portal and the off-the-shelf T&E application. Both applications contained several canned options to report on merchant category and T&E spend, and the data did not require much data cleansing to start working with. Therefore, it made sense to invest the time and resources into leveraging data analysis and descriptive analytic techniques into the audit program.

How did I apply data analysis and descriptive analytic techniques?

Like the first logical access audit scenario described above, I would classify my approach as a series of data analysis with some descriptive analytics, since all activity had already taken place.

The analysis was done in Excel, using common techniques such as filters, formulas, pivot tables, and charts. While TeamMate Analytics (TMA) was not available to me at the time, the TMA add-on to Excel would be useful for a T&E audit because of the comprehensive suite of data analytic tests that come out of the box.

With the entire population of activity from the commercial card portal and the T&E portal, several hypotheses emerged. For the population of all transactions logged from the commercial card application, several canned reports were available for the top 10 merchants by transaction volume, the top 10 merchants by total dollar spend, the top 10 merchant category codes used, and card transactions that were authorized against the unauthorized merchant category code listing. For the population of T&E expenses approved and paid, tests included the vendor with the most spend under the receipt threshold (for example, gift cards), the number of instances of spend ending in .00, and performing high-level analysis of T&E spend, by department, to see if total spend by general ledger code was reasonable, given the objectives of the business unit. This analysis gave us the information needed to follow up with management with specific questions.

Taking the next steps in data analytics for internal audit success

It takes an investment of time to get started with data analytics in internal audit. Hopefully, the above decision-making framework of whether the situation is favorable to apply data analytics to a particular audit gives you some traction in this space. A good next step would include further review of your organization’s data sets and an understanding of the business process to see what types of hypotheses can be developed. Lastly, make the leap! Carve out some dedicated time each week, brainstorm with colleagues, and begin applying basic data analysis techniques to structured data sets to see what happens. From there, you will see a transition to working with unstructured data, working through data quality considerations, and applying the more advanced statistical techniques.

Subscribe below to receive monthly Expert Insights in your inbox

Peter Zimmerman
Senior Consultant, Wolters Kluwer TeamMate
Pete Zimmerman, CPA, CISA, is a Senior Consultant in the TeamMate Professional Services practice. 
Back To Top