Matching addresses from different data sources and multiple locations
If you want to determine if any employees are operating as vendors, you will need to match the employee’s address to the vendor’s address. If there are a Ship To and Bill To location, you will need to match against both.
The best way to match addresses is to learn from the credit card industry and use something similar to their AVS (Address Verification System). Using the AVS methodology, you strip out the address number from the address street, combine it with the zip / postal code and use that as the matching value. The logic/process to accomplish this:
- Identify the number of street address fields in your vendor file and in your employee file.
- Extract the records with street addresses in either of the street address fields, separating them from entries with PO Boxes. Do this for both the vendor file and the employee file.
- Extract the Street Number from the Street Address for all address fields and name the new field accordingly.
- Combine the Street Number, from one or both address fields, with the Zip / Postal Code. Do this for the vendor and employee files.
- Match the combined value from each of the sources in the vendor file to one or both combined values from the employee file.
- Conduct further research on all matching entries.
Matching addresses in the same file
Vendor fraud sometimes occurs when fake companies are set up. To test your vendor file for duplicate addresses for different vendors, you would follow a similar process, but only on the one dataset:
- Identify the number of street address fields in your vendor file.
- Extract the records with street addresses in either of the street address fields, separating them from entries with PO Boxes.
- Extract the Street Number from the Street Address for all address fields and name the new field accordingly.
- Combine the Street Number, from one or both address fields, with the Zip / Postal Code.
- Match the combined value from one or both address fields to other entries in the same address field, or the second address field if present.
- Conduct further research on all matching entries.
The idea between both of these tests is to identify the most likely candidates for review. You may still have some false positives to test against, but the number will be much less than using other methodologies.
The plan
The concepts from the testing discussed above are relatively easy to understand. The difficulty is in developing a tool or process to conduct this test and deploying it to all of the auditing team when it is needed.
- Identify when this testing should be conducted and on what datasets.
- Identify the tool you are going to use to conduct this test. The default for many is Excel – you would use this to split and then combine the fields as noted.
- For audits that are regularly completed, repeated, or similar build the process and document it for future use.
- Provide additional recommendation guidance to business units on data entry standards and best practices.
Conclusion
Identifying matching addresses can help in detecting and minimizing the risk of loss due to fraud, but it must be done in a methodical repeatable fashion based on the data available.