Saturday, September 5, 2015

Behind the Scenes at a Business Credit Bureau

Behind the Scenes at a Business Credit Bureau


You may have a mental image of how credit reports are created: from an orderly exchange of clean, tidy data flowing seamlessly through some kind of standard method, untouched by human hands and delivered straight to your web browser.  But that would be wrong.  As with any well-executed professional endeavor, it only looks that effortless.



I entered the world of credit data in the late 1990s as part of the team that delivered one of the first business credit reporting platforms on the internet.  Back then, I thought a lot like you.  I assumed that credit data comes from nimble-fingered accountants and therefore would be inherently orderly.  I assumed that the bigger the company sharing trade experiences with us, the more exacting and precise the data was likely to be.





At least now I no longer have a 9-track computer tape reader in my office; all of our data is transmitted digitally and we don’t have to wait for FedEx to deliver physical media.  Other than that, not much has changed.



Consider the lowly data entry clerk who manually inputs much of the information into these systems.  They often aren’t paid enough to think or care deeply about what they are doing, and in many cases, they don’t have the time to notice, much less correct, misspellings like COMAPNY.  So let’s imagine three different creditors sharing their trade experiences about a customer who I will give the name of Walt’s Cigar Rentals.  We might get the name and address information in the three following ways:



WALT’S CIGAR RENTAL


123 WALT AVE


WALTVILLE, VT 01342



WALTS CIGAR RENTALS COPMANY


123 WALT


WALTVILLE VT



WALT’S CIGARRNTL CO


123 WALT STRET


MALTVILLE, VERMONT 1342



And this is a simple example.  Don’t get me started about non-US addresses, often entered into US-made software that doesn’t understand, for instance, that many countries have four or six digit postal codes, rather than five like ours – Canada’s is not even all numeric.  And while we’re on the subject of Canada:  Many addresses in Canada could be expressed in either English or French and we still have to recognize them as the same!



Now comes a user searching for this company, and they might enter the name in yet different ways.  If they search for WALT’S with an apostrophe, they won’t find the second version.  If they narrow the search to MALTVILLE, then the misspelled version, WALTVILLE, will be omitted.  When you consider that we can get trade experiences on any one company from hundreds of creditors, and that the search terms entered by thousands of users may vary in their own right, you’ve encountered the fundamental challenge of processing all this data:  How does a computer, which is famously simple-minded and literal about matching, recognize hundreds of variants as the very same company?



Even if the computer recognizes all the different variations, who wants their business credit report cluttered with all of them?  The debtor should appear once, with the name and address correctly rendered, so that users can make quick credit decisions with confidence.  Nothing inspires doubt like obvious mistakes in the representation of company names and addresses.



This is where data hygiene enters the picture.  If that evokes images of data elves scrubbing data with (industrial-strength) soap and water, that is not, metaphorically speaking, far from the truth!  But to digest data in real time and avoid falling behind, this process must be automated.  We must patiently teach computers to correct random errors introduced by humans.  It is this requirement that keeps people like me awake nights.



Thankfully, we’ve had nearly two decades to fine-tune the process, and it’s getting better all the time.  Using a combination of postal standardization software, tools for us to capture intelligence over time about common misspellings and odd abbreviations, and an extensive layer of proprietary software, the three sample addresses above would be recognized and reported by our system in a standardized way:



WALTS CIGAR RENTALS CO


123 WALT ST


WALTVILLE VT 01342



Of course there’s far more to it.  I haven’t discussed other interesting issues such as how accounts receivable can be aged differently by different companies, how the methods of expressing data points like high credit and days-to-pay can vary or just be plain incorrect, and the fact that the export and transmission of trade experience data is not always 100-percent automated by the creditor, resulting in constant small changes in the data layout and even the file format.  Our system handles roughly 80 percent of the data sent to us automatically in spite of this, and human intervention for the rest is often a matter of minutes.  It takes a long time and a lot of experience to achieve such levels of automation with such unruly data.



Nor have I mentioned the need to remove non-objective comments stashed into data fields not meant for such things.  It would not do for our business credit reports to include some clerk’s notation appended to a company name that THIS CUSTOMER IS A PAIN!  In addition, we deal with cryptic notations or acronyms that have meaning only within the collection department of a company – or within an industry.  So we have to recognize when MACYS EAST COAST ACCOUNTS should really just be MACYS INC, or that MACYS INC EDI just means that the bills are paid by “electronic data interchange” and so the EDI can be removed as superfluous for credit reporting purposes.



We have a large stable of internal “sanity checks,” too.  They help to ensure that, for instance, a creditor didn’t accidentally send us the same file they sent last month, or the same month last year.  Or that the total portfolio balance doesn’t vary by a suspicious amount month-to-month, indicating a possible malfunction in the creditor’s data export.  We have staff to contact creditors and verify suspicious changes or request corrected replacement files.



Finally, we have mechanisms to guard against credit fraud.  It’s rare, but not unheard of, for someone to set up one or more fake companies that share contrived trade experiences just to inflate the credit scores of certain slow-paying or non-paying debtors.  Surprisingly, there are telltale signs in such data that we look for regularly.



No comments:

Post a Comment