By Gil Irizarry
Customs and border security have never been more important. Around the world, professionals from a broad variety of federal, state, and local governments strive to prevent the passage of criminals, halt the import of counterfeit goods, and manage immigration — all while facilitating the passage of legitimate travelers and merchandise.
“Pushing the border out” is one way to accomplish this. The term refers to implementing technology that can empower customs and border officials to assess risk before people or goods arrive at a nation’s land crossings, airports, or maritime borders. Making decisions in advance can improve national safety while easing entry as appropriate. It can also facilitate information sharing among customs and border agencies.
Unfortunately, hard-to-replace legacy name matching systems often hinder these efforts. They miss too many matches; alert to too many mismatches; provide inadequate tuning capabilities; struggle to match corporate names; and provide opaque match scores.
The good news is that application programming interface (API)-based name matching technologies now on the market can help customs and border agencies overcome these challenges. Better still, these solutions can be deployed with minimal disruption.
Using these API-based name matching systems can help you:
1. Find more matches
Too many customs and border security agents rely on outdated, full-text search platforms to match names in structured text, such as when comparing names of incoming travelers against watchlists. The name matching capabilities of these search engines lie somewhere between binary match/no match determinations and fuzzy matching — a computing approach that improves upon binary processes by considering degrees of truth. Returning only exact or near-exact matches, full-text search platforms are fuzzy enough for general searches, but not expansive or fast enough for optimized name matching.
In addition, many of these platforms accommodate only a limited number of languages, making it difficult to match translated names, transliterated names, and names rendered in non-Latin scripts. They also fail to spot aliases, nicknames, misspellings, honorifics, or out-of-order names.
2. Minimize mismatches and associated manual checks
A lot of misses are coupled with excess alerts. How can this happen?
Full-text search platforms often lack appropriate disambiguation capabilities. They don’t always apply additional identifiers — ages, addresses, and places of birth, as examples — to each name studied. This is vital when examining people with common names. Imagine that a tourist named John A. Smith wants to visit the United States. He has a similar name to one appearing on a watch list — John Aaron Smith. If a name matching system does not apply additional identifiers, it cannot distinguish between the John A. Smith who wants to visit the Grand Canyon with his family and John Aaron Smith, known human trafficker. It therefore alerts to both — and causes additional work for agents who must investigate these alerts.
3. Adjust matching parameters to further reduce the need for manual investigations
Too many legacy systems have inflexible match parameters. What does this mean to customs and border security? Agents and others cannot fine-tune match parameters to match their risk profiles.
Let’s return to the case of John Aaron Smith, human trafficker. Knowing that he often translates his name into Spanish or German (Juan Aarón Herrero or Johann Aron Schmidt), U.S. Department of State agents charged with examining visa applicants may want to set their match parameters to give extra weight to those versions of Smith’s name. If they’re matching lists written in the United States with those compiled in the United Kingdom, they may want to adjust for the British Spelling of Smith’s last name: Smythe. If they have reason to know that John Aaron Smith always uses his middle name or middle initial, they may want to give more weight to middle names and initialisms — concurrently rejecting instances of John Smith without the “Aaron,” or instances of persons with different middle names or initials — John Michael Smith or John M. Smith. But legacy name matching systems typically do not enable these types of adjustments — again leading to unnecessary manual work for State Department officials charged with identifying all the John Smiths who wish to enter the United States each year.
4. Improve matching of corporate names
Full-text search engines face particular challenges when trying to match corporate names.
Subsidiaries abound in the corporate arena. Typical search engines cannot match parent company “Lookin’ Lovely, Inc.” to its subsidiary, “Fine Fashions LLC,” and to Fine Fashions’ subsidiary, “Haute Handbags.” Yet the behavior of parent and subsidiary companies are inextricably linked. If customs and border security officials know that Lookin’ Lovely exports counterfeit goods, they would no doubt suspect Haute Handbags of selling knock-off purses. Without knowledge of the parent company, these officials would be more likely to allow passage of Haute Handbags merchandise.
Nicknames and initialisms are also common, and too often stump existing name matching systems. You may eat at your favorite burger joint, JBB, or pick up your prescriptions at a VFP chain drugstore, without knowing that, on the dotted line, these companies are called “John’s Best Burgers” and “Very Fine Pharmaceuticals.” Nicknames and initialisms can obscure the true identities of companies that wish to do business internationally.
5. Trust your match scores
How much do you trust — or even understand — your match scores?
Many search engines use complicated ranking functions to estimate a name’s match to a given query. Their results are often based on factors such as how often, or how infrequently, a name appears in a document or set of documents. The match scores these search engines deliver is a ratio based on those frequencies. This scoring system is suboptimal for name matching because, while it gives users an idea of how often search teams appear, it does not clearly indicate how closely search terms match.
A better way
Technologically superior, API-based name matching solutions now on the market can alleviate these problems, improving customs and border security.
AI-powered name matching systems match names across a broad variety of languages and scripts, detecting aliases, nicknames, and misspellings. To disambiguate names, (to find the right “John Aaron Smith”) they apply additional identifiers to each record. These identifiers help to differentiate between the real estate agent John Aaron Smith hoping to see Times Square and the John Aaron Smith who sells human beings. Similar capabilities help link corporate names to their nicknames and to names of their subsidiaries. In doing so, these systems both improve matching capabilities and dramatically reduce instances of false matches.
The best name matching technologies also provide clear scores, helping you to understand why two names have been deemed a “match” or a “mismatch,” thereby giving you confidence in the match obtained. Unhappy with the decisions your system has made? Good systems empower you to track the “matches” that you would consider a mismatch, track the “mismatches” you would consider a match, then adjust scoring paraments accordingly: giving more or less weight to parameters including disordered name components; translations or transliterations; initialisms; nickname and aliases, and homonyms and gender conflicts (John Aaron Smith vs. Jan Erin Smith). Finally, by improving match processes, these systems reduce the investigative time that must be spent responding to alerts.
About Rosette by Babel Street
Rosette by Babel Street is a scalable API-based software solution that enables increased automation of personal and organization name-matching and disambiguation across large volumes of structured and unstructured text. It employs AI-powered fuzzy matching capabilities to recognize names in all their varieties. Rosette’s language capability transliterates names from dozens of languages — including complex, non-Latin languages such as Arabic, Chinese, Hebrew, Japanese, Korean, and Russian.
Rosette can help you build confidence in your name-matching processes. Rather than just providing a “match/no match” report, Rosette provides normalized, actionable scores indicating its confidence in each match. Scores run from 0 to 1, with, as examples, a 0.5 indicating a 50% degree of confidence in the match, and 0.75 indicating a 75% degree of confidence. Enabling users to easily fine-tune more than 20 match parameters — and to see how matches are being scored and the decisions behind the scores — provides increased confidence in the validity of your matches while reducing the number of mismatches that must be examined manually.
With its highly flexible API architecture, Rosette is easily integrated into existing systems. Uniquely transparent and explainable, it is trusted by governments worldwide.
All names, companies, and incidents portrayed in this document are fictitious. No identification with actual persons (living or deceased), places, companies, and products are intended or should be inferred.