By Tina Lieu
Entity recognition helps link persons and organizations to the agency knowledge base
The unknown “unknowns” keep law enforcement and intelligence professionals awake at night. They worry about unknown key actors and the connections between seemingly unrelated cases. If known, the connections would reveal a larger view and possibly a faster, better way to tackle the case. To meet this challenge, named entity recognition (also called entity extraction) links people, organizations, and places found in data to entries in a common investigative knowledge base. This is how AI can reveal the unknowns.
Consider this hypothetical scenario. Law enforcement officer Abel is tracking a suspected drug dealer, Grimp Arana, who is an Empire State University sophomore majoring in chemistry and affiliated with a rock climbing club. Meanwhile, intelligence analyst Bert is tracking a foreign intelligence officer in New York City, code name Arachnid, who only appears at night, but has been seen taking photos of “homeland-security sensitive areas" and hanging around the municipal water department.
Abel and Bert share a cross-agency knowledge base where they store information about Grimp Arana and Arachnid: Where do they frequent? Where and when are they traveling? Who do they meet?
What if officer Abel and analyst Bert realized their investigations were connected when the knowledge base recognized that the profiles and movements of Grimp Arana and Arachnid were very similar, and they were most likely the same person?
This scenario is not wishful thinking. Finding links between disparate investigations is what we at Babel Street mean when we say “connecting the dots.” These connections are possible through shared knowledge bases that leverage named entity recognition.
The latest release of Rosette® by Babel Street refreshed the Wikidata that is the default named entity recognition and linking knowledge base enabling Rosette to tell entities with the same name apart. (See sidebar for details.)
How named entity recognition distinguishes between entities with the same name
When Rosette extracts the names of people, places, and organizations from text data streams – such as message traffic, news, and social media – it uses a knowledge base (such as Wikidata to link names to entries in the knowledge base. In this way, Rosette can distinguish between entities sharing the same name. Rosette compares the document context of each name with possible matching knowledge base entries. For a person, the context might be date of birth, place of residence, education, and career-related terms.
For example, the name “Michael Collins” can refer to an astronaut of the Apollo 11 moonwalk mission or a leader in the early 20th century struggle for Irish independence.
Entity disambiguation through entity linking
Because Rome and the United States Military Academy are associated with the astronaut in the knowledge base, Rosette named entity recognition correctly links the mention of “Michael Collins” in the article to the astronaut, not the Irish leader.
Entity linking preserves institutional knowledge, increases analyst efficiency
Wikidata works well as a catch-all of people and organizations that might show up in the news, but the real value of Rosette is in extracting names of entities from data streams — such as message traffic, case files, finished products, and news articles — and linking them to an agency’s own knowledge base, like that of officer Abel and analyst Bert in our example.
Suppose that officer Abel is a rookie. For six months he trained with retiring 30-year-veteran Candace, who has been diligently updating the knowledge base with what she knows about people and organizations of interest. Abel and his colleagues can draw upon that knowledge and accelerate their work.
How can named entity recognition speed up investigations? Finding the connection between different people (Grimp Arana and Arachnid) and organizations that share similar context is just one way. Another is pushing relevant information to the officer/analyst as they work.
Imagine that analyst Bert is reviewing incoming message traffic. What if every document he looked at highlighted which entities were in the agency knowledge base? What if hovering over a name popped up that entity’s card from the knowledge base?
This information push is a significant boost in Bert’s productivity in that:
- It saves Bert the time of stopping to search the knowledge base for every entity mentioned.
- Bert avoids the context switch that may break his train of thought.
- Bert will quickly know whether the entity is a high-priority, known entity of interest.
Furthermore, knowing that entity is of interest would immediately make anyone associated with it an interesting avenue to pursue.
Last words on entity recognition
By leveraging the tireless power of named entity recognition and linking to user-specified knowledge bases, analysts and investigators can spend less time sifting through data and more time collecting information and pursuing new avenues of investigation. AI-assisted realization that two investigations are connected by a common entity – seen through different investigative views – is incredibly powerful. The amount of data being generated by investigations, social media, and the 24x7 news cycles is too much for a human mind to grasp. AI doesn’t yet have the instinct of an experienced investigator to decide which trail will be most fruitful to follow, but it can show humans the map.