By Nikki Medinger

Entity extraction helps you answer who it is quickly, accurately and concisely

Rosette Entity Extractor is a component within Rosette by Babel Street that automatically identifies and extracts entities from multilingual text. It supports a wide range (19+) of entity types, including people, organizations, locations, dates, nationalities, and more — including custom entity types.

Entity Extractor uses machine learning algorithms, entity lists (gazetteers) and regular expressions (pattern matching) to analyze text data and extract entity information. It can identify entities in text and categorize them based on their type — delivering concise, accurate information and eliminating overwhelming useless information.

Next, Rosette can link the entity and recognize “who” the entity is by disambiguating entities with the same name using its knowledge base or your data.

Entity extraction helps you answer “who it is” quickly and accurately. Government and private sectors benefit from Rosette Entity Extractor and use it for military intelligence, social media monitoring, adverse media screening, metadata extraction, customer service, and market research.

Event extraction helps you answer where and when, quickly, accurately, and concisely

Rosette Event Extraction is an add-on component within the Rosette platform that allows end-users to annotate data for creation of a model that will identify and extract specific event information from text. It uses machine learning algorithms and natural language processing (NLP) techniques to analyze text data and extract event information. It can identify events in text and categorize them based on their type.

Rosette Event Extraction supports a wide range of event types, including news events, business events, sports events, military, political, and more. For example, it’s helpful to governments in all-source intelligence and alerting. Keyword search sends too many irrelevant alerts and misses too many important things. Event extraction understands context and extracts the key who/what/when/where information that analysts need to see. When monitoring for unrest in a region of the world, event extraction can find it whether it is called a shooting, attack, bombing, violent incident, or assassination.

All-source intelligence and alerting: When monitoring for unrest in a region of the world, event extraction can find it whether it is called a shooting, attack, bombing, violent incident, or assassination.

Event extraction is extremely specific to a domain, event type, and use case. It has become feasible for production use because its model-building process can produce a viable model with very little training data.

Ready to use: pre-trained with knowledge bases and trainable with your data

Both Rosette Entity Extractor and Rosette Event Extraction come ready to use with entity linking knowledge bases built on Wikidata, DBpedia, and PermID. Rosette uses entity linking to distinguish between similarly named entities by looking at the context of the entity in the target document compared to the entity in the knowledge base.

In addition to coming pre-trained, you can also train both extractors and customize each to link to your own database.

Rosette Model Training Suite is an add-on component within the Rosette platform that enables you to train custom models to suit your specific needs. With Rosette Model Training Suite, you can annotate your data, train models using that annotated data, and evaluate the performance of the models you have trained. It provides a user-friendly interface and supports multiple languages. This custom training improves accuracy and performance and gives you the ability to train models for specific use cases and domains.

Find out who it is, what they are saying, and understand where and when it is taking place

With Rosette Entity Extractor and Rosette Event Extraction, government and private sectors can find out who it is, what they are saying, and understand where and when it’s taking place. This is especially helpful in:

News analysis: Event training can be used to analyze news articles and identify key events and their attributes, such as the actors involved, the location, and the time of the event. This can be useful for tracking important events and trends, such as political developments, natural disasters, or financial market movements.
Social media monitoring: Event training can also be used to analyze social media posts and identify events of interest, such as product launches, or public events. This can be useful for monitoring brand reputation, tracking public sentiment, or identifying emerging trends.
Business intelligence: Event training can be used to extract information from business documents, such as earnings reports, product reviews, or customer feedback. This can be useful for identifying business opportunities, tracking market trends, or monitoring competitor activity.
Legal document analysis: Event training can also be used in legal document analysis to identify key events and their attributes, such as the parties involved, the nature of the dispute, and the outcome of the case. This can be useful for legal research, litigation support, or regulatory compliance.

What’s new in this release? Team collaboration and simple adjudication

In Model Training Suite 1.06, the tool now includes event model multi-user annotation and simple adjudication.

“This is a logical next step as we enable our teams to further collaborate on model training. Now multiple users can annotate, and those annotations can be expertly evaluated/adjudicated so the most correct event mention can be determined. Multi-user annotation, combined with adjudication provides a way for all ideas to be brought forward, creating a more collaborative approach. The end result is a more accurate and customized training process leading to improved event extraction,” said Vivian Shih, Sr. Product Manager, Content Intelligence, Rosette.

What follows is a brief summary of the new capabilities:

Multi-user annotation: This capability enables multiple annotators to label (annotate) training data, providing diverse perspectives and reducing potential bias in the training process. Multi-user annotation allows for a more comprehensive and robust training dataset, leading to improved model performance.
Simple adjudication: Adjudication is the process of resolving conflicts or discrepancies in the annotations made by multiple annotators. The adjudication capability helps reconcile differences in annotations, ensuring that the data used to train the model is accurate and reliable.

This enhances the accuracy, reliability, and customization of event extraction, making it a powerful tool for identifying and analyzing events in text data.

Rosette Server 1.25.1 expands languages, improves speed and accuracy

Below are a few additional highlights of improvements listed in the release notes of Rosette Server.

Rosette Name Indexer/Rosette Name Translator (RNI/RNT) updates increase language options and accuracy, plus our studio tool is easier to use.

New! Turkish support added
Improved person name matching to detect given names and surnames in Latin script when the name is of English origin; and added a new match phenomenon that increases accuracy scores in Arabic
Improved user interface for the studio tool: connectivity, security, and ease of use

Rosette Base Linguistics (RBL) updates increase language response speed.