How DeepDive uses Entity Resolution to eliminate false positives

After DeepDive's NLP digests and structures information from hundreds of sources, a critical challenge remains: ensuring all this data actually pertains to the correct person. This is where our sophisticated entity resolution systems come into play—filtering out false positives and building an accurate picture of your investigation subject.

Compliance analysts and MLRO’s will know all too well the task of disambiguating entities with the same name, in particular, when PEPs and Sanctions matches often yield false positives. DeepDive eliminates the painstaking swivel chair multi-screen challenge of cross- referencing between multiple sources.

The challenges of manual entity resolution

Here's what makes manual disambiguation so challenging for compliance teams and investigators:

Name variation and commonality: Many names are shared by hundreds or thousands of individuals worldwide.
Limited context: Individual sources often provide insufficient details to disambiguate with certainty.
Cross-language confusion: Name variations across different languages and alphabets complicate entity identification.
Shifting identifiers: People change roles, locations, and affiliations over time
Resource constraints: Thorough verification across multiple sources is time-intensive These challenges often result in either false positives (including information about the wrong person) or excessive caution (excluding potentially relevant information due to uncertainty).

DeepDive's intelligent entity resolution system addresses these challenges through five key capabilities:

1. Graph-based clustering. DeepDive's proprietary entity resolution performs network link analysis to group related mentions:

Pattern recognition: Identifies consistent patterns that indicate the same individual across sources
Attribute matching: Compares names alongside locations, dates, affiliations, and other identifiers
Network analysis: Maps relationships between mentions to determine likely matches
Outlier detection: Flags mentions that significantly deviate from established patterns

This sophisticated approach goes far beyond simple name matching, using the rich context established by our NLP system to make intelligent connections.

2. Multi-factor entity comparison. DeepDive uses multiple attributes to compare and match entities across different sources

Biographical details: Birth dates, education history, and career milestones.
Geographical connections: Residential locations, business addresses, and travel patterns.
Organisational affiliations: Company roles, institutional connections, and professional memberships.
Relationship networks: Family members, business associates, and other consistent connections.
Temporal consistency: Chronological alignment of life events and activities.

By requiring multiple matching factors, the system dramatically reduces false positives while maintaining the widest possible pool of sources.

3. Adversarial AI verification. Even sophisticated algorithms make mistakes, which is why DeepDive employs a multi-layered verification approach:

Semantic analysis: Large Language Models evaluate whether content conceptually relates to the search subject
Coherence assessment: The system verifies that the assembled profile presents a logically consistent picture
Dual-system validation: Independent AI systems cross-check each other's entity resolution validation
Edge case detection: Special attention is given to borderline cases that might confuse standard algorithms

This verification layer acts as a crucial quality control mechanism, catching potential errors before they impact the investigation.

4. Confidence scoring. DeepDive assigns confidence levels to each entity resolution decision:

Match strength: Quantifies the number and quality of matching attributes
Source reliability: Factors in the credibility of sources providing identifying information
Corroboration level: Weighs how many independent sources support the same conclusion
Distinguishing factors: Evaluates the presence of unique identifiers that differentiate similar individuals

These confidence scores provide transparency to analysts, allowing them to focus on high-confidence information

5. False positive removal. The final step is decisive filtering to ensure only relevant information remains:

Strict exclusion: Content not firmly linked to the correct individual is removed from the analysis
Cluster separation: Clear boundaries are established between the subject and similar individuals
Manual review options: Borderline cases can be flagged for human review when appropriate
Continuous learning: The system refines its approach based on feedback and verified outcomes

This disciplined approach ensures the resulting Body of Knowledge focuses exclusively on the correct individual, eliminating the noise and confusion of false matches.

Beyond entity resolution...

DeepDive's entity resolution creates a filtered, verified foundation for the next stages:

Body of Knowledge creation through LLM-powered statement extraction
Confidence scoring of extracted statements against source reliability
Report generation with full source citations and structured sections
Interactive chatbot interrogation of the knowledge base

By transforming one of the most challenging aspects of investigations into a reliable, systematic process, DeepDive enables compliance teams and investigators to proceed with confidence that they're analysing the right person.

The result? Investigations that avoid costly mistakes stemming from identity confusion, deliver more accurate risk assessments, and save countless hours previously spent on discounting false positives.

Want to read more? The Body of Knowledge: From resolved entity into insight

Blog

How DeepDive uses Entity Resolution to eliminate false positives

The challenges of manual entity resolution

DeepDive's intelligent entity resolution system addresses these challenges through five key capabilities:

Beyond entity resolution...

Want to read more? The Body of Knowledge: From resolved entity into insight

Spotting red flags is the easy part.

Why AI-Powered EDD Matters

Frequently Asked Questions

Frequently Asked Questions

Related posts

The Body of Knowledge: From resolved entity to insight

What is prompt engineering? A Q&A with one of our Prompt Engineers

How DeepDive transforms a body of knowledge into intelligence reports

Use Cases

Use Cases

Recent Blogs

What do kitchen knives and LLMs have in common?

What regulators see when your firm uses DeepDive for EDD

How DeepDive multiplies EDD capacity without additional headcount

All Rights Reserved | Website Privacy Policy