Everyrow
Getting Started
  • Installation
  • Skills vs MCP
Guides
  • How to Add A Column to a DataFrame with Web Research
  • How to Classify and Label Data with an LLM in Python
  • Remove Duplicates from ML Training Data in Python
  • Filter a Pandas DataFrame with LLMs
  • How to Fuzzy Join DataFrames in Python
  • How to sort a dataset using web data in Python
  • How to resolve duplicate rows in Python with LLMs
API Reference
  • dedupe
  • merge
  • rank
  • agent_map
  • screen
Case Studies
  • Build an AI lead qualification pipeline in Python
  • Fuzzy join two Pandas DataFrames using LLMs
  • Fuzzy match and merge contact lists in Python
  • How to filter job postings with LLM Agents
  • How to merge datasets without common ID in Python
  • How to score and prioritize leads with AI in Python
  • How to Screen Stocks in Python with AI Agents
  • How to use LLMs to deduplicate CRM Data
  • LLM-powered Merging at Scale
  • LLM-powered Screening at Scale
  • Python Notebook to screen stocks using AI Agents
  • Running LLM Web Research Agents at Scale
  • Score and rank leads without a CRM in Python
  • Use LLM Agents to research government data at scale

How to score and prioritize leads with AI in Python¶

This notebook demonstrates using everyrow's rank() utility to score B2B leads by their likelihood of suffering from data fragmentation challenges.

Use Case: A data integration SaaS company wants to prioritize leads. Companies operating across multiple locations, entities, or point solutions are more likely to need data integration tools.

Why everyrow? Traditional enrichment tools provide data fields but can't interpret them. Manual review of 1,000 leads is prohibitively slow. everyrow's rank() analyzes each company's operational complexity semantically.

In [1]:
import asyncio
from dotenv import load_dotenv
load_dotenv()

import pandas as pd
from everyrow import create_session
from everyrow.ops import rank

Load Company Data¶

In [2]:
companies_df = pd.read_csv("../data/b2b_companies.csv")

print(f"Loaded {len(companies_df)} companies")
companies_df.head(10)
Loaded 20 companies
Out[2]:
company_name industry employees description
0 Midwest Healthcare Network Healthcare 12000 Regional hospital system with 15 facilities ac...
1 TechFlow Solutions Software 85 B2B SaaS startup. Single product, cloud-native...
2 Continental Manufacturing Group Manufacturing 8500 Industrial equipment manufacturer with 22 plan...
3 QuickServe Restaurants Food Service 45000 Fast food franchise with 2,000+ locations. Eac...
4 DataPure Analytics Software 120 Analytics platform company. Unified tech stack...
5 First National Bancorp Banking 6000 Regional bank formed from 5 acquisitions. Stil...
6 GreenEnergy Utilities Utilities 3500 Power company serving 3 states. SCADA systems,...
7 SimpleRetail Co Retail 200 DTC e-commerce brand. Shopify store, all opera...
8 Global Logistics Partners Logistics 15000 Freight forwarding across 40 countries. Differ...
9 Boutique Law LLP Legal 50 Small law firm. Single office, uses Clio for e...

Define Ranking Task¶

In [3]:
RANKING_TASK = """
Score each company from 0-100 on their likelihood of suffering from DATA FRAGMENTATION challenges.

Data fragmentation risk is HIGH (70-100) when a company has:
- Multiple locations, facilities, or entities
- M&A history (acquired companies often mean duplicate systems)
- Multiple disconnected software systems mentioned
- Operations across different regions or countries
- Franchise or distributed business models
- Legacy systems mixed with modern ones

Data fragmentation risk is LOW (0-30) when a company has:
- Single location or unified operations
- Modern, cloud-native, integrated tech stack
- Small team with simple operations
- Explicitly mentions unified or integrated systems

MEDIUM (30-70) for companies with some complexity but not severe fragmentation.

Focus on operational complexity and system diversity, not just company size.
"""

Run the Ranking¶

In [4]:
async def run_ranking():
    async with create_session(name="Data Fragmentation Lead Scoring") as session:
        print(f"Session URL: {session.get_url()}")
        
        result = await rank(
            session=session,
            task=RANKING_TASK,
            input=companies_df,
            field_name="score",
        )
        
        return result.data

results_df = await run_ranking()
Session URL: https://everyrow.io/sessions/d1dc8ed0-70d2-4377-9b1c-81f0aba6abd3

Analyze Results¶

In [5]:
# Sort by score descending
results_df = results_df.sort_values("score", ascending=False)

print(f"\n{'='*60}")
print("TOP 10 DATA FRAGMENTATION RISK (Best Leads)")
print(f"{'='*60}\n")

for i, (_, row) in enumerate(results_df.head(10).iterrows(), 1):
    print(f"{i:2}. {row['company_name'][:35]:35} | Score: {row['score']:3} | {row['industry']}")
    print(f"    {row['description'][:70]}...")
    print()
============================================================
TOP 10 DATA FRAGMENTATION RISK (Best Leads)
============================================================

 1. Global Logistics Partners           | Score:  95 | Logistics
    Freight forwarding across 40 countries. Different TMS in each region, ...

 2. QuickServe Restaurants              | Score:  95 | Food Service
    Fast food franchise with 2,000+ locations. Each franchise uses differe...

 3. TransGlobal Shipping                | Score:  92 | Logistics
    Container shipping line. Vessel systems, port operations, and customer...

 4. Heritage Hotels International       | Score:  92 | Hospitality
    Hotel chain with 150 properties. Mix of Opera, Cloudbeds, and independ...

 5. Midwest Healthcare Network          | Score:  92 | Healthcare
    Regional hospital system with 15 facilities across 4 states. Uses Epic...

 6. GreenEnergy Utilities               | Score:  90 | Utilities
    Power company serving 3 states. SCADA systems, customer billing, and f...

 7. Regional Auto Dealers               | Score:  90 | Automotive
    Auto dealer group with 25 dealerships. Each uses different DMS systems...

 8. United School Districts             | Score:  90 | Education
    Consortium of 8 school districts sharing services. Each district has i...

 9. MultiState Insurance Group          | Score:  90 | Insurance
    Property & casualty insurer in 12 states. Each state has different reg...

10. Continental Manufacturing Group     | Score:  90 | Manufacturing
    Industrial equipment manufacturer with 22 plants globally. Mix of SAP,...

In [6]:
print(f"\n{'='*60}")
print("BOTTOM 5 (Lowest Priority)")
print(f"{'='*60}\n")

for _, row in results_df.tail(5).iterrows():
    print(f"  {row['company_name'][:35]:35} | Score: {row['score']:3} | {row['industry']}")
============================================================
BOTTOM 5 (Lowest Priority)
============================================================

  DataPure Analytics                  | Score:  12 | Software
  TechFlow Solutions                  | Score:  10 | Software
  Boutique Law LLP                    | Score:  10 | Legal
  SimpleRetail Co                     | Score:  10 | Retail
  CloudFirst Startup                  | Score:   5 | Software
In [7]:
# Score distribution by industry
print("\nAVERAGE SCORE BY INDUSTRY:")
print(results_df.groupby("industry")["score"].mean().sort_values(ascending=False).to_string())
AVERAGE SCORE BY INDUSTRY:
industry
Food Service     95.0
Logistics        93.5
Hospitality      92.0
Automotive       90.0
Education        90.0
Healthcare       90.0
Insurance        90.0
Utilities        90.0
Banking          88.0
Manufacturing    52.5
Biotech          15.0
Software         10.5
Legal            10.0
Retail           10.0
In [8]:
# Full results table
results_df[["company_name", "industry", "employees", "score", "research"]]
Out[8]:
company_name industry employees score research
19 Global Logistics Partners Logistics 15000 95 {'score': 'The score is based on the company d...
18 QuickServe Restaurants Food Service 45000 95 {'score': 'The score is based on the company's...
17 TransGlobal Shipping Logistics 22000 92 {'score': 'The score is based on the company's...
16 Heritage Hotels International Hospitality 9000 92 {'score': 'Based on the provided description, ...
15 Midwest Healthcare Network Healthcare 12000 92 {'score': 'The score is based on the company d...
11 GreenEnergy Utilities Utilities 3500 90 {'score': 'The score is based on the provided ...
14 Regional Auto Dealers Automotive 1200 90 {'score': 'The company has 25 dealerships (mul...
13 United School Districts Education 7500 90 {'score': 'The company is a consortium of 8 se...
12 MultiState Insurance Group Insurance 4200 90 {'score': 'The company operates across 12 stat...
10 Continental Manufacturing Group Manufacturing 8500 90 {'score': 'The score is based on the provided ...
9 CityMed Physicians Group Healthcare 800 88 {'score': 'The score is based on the provided ...
8 First National Bancorp Banking 6000 88 {'score': 'The company is a regional bank form...
7 Unified Software Corp Software 350 15 {'score': 'The company is described as having ...
6 NanoTech Labs Biotech 95 15 {'score': 'The score is based on the provided ...
5 Precision Machining Inc Manufacturing 180 15 {'score': 'The score is based on the company d...
4 DataPure Analytics Software 120 12 {'score': 'The score is based on the company d...
1 TechFlow Solutions Software 85 10 {'score': 'The company is a small SaaS startup...
3 Boutique Law LLP Legal 50 10 {'score': 'The company is a small firm with 50...
2 SimpleRetail Co Retail 200 10 {'score': 'The company is a small-scale DTC br...
0 CloudFirst Startup Software 25 5 {'score': 'As a seed-stage startup with only 2...