How to score and prioritize leads with AI in Python¶
This notebook demonstrates using everyrow's rank() utility to score B2B leads by their likelihood of suffering from data fragmentation challenges.
Use Case: A data integration SaaS company wants to prioritize leads. Companies operating across multiple locations, entities, or point solutions are more likely to need data integration tools.
Why everyrow? Traditional enrichment tools provide data fields but can't interpret them. Manual review of 1,000 leads is prohibitively slow. everyrow's rank() analyzes each company's operational complexity semantically.
In [1]:
import asyncio
from dotenv import load_dotenv
load_dotenv()
import pandas as pd
from everyrow import create_session
from everyrow.ops import rank
Load Company Data¶
In [2]:
companies_df = pd.read_csv("../data/b2b_companies.csv")
print(f"Loaded {len(companies_df)} companies")
companies_df.head(10)
Out[2]:
Define Ranking Task¶
In [3]:
RANKING_TASK = """
Score each company from 0-100 on their likelihood of suffering from DATA FRAGMENTATION challenges.
Data fragmentation risk is HIGH (70-100) when a company has:
- Multiple locations, facilities, or entities
- M&A history (acquired companies often mean duplicate systems)
- Multiple disconnected software systems mentioned
- Operations across different regions or countries
- Franchise or distributed business models
- Legacy systems mixed with modern ones
Data fragmentation risk is LOW (0-30) when a company has:
- Single location or unified operations
- Modern, cloud-native, integrated tech stack
- Small team with simple operations
- Explicitly mentions unified or integrated systems
MEDIUM (30-70) for companies with some complexity but not severe fragmentation.
Focus on operational complexity and system diversity, not just company size.
"""
Run the Ranking¶
In [4]:
async def run_ranking():
async with create_session(name="Data Fragmentation Lead Scoring") as session:
print(f"Session URL: {session.get_url()}")
result = await rank(
session=session,
task=RANKING_TASK,
input=companies_df,
field_name="score",
)
return result.data
results_df = await run_ranking()
Analyze Results¶
In [5]:
# Sort by score descending
results_df = results_df.sort_values("score", ascending=False)
print(f"\n{'='*60}")
print("TOP 10 DATA FRAGMENTATION RISK (Best Leads)")
print(f"{'='*60}\n")
for i, (_, row) in enumerate(results_df.head(10).iterrows(), 1):
print(f"{i:2}. {row['company_name'][:35]:35} | Score: {row['score']:3} | {row['industry']}")
print(f" {row['description'][:70]}...")
print()
In [6]:
print(f"\n{'='*60}")
print("BOTTOM 5 (Lowest Priority)")
print(f"{'='*60}\n")
for _, row in results_df.tail(5).iterrows():
print(f" {row['company_name'][:35]:35} | Score: {row['score']:3} | {row['industry']}")
In [7]:
# Score distribution by industry
print("\nAVERAGE SCORE BY INDUSTRY:")
print(results_df.groupby("industry")["score"].mean().sort_values(ascending=False).to_string())
In [8]:
# Full results table
results_df[["company_name", "industry", "employees", "score", "research"]]
Out[8]: