Score and rank leads without a CRM in Python¶
This notebook demonstrates using everyrow's rank() utility to score investment firms by their likelihood to purchase research tools—without needing CRM data or prior interactions.
Use Case: A research tools company wants to rank investment firms by product fit. Traditional approaches either require expensive CRM integrations or burn credits on enrichment tools that provide data without interpretation.
Why everyrow? The rank() function can analyze public information (website descriptions, investment focus, team characteristics) and provide both a score AND reasoning for why a firm was scored a certain way.
In [1]:
import asyncio
from dotenv import load_dotenv
load_dotenv()
import pandas as pd
from everyrow import create_session
from everyrow.ops import rank
Load Investment Firm Data¶
In [2]:
firms_df = pd.read_csv("../data/investment_firms.csv")
print(f"Loaded {len(firms_df)} investment firms")
firms_df
Out[2]:
Define Ranking Task¶
In [3]:
RANKING_TASK = """
Score each investment firm from 0-100 on their likelihood to PURCHASE THIRD-PARTY RESEARCH TOOLS.
HIGH likelihood (70-100) firms:
- Fundamental/discretionary strategies that require company analysis
- Activist investors who need deep research for campaigns
- Smaller teams that can't build everything in-house
- Known for research-intensive processes
- Short-sellers who need forensic analysis tools
- Firms that mention reading documents, research, or analysis
LOW likelihood (0-30) firms:
- Passive index funds (no stock picking = no research needs)
- Pure quantitative/systematic funds that build in-house
- Very large funds with unlimited internal resources
- Funds that explicitly mention building everything proprietary
MEDIUM (30-70) for firms with mixed signals.
Consider: Would this firm benefit from better research tools? Do they have the budget?
Would they buy vs. build?
"""
Run the Ranking¶
In [4]:
async def run_ranking():
async with create_session(name="Research Tool Adoption Scoring") as session:
print(f"Session URL: {session.get_url()}")
result = await rank(
session=session,
task=RANKING_TASK,
input=firms_df,
field_name="score",
)
return result.data
results_df = await run_ranking()
Analyze Results¶
In [5]:
# Sort by score
results_df = results_df.sort_values("score", ascending=False)
print(f"\n{'='*60}")
print("RESEARCH TOOL ADOPTION LIKELIHOOD RANKING")
print(f"{'='*60}\n")
for i, (_, row) in enumerate(results_df.iterrows(), 1):
print(f"{i:2}. {row['firm_name'][:30]:30} | Score: {row['score']:3} | {row['strategy']}")
if 'research' in row and pd.notna(row['research']):
print(f" Reasoning: {str(row['research'])[:70]}...")
print()
In [6]:
# Segment by tier
high_priority = results_df[results_df["score"] >= 70]
medium_priority = results_df[(results_df["score"] >= 40) & (results_df["score"] < 70)]
low_priority = results_df[results_df["score"] < 40]
print("\nSEGMENTATION:")
print(f" High priority (70+): {len(high_priority)} firms")
print(f" Medium priority (40-69): {len(medium_priority)} firms")
print(f" Low priority (<40): {len(low_priority)} firms")
In [7]:
# Average score by strategy type
print("\nAVERAGE SCORE BY STRATEGY:")
print(results_df.groupby("strategy")["score"].mean().sort_values(ascending=False).to_string())
In [8]:
# Full results
results_df[["firm_name", "strategy", "aum_billions", "score", "research"]]
Out[8]: