everyrowdocs
Overview
  • Installation
  • Getting Started
  • API Key
  • Skills vs MCP
  • Chaining Operations
  • GitHub
API Reference
  • dedupe
  • merge
  • rank
  • agent_map
  • screen
Guides
  • How to Add A Column to a DataFrame with Web Research
  • How to Classify and Label Data with an LLM in Python
  • Remove Duplicates from ML Training Data in Python
  • Filter a Pandas DataFrame with LLMs
  • How to Fuzzy Join DataFrames in Python
  • How to sort a dataset using web data in Python
  • How to resolve duplicate rows in Python with LLMs
Case Studies
  • Build an AI lead qualification pipeline in Python
  • Fuzzy join two Pandas DataFrames using LLMs
  • Fuzzy match and merge contact lists in Python
  • How to filter job postings with LLM Agents
  • How to merge datasets without common ID in Python
  • How to score and prioritize leads with AI in Python
  • How to Screen Stocks in Python with AI Agents
  • How to use LLMs to deduplicate CRM Data
  • LLM-powered Merging at Scale
  • LLM-powered Screening at Scale
  • Python Notebook to screen stocks using AI Agents
  • Run 10,000 LLM Web Research Agents
  • Score and rank leads without a CRM in Python
  • Use LLM Agents to research government data at scale
everyrowby futuresearch
by futuresearch

Chaining Operations

Operations can be chained together to build complete workflows. Each step refines your data further.

Example scenario: A wizard rescuing magical creatures for a sanctuary.

Screen

Start with: Sighting reports from villages—some real, some tall tales.

Goal: Keep only confirmed magical creatures (not "my neighbour's cat acts weird").

class CreatureSighting(BaseModel):
    is_magical: bool = Field(description="True if genuinely magical")

screened = await screen(
    session=session,
    task="Keep only confirmed magical creatures, not mundane animals or tall tales",
    input=sightings,
    response_model=CreatureSighting,
)
LocationReportMillbrookGlowing deer in forestThornwallCat acts strange at nightLakemereSerpent spotted in lakeOldwickBird talked to farmer...46 more reportsSCREENLocationReportMillbrookGlowing deer in forestLakemereSerpent spotted in lake...18 confirmed magical

Common Workflows

Here are practical examples of chaining 2-3 operations for everyday analyst tasks.

Consolidate messy lead lists

Problem: You have leads from a trade show, a purchased list, and your CRM export. Same companies appear under different names.

Workflow: Concatenate → Dedupe → Rank

# Combine sources
all_leads = pd.concat([trade_show, purchased_list, crm_export])

# Dedupe across sources
deduped = await dedupe(
    input=all_leads,
    equivalence_relation="Same company, accounting for Inc/LLC variations, abbreviations, and parent/subsidiary relationships",
)

# Prioritize for outreach
ranked = await rank(
    task="Score by likelihood to need our data integration product",
    input=deduped.data,
    field_name="fit_score",
)

Vet vendors before RFP

Problem: You have 200 potential vendors. Need to shortlist ones that meet basic requirements, then rank by fit.

Workflow: Screen → Rank

# Filter to qualified vendors
screened = await screen(
    task="Must have SOC2 certification, 50+ employees, and enterprise references",
    input=vendor_list,
    response_model=VendorQualification,
)

# Rank survivors by fit
ranked = await rank(
    task="Score by alignment with our technical requirements and budget constraints",
    input=screened.data,
    field_name="fit_score",
)

Match products to approved suppliers

Problem: Procurement has a list of software products in use. Need to check each against the approved vendor list—but product names don't match company names (Photoshop vs Adobe).

Workflow: Research → Merge

# First, enrich products with parent company info
enriched = await agent_map(
    task="Find the parent company that makes this software product",
    input=software_products,
)

# Now merge with approved vendors
matched = await merge(
    task="Match products to approved vendors by parent company",
    left_table=enriched.data,
    right_table=approved_vendors,
    merge_on_left="parent_company",
    merge_on_right="vendor_name",
)

# Flag unapproved software for review
unapproved = matched.data[matched.data["vendor_id"].isna()]

Enrich accounts before territory planning

Problem: CRM has company names but missing firmographic data. Need employee count and industry before assigning to sales territories.

Workflow: Research → Rank

# Enrich with firmographics
enriched = await agent_map(
    task="Find employee count, industry, and headquarters location",
    input=accounts,
)

# Score for territory assignment
ranked = await rank(
    task="Score by revenue potential based on company size and industry fit",
    input=enriched.data,
    field_name="territory_priority",
)

Dedupe research from multiple sources

Problem: You scraped company data from LinkedIn, Crunchbase, and news articles. Same companies appear with different descriptions.

Workflow: Concatenate → Dedupe

# Combine all research
all_research = pd.concat([linkedin_data, crunchbase_data, news_mentions])

# Consolidate into canonical records
deduped = await dedupe(
    input=all_research,
    equivalence_relation="""
        Same company if names match accounting for legal suffixes,
        or if they share the same website domain or headquarters address
    """,
)
# Result includes best available data from all sources