Getting Started
Everyrow lets you perform qualitative data transformations on noisy real-world data, at quantitative scale. Define your fuzzy logic concisely in natural language, and everyrow handles the complexity of orchestrating the execution.
Prerequisites
- Python 3.12+
- API key from everyrow.io/api-key
Installation
pip install everyrow
export EVERYROW_API_KEY=your_key_here
See the docs homepage for other options (MCP servers, coding agent plugins).
Basic Example
Shortlist an initial set of companies.
import asyncio
import pandas as pd
from everyrow.ops import screen
from pydantic import BaseModel, Field
jobs = pd.DataFrame([
{"company": "Airtable", "post": "Async-first team, 8+ yrs exp, $185-220K base"},
{"company": "Vercel", "post": "Lead our NYC team. Competitive comp, DOE"},
{"company": "Notion", "post": "In-office SF. Staff eng, $200K + equity"},
{"company": "Linear", "post": "Bootcamp grads welcome! $85K, remote-friendly"},
{"company": "Descript", "post": "Work from anywhere. Principal architect, $250K"},
])
class JobScreenResult(BaseModel):
qualifies: bool = Field(description="True if meets ALL criteria")
async def main():
result = await screen(
task="""
Qualifies if ALL THREE are met:
1. Remote-friendly
2. Senior-level (5+ yrs exp OR Senior/Staff/Principal in title)
3. Salary disclosed (specific numbers, not "competitive" or "DOE")
""",
input=jobs,
response_model=JobScreenResult,
)
print(result.data)
asyncio.run(main())
Sessions
Every operation runs within a session. Sessions group related operations together and appear in your everyrow.io session list.
When you call an operation without an explicit session, one is created automatically. For multiple related operations, create an explicit session:
from everyrow import create_session
from everyrow.ops import screen, rank
async with create_session(name="Lead Qualification") as session:
# Get the URL to view this session in the dashboard
print(f"View at: {session.get_url()}")
# All operations share this session
screened = await screen(
session=session,
task="Has a company email domain (not gmail, yahoo, etc.)",
input=leads,
response_model=ScreenResult,
)
ranked = await rank(
session=session,
task="Score by likelihood to convert",
input=screened.data,
field_name="conversion_score",
)
The session URL lets you monitor progress and inspect results in the web UI while your script runs.
Async Operations
For long-running jobs, use the _async variants to submit work and continue without blocking:
from everyrow import create_session
from everyrow.ops import rank_async
async with create_session(name="Background Ranking") as session:
task = await rank_async(
session=session,
task="Score by revenue potential",
input=large_dataframe,
field_name="score",
)
# Task is now running server-side
print(f"Task ID: {task.task_id}")
# Do other work...
# Wait for result when ready
result = await task.await_result()
Print the task ID. If your script crashes, recover the result later:
from everyrow import fetch_task_data
df = await fetch_task_data("12345678-1234-1234-1234-123456789abc")
Operations
| Operation | Description |
|---|---|
| Screen | Filter rows by criteria requiring judgment |
| Rank | Score rows by qualitative factors |
| Dedupe | Deduplicate when fuzzy matching fails |
| Merge | Join tables when keys don't match exactly |
| Research | Run web agents to research each row |
See Also
- Guides: step-by-step tutorials
- Case Studies: worked examples
- Skills vs MCP: integration options