How to Add a Column to a DataFrame Using Web Lookup
pandas.apply() runs a local function on each row. But it can't use LLM judgment or do web research to find new values. And doing this by hand can be very slow or expensive. EveryRow provides a one-line utility to do this cheaply and at scale.
This guide shows how to add a column for price, for 246 common software products, in a single method call on your pandas dataframe.
| Metric | Value |
|---|---|
| Rows | 246 |
| Cost | $6.68 |
| Time | 15.7 minutes |
| Success rate | 99.6% (1 failed) |
| Session | view |
pip install everyrow
export EVERYROW_API_KEY=your_key_here # Get one at everyrow.io/api-key
The dataset is a list of 246 SaaS and developer tools like Slack, Notion, Asana. Download saas_products.csv to follow along. We find the annual price of each product's lowest paid tier, which isn't available through any structured API; it requires visiting pricing pages that change frequently and present information in different formats.
import asyncio
import pandas as pd
from pydantic import BaseModel, Field
from everyrow import create_session
from everyrow.ops import agent_map
class PricingInfo(BaseModel):
lowest_paid_tier_annual_price: float = Field(
description="Annual price in USD for the lowest paid tier. "
"Use monthly price * 12 if only monthly shown. "
"0 if no paid tier exists."
)
tier_name: str = Field(
description="Name of the lowest paid tier (e.g. 'Pro', 'Starter', 'Basic')"
)
async def main():
df = pd.read_csv("saas_products.csv") # Single column: product
async with create_session(name="SaaS pricing lookup") as session:
result = await agent_map(
session=session,
task="""
Find the pricing for this SaaS product's lowest paid tier.
Visit the product's pricing page to find this information.
Look for the cheapest paid plan (not free tier). Report:
- The annual price in USD (if monthly, multiply by 12)
- The name of that tier
If the product has no paid tier or pricing isn't public, use 0.
""",
input=df,
response_model=PricingInfo,
)
print(result.data)
asyncio.run(main())
product tier_name lowest_paid_tier_annual_price
0 Notion Plus 96.00
1 Slack Pro 87.00
2 Asana Starter 131.88
3 Monday.com Basic 108.00
4 Trello Standard 60.00
5 Jira Standard 94.92
6 Linear Basic 120.00
7 ClickUp Unlimited 84.00
...
Each result includes a research column showing how the agent found the answer, with citations linking back to sources. For example, Slack's entry shows: "The Pro plan costs $7.25 USD per active user per month when billed annually (from slack.com/pricing/pro). Annual price calculation: $7.25 × 12 months = $87 per user per year."
The key to doing this cheaply is in the orchestration of the web research agents, using the right batching, parallelism, LLMs, search tools, and page reading tools. Web research agents have degrees of freedom on how to solve problems, and EveryRow optimizes them for cost and accuracy, all in a single method on your pandas dataframe.
By using LLM web agents, this works for any new column, any enrichment, that you need on your table, as long as the information can be found on the web.