How to filter job postings with LLM Agents¶

This notebook demonstrates using everyrow's screen() utility to filter job postings by semantic criteria that traditional regex/keyword matching struggles with.

Use Case: Filter job postings from a "Who's Hiring" thread to find only those that meet ALL of:

Remote-friendly (explicitly allows remote/hybrid/distributed work)
Senior-level (title or requirements indicate 5+ years experience)
Salary disclosed (specific compensation figures, not "competitive" or "DOE")

Why everyrow? Traditional keyword matching achieves ~68% precision on this task. Semantic screening with everyrow achieves >90% precision by understanding context and intent.

In [1]:

import asyncio
from dotenv import load_dotenv
load_dotenv()

import pandas as pd
from pydantic import BaseModel, Field
from everyrow import create_session
from everyrow.ops import screen

Load Job Posting Data¶

In [2]:

job_postings = pd.read_csv("../data/job_postings.csv")

print(f"Loaded {len(job_postings)} job postings")
job_postings.head()

Loaded 15 job postings

Out[2]:

	company	title	location	description
0	TechCorp	Senior Backend Engineer	Remote (US)	We're looking for a senior backend engineer wi...
1	StartupXYZ	Full Stack Developer	San Francisco, CA	Join our fast-growing team! 2+ years experienc...
2	DataDriven Inc	Staff Data Scientist	Hybrid (NYC)	Staff-level data scientist needed. 8+ years ML...
3	CloudFirst	Junior DevOps Engineer	Remote	Entry level DevOps role. 0-2 years experience....
4	Enterprise Solutions	Principal Architect	On-site Boston	Principal architect for our platform team. 15+...

Define Screening Schema¶

We use a Pydantic model to structure the screening output.

In [3]:

class JobScreeningResult(BaseModel):
    """Schema for job posting screening results."""
    passes: bool = Field(
        description="Whether the job posting meets ALL three criteria"
    )
    is_remote_friendly: bool = Field(
        description="Whether the posting explicitly allows remote/hybrid/distributed work"
    )
    is_senior_level: bool = Field(
        description="Whether the role is senior-level (5+ years or Senior/Staff/Lead/Principal title)"
    )
    has_salary_disclosed: bool = Field(
        description="Whether specific salary figures are provided (not 'competitive' or 'DOE')"
    )
    reasoning: str = Field(
        description="Brief explanation of the screening decision"
    )

Define Screening Task¶

In [4]:

SCREENING_TASK = """
Screen job postings to find roles that meet ALL THREE of the following criteria:

1. **Remote-friendly**: The posting explicitly allows remote, hybrid, distributed, or 
   work-from-anywhere arrangements. "On-site only" or no mention of remote = fail.

2. **Senior-level**: The role is for experienced professionals. This means EITHER:
   - Title includes Senior, Staff, Lead, Principal, Director, or Architect
   - Requirements explicitly state 5+ years of experience
   Junior roles or roles requiring <5 years = fail.

3. **Salary disclosed**: The posting includes specific compensation figures (dollar amounts,
   salary ranges, or equivalent). Vague terms like "competitive", "DOE", "top of market",
   "TBD", or "equity only" = fail.

A posting only PASSES if it meets ALL THREE criteria.
"""

Run the Screening¶

In [5]:

async def run_screening():
    async with create_session(name="Job Posting Screening") as session:
        print(f"Session URL: {session.get_url()}")
        
        result = await screen(
            session=session,
            task=SCREENING_TASK,
            input=job_postings,
            response_model=JobScreeningResult,
        )
        
        return result.data

results_df = await run_screening()

Session URL: https://everyrow.io/sessions/3ec69130-f011-49b8-abb8-3779dcfaa204

Analyze Results¶

In [6]:

# Filter to passing jobs
passing_jobs = results_df[results_df["passes"] == True]

print(f"\n{'='*60}")
print(f"RESULTS: {len(passing_jobs)} of {len(results_df)} jobs passed all criteria")
print(f"{'='*60}\n")

print("QUALIFIED POSTINGS:")
print("-" * 40)
for _, row in passing_jobs.iterrows():
    print(f"  {row['company']:20} | {row['title']}")
    print(f"  {row['location']}")
    print()

============================================================
RESULTS: 7 of 7 jobs passed all criteria
============================================================

QUALIFIED POSTINGS:
----------------------------------------
  TechCorp             | Senior Backend Engineer
  Remote (US)

  DataDriven Inc       | Staff Data Scientist
  Hybrid (NYC)

  RemoteFirst Co       | Lead Frontend Engineer
  100% Remote, Anywhere

  FinTech Pro          | Senior Security Engineer
  Remote (EU timezone)

  HealthTech           | Senior Product Manager
  Distributed team

  MegaCorp             | Staff SRE
  Hybrid (Seattle)

  EdTech Plus          | Senior iOS Developer
  Remote first

In [7]:

# Show breakdown
print("\nSCREENING SUMMARY:")
print(f"  Total postings:  {len(results_df)}")
print(f"  Passed:          {results_df['passes'].sum()}")
print(f"  Failed:          {(~results_df['passes']).sum()}")

SCREENING SUMMARY:
  Total postings:  7
  Passed:          7
  Failed:          0

In [8]:

# Show full results
results_df[["company", "title", "location", "passes"]]

Out[8]:

	company	title	location	passes
0	TechCorp	Senior Backend Engineer	Remote (US)	True
1	DataDriven Inc	Staff Data Scientist	Hybrid (NYC)	True
2	RemoteFirst Co	Lead Frontend Engineer	100% Remote, Anywhere	True
3	FinTech Pro	Senior Security Engineer	Remote (EU timezone)	True
4	HealthTech	Senior Product Manager	Distributed team	True
5	MegaCorp	Staff SRE	Hybrid (Seattle)	True
6	EdTech Plus	Senior iOS Developer	Remote first	True

How to filter job postings with LLM Agents¶

This notebook demonstrates using everyrow's screen() utility to filter job postings by semantic criteria that traditional regex/keyword matching struggles with.

Use Case: Filter job postings from a "Who's Hiring" thread to find only those that meet ALL of:

Remote-friendly (explicitly allows remote/hybrid/distributed work)

Senior-level (title or requirements indicate 5+ years experience)

Salary disclosed (specific compensation figures, not "competitive" or "DOE")

Why everyrow? Traditional keyword matching achieves ~68% precision on this task. Semantic screening with everyrow achieves >90% precision by understanding context and intent.

company

title

location

description

TechCorp

Senior Backend Engineer

Remote (US)

We're looking for a senior backend engineer wi...

StartupXYZ

Full Stack Developer

San Francisco, CA

Join our fast-growing team! 2+ years experienc...

DataDriven Inc

Staff Data Scientist

Hybrid (NYC)

Staff-level data scientist needed. 8+ years ML...

CloudFirst

Junior DevOps Engineer

Remote

Entry level DevOps role. 0-2 years experience....

Enterprise Solutions

Principal Architect

On-site Boston

Principal architect for our platform team. 15+...

class JobScreeningResult(BaseModel): """Schema for job posting screening results.""" passes: bool = Field( description="Whether the job posting meets ALL three criteria" ) is_remote_friendly: bool = Field( description="Whether the posting explicitly allows remote/hybrid/distributed work" ) is_senior_level: bool = Field( description="Whether the role is senior-level (5+ years or Senior/Staff/Lead/Principal title)" ) has_salary_disclosed: bool = Field( description="Whether specific salary figures are provided (not 'competitive' or 'DOE')" ) reasoning: str = Field( description="Brief explanation of the screening decision" )

SCREENING_TASK = """ Screen job postings to find roles that meet ALL THREE of the following criteria: 1. **Remote-friendly**: The posting explicitly allows remote, hybrid, distributed, or work-from-anywhere arrangements. "On-site only" or no mention of remote = fail. 2. **Senior-level**: The role is for experienced professionals. This means EITHER: - Title includes Senior, Staff, Lead, Principal, Director, or Architect - Requirements explicitly state 5+ years of experience Junior roles or roles requiring <5 years = fail. 3. **Salary disclosed**: The posting includes specific compensation figures (dollar amounts, salary ranges, or equivalent). Vague terms like "competitive", "DOE", "top of market", "TBD", or "equity only" = fail. A posting only PASSES if it meets ALL THREE criteria. """

async def run_screening(): async with create_session(name="Job Posting Screening") as session: print(f"Session URL: {session.get_url()}") result = await screen( session=session, task=SCREENING_TASK, input=job_postings, response_model=JobScreeningResult, ) return result.data results_df = await run_screening()

# Filter to passing jobs passing_jobs = results_df[results_df["passes"] == True] print(f"\n{'='*60}") print(f"RESULTS: {len(passing_jobs)} of {len(results_df)} jobs passed all criteria") print(f"{'='*60}\n") print("QUALIFIED POSTINGS:") print("-" * 40) for _, row in passing_jobs.iterrows(): print(f" {row['company']:20} | {row['title']}") print(f" {row['location']}") print()

============================================================ RESULTS: 7 of 7 jobs passed all criteria ============================================================ QUALIFIED POSTINGS: ---------------------------------------- TechCorp | Senior Backend Engineer Remote (US) DataDriven Inc | Staff Data Scientist Hybrid (NYC) RemoteFirst Co | Lead Frontend Engineer 100% Remote, Anywhere FinTech Pro | Senior Security Engineer Remote (EU timezone) HealthTech | Senior Product Manager Distributed team MegaCorp | Staff SRE Hybrid (Seattle) EdTech Plus | Senior iOS Developer Remote first

# Show breakdown print("\nSCREENING SUMMARY:") print(f" Total postings: {len(results_df)}") print(f" Passed: {results_df['passes'].sum()}") print(f" Failed: {(~results_df['passes']).sum()}")

company

title

location

passes

TechCorp

Senior Backend Engineer

Remote (US)

True

DataDriven Inc

Staff Data Scientist

Hybrid (NYC)

True

RemoteFirst Co

Lead Frontend Engineer

100% Remote, Anywhere

True

FinTech Pro

Senior Security Engineer

Remote (EU timezone)

True

HealthTech

Senior Product Manager

Distributed team

True

MegaCorp

Staff SRE

Hybrid (Seattle)

True

EdTech Plus

Senior iOS Developer

Remote first

True