Ecommerce: Enrich 10,000 Product Rows With Scraped Vendor Data
Agent pattern for bulk product feed enrichment: scrape each vendor page, rewrite with Claude 4.5 Haiku, cap daily spend via wallet rules.
Pattern type: Bulk CSV enrichment pipeline Tools used:
firecrawl/scrape,claude-4-5-haiku,/api/wallet/rules,/api/batchEstimated cost per 1,000 products: ~$29
What this pattern does
A team running a product catalog takes a CSV of 10,000 SKUs, each with a vendor URL and thin metadata. The agent scrapes each vendor product page, then uses Claude 4.5 Haiku (the cheap, high-throughput tier) to rewrite the description, extract structured attributes (color, material, weight), and tag the category. Output is a new CSV ready for upload to Shopify, BigCommerce, or a PIM.
Haiku is intentional here: for "rewrite this scraped page as a clean product description", Opus is overkill and 5x more expensive.
Workflow
1. Load products.csv (10,000 rows)
2. Chunk into batches of 200 rows
3. /api/batch -> firecrawl/scrape for each vendor_url
4. For each scraped page:
/v1/run model=claude-4-5-haiku
-> { description, color, material, weight, category }
5. Append to enriched.csv
6. /api/wallet/rules caps agent at $50/day, hard-stops if exceeded
Runnable implementation
import fs from "fs";
import { parse } from "csv-parse/sync";
import { stringify } from "csv-stringify/sync";
const SB = "https://api.heybossai.com/v1";
const KEY = process.env.SKILLBOSS_API_KEY!;
const rows = parse(fs.readFileSync("products.csv"), { columns: true });
const out: any[] = [];
for (const chunk of chunks(rows, 200)) {
const scraped = await fetch("https://www.skillboss.co/api/batch", {
method: "POST",
headers: { Authorization: `Bearer ${KEY}`, "Content-Type": "application/json" },
body: JSON.stringify({
requests: chunk.map((r: any) => ({
model: "firecrawl/scrape",
inputs: { url: r.vendor_url },
})),
idempotency_key: `feed-${chunk[0].sku}`,
}),
}).then((r) => r.json());
for (let i = 0; i < chunk.length; i++) {
const md = scraped.results[i]?.markdown?.slice(0, 8000);
if (!md) continue;
const enriched = await fetch(`${SB}/run`, {
method: "POST",
headers: {
Authorization: `Bearer ${KEY}`,
"Content-Type": "application/json",
"X-Max-Cost-Usd": "0.01",
},
body: JSON.stringify({
model: "claude-4-5-haiku",
messages: [
{
role: "system",
content:
"Return strict JSON: { description, color, material, weight, category }",
},
{ role: "user", content: md },
],
}),
}).then((r) => r.json());
out.push({ ...chunk[i], ...JSON.parse(enriched.choices[0].message.content) });
}
}
fs.writeFileSync("enriched.csv", stringify(out, { header: true }));
function* chunks<T>(arr: T[], n: number) {
for (let i = 0; i < arr.length; i += n) yield arr.slice(i, i + n);
}
Cost breakdown (per 1,000 products)
Assumes ~6k input tokens + 300 output tokens per product (scraped page is long, output is structured JSON).
| Call | Skill | Unit cost | Units | Subtotal |
|---|---|---|---|---|
| Scrape vendor page | firecrawl/scrape | $0.0253/request | 1,000 | $25.30 |
| Enrich (input) | claude-4-5-haiku | $4/1M tokens | 6M in | $24.00 |
| Enrich (output) | claude-4-5-haiku | $20/1M tokens | 0.3M out | $6.00 |
Total per 1,000 products: ~$55 Total for full 10,000-row feed: ~$550 one-shot If you skip the scrape step (you already have vendor HTML): ~$30 per 1,000.
Guardrails for this pattern
- Per-call cap:
X-Max-Cost-Usd: 0.01keeps a single weird page from eating $0.50. - Daily limit: set
max_daily_spend_usd: 50via /api/wallet/rules so an infinite loop cannot drain the wallet overnight. - Sub-wallet: put the enrichment agent on its own wallet via /api/wallet/sub-wallets; if something breaks you know exactly who spent what.
- Idempotency: use
idempotency_key: "feed-<sku>"so retries after a crash do not double-charge. - Audit: every enrichment returns a signed receipt you can attach to the row.
Going further
- Add a second pass with
claude-4-6-sonnetonly for rows where Haiku returned low-confidence categories. - Use /api/estimate before kicking off the full 10k-row run so finance can sign off.
- Cache scrape results keyed by URL for 7 days — re-runs become free.
Try this pattern yourself with a free $0.50 trial:
curl -X POST https://www.skillboss.co/api/try/anonymous-wallet \
-H "Content-Type: application/json" -d '{}'
See also: /use/firecrawl-scrape | /use/claude-4-5-haiku
SkillBoss is an independent multi-provider gateway, not affiliated with any vendor listed. Trademarks belong to their owners. See our IP policy.