url.clean
Fetch any URL and return its article content with the clutter stripped — nav, ads, sidebars, footers, scripts, styles, comments removed via heuristic extraction (<article> / <main> / role=main / densest block). Choose the output with `format`: markdown (default), text, both (JSON envelope), html (a self-contained readable reader-view page, raw text/html), or pdf (a clean typeset reading document, raw application/pdf). html/pdf are built from the same cleaned content, so they carry no live page, no third-party assets, no trackers. SSRF-guarded, 512KB body cap, 8s timeout, 5 redirects max. JSON formats return { url, finalUrl, title, markdown?, text?, wordCount, sourceBytes }. This uses a raw HTTP fetch (no JavaScript) — for client-rendered / SPA pages whose content only appears after JS runs, use /api/url/render (same formats, headless-rendered). For a pixel-perfect image of the live page use /api/ai/screenshot; to enumerate a page or sitemap into its links use /api/url/map.
/api/url/cleanPAYMENT-SIGNATURE.Parameters
| Name | Type | Description |
|---|---|---|
urlrequired | string | max 2048 chars · uri |
format | string | one of: markdown | text | both | html | pdf |
Code samples
# 1. Probe with no auth → 402 envelope with PaymentRequirements curl -sS 'https://2s.io/api/url/clean?url=https%3A%2F%2Fexample.com&format=markdown' # 2. Sign + retry with PAYMENT-SIGNATURE: curl -sS 'https://2s.io/api/url/clean?url=https%3A%2F%2Fexample.com&format=markdown' \ -H 'PAYMENT-SIGNATURE: <base64-json-payload>' # Or use the canonical runner (handles probe → sign → retry): # EVM_PRIVATE_KEY=0x... node --env-file=.env.local \ # --experimental-strip-types scripts/x402-pay.ts \ # 'https://2s.io/api/url/clean?url=https%3A%2F%2Fexample.com&format=markdown'
import { TwoS } from '@2sio/sdk'
const client = new TwoS({
privateKey: process.env.EVM_PRIVATE_KEY as `0x${string}`,
})
const result = await client.url.clean({
"url": "https://example.com",
"format": "markdown"
})
console.log('endpoint:', result.endpoint)
console.log('cost:', result.costUsd, 'USDC')
console.log('tx:', result.settlement?.txHash)
console.log('data:', result.data)import os
from twosio import TwoS
client = TwoS(private_key=os.environ["EVM_PRIVATE_KEY"])
result = client.url.clean(url="https://example.com", format="markdown")
print("endpoint:", result.endpoint)
print("cost:", result.cost_usd, "USDC")
print("tx:", (result.settlement or {}).get("tx_hash"))
print("data:", result.data)// 1. Add @2sio/mcp to your MCP host config (Claude Desktop example below).
// EVM_PRIVATE_KEY funds x402 payments per call.
// claude_desktop_config.json
{
"mcpServers": {
"2sio": {
"command": "npx",
"args": ["-y", "@2sio/mcp"],
"env": { "EVM_PRIVATE_KEY": "0x..." }
}
}
}
// 2. Once the server is running, agents call this tool via standard MCP:
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "url.clean",
"arguments": {
"url": "https://example.com",
"format": "markdown"
}
}
}Discovery
- /api/directory — full catalog of every endpoint
- /openapi.json — OpenAPI 3.1 spec (per-op x-payment-info, x402Payment security)
- /.well-known/x402 — machine-readable service descriptor for x402-aware crawlers
- /.well-known/mcp/server-card.json — MCP SEP-1649 server card
- /llms.txt — plain-text manifest for LLM ingestion