loading…
Search for a command to run...
loading…
Web scraping MCP server for AI agents. 6 tools: clean content extraction, structured scraping with CSS selectors, full-page screenshots via Playwright, link ext
Web scraping MCP server for AI agents. 6 tools: clean content extraction, structured scraping with CSS selectors, full-page screenshots via Playwright, link extraction, metadata extraction (OG/Twitter cards), and Google search. Free tier with x402 micropayments.
The #1 most requested utility for AI agents — professional web scraping, screenshots, and content extraction via MCP + REST API.
🌐 Clean Content Extraction — Extract readable text/markdown from any webpage (like Readability)
🎯 Structured Scraping — Extract specific data using CSS selectors
📸 Screenshots — Capture full-page or viewport screenshots with Playwright
🔗 Link Extraction — Get all links from a page with optional regex filtering
📋 Metadata Extraction — Extract title, description, Open Graph tags, favicon, etc
🔍 Google Search — Search Google and get results programmatically
Add to your MCP settings file (cline_mcp_settings.json or similar):
{
"mcpServers": {
"agent-scraper": {
"url": "https://agent-scraper-mcp.onrender.com/mcp"
}
}
}
Base URL: https://agent-scraper-mcp.onrender.com
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/scrape_url \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/article",
"format": "markdown"
}'
Response:
{
"success": true,
"url": "https://example.com/article",
"title": "Article Title",
"content": "# Article Title\n\nClean markdown content...",
"format": "markdown"
}
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/scrape_structured \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/product",
"selectors": {
"title": "h1.product-title",
"price": ".price",
"reviews": ".review-text"
}
}'
Response:
{
"success": true,
"url": "https://example.com/product",
"data": {
"title": "Product Name",
"price": "$29.99",
"reviews": ["Great product!", "Worth the money"]
}
}
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/screenshot_url \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"width": 1280,
"height": 720,
"full_page": false
}'
Response:
{
"success": true,
"url": "https://example.com",
"image": "iVBORw0KGgoAAAANSUhEUgAA...",
"width": 1280,
"height": 720,
"full_page": false
}
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/extract_links \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"filter": "https://example.com/blog/.*"
}'
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/extract_meta \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'
curl -X POST https://agent-scraper-mcp.onrender.com/api/v1/search_google \
-H "Content-Type: application/json" \
-d '{
"query": "python web scraping",
"num_results": 10
}'
After free tier exhausted:
Payment via HTTP 402 with crypto wallet:
0x8E844a7De89d7CfBFe9B4453E65935A22F146aBBX-Payment header with payment proofscrape_urlExtract clean, readable content from any webpage (like Readability).
Parameters:
url (string, required): URL to scrapeformat (string, optional): Output format — text, markdown, or html (default: markdown)Returns: {success, url, title, content, format}
scrape_structuredExtract specific data using CSS selectors.
Parameters:
url (string, required): URL to scrapeselectors (object, required): Dict of name → CSS selectorReturns: {success, url, data}
Example selectors:
{
"title": "h1.post-title",
"author": ".author-name",
"price": "span.price",
"images": "img.product-image"
}
screenshot_urlCapture a screenshot of any webpage.
Parameters:
url (string, required): URL to screenshotwidth (int, optional): Viewport width (default: 1280)height (int, optional): Viewport height (default: 720)full_page (bool, optional): Capture full scrollable page (default: false)Returns: {success, url, image, width, height, full_page}
Image is base64-encoded PNG.
extract_linksExtract all links from a webpage.
Parameters:
url (string, required): URL to scrapefilter (string, optional): Regex pattern to filter URLsReturns: {success, url, links, count}
Links array contains {text, href} objects.
extract_metaExtract metadata from a webpage.
Parameters:
url (string, required): URL to scrapeReturns: {success, url, meta}
Meta object includes:
title: Page titledescription: Meta descriptioncanonical: Canonical URLfavicon: Favicon URLog: Open Graph tagstwitter: Twitter Card tagssearch_googleSearch Google and get results.
Parameters:
query (string, required): Search querynum_results (int, optional): Number of results (default: 10)Returns: {success, query, results, count}
Results array contains {title, url, snippet} objects.
# Clone repo
git clone https://github.com/aparajithn/agent-scraper-mcp.git
cd agent-scraper-mcp
# Install dependencies
pip install -e ".[dev]"
# Install Playwright browsers
playwright install chromium --with-deps
# Run server
uvicorn src.main:app --reload --port 8080
# Initialize
curl -X POST http://localhost:8080/mcp -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0.0"}}}'
# List tools
curl -X POST http://localhost:8080/mcp -d '{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}'
# Call scrape_url
curl -X POST http://localhost:8080/mcp -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"scrape_url","arguments":{"url":"https://example.com","format":"markdown"}}}'
# Build
docker build -t agent-scraper-mcp .
# Run
docker run -p 8080:8080 -e PUBLIC_HOST=localhost agent-scraper-mcp
Deployed on Render (free tier):
agent-scraper-mcpaparajithn/agent-scraper-mcpEnvironment variables:
PUBLIC_HOST: agent-scraper-mcp.onrender.comX402_WALLET_ADDRESS: 0x8E844a7De89d7CfBFe9B4453E65935A22F146aBBMIT License — see LICENSE for details.
Built for AI agents by AI engineers 🤖
Run in your terminal:
claude mcp add aparajithn-agent-scraper-mcp -- npx Yes, aparajithn/agent-scraper-mcp MCP is free — one-click install via Unyly at no cost.
No, aparajithn/agent-scraper-mcp runs without API keys or environment variables.
Self-hosted: the server runs locally on your machine via the install command above.
Open aparajithn/agent-scraper-mcp on unyly.org, pick your client tab (Claude Desktop, Claude Code, Cursor) and press Install — the config is generated automatically, no JSON editing.
Browser automation, scraping, screenshots
by MicrosoftBrowser automation and web scraping.
by modelcontextprotocolPlugin-based MCP server + Chrome extension that gives AI agents access to web applications through the user's authenticated browser session. 100+ plugins with a
by opentabs-dev1,500+ developer infrastructure deals, free tiers, and startup programs across 54 categories. Search deals, compare vendors, plan stacks, and track pricing chan
by robhunterNot sure what to pick?
Find your stack in 60 seconds
Author?
Embed badge for your README
Browse similar
All browse MCPs