kreuzcrawl, an open source Rust crawling engine with 11 language bindings
Posted by Eastern-Surround7763@reddit | LocalLLaMA | View on Reddit | 2 comments
kreuzcrawl is a high-performance web crawling engine. It was designed to reliably extract structured data, operating natively across multiple languages without enforcing a specific runtime. See here: https://github.com/kreuzberg-dev/kreuzcrawl
The MCP server is integrated from the start, enabling web-crawling AI agents as a primary use case. Streaming crawl events allow real-time progress tracking. Batch operations handle hundreds of URLs concurrently and tolerate partial failures. Browser rendering supports JavaScript-heavy SPAs and includes WAF detection.
Supported language interfaces are Rust, Python, Typescript/Node.js, Go, Ruby, Java, C#, PHP, Elixir, WASM, and C FFI, and each binding connects directly to the core engine.
Kreuzcrawl is part of the Kreuzberg org: https://kreuzberg.dev/
Feedback and contributions are welcome:)
Icy_Host_1975@reddit
http transport is in the mcp spec already — most servers support sse or streamable-http as a drop-in transport swap. the stdio limitation is usually just an implementation choice, not an architectural one. vibe browser mcp runs over http in your real signed-in browser session if you want a reference impl for how the transport binding looks. vibebrowser.app/mcp
srigi@reddit
interesting project. What stops me from using is “MCP uses strictly stdio transport”. There are usecases where stdio is isn’t possible and HTTP transport is needed. For example llama-server web UI. Any plans for HTTP transport for MCP?