Discussion about this post

User's avatar
JP's avatar

The ~10-20 tool optimal range tracks with what Cursor found. They enforce a hard cap of 40 because quality tanks beyond that. I recently compared the same web scraping API offered as both an MCP server and a CLI+Skills package; the token efficiency gap was huge. Full repo dissection of both approaches: https://reading.sh/firecrawl-for-ai-agents-skills-vs-mcp-servers-for-web-scraping-051b701b28f9

Pawel Jozefiak's avatar

The scalability limit you identify (~10-20 tools per agent) matches exactly what I hit when rebuilding my autonomous agent (Wiz). Beyond that threshold, context management breaks down and the agent starts making poor tool selection decisions.

Your three patterns - monolithic single agents, agentic workflows, LLM skills - map directly to the evolution I went through. Started monolithic (everything in one agent), moved to workflows (spawn subagents for specific tasks), now experimenting with skills (specialized capabilities the agent can invoke).

What I'm still figuring out is when to use which pattern. Subagents for parallelizable research tasks works well. Skills for domain-specific operations (Notion, email, Discord) makes sense. But the orchestration layer - deciding which pattern to use when - remains more art than science.

I documented this architectural evolution when I rebuilt Wiz from scratch: https://thoughts.jock.pl/p/ai-agent-self-extending-self-fixing-wiz-rebuild-technical-deep-dive-2026 - curious if the hybrid approaches you mention have solved this orchestration problem.

No posts

Ready for more?