Apple recently introduced ToolSandbox, a framework for stateful, conversational, interactive evaluation of LLM tool use capabilities.
AI Agent Evaluation Framework From Apple
Apple recently introduced ToolSandbox, a framework for stateful, conversational, interactive evaluation of LLM tool use capabilities.