Emergence of Large Action Models (LAMs) and Their Impact on AI Agents
While LLMs are great for understanding and producing unstructured content, LAMs are designed to bridge the gap by turning language into structured, executable actions.
Introduction
As I have mentioned in the past, Autonomous AI Agents powered by large language models (LLMs) have recently emerged as a key focus of research, driving the development of concepts like agentic applications, agentic retrieval-augmented generation (RAG), and agentic discovery.
However, according to Salesforce AI Research, the open-source community continues to face significant challenges in building specialised models tailored for these tasks.
A major hurdle is the scarcity of high-quality, agent-specific datasets, coupled with the absence of standardised protocols, which complicates the development process.
To bridge this gap, researchers at Salesforce have introduced xLAM, a series of Large Action Models specifically designed for AI agent tasks.
The xLAM series comprises five models, featuring architectures that range from dense to mixture-of-experts, with parameter sizes from 1 billion upwards.
These models aim to advance the capabilities of autonomous agents by providing purpose-built solutions tailored to the complex demands of agentic tasks.
Function Calling
Function calling has become a crucial element in the context of AI agents, particularly from a model capability standpoint, because it significantly extends the functionality of large language models (LLMs) beyond static text generation.
And hence one of the reasons for the advent of Large Action Models which has as one of its main traits the ability to excel at function calling.
AI agents often need to perform actions based on user input, such as retrieving information, scheduling tasks, or performing computations.
Function calling allows the model to generate parameters for these tasks, enabling the agent to trigger external processes like database queries or API calls.
This makes the agent not just reactive, but action-oriented, turning passive responses into dynamic interactions.
Interoperability with External Systems
For AI Agents, sub-tasks involve interacting with various tools. Tools are in turn linked to external systems (CRM systems, financial databases, weather APIs, etc).
Through function calling, LAMs can serve as a broker, providing the necessary data or actions for those systems without needing the model itself to have direct access. This allows for seamless integration with other software environments and tools.
By moving from a LLM to a LAM, the model utility is also expanded, and LAMs can thus be seen as purpose built to act as the centre piece for an agentic implementation.
Large Language Models (LLMs) are designed to handle unstructured inputand output, excelling at tasks like generating human-like text, summarising content, and answering open-ended questions.
LLMs are highly flexible, allowing them to process diverse forms of natural language without needing predefined formats.
However, their outputs can be ambiguous or loosely structured, which can limit their effectiveness for specific task execution. And using a LLM for an agentic implementation is not wrong, and serves the purpose quite well.
But Large Action Models (LAMs) can be considered as purpose built, focusing on structuring outputs by generating precise parameters or instructions for specific actions, making them suitable for tasks that require clear and actionable results, such as function calling or API interactions.
While LLMs are great for understanding and producing unstructured content, LAMs are designed to bridge the gap by turning language into structured, executable actions.
Overall, in the context of AI agents, function calling enables more robust, capable, and practical applications by allowing LLMs to serve as a bridge between natural language understanding and actionable tasks within digital systems.
Function calling is valuable across a wide range of use cases, including:
Data retrieval for assistants: An AI assistant can fetch up-to-date customer information from an internal system when a user asks, “When is my delivery date?” before providing a response.
Actionable tasks for assistants: AI assistants can perform tasks like scheduling meetings by aligning user preferences with calendar availability.
Performing computations: A math tutor assistant can execute real-time computations to solve math problems for users. In the example code below, I show this.
Building complex workflows: In data processing, a pipeline can retrieve raw text, convert it into structured data, and store it in a database.
Modifying application UIs: Function calls can dynamically update the user interface, such as placing a pin on a map based on user input.
Something to realise in terms of Function Calling, is that when using the OpenAI API with function calling, the model does not execute functions directly.
Instead, in step 3, the model generates the parameters needed to call your function, leaving it to your code to decide how to handle them — typically by invoking the specified function. Your application retains full control throughout the process.
Below an image from OpenAI regarding function calling…
OpenAI Function Calling Example
The model gpt-4-0613
is set up with function calling capability.
Two very simple tools are created (add_numbers
& subtract_numbers
), which are designed to add and subtract numbers, respectively.
The model listens for requests and decides when to call one of the functions, then the appropriate tool is invoked based on the function name and arguments.
This gives a clear demonstration of how to function calling with GPT models work…
pip install openai==0.28
import openai
import json
# Prompt user to input API key
api_key = input("Please enter your OpenAI API key: ")
openai.api_key = api_key
# Define the tools: an addition function and a subtraction function
def add_numbers(a, b):
return {"result": a + b}
def subtract_numbers(a, b):
return {"result": a - b}
# Define the function schema for OpenAI function calling
functions = [
{
"name": "add_numbers",
"description": "Add two numbers together",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to add"
},
"b": {
"type": "number",
"description": "The second number to add"
}
},
"required": ["a", "b"]
}
},
{
"name": "subtract_numbers",
"description": "Subtract one number from another",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The number to subtract from"
},
"b": {
"type": "number",
"description": "The number to subtract"
}
},
"required": ["a", "b"]
}
}
]
# Define a function to handle the function calling based on the function name
def handle_function_call(function_name, arguments):
if function_name == "add_numbers":
return add_numbers(arguments['a'], arguments['b'])
elif function_name == "subtract_numbers":
return subtract_numbers(arguments['a'], arguments['b'])
else:
raise ValueError(f"Unknown function: {function_name}")
# Prompting the model with function calling
def call_gpt(prompt):
response = openai.ChatCompletion.create(
model="gpt-4-0613", # gpt-4-0613 supports function calling
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
],
functions=functions,
function_call="auto" # This allows the model to decide which function to call
)
# Check if the model wants to call a function
message = response["choices"][0]["message"]
if "function_call" in message:
function_name = message["function_call"]["name"]
arguments = json.loads(message["function_call"]["arguments"])
result = handle_function_call(function_name, arguments)
return f"Function called: {function_name}, Result: {result['result']}"
else:
return message["content"]
# Test the app
while True:
user_input = input("Enter a math problem (addition or subtraction) or 'exit' to quit: ")
if user_input.lower() == "exit":
break
response = call_gpt(user_input)
print(response)
And the interaction with the notebook below, notice how
I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.