Agentic AI to Support Conversations

Now that we have understood the different layers in a conversation let us think about what it means for Agentic AI as an enabler.

Key Concept: If you can create steps using deterministic technologies (e.g., workflow engines like PEGA) mapped to actions then you do not need the complexity of agents. This means you have a finite domain of jobs that need to be done. For example: a pizza chain has a few well defined ‘jobs to be done’ and therefore does not need the complexity of an agentic system. Probably this is why the NLP chatbot examples from 2016 were all about ordering pizza and they worked really well!

Agents will start to become a real option if you find that as you step down the first three layers (Utterance, Intent, Action) there is a growing level of complexity that requires some level of decomposition and cannot be applied as a ‘one-size fits all’ process automation.

Decomposition

This decomposition is where the Agent pattern is amazing. For example at the Action level you can have a product agnostic Agent whose goal is to collect the information from a customer and onboard them to the org systems and another Agent could be tasked with fraud detection.

Parallelism

Agents can go about their work happily in parallel, working off the same input. They have well defined methods of interacting with each other and request support from other Agents as needed (e.g., an onboarding Agent for vulnerable customers).

Composition

Agents can also work in a hierarchy where Agents get increasingly specialised (e.g., such as those that implement specific Steps within an Action) to ensure we do not end up with monolith high level agents and can compose Step specific agents across different journeys. An agent that checks for specific types of fraud, or one that is good at collecting and organising semi-structured information (e.g., customer’s expenses) can be used across a number of different journeys as long as it can be assigned a job by the agents dealing with the conversation.

Handoff

We can clearly see two specific types of handoffs:

Conversational Handoff – here an agent hands over the interaction to another agent. For example: a particular customer was identified as a vulnerable customer by the primary contact agent and then transferred over to an agent that has been specially created for the task. The speciality of the agent can stem from custom prompts and governance, custom fine-tuned LLM, or a combination of the two. There may also be specific process changes in that scenario or an early escalation to a human agent.

The receiving agent has the option of not accepting the handoff therefore the sending agent must be prepared to deal with this scenario.

Once the receiving agent accepts the handoff the sending agent has no further role to play.

Task Handoff – in this case we can compose a particular bunch of functionality through task decomposition and handoffs. For example at the Step level we maybe have each Step implemented by a different Agent.

Taking the example from the previous post:

  1. Collect basic information to register a new customer.
  2. [Do Fraud checks]
  3. {Create customer’s record.}
  4. {Create customer’s application within the customer’s account.}
  5. Collect personal information.
  6. Collect expense information.
  7. Collect employment information.
  8. Seek permission for credit check
  9. [Do credit check or stop application.]

The driver agent is carrying out the steps in italics. Then it is decomposing the tasks at the next level of detail between the fraud step (square brackets) and the creating the customer records step (curly brackets). These could be given to two different agents.

In this case the driver agent will decide how to divide the tasks and which agent to hand over which sub-task to. The driver will also be responsible for handling any errors, unexpected responses and the final response received from any of the support agents.

The support agents can refuse to accept a sub-task (or not) depending on the specific scenario).

An Important Agent Design Decision

Now we come to a critical decision for designing our agents. The trade-off between sending task to the data vs fetching data for the task vs a centralising tendency for both. Let us dig a bit deeper.

Sending the Task to the Data:

This is where the Orchestrating Agent drives the process by sending the task to a Serving Agent that is closer to the data. The Serving Agent processes the data as per the task requirement and returns only the results to the Orchestrating Agent. This is required in many situations such as:

  1. Data is sensitive and cannot be accessed directly.
  2. Data processing requires extensive context and knowledge.
  3. Data processing is time consuming.
  4. Associated data has specific usage conditions attached to it.
  5. Results need to be ‘post-processed’ before being returned – e.g., checking for PII.

This is what happens when we seek expert advice or professional help. For example if we want to apply for a mortgage we provide the task (e.g., amount to be borrowed) to an expert (mortgage advisor) who then looks at all the data and provides suitable options (results) for us to evaluate.

We can see this type of Agentic interaction in the future where in a ‘Compare the Market’ scenario our Apple/Google/OpenAI agent becomes the Orchestrating Agent for a large number of Serving Agents operated by different lenders/providers.

Currently Googles A2A protocol attempts to provide this kind of ‘task transfer’ across organisational boundaries. This task transfer requires many layers of security, tracking, negotiations, and authorisation. Given the current state of A2A there are still gaps.

Security and Authorisation: the security posture and authorisation needs to be mixed. The agent operating on the data (Serving Agent) may require access to additional data that the Orchestrating Agent does not have access to. For example, interest rates and discounts. Further, the Orchestrating Agent may need to authorise the Serving Agent to access data owned by the requester. For example, requesters credit history.

Tracking and Negotiations: the tracking of tasks and negotiations before and during the task execution is critical. For example, when going through complex transactions like a mortgage application there is constant tracking and negotiations between the requester and the mortgage advisor.

Fetching the Data for the Task:

Now let us reverse the above example. We fetch the data required for an Agent to complete its task. This will be done through Tools using Framework-based tooling or MCP (for inter-organisational tool use).

There are many scenarios where this pattern is required. The common theme being the task is not easily transferable due to extensive knowledge requirements, cost, context, or regulations. For example, a personal finance advisor works in this way. The advisor does not forward the task to another agent as it is a regulatory requirement for the person dealing with the application to have specific training and certifications.

Here the key task is what data is required vs good to have, how is the data to be gathered, the time intervals for data gathering, and the relative sensitivity of the data being gathered (and therefore the risk holding that data brings). There is also an ethical dilemma in this where what information should be disregarded or not gathered.

I will bring out the ethical dilemma as the other issues are well understood. Imagine you are talking with an AI Insurance Agent, looking to buy travel insurance to your upcoming trip to a city famous for its water-sports. Let us say you mention by accident ‘I love deep sea diving’. Now the Agent asks you if you plan on participating in any water-sports and you reply ‘No, I am just going to relax there’. The ethical dilemma is whether the AI should take your response on face-value and forget about your love for deep sea diving or should it ignore. The choice will impact the perceived risk and therefore the premium. It may collect more data to improve its assessment and also provide a clear disclaimer to the requester that they will not be covered for any water-sports related claims.

There are various mechanism available to solve all of the above problems except the ethical dilemma. That is why we need the next style.

Centralising Data and Task:

In this case we send the data and the task (independently or as part of a deterministic process) to a third agent to process and respond.

This style is particularly important when we want one way of doing something which is applicable across a wider variety of tasks and data. Think of a judge in a court – they get cases pertaining to different laws. The same judge will process them.

The classic example for this is ‘LLM-as-a-Judge’ where we provide the task and data (including LLM response) to a different LLM to evaluate the response on some pre-defined criteria. These are usually implemented using a deterministic orchestration flow.

In our water-sports insurance journey we would have sent the final conversation and data (about the customer and the eligible products) to a validator LLM to ensure best possible customer outcome including sending communications to correct any mis-selling.

This can be risky in its own right – especially if the task and different parts of the data are coming from different sources. Even one slight issue can lead to sub-optimal outcomes.

1 Comment

Leave a Comment