The Modern Java Stack for Agentic AI: Quarkus and Langchain4j

The way technology changes often follows familiar patterns. The jump from static, information-only HTML pages in the early 1990s to dynamic, interactive web applications was a huge shift. It turned the web from a passive library into an active platform for business and communication. Today, Artificial Intelligence is at a similar turning point. The first wave of Large Language Models (LLMs) gave us powerful “answer engines,” which are a lot like those early static web pages. They can find and generate information with impressive fluency, but their role is mostly passive.

The next step is the move from these answer engines to “action engines.” This is the world of agentic AI. An agentic system turns the LLM from a static source of content into an active partner that works towards a goal. It can plan, manage complex workflows, and connect with external systems to do things on a user’s behalf. This isn’t just a small improvement; it’s a fundamental change in how we design and build intelligent applications.

For enterprise Java developers, this moment is a big opportunity. The skills you’ve built over the years, like creating robust, scalable, and secure systems, are exactly what’s needed to build this new generation of agentic applications. The challenge isn’t building the LLM itself, but creating the powerful, reliable systems that use an LLM as their core brain. This guide will walk you through this new area. It breaks down the agentic model, gives a practical guide to building your first agent with a modern Java stack, and shows how to handle real-world challenges like performance, security, and monitoring.

Deconstructing the AI Agent: More Than Just a Smarter API Call

To build effective agents, you first need to understand that an agent is an architectural pattern, not just a smarter API call. It represents a shift from a simple request-response model to a coordinated, goal-oriented workflow where the AI is an active participant.

Defining the Agentic Paradigm

An AI agent is an autonomous system that uses an LLM as its main reasoning engine to achieve a goal. Unlike a standard LLM chat, which is reactive, an agent is proactive. It works by breaking down a high-level goal into a series of smaller, manageable steps, a process called task decomposition. This completely changes the interaction. Instead of giving a step-by-step prompt, the developer or user gives a high-level goal, and the agent figures out and executes a plan to achieve it on its own.

This ability moves AI from a tool that helps with simple, routine tasks to a system that can automate complex, multi-step business processes. The LLM changes from being the final stop for a query to being the central coordinator of a workflow; it becomes an “action engine” instead of just an “answer engine.”

The Core Components of an Agent

An agent’s ability to act on its own comes from a set of connected components that work with the core LLM. These parts turn the model from a passive text generator into a system that can understand, reason, and act in its environment.

  • Planning & Reasoning Engine: This is the brain of the agent. It’s responsible for creating a plan to reach the user’s goal. This includes analyzing the goal, figuring out the necessary steps, weighing different options, and changing the plan as new information comes in. Frameworks often use well-known reasoning patterns to structure this process. Two common examples are:
    • ReAct (Reason + Act): This pattern uses a loop where the agent first “thinks” about the problem, decides on an action, does it, sees the result, and then uses that result to plan its next thought and action. This continuous feedback makes it great for tasks where things are always changing.
    • ReWOO (Reason + Work + Output): This is another workflow where the agent first creates a more detailed plan, then calls all the tools it needs to gather information (the “Work” part), and finally uses all the collected results to create a final answer. This can be faster and use fewer resources by cutting down on the back-and-forth with the LLM.
  • Memory: An agent can’t be effective if it has no memory. Memory is what allows an agent to keep track of the conversation, learn from past interactions, and follow its progress towards a goal. Memory usually comes in two types:
    • Short-Term Memory: This holds the context for the current task. It includes the user’s original request and the history of what’s happened so far, like previous tool calls and their results. This is essential for any multi-step plan.
    • Long-Term Memory: This gives the agent a permanent knowledge base, letting it recall information from past conversations or tasks. This allows for personalization and the ability to apply old solutions to new problems.
  • Tool Use: Tools are the agent’s connection to the outside world. They are what allow the agent to perform actions and get information beyond what it was trained on. In a business setting, a tool is just a function the agent can call. This could be a REST API, a database query, a web search, or even another AI agent. The ability to use tools is what makes agents useful in the real world, letting them connect to existing company systems and data.

To make these differences clear, the following table provides an analogy, comparing a standard LLM call to a junior assistant and an agentic workflow to a seasoned, autonomous team member.

Feature Standard LLM Call (The Junior Assistant) Agentic Workflow (The Seasoned Team Member)
Initiation Responds to a direct, specific prompt. Takes a high-level, complex goal.
Process Single turn; generates a response based on its training data. Multi-step; creates a plan, executes actions, and reflects on results.
Interaction Stateless; has no memory of past interactions without manual context. Stateful; uses short-term and long-term memory to track progress.
Capabilities Limited to knowledge within the model. Can interact with external systems (APIs, DBs) via tools.
Output A text-based answer to the prompt. A completed task or a synthesized answer derived from multiple tool outputs.

Your First Java Agent: A Practical Walkthrough with Langchain4j and Quarkus

Let’s move from theory to code. This section provides a guide to building a functional AI agent using a modern, high-performance Java stack. The focus is on the developer experience and the power of well-designed tools that make complex interactions easy to manage.

The “Golden Stack” for Enterprise AI

The combination of Java, Quarkus, and Langchain4j provides a great stack for building enterprise-grade AI applications. Each part plays a key role:

  • Java: As the foundation of enterprise software, Java offers a robust, mature, and secure platform. The language is always evolving, with recent improvements in performance and handling many tasks at once, making it a strong choice for demanding AI applications.
  • Quarkus: A very fast and lightweight Java framework designed for modern, cloud-native applications. Its quick startup, low memory usage, and ability to compile to a native executable make it perfect for running efficient and scalable AI services.
  • Langchain4j: A Java library for building LLM-powered applications. It provides high-level building blocks for key patterns like Agents, Tools, and Memory, letting developers work with familiar ideas instead of low-level API details. The quarkus-langchain4j extension smoothly integrates these into the Quarkus ecosystem, providing a simple and productive developer experience.

Step 1: Defining the Agent’s Interface with @RegisterAiService

The starting point for creating an agent in this stack is a simple Java interface with the @RegisterAiService annotation. This annotation is the core of the programming model provided by Quarkus LangChain4j. It tells Quarkus to automatically create the code for that interface when you build the application. This generated code handles all the complex parts of talking to the LLM, like creating prompts, sending requests, and understanding the responses.

This approach separates what the service does (defined by the interface) from how it does it (the complex implementation details). This separation makes it much easier to get started, allowing developers to focus on their application’s logic using familiar Java code.

For example, here is an agent designed to handle customer support questions about orders:

import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;

@RegisterAiService
public interface CustomerSupportAgent {

    String chat(@UserMessage String message);
}

Step 2: Giving the Agent Tools with @Tool

An agent without tools is stuck with only its internal knowledge. To make it useful, you have to give it the ability to connect with external systems. In Langchain4j, a tool is just a method in a CDI bean that you expose to the agent using the @Tool annotation.

Continuing the customer support example, an OrderService could have methods to get an order’s status and shipping information:

import dev.langchain4j.agent.tool.Tool;
import jakarta.enterprise.context.ApplicationScoped;

@ApplicationScoped
public class OrderService {

    @Tool("gets the current status of an order given its unique order ID")
    public String getOrderStatus(int orderId) {
        // Implementation to query a database or call an order management API
        return "SHIPPED";
    }

    @Tool("gets the detailed shipping information, including carrier and tracking number, for a given order ID")
    public ShippingInfo getShippingInfo(int orderId) {
        // Implementation to call a shipping service API
        return new ShippingInfo("CarrierX", "CX987654321");
    }
}

A very important part of this code is the plain-language description inside the @Tool annotation. This description isn’t a comment for the developer; it’s the main contract for the LLM. The agent uses these descriptions to figure out which tool is right for a task and what information it needs. A clear and precise tool description is essential for the agent to work correctly and reliably.

Step 3: Wiring It All Together

With the agent interface and tools defined, the final step is to connect them. You do this by updating the @RegisterAiService annotation to tell it which tools the agent can use.

A key point is that any agent using tools must also have a ChatMemoryProvider configured. This is because using tools is always a multi-step process. The agent needs to remember the conversation history, including its own decisions to call tools and the results of those calls, to make a plan and give a final, clear response.

import io.quarkiverse.langchain4j.RegisterAiService;
import dev.langchain4j.service.UserMessage;

// Assuming a ChatMemoryProvider bean named "MyMemoryProvider" is configured
@RegisterAiService(tools = OrderService.class,
                 chatMemoryProvider = MyMemoryProvider.class)
public interface CustomerSupportAgent {

    String chat(@UserMessage String message);
}

Tracing an Agentic Workflow

To understand the value of the framework, consider what happens behind the scenes when a user asks the agent: "What's the status of my order #12345, and where is it shipping to?"

The framework manages a complex conversation with the LLM, which is completely hidden from the developer:

  1. Initial Request: Quarkus Langchain4j sends the user’s message to the LLM, along with the descriptions of the available tools (getOrderStatus and getShippingInfo).
  2. First LLM Response (Tool Call): The LLM analyzes the request and the tool descriptions. It decides it first needs to know the order’s status. It doesn’t generate a text answer. Instead, it responds with a structured request to run a tool: getOrderStatus(12345).
  3. Framework Action: The framework sees this response, understands the tool request, and calls the actual OrderService.getOrderStatus(int) Java method. The method returns the string “SHIPPED”.
  4. Second Request: The framework now creates a new request for the LLM. This request contains the entire conversation history: the user’s original message, the LLM’s first response asking for the tool call, and a new message with the output from that tool (“SHIPPED”).
  5. Second LLM Response (Tool Call): The LLM processes this new context. It now knows the order has shipped and figures out the next logical step is to answer the second part of the user’s question. It responds with another tool request: getShippingInfo(12345).
  6. Framework Action: The framework again calls the corresponding Java method, which returns a ShippingInfo object.
  7. Final Request: The framework sends one last request to the LLM, again including the full conversation history, now updated with the result of the second tool call.
  8. Final LLM Response (Synthesis): The LLM now has all the information it needs to completely answer the user’s original question. It combines the information from the tools into a final, natural language response, like: “Your order #12345 has been shipped. It is being sent via CarrierX with tracking number CX987654321.”

This entire multi-step process of reasoning, action, and observation is managed automatically by the framework, showing its power in simplifying the development of complex, agentic systems.

Enterprise-Grade Agents: Concurrency, Context, and the Power of Modern Java

Building a simple demo is one thing. Deploying a robust, scalable, and secure agent in a demanding enterprise environment is another. This final section covers key real-world challenges and shows how the modern Java ecosystem, especially features in Java and the deep integration in Quarkus, provides powerful and elegant solutions.

Challenge 1: Scalability Under Load with Blocking Tools

The Problem: An agent is often useful because it can talk to existing company systems. These systems can sometimes be slow, legacy services with blocking APIs (like an old SOAP web service or a complex database query). In a traditional system, if multiple users talk to the agent at once, each request that calls a slow tool will tie up a valuable system thread. This can quickly use up all available threads, hurting the application’s performance and ability to scale.

The Modern Java Solution: Java continues to improve virtual threads (from Project Loom), a lightweight thread implementation managed by the JVM. Virtual threads are designed for tasks that spend a lot of time waiting for I/O. They allow a small number of system threads to handle thousands, or even millions, of concurrent operations. When a virtual thread hits a blocking operation, it steps off its system thread, freeing it up to work on other tasks. This greatly improves scalability without needing complex asynchronous code.

The Quarkus Langchain4j Implementation: The powerful combination of the Java platform and a modern framework like Quarkus is clear here. The framework provides a simple way to use this powerful JDK feature. By adding the @RunOnVirtualThread annotation to a tool method, a developer tells Quarkus to automatically run that tool on a virtual thread, making sure the main request-handling thread isn’t blocked.

import dev.langchain4j.agent.tool.Tool;
import io.smallrye.common.annotation.RunOnVirtualThread;

@Tool("calls a slow legacy system to get customer details")
@RunOnVirtualThread
public CustomerDetails getLegacyDetails(int customerId) {
    // This slow, blocking call will not tie up a platform thread
    return legacyApiClient.fetchDetails(customerId);
}

This seamless integration shows how improvements in the core Java platform directly benefit application developers using a modern framework. It allows teams to solve complex performance problems with a simple annotation, which shows the power of a well-integrated system.

Challenge 2: Secure and Reliable Context Propagation

The Problem: Enterprise applications are rarely simple. A request needs to carry context, like a security identity for authorization, a trace ID for monitoring, or a tenant ID in a multi-tenant system. Passing this context reliably through the entire request is critical. This becomes very difficult in an agentic workflow, where a single user request can trigger a series of unpredictable, asynchronous tool calls on different threads.

The Modern Java Solution: Java further refines Scoped Values, the modern, safe, and performant option for ThreadLocal. Scoped Values are designed to share data reliably down a call stack, even across thread boundaries. They are built for the era of structured concurrency and virtual threads, providing a robust way to pass context.

Challenge 3: Observability in a Black Box

The Problem: An agentic workflow, with its dynamic, multi-step nature, can feel like a black box. When a user’s request fails or is slow, how can a developer figure out what went wrong? It’s impossible to know if the problem is in the initial prompt, the LLM’s reasoning, a specific tool, or the final response. Without deep visibility, monitoring performance, debugging failures, and tracking costs becomes nearly impossible.

The Quarkus Solution: Recognizing this challenge, Quarkus provides a first-class, integrated observability solution that turns the black box into a glass box.

  • Logging: By setting simple configuration properties (quarkus.langchain4j.log-requests=true, quarkus.langchain4j.log-responses=true), developers can get a detailed, human-readable log of the entire conversation between the application and the LLM. This is often the first and most valuable tool for debugging an agent’s behavior.
  • Metrics: The quarkus-micrometer extension automatically captures and exposes a rich set of metrics related to LLM interactions. These can provide insights into performance, like response times, and even allow for tracking token usage, which is critical for managing the operational expense of AI services.
  • Tracing: Through its integration with OpenTelemetry, Quarkus can generate distributed traces that visualize the entire end-to-end flow of an agentic request. A developer can see the initial REST call, each reasoning step performed by the agent, the execution time of every tool call, and the final response generation, all in a single, unified view. This is essential for identifying and resolving performance bottlenecks.

This comprehensive, out-of-the-box observability shows that with the right stack, production readiness is not an afterthought but a core part of the development experience.

Conclusion

The rise of agentic AI is a key moment in software development, creating an opportunity to build a new class of proactive, intelligent applications. This guide has shown that the modern Java ecosystem, with the performance of the JDK, the efficiency of the Quarkus framework, and the powerful tools of Langchain4j, is a great fit for this new world. By providing a simple development model, seamless integration with modern platform features like virtual threads, and a built-in, enterprise-grade approach to things like security and observability, this stack empowers Java developers to move beyond simple chatbots and start building the sophisticated, autonomous agents that will power the next generation of enterprise software.

Looking beyond this specific stack, the future of agentic AI is being shaped by the critical need for different systems to work together. As companies deploy networks of specialized agents, the ability for these agents to communicate and collaborate becomes very important. This is where new open standards like the Model Context Protocol (MCP) come in. MCP aims to create a common language for agents, allowing a system built with Quarkus and Langchain4j to seamlessly interact with agents built on completely different stacks. This move towards a “mesh” of agents that can be mixed and matched from different providers represents the next frontier, ensuring that the intelligent systems built today are not isolated silos but key parts of a larger, interconnected enterprise intelligence.

 

If you run Java at scale, grab the free whitepaper “The Enterprise Guide to AI in Java (POC to Production)”. Download it here.

2 thoughts on “The Modern Java Stack for Agentic AI: Quarkus and Langchain4j”

  1. Excellent article, Elder! The combination of Quarkus && LangChain4j && AI agents is incredibly. It got me to think: what’s the potential of combining AI and OpenTelemetry? That would be a great topic for a future article 😂.

    Cheers!

    Reply

Leave a Comment