The release notes for Quarkus 3.37.0.CR1 read like a busy week. A wave of entries adds an “AI skill” to specific extensions, including quarkus-quartz, quarkus-hibernate-reactive, quarkus-redis-cache, and quarkus-security-jpa. One line introduces the ability to get response metadata in a streamed response. One enables Jackson reflection-free serializers by default. Two of these lines, read slowly, change something structural about how a Quarkus codebase gets built and operated.
The AI skill moved inside the extension
The interesting word in “Add AI skill for quarkus-quartz” is for. The skill ships attached to a single extension rather than as a global description of Quarkus living somewhere central.
That placement is the whole argument. A coding agent working in your project does not need Quarkus in the abstract; it needs the version of quarkus-quartz that is actually on your classpath, with the configuration keys and patterns that version supports. When the skill travels with the extension, the agent reads patterns matched to what you installed, instead of patterns averaged over every version ever published to the open web.
The alternative most teams settle for is a single, generic body of “how do I do X in Quarkus” knowledge that an agent queries, one description that knows the framework in the abstract and quickly drifts out of sync. The 3.37.0.CR1 approach inverts the arrangement. Each extension carries its own agent-facing knowledge as a first-class, version-matched artifact, delivered to the agent through the Quarkus Agent MCP server rather than bolted onto the whole framework as one description.
Notice the kinds of extension getting skills in this batch: a scheduler, a reactive ORM, a cache, a security integration. These are the parts where the “right” usage is full of small, version-specific decisions that an agent routinely gets subtly wrong. That is the correct place to start.
Where hallucinated configs actually come from
Most wrong Quarkus configuration an agent emits started as real configuration: for a different version, or a property renamed two releases ago, or an extension that was later split. The agent learned it from public text that carries no version stamp, so it cannot tell that the snippet it is confidently reproducing stopped being correct eighteen months ago.
A skill that ships with the extension carries that stamp implicitly. The knowledge an agent reads is the knowledge that matches the artifact resolved into your build. The failure mode persists, but its main fuel source, version-blind text scraped from everywhere, stops being the default input. For a senior reviewer, that is the difference between catching one plausible-but-stale config in review and catching five.
This is the part of the release that compounds. Skills attached to extensions are something every extension can grow over time, and every one that does makes the agent a little less dependent on the public internet for your stack.
The line that touches your production numbers
“Introduce ability to get response metadata in streamed response” is the entry nobody will quote. The capability it points at, reading what is around a streamed response rather than just the body, is the kind of thing that quietly changes a dashboard. If you stream LLM output in Quarkus with the LangChain4j extension, you push tokens to the client as they arrive. The text is the easy part. The awkward part has always been everything around it: how many tokens the call consumed, and why the model stopped. That metadata is what you bill against, alert on, and debug with. Capturing token usage and finish reason without buffering the whole response is something the LangChain4j extension already supports through the completed ChatResponse. What this changelog line actually adds lives on the Quarkus REST Client side: a way to read the HTTP status code and headers of a streamed response, rather than the body alone. Before it, reading those meant falling back to the raw Vert.x HTTP client.
Here is the shape of consuming it in a real project. The dependencies first:
<dependencies>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-rest-jackson</artifactId>
</dependency>
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-openai</artifactId>
</dependency>
</dependencies>
There is no separate AI service interface here. Quarkus LangChain4j exposes the configured model as a CDI bean, so the metadata work lives in one service: you inject the default StreamingChatModel and drive it with a StreamingChatResponseHandler. Partial responses flow straight out to the caller as they arrive. When the stream completes, you are handed a ChatResponse whose metadata() carries the finish reason and the token usage. You record it at that point. The text was never buffered:
// src/main/java/com/example/story/StoryService.java
package com.example.story;
import java.util.List;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.chat.response.ChatResponseMetadata;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import io.smallrye.mutiny.Multi;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import org.jboss.logging.Logger;
@ApplicationScoped
public class StoryService {
private static final Logger LOG = Logger.getLogger(StoryService.class);
@Inject
StreamingChatModel model;
public Multi<String> write(String topic) {
ChatRequest request = ChatRequest.builder()
.messages(List.of(UserMessage.from(
"Write a short story about " + topic + ". Keep it under 200 words.")))
.build();
return Multi.createFrom().<String>emitter(emitter ->
model.chat(request, new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
emitter.emit(partialResponse);
}
@Override
public void onCompleteResponse(ChatResponse response) {
ChatResponseMetadata metadata = response.metadata();
LOG.infof("stream complete: finishReason=%s, tokenUsage=%s",
metadata.finishReason(), metadata.tokenUsage());
emitter.complete();
}
@Override
public void onError(Throwable error) {
emitter.fail(error);
}
}));
}
}
The REST resource stays trivial. It takes input, delegates, and streams the result back as Server-Sent Events. It does not know how the stream is produced:
// src/main/java/com/example/story/StoryResource.java
package com.example.story;
import io.smallrye.mutiny.Multi;
import jakarta.inject.Inject;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;
import org.jboss.resteasy.reactive.RestStreamElementType;
@Path("/stories")
public class StoryResource {
@Inject
StoryService service;
@GET
@Produces(MediaType.SERVER_SENT_EVENTS)
@RestStreamElementType(MediaType.TEXT_PLAIN)
public Multi<String> stream(@QueryParam("topic") String topic) {
return service.write(topic);
}
}
Run it with ./mvnw quarkus:dev and consume the stream with curl, which is the right tool for SSE since the response is open-ended:
curl -N "http://localhost:8080/stories?topic=a%20lighthouse%20keeper"
The token usage and finish reason land in your logs the moment the stream closes, on the same call that streamed the text. The completed ChatResponse from the LangChain4j extension is where that payoff comes from: streaming and observability stop being a trade-off you make per endpoint. The placement of the metadata capture matters as much as the feature. It sits in the service, so the resource stays a thin transport and the orchestration has one home.
Reflection-free serializers, now the default
“Enable Jackson reflection-free serializers by default” is pure plumbing, and it ships switched on rather than waiting behind a flag. Reflection is the part of JSON serialization that GraalVM Native Image has always had to be told about, registration by registration. A reflection-free path is friendlier to native compilation and trims the reflective work the application does to move objects in and out of JSON.
The part that matters is the default. A capability you opt into helps the teams that already knew to look for it. A default helps the apps that never tuned serialization at all, which is most of them, and it applies on the next upgrade with no code change. That is the quiet kind of improvement that shows up as a slightly leaner build and a slightly cheaper request path across a whole fleet of services nobody is actively optimizing.
Reading a changelog like a senior
Most of the noteworthy lines here are about coding agents, so it is tempting to file the whole release under “AI” and move on. The more useful read sorts the lines by what they actually touch.
The AI skills line is architectural: it reshapes how your codebase and your agent relate over the long run, and compounds as more extensions ship skills of their own. The streaming-metadata line is about observability: it changes what you can see in production today, on the calls you are already making. The reflection-free serializer line is about runtime cost: it touches what your app does on every request, by default.
A candidate release rewards a careful read for exactly this reason. The line that gets the attention is the one that demos well. The lines worth acting on are usually the ones that change your numbers without asking for a press release.