Java & containers: what I wish I knew before I used it

Before getting started, you may want to check my book “5 Steps to an extraordinary career“, which will guide you to build your dream career as a software developer! Click here to check it out.

 

They told you that using Java with containers was great, but they never mentioned that it wouldn’t be that easy, right?

You have all the advantages of containers like isolation, scalability, ease of deployment, and version management, but what about the pitfalls when using it with Java?

Memory management, image size, initialization time … sometimes they can be tricky.

But there’s a way out. There are some best practices that will rescue your application from failing.

From the Dockerfile to the Java updates (from 9 to 14 and beyond), you can have the best of both worlds right in your hands.

Before get going, if you are new to containers, maybe you will find it interesting to watch this video:

When I first heard about containers, the picture that I had in my mind was something like this:

Everything:

  • Organized
  • Orchestrated
  • Just working
  • Beautiful

But the first time that I really tried to do something meaningful with it, especially with Java, the real picture was something like this:

Everything:

  • Not very well organized
  • Not very well orchestrated
  • Not really working
  • Could be *so much* better

Through the course of the time (I mean… years!) I realized that are some common issues when combining Java and containers:

  1. Long build time
  2. Huge image size
  3. Hard maintainability
  4. Resources allocation

And here are my tips that can save you time, money, and headaches when dealing with these issues (even avoiding them).

How to avoid long build time

If you ever tried even to run a “hello world” using containers, you noticed that the build time for image creation is something to consider. If this is true for a “hello world”, imagine for a huge and complex application.

Your life savior to tackle this is caching. And here are 3 simple strategies that will help you… a lot!

1.Mind order for caching

The golden rule here is:

What rarely changes, go first. What changes at most, go last.

This happens because your container engine will create a cache for each command that is running inside your Dockerfile.

Then, when you build a new version of your application and try to build a container image based on it, then the cache for that line equivalent to your application will need to be recreated.

And whatever is after that will also lose its cache.

2.Be specific for caching

If you just say “copy whatever that is in the folder to the container”, guess what? It can be changed for whatever reason and… it will break the cache.

Do not copy folders, copy files.

It will decrease a lot the chance of breaking the cache for no reason (example: someone copied a file to that folder accidentally).

3.Group units for caching

If the container engine will create a cache for each line of command in your Dockerfile… what would happen if you group lines (when possible)?

Yes, you answered correctly: it will help the cache management of your container image.

Whenever is possible, group commands/lines that can be grouped.

How to get rid of huge image size

To reduce image sizes you need to reduce what is going inside of it. Looks obvious, but the 3 strategies that I’ll show you right now will prove the opposite.

1.Mind unnecessary dependencies

In the example above we are doing two things:

  • Removing “ssh” and “vim” from the container. Why will you need them inside a container? If it’s for debugging times, you can always install them when needed
  • Using “–no-install-recommends” flag. This is an option for the APT manager that will prevent from automatically installing recommended packages. It can save a lot of storage space.

If you don’t need it, don’t install it in your container.

2.Eliminate package manager cache

Your laptop needs the package manager’s cache. Your server needs. Your PC needs. Your container doesn’t.

Remove from your container any cache that is not related to it.

3.Use optimized tools and frameworks

If you are using something that will make your application big by default, guess what? It will make your container image bigger. There’s no free lunch…

I have two recommendations on this matter:

  • Build your Java applications using Quarkus. It will make your package smaller. By far;
  • Depending on your use case, you can consider using native-images create with Graal VM. It will package only the dependencies that are needed in a self-executable file.

Do not use something that will make your application bigger for no reason.

How to stay away from hard maintainability

When you create a Dockerfile for your container image, there’s no difference from whatever code you create: you’ll need to maintain it.

So, for the same reasons, if there’s some way to ease it… why not take it?

1.Use official images

What if you can base your image in a pre-built image that was built following these and many other best practices?

Well, you can. Most of the biggest projects that publish container images for broad usage are following the best practices for containers.

I cannot say for whatever technology, but I can say it for sure, for example, for OpenJDK.

So instead of building a great container image for Java, build yours based on the OpenJDK public images.

2.Be specific with tags

When you don’t use some tag specifically, you are using the “latest”. And what’s the problem with it?

Well… whenever the latest is updated, your image will also be updated. It can be harmless, it can change the application’s behavior, or it can even break the application.

And, come on, this should be the same approach that you use with your applications in general, right?

Don’t ever, ever, use the latest tag. Be specific.

3.Choose minimal size images

Check this comparative:

So, in the previous example, if you just use the “8” tag, your container is starting with 510 MB of size. Without your application!

The proper approach should be to start with “8-jre-alpine” and, in case you need something there is available only in the “…-slim”, there you go. Pick it.

Always start with the smallest base image as possible.

How to manage resources allocation

By far, the biggest issues I ever faced when using Java with containers were related to resource allocation. Mainly memory and CPU.

But luckily this is something that has been tackled release after release, and it’s worthy to cover this path here (hope I’m able to cover the version you are using).

Containers, in the way that we use them today, have originated from something called cgroup. Accordingly to an article of the Linux Jornal (URL at the end of this post):

Control groups (cgroups) is a kernel feature that limits, accounts for and isolates the CPU, memory, disk I/O and network’s usage of one or more processes.

Alright, so what’s the matter?

The matter is that when Docker containers came to the IT industry, Java was already 10 years old or so, and the JVM just wasn’t aware of this thing called cgroup.

So the JVM, in the scenario, just go ahead and allocate memory looking to the host resources, not the container resources. Now you do the math!

Now let’s follow the Java history related to containers and how to deal with resource allocation.

Java 8u121 and before

If you are using Java 8u121 and before, I would say: don’t use it with containers.

Ok, maybe you just have to… so I would say: use this “hack” created by the Fabric8 folks:

This is a small part of the script that will help your JVM recognize the container resources, not the host resources. The URL for the full project is at the end of this post.

Java 8u131 and Java 9

Some flags were introduced to the platform:

  • ParallelGCThreads: helps to limit the cpu usage of a container
  • UseCGroupMemoryLimitForHeap: JVM uses the cgroups limits to calculate memory defaults
  • MaxRAMFraction: percentage of available RAM that can be used

Java 8u191 and Java 10

Those previous flags were deprecated and these new flags were introduced:

  • InitialRAMPercentage: initial percentage of heap allocation
  • MaxRAMPercentage: maximum percentage of heap allocation
  • MinRAMPercentage: minimum percentage of heap allocation

And with the JDK-8196595 the number of CPUs is calculated from container allocation by default.

Java 11

I would say: if you are fresh starting with Java and containers, start with Java 11.

Some improvements in this version:

  • -XshowSettings (Container Metrics): display the system or container configuration
  • JDK-8197867: improve CPU calculations for both containers and JVM hotspot (see PreferContainerQuotaForCPUCount)

Java 12 and 13

  • jhsdb now can be attached to Java processes running in containers (JDK-8205992)
  • Container support improved for Java Flight Recorder (JDK-8203359)
  • Improve systemd slice memory limit support (JDK-8217338)

Java 14

  • JFR Event Streaming: expose JDK Flight Recorder data for continuous monitoring (easier for observability in clusters)
  • Packaging Tool: tool for packaging self-contained Java applications (incubator)

Wrapping up

Straight to the point:

  • Yes, Java and containers can get along!
  • Be intentional when building your Dockerfiles
  • Better start with Java 11+
  • If you *really* need 8 (why?), be extra cautious

I hope this is useful to you! And I would love to read your stories and comments about this “Java & Containers” thing. Go ahead and write it down below.

References

3 thoughts on “Java & containers: what I wish I knew before I used it”

  1. Yoshiro Ozawa says:

    Hi Elder, I am facing an issue with native image GC inside GKE Kubernetes pod. I have not set any pod specific memory constraints yet but after reading your article, I think I should. I see the memory keeps going up to 2 GB and comes crashing down even if the application is not serving any requests. I need to set “UseCGroupMemoryLimitForHeap” parameter but not sure if it should an argument to the native image itself inside the docker or should it be passed while building the image? Any example would be awesome.
    Thanks,
    Yoshiro

  2. Larry Cable says:

    Look at using jlink in containers even if your application is not modular you can jlink your jvm and reduce your footprint vs jdk and run your classes against that

    1. Elder Moraes says:

      You are right! Maybe someday I’ll update the article to add a jlink section.

Leave a Reply

Your email address will not be published. Required fields are marked *