Nicolas Phung

Senior Scala/Java Data Engineer

Memory Leak on our API

2018-01-10 nsphungheap

Today, we will talk about how we detect a memory leak on our Houston API. I’m working for Oui Technologie a French company that is a subsidary of SNCF (the French Railroad National Company).

How do we know there’s a memory leak ?

PAO Log

You can see that the memory at the end of this graph is stuck around 50 Gb. The Java garbage collector cannot reclaim the memory anymore. This is a memory leak. When you see something like this, you can begin guess work and look your codes for clues.

But the best factual way is to use a Java profiling tool. Here’s an example with Java Mission Control:

Java Mission Control Leak

Using this tool, we manage to identify that the leak come from the serialization with Kafka. We are using Kryo serialization for our Kafka Message.

What is Kryo ?

Kryo is a fast and efficient object graph serialization framework for Java. The goals of the project are speed, efficiency, and an easy to use API. The project is useful any time objects need to be persisted, whether to a file, database, or over the network.

Kryo kryo = new Kryo();
// ...
Output output = new Output(new FileOutputStream("file.bin"));
SomeClass someObject = ...
kryo.writeObject(output, someObject);
output.close();
// ...
Input input = new Input(new FileInputStream("file.bin"));
SomeClass someObject = kryo.readObject(input, SomeClass.class);
input.close();

What’s the issue ?

The issue is that Output and Input weren’t close properly after usage. This means that each time we serialize/deserialize an object in Kafka, we were losing more and more memory. This is an explanation of the right way to close a stream. In short, why this was consuming more and more memory:

An InputStream ties up a tiny kernel resource, a low level file handle. In addition, the file will be locked to some extent (from delete, renaming), as long as you have it open for read. Lets imagine you didn’t care about the locked file. Eventually, if you need to read another file, and open it with a new InputStream, the kernel sequentially allocates a new descriptor (file stream) for you. This will add up.

Source: Why is it good to close an inputstream

Java Misson Control

For more details about how to use Java Mission Control, you can view this quick video that’ll give you a quick overview (under 10 minutes).