Archive for the ‘Featured’ Category
ClassLoaderLocal: How to avoid ClassLoader leaks on application redeploy
Written by Jevgeni Kabanov on June 15, 2009 – 10:51 am“OutOfMemoryError: PermGen” is a very common message to see after a few redeploys. The reason why it’s so common is that it’s amazingly easy to leak a class loader. It’s enough to hold a single outside reference to an object instantiated from a class loaded by the said class loader to prevent that class loader from being GC-d.
In this post I’ll review how we solved this problem in JavaRebel, and share the solution with you. It’s not a magical solution, but it will help alleviate some of the problems introduced in both libraries and applications in Java EE.
The most common way to leak is to register some kind of a callback object and never deregister it. E.g. look at the following code:
-
Core.addListener(new MyListener());
If Core is a part of the framework/platform/container then it will hold to MyListener long after the application was redeployed and the class loader left hanging.
Let’s see if we can do anything to solve this. The Core implementation looks something like this:
The problem is that listeners provides a strong reference to the Listener object. What if we replace it by a weak one?
-
public class Core {
-
-
void addListener(Listener l) {
-
}
-
-
void fireListeners() {
-
// Exercise for the reader!
-
}
-
}
Unfortunately although this does solve the problem of GC-ing the class loader, it doesn’t really work. The Listener behind the weak reference will be GC-d at first opportunity and after that it’ll no longer receive any callbacks. To illustrate why it’s a problem the code above is basically equivalent to throwing the reference away altogether:
Replacing weak reference with a soft one doesn’t improve the situation, just delays the inevitable a bit further. Both are useful for caches, where objects can be recreated at will, but not in this case where we have an externally created object.
So what do we do? What we’d like to do is have the Listener reference to depend on the class loader somehow. Unfortunately, to the best of my knowledge, there isn’t a ready-made solution for that, and there’s no way to achieve it with any combinations of weak references without causing problems.
What we’d like to have is an ability to add a strong reference to the class loader: in other words have it carry a custom property:
-
void addListener(Listener l) {
-
if (lls == null) {
-
cl.putProperty(“CoreListeners”, lls);
-
}
-
lls.add(l);
-
}
That would work, wouldn’t it? Well, not quite. We also need to save a reference to the class loaders, so that we could later go over all of them. Here the WeakHashMap is useful:
-
-
void addListener(Listener l) {
-
//…
-
}
There’s not WeakHashSet in Java, so we’re just using a boolean flag as the value.
So this would probably work, but unfortunately class loaders don’t have a getProperty()/putProperty() API. However, it turns out that with a bit of a hack we can simulate it, by generating a unique class per class loader to hold the properties for us. Let’s see how it’s done!
We start with a little boilerplate:
-
class ClassLoaderLocalMap {
-
-
static {
-
try {
-
“defineClass”,
-
new Class[] {
-
String.class,
-
byte[].class,
-
int.class,
-
int.class });
-
defineMethod.setAccessible(true);
-
-
findLoadedClass =
-
“findLoadedClass”,
-
findLoadedClass.setAccessible(true);
-
}
-
}
-
}
-
}
This will give us access to ClassLoader protected methods defineClass() and findLoadedClass() later on. Now let’s setup the basic API:
-
public static void put(
-
ClassLoader cl,
-
Object key,
-
// Synchronizing over ClassLoader is safest
-
synchronized (cl) {
-
getLocalMap(cl).put(key, value);
-
}
-
}
-
-
ClassLoader cl,
-
// Synchronizing over ClassLoader is safest
-
synchronized (cl) {
-
return getLocalMap(cl).get(key);
-
}
-
}
getLocalMap() method should return a map of entries associated with the class loader. How should that work?
Next we introduce a map from class loaders to unique holder class names. We also introduce a nextHolderName() method that generates unique names:
-
private static volatile int counter = 1;
-
-
return “ClassLoaderLocalMapHolder$$GEN$$” + counter++;
-
}
Finally we can implement the getLocalMap() method (to save space I removed all exception handling):
-
String holderClassName =
-
if (holderClassName == null) {
-
holderClassName= nextHolderName();
-
classLoaderToHolderClassName.put(
-
cl, holderClassName);
-
}
-
-
Class holderClass =
-
(Class) findLoadedClass.invoke(
-
cl,
-
-
if (holderClass == null) {
-
byte[] classBytes =
-
buildHolderByteCode(holderClassName);
-
-
holderClass = (Class) defineMethod.invoke(cl,
-
holderClassName,
-
classBytes,
-
}
-
-
.getDeclaredField(“localMap”).get(null);
-
}
The last method to implement is buildHolderByteCode. It’s quite trivial and builds the following class renamed to the unique name:
The code can be derived using ASMifier with just a little customization, you can look it up in the full source code.
Although we could now easily implement the original example it makes sense to do just a little bit extra effort and introduce a ClassLoaderLocal, with behavior similar to the ThreadLocal:
-
public class ClassLoaderLocal {
-
-
if (!ClassLoaderProperties.containsKey(cl, key))
-
return null;
-
return ClassLoaderProperties.get(cl, key);
-
}
-
-
ClassLoaderProperties.put(cl, key, value);
-
}
-
}
So the original example now becomes:
-
ClassLoaderLocal cll = new ClassLoaderLocal();
-
-
void addListener(Listener l) {
-
if (lls == null) {
-
cll.set(cl, lls);
-
}
-
lls.add(l);
-
-
}
In this code if any listener comes from a freed class loader, then it will be GC-d from both Core.classLoaders and ClassLoaderProperties.classLoaderToHolderClassName, as both are WeakHashMaps and there are no strong references to the class loaders. The generated ClassLoaderLocalMapHolder$$GEN$$X class will also be GC-d along with the class loader, so we have effectively eliminated a class loader leak without explicit cleanup calls from the user.
I hope this code will be useful for someone. I cannot give any guarantees whether it will work or not and it’s clearly a hack (though a solid hack). Please use it if you actually understand what is happening.
If you see a bug in the code or have a good suggestion, please be sure to comment. There could be a free JavaRebel license in it for you :)
Full source code: ClassLoaderLocalMap.java, ClassLoaderLocal.java.
Posted in Featured, creative | 31 Comments »
The Ultimate Java Puzzler
Written by Jevgeni Kabanov on February 16, 2009 – 6:01 pmWhy is this particular one the ultimate? Two reasons:
- It’s at the very core of the Java language, not some obscure piece of API.
- It melted my brain when I hit it.
UPDATE 2: If you want to test yourself before reading the post take this test. Results are not saved (it’s a paid feature apparently and I just don’t care enough), but you can post them in the comments.
Let’s start by setting up the puzzler environment. We’ll have three classes in two packages. Classes C1 and C2 will be in package p1:
-
package p1;
-
public class C1 {
-
public int m() {return 1;}
-
}
-
public class C2 extends C1 {
-
public int m() {return 2;}
-
}
Class C3 will be in a separate package p2:
-
package p2;
-
public class C3 extends p1.C2 {
-
public int m() {return 3;}
-
}
We will also have the test class p1.Main with the following main method:
Note that we’re calling the method of C1 on an instance of C3. The output for this example is “3″ as you’d expect. Now let’s change the m() visibility in all three classes to default:
-
public class C1 {
-
/*default*/ int m() {return 1;}
-
}
-
public class C2 extends C1 {
-
/*default*/ int m() {return 2;}
-
}
-
public class C3 extends p1.C2 {
-
/*default*/ int m() {return 3;}
-
}
The output will now be “2″!
Why is that? The Main class that invokes the method does not see the m() method in the C3 class, it being in a separate package. As far as it cares the chain ends with C2. But as C2 is in the same package it overrides the m() method in C1. This does not seem too intuitive, but that’s the way it is.
Now let’s try something different, let’s change the modifier of C3.m() back to public. What will that do?
-
public class C1 {
-
/*default*/ int m() {return 1;}
-
}
-
public class C2 extends C1 {
-
/*default*/ int m() {return 2;}
-
}
-
public class C3 extends p1.C2 {
-
public int m() {return 3;}
-
}
Now Main can clearly see the C3.m() method. But amazingly enough output is still “2″!
Apparently C3.m() is not considered to override C2.m() at all. One way to think about it is overriding methods should have access to the super methods (via super.m()). However in this case C3.m() wouldn’t have access to its super method, as it it not visible to it, being in another package. Therefore C3 is considered to be in a completely different invocation chain from C1 and C2. Were we to call C3.m() directly from Main the output would actually be “3″.
Now let’s look at one last example. Protected is an interesting visibility. It behaves like default for members in the same package and like public for subclasses. What will happen if we change all of the visibilities to protected?
-
public class C1 {
-
protected int m() {return 1;}
-
}
-
public class C2 extends C1 {
-
protected int m() {return 2;}
-
}
-
public class C3 extends p1.C2 {
-
protected int m() {return 3;}
-
}
My reasoning goes like this: as Main is not a subclass of any classes protected should behave as default in this case and output should be “2″. However that is not the case. The crucial thing is that C3.m() has access to super.m() and thus the actual output will be “3″.
Personally, when I first encountered this accessibility issue I got thoroughly confused and couldn’t get it until I did all of this examples through. The intuition I got from this is that if and only if you can access super.m() the subclass is a part of the invocation chain.
UPDATE: Apparently even though the whole thing is obvious to anyone, the intuition I came up with was wrong. A mysterious commenter know only as “C” has provided the following example:
-
public class C1 {
-
/*default*/ int m() {return 1;}
-
}
-
public class C2 extends C1 {
-
/*default*/ int m() {return 2;}
-
}
-
public class C3 extends p1.C2 {
-
/*default*/ int m() {return 3;}
-
}
-
public class C4 extends p2.C3 {
-
/*default*/ int m() {return 4;}
-
}
Note that C4 is in the package p1. If we now change the Main code as follows:
Then it will output “4″. However super.m() is not accessible from C4 and putting @Override on the C4.m() method will stop the code from compiling. At the same time if we change the main method to:
The output will be “3″. This means that C4.m() overrides C2.m() and C1.m(), but not C3.m(). This also makes the issue even more confusing, and the amended intuition is that a method in a subclass overrides a method in a superclass if and only if the method in the superclass is accessible from the subclass. Here superclass can be any ancestor, not necessarily the direct parent and the relation has to be transitive.
For the kicker try reading all of this out from the JVM specification that selects the method to be invoked:
Let C be the class of objectref. The actual method to be invoked is selected by the following lookup procedure:
- If C contains a declaration for an instance method with the same name and descriptor as the resolved method, and the resolved method is accessible from C, then this is the method to be invoked, and the lookup procedure terminates.
- Otherwise, if C has a superclass, this same lookup procedure is performed recursively using the direct superclass of C; the method to be invoked is the result of the recursive invocation of this lookup procedure.
- Otherwise, an AbstractMethodError is raised.
Posted in Featured, meme | 54 Comments »
Correcting the Billion Dollar Mistake
Written by Jevgeni Kabanov on February 1, 2009 – 11:04 pmLast week I visited Stockholm to speak at the JFokus 2009. The event was quite spectacular, but for me the most interesting part occurred on the evening before the conference. I was sitting at the speaker’s dinner with Rickard Öberg, Kirk Pepperdine, Simon Ritter and a couple of others. For some reason or other I started talking to Simon about the problem that’s recently been on my mind. Perhaps it’s been eating me after legendary Tony Hoare said this:
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. [...] This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
Now since I have a background in functional languages, I know that null references are not necessary. Null value is a member of all types, but some types don’t have or need a natural notion of absence. In fact in Haskell there is no such thing as an “absent” value generic to all types — types either have to declare it explicitly or wrap the underlying value into a Maybe type that has a dedicated Nothing value.
So e.g. the List type has a notion of an empty list encoded into the type:
// Roughly: a list is an empty list // or a pair of a value and a list data List a = Nil | Cons a (List a)
While the only way to encode an absent structure is to use Maybe:
// To build a car you need at most one engine // and zero or more wheels :) buildCar :: Maybe Engine -> [Wheel] -> Car
So while I was discussing that with Simon, Kirk banged me on the head to get my attention (just joking, banging me on the head is nowhere near enough to get my attention). Apparently they were discussing more or less the same topic with Rickard, and he had a similar solution for Java. I assumed it was the widely discussed @NotNull annotation, but it was way cooler than that.
Basically in the beginning Rickard started by adding @NotNull support to Qi4J (BTW an amazing piece of engineering, definitely check it out), so that they could automatically insert runtime assertions for the parameters having the @NotNull annotations, something like that:
-
class Test {
-
void transfer(
-
@NotNull Account from,
-
Account to,
-
double amount) {
-
//…
-
}
-
-
// Succeeds
-
transfer(
-
new Account(“Bugs Bunny”), null, 1000000.0);
-
// Throws NullPointerException
-
transfer(
-
null, new Account(“Bugs Bunny”), 1000000.0);
-
}
-
}
However soon after that they discovered that they were inserting @NotNull-s everywhere. So they reverted the notion and decided that they will generate the not null assertions for all parameters except the ones marked as @Optional. And this was the point when I thought to myself “Jevgeni, this is exactly what you were looking for!” So I turn to Rickard and I say “Rickard, this is exactly what I was looking for! This is amazing!” In fact this truly is amazing as it’s exactly captures the semantics of the Maybe type that I liked so much in the functional languages.
After that we go into a discussion whether or not it is possible to validate this assertion during compile-time. Opinions were mixed on this one, though I personally am convinced that it shouldn’t be too easy (your opinion on the topic is welcome, also it will make a great master’s thesis topic). Eventually I get an idea and I say “Rickard, I bet I could implement a JavaRebel plugin that would check this at runtime in about half an hour in 50 lines of code”. And of course Rickard goes “No way!”, so I’m challenged. Next day I sit down for half an hour, then for another half an hour (there was no internet!) and voila — I have a working JavaRebel plugin (in less that 50 lines of code) that will make your methods throw an exception if you try to pass a null reference to a parameter not marked as @Optional (the plugin code deserves another post altogether). Of course I have to show this to Rickard (and he goes “No way!”) and we agree to somehow join forces to promote this sane approach (starting with having a single namespace for the @Optional annotation).
So what do we get? The previous example now looks like this:
-
@OptionalCheck
-
class Test {
-
void transfer(
-
Account from,
-
@Optional Account to,
-
double amount) {
-
//…
-
}
-
-
// Succeeds
-
transfer(
-
new Account(“Bugs Bunny”), null, 1000000.0);
-
// Throws NullPointerException
-
transfer(
-
null, new Account(“Bugs Bunny”), 1000000.0);
-
}
-
}
As you can understand if we make all of the code without the @Optional annotation to throw a NullPointerException for nulls we’ll break a lot of existing code. Therefore at the moment you also have to annotate the class where you want to enable such semantics with @OptionalCheck.
That’s pretty much it — you can download the plugin right away and just drop it in the classpath when JavaRebel is enabled (you’ll need a 2.0 milestone as 1.x was missing the necessary APIs). At the moment both annotations are in the org.optionalalliance package, but when it changes all you have to do is Organize Imports, so I won’t sweat the naming too much. Please do let us know what do you think of the approach and feel free to advocate it further :)
Cheers,
Jevgeni Kabanov
P.S. The plugin along with the source is also avalable in our Maven repository:
- URL: http://repos.zeroturnaround.com/maven2
- Group id:
org.zeroturnaround - Artifact id:
javarebel-optional-check-plugin
Posted in Featured, creative | 31 Comments »
Announcing Squill: Not Another ORM
Written by Jevgeni Kabanov on December 9, 2008 – 4:24 pmRemember that post about Typesafe DSLs that had a part one and no follow up? Well, meanwhile Juhan Aasaru and yours truly were joined by Michael Hunger of jexp.de and JEQUEL and together we have created the Squill project that came right out of the ideas in the paper we wrote with Rein Raudjärv. The announcement follows, enjoy!
It is with great pleasure that we announce the first release of Squill. Download it now or check out the quickstart guide, the step-by-step tutorial and the Devoxx presentation.
Squill is a slick internal DSL for writing SQL queries in pure Java. It uses the database metadata and generics to catch as many errors as possible during compilation and is almost completely typesafe.
At the same time it is designed to allow everything SQL allows you to do, exactly the way SQL is meant to do it. This means that you’re encouraged to select only the data you need and no hidden queries are generated for you, leaving you in full control of the query performance. Squill supports database-specific extensions, allowing you to both use advanced features and fully tweak your queries.
Squill also has special support for CRUD operations and table relations, adding some sugar over vanilla SQL. A typical Squill query looks like this:
-
ComplaintTable c = new ComplaintTable();
-
-
for (Tuple2<String, Integer> tuple2:
-
squill
-
.from(c, c.customer)
-
.where(
-
gt(c.customer.isActive, 0),
-
notNull(c.percentSolved),
-
notNull(c.refoundSum))
-
.orderBy(desc(c.customer.id))
-
.selectList(
-
c.customer.lastName,
-
c.percentSolved)) {
-
“Customer “ + tuple2.v1 + ” has a complaint solved “ + tuple2.v2 + “%”);
-
}
Squill is a very young project and you can follow (and help) its development by joining the user or developer mailing lists.
Tags: dsl, java, sql
Posted in Featured, creative | 2 Comments »
The Performance Cutoff
Written by Jevgeni Kabanov on November 10, 2008 – 12:36 pmA few days ago I had a small epiphany on a simple yet important issue. I was trying to squeeze those last few percents of performance out of JavaRebel and it came to the point where I started optimizing individual hotspots method-by-method.
One of the most common ways to improve execution time of a specific method is the cutoff. For many complicated enough methods there are some inputs for which you can return immediately and cut off the main execution path. To explain let me use the following example:
-
Output doSomething(Input input) {
-
// Do something for 200 ms
-
return output;
-
}
This method could do anything as long as the complexity is constant, we only care how much time it takes on average. I could also have taken some algorithm (e.g. I started with matrix multiplication), but then we’d have to bring in complexity estimates and I’d like to keep it simple for now.
Let’s assume that for some inputs we could calculate the output in 50ms instead of 200ms. We introduce a check and the code is now:
-
Output doSomething(Input input) {
-
if (isSimpleInput(input)) {
-
// Do something for 50 ms
-
return output;
-
}
-
-
// Do something for 200 ms
-
return output;
-
}
This is a common way to optimize the method execution time. However the question is if this actually optimized anything? It seems like a stupid question, but let’s estimate the new average execution time.
To do that we need to know two things. The time t it takes to do the isSimpleInput() check and the proportion p of method calls that are “simple”. Let’s assume that t is 20 ms and 10% of the calls are simple. The average execution time can then be calculated using the expected value formula (EV = p * v1 + (1 – p) * v2):
EV = 0.1 * (20 + 50) + (1 – 0.1) * (20 + 200)
= 20 + 0.1 * 50 + 0.9 * 200
= 20 + 5 + 180 = 205
This calculation shows that with this values we have actually increased the average method execution time by 5 ms. Note that we are still winning every time the cutoff occurs (20 + 50 << 200), but we are losing on the average.
The same equation can also be applied to measuring several checks or measuring execution time that is dependent on the input, however it gets increasingly more complicated. For me the main value of this is realizing that every cutoff has an inherent cost which I will now pay in mind.
Before you think this is all just math, the epiphany that came to me was not to add a check, it was to remove one. I realized that a particular check I’m making is rare and expensive enough to directly influence the method execution time and lo and behold, removing it gave a 30% improvement for the benchmark I was using.
Tags: java, performance
Posted in Featured, creative | No Comments »