Optimizing for Speed in J2ME :: Extreme Tips for Lightning-Fast MIDlets
2007-03-31
Scope of Article
Although the tips in this article are not specific only to J2ME, they are essential for maintaining the performance of MIDlets in spite of the limitations of mobile devices.
In most cases, optimizing for speed comes at the cost of increased space usage. Developing for MIDlets is a careful balance of these 2 optimization concerns. Early optimization can render code difficult to read and maintain and the most of the tips found here should be taken towards the end of the development phase and not from the beginning.
1. Avoid synchronization when possible
It is said that code within synchronized contexts is roughly 4 times slower than ordinary code. Regardless of Java VM implementation a lock must be inspected and acquired upon an object each time the synchronized context is entered and must be unlocked upon exit. Threads which cannot acquire a lock must wait for the lock to be released. These factors should be compelling reasons to avoid synchronization where possible.
2. Using pre-computing
If your MIDlet uses 3D/2.5D graphics you will no doubt be using sines, cosines and possibly tangents along with angles. These operations are processing intensive so why not create pre-computed lookup arrays in your code? Your CPU will thank you for it.
There are cases besides 3D graphics that benefit from pre-computing but all take the form of lookup tables/arrays where values known at design-time are spared computation at runtime.
3. Array-spreading
Accessing arrays comes at a performance penalty and multi-dimension arrays are even worse. They can be optimized by making them uni-dimensional.
Example:
// Before
int[][] table; // a 4x4 table
// After
int[] table; // a 1x16 table
As a bonus uni-dimensional arrays consume less heap memory!
4. For-Loop Unrolling
Loops are a convenient feature but the bytecode they generate is wasteful. Iterating 15 times over a block of logic means also executing a 15 comparisons and 15 increments.
Example:
void printMsg() {
for (int loop = 0; loop < 15; loop++) {
System.out.println(msg);
}
}
Is functionally the same as…
void printMsg() {
int loop = 0;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
System.out.println(msg);
if (loop >= 15) {
return;
}
loop++;
}
I apologize for making you scroll so far, but it was necessary to impress upon you how loops make our lives easier but also abstract the inner workings so well that many programmers don’t realize their costs.
If, at design time, you know the number of iterations it can be beneficial to unroll your loop like such…
void printMsg() {
for (int loop = 0; loop < 15; loop += 5) {
System.out.println(msg);
System.out.println(msg);
System.out.println(msg);
System.out.println(msg);
System.out.println(msg);
}
}
Do you see what has improved? This for-loop will only iterate 3 times. The message will still be printed out 15 times as it was before but there will be only 3 comparisons and 3 additions upon the ‘loop’ variable. The output is the same but it takes less instructions to accomplish because there was less loop management.
One caveat about loop-unrolling
You might be tempted to apply this trick on all your for-loops but you must keep one thing in mind before doing so. Loop-unrolling generates larger code and can cause the body of the loop to be so big that it cannot fit entirely within the CPU’s cache. When instructions cannot entirely fit within the cache, the cache contents have to swapped in and out and this will render your MIDlet’s performance probably worse than had you not unrolled it at all. Refrain from going overboard with loop-unrolling.
5. For-loop tightening
The goal of for-loop tightening is to remove logic that does not require repeated execution from appearing within the loop body.
Example:
// Bad
for (int loop = 0; loop < 10; loop++) {
int a = 5;
int b = 10;
int c = a * b + loop;
}
// Good
int a = 5;
int b = 10;
for (int loop = 0; loop < 10; loop++) {
int c = a * b + loop;
}
In the good example a and b are assigned once only. That’s a total of 20 assignments that were avoided.
A lesser known tightening technique
Unless absolutely necessary, never call a method in the conditional portion of a loop. It will be called with each iteration. A common example and an easy optimization is given below.
//Bad
for (int loop = 0; loop < msgs.size(); loop++) {
System.out.println(msgs.get(loop));
}
// Good
int msgCount = msgs.size();
for (int loop = 0; loop < msgCount; loop++) {
System.out.println(msgs.get(loop));
}
In the bad example msgs.size() is called with each iteration. In the good example it is called once. This is of course assuming msgs doesn’t grow or shrink during the execution of our loop.
6. Avoiding interface method calls
In Java bytecode there are 4 types of method invocations. Below they are listed in fastest-to-slowest order.
Static method invocation does not use an instance and therefore does not need to abide by the rules for polymorphism and this means an instance type lookup is not required when calling it. The reference type is used instead and this is known at compile time.
Calls a normal method.
Special invocations are calls to constructors, private methods and super class methods.
Requires a search for an appropriate implementation of the interface method prior to invocation and is therefore the slowest.
What types of methods you call can have an impact on your application design so it is a good idea to keep these facts in mind during the development cycle.
Calling static methods offers the best performance as there is no lookup at runtime for the method that will be invoked. In contrast, calling interfaces requires two lookups.
7. Avoiding unnecessary array lookups
Indexing an array isn’t the fastest of actions. If there’s a value you can reuse, place it in a plain variable and refer to that instead.
8. Avoiding argument usage
When you call a non-static method you are implicitly passing the “this” reference to it along with any other parameters. Since the Java VM is a stack-based one these arguments need to be pushed and then popped from that stack upon each method invocation. In some cases you can place these parameters in instance variables and avoid stack operations altogether.
Example:
// Bad
return multiply(int a, int b) {
return a * b;
}
// Good
int result;
int a;
int b;
void multiply() {
result = a * b;
}
From a high-level perspective there is very little difference between these two styles but the Good one is actually faster. It gets even better if all the variables and the method are declared “static”. This way the “this” reference isn’t sent to the multiply method.
9. Avoid local variables
Local variables are instantiated on and removed from the stack with each method call. Where possible use (and reuse) instance variables which are allocated from the heap upon object instantiation and garbage collected at a later time.
10. Avoiding getters/setters
For reasons mentioned above calling getter/setter methods is wasteful. I don’t usually recommend breaking from Java Beans calling conventions but towards the end of my development cycles I usually trim these methods from my code. This is a rare case where optimizing for speed also optimizes for space.
11. Strength Reduction
Performance-wise, not all mathematical operations are equal. While additions are subtractions are fast, multiplications, divisions, and modulus are slow. Sometimes bitwise operators can make operations even faster as we’ll see.
// Bad
int a = 5 * 2;
// Good
int a = 5 + 5;
// Better
int a = 5 << 1; // Shift 5 (0000 0101) 1-bit to the left to become 10 (0000 1010).
// Best
int a = 10; // Trick question, if you know the result at design-time don't compute it!
Strength reduction techniques have been in practice since at least the DOS game programming days and are perhaps one of the only optimizations that one might care to make early on as it can impact the overall design of the MIDlet.
In closing
This is just a sample of optimizations that can be easily applied to just about any MIDlet with minimal effort. While it might be difficult to guage their effectiveness (especially across various mobile devices), properly used they cannot have a negative impact on performance. The value in these tips comes mostly from the gained understanding of the bytecode that results from our programming styles and how it is up to us to keep it optimal.
Entry Filed under: optimization, programming (advanced). .
1. Gibby’s Blog » Blog Archive » 1 size does not fits all | 2007-03-31 at 6:36 pm
[...] are other types of optimization techniques and some have trades off where you can increase the speed, but use up more heap memory etc etc. So [...]