Monday, June 29, 2009

Writing Code Is Much Like Writing Prose

There are many similarities when comparing the writing of code to the writing of prose. Because of this, we should be able to learn from doing each of these and apply things learned from one to the other.

The Absolutes

In writing code and in writing prose, there are a few things that are either absolute or approach very nearly the status of absolute. This is especially true for programming languages where syntactic and semantic rules must be followed for the code to be compiled and/or interpreted correctly. Even when a programming language supports certain generally frowned-upon features, some of these features are avoided to such a large degree that they almost appear absolute. For example, direct use of "goto" is generally frowned upon and is rarely seen in most code bases. However, less obvious versions of this (such as break and continue in Java) do seem to be less strictly avoided.

Although it is less strictly enforced in writing prose, there still is significant pressure to conform to certain absolutes even in writing prose. For example, it is generally assumed that most professional prose will include sentences that begin with capital letters and end with periods. Similarly, proper names are almost always capitalized and correct spelling and reasonable grammar are also expected. The degree of enforcement for such things in prose often depends on the media. Professional papers and articles typically are the most enforced with the author and professional editors investing significant effort into polishing the prose. On the opposite end of the spectrum are e-mail messages, blogs, and Twitter messages, which seem to have less enforced absolutes.


The Frowned-Upon

Although there are a small number of absolutes or near-absolutes as just discussed, code development and prose writing seem to have many more things that are not absolutely avoided but seem to be strongly discouraged. However, these things tend to creep in despite their negative reputations because they do offer some advantages. Usually the advantages these items offer are ease for the writer at the expense of later reader or maintainer having more difficult prose or code.

Many developers realize problems associated with using so-called "magic numbers." However, they still seem to crop up. Often they are put in place "temporarily" and then forgotten or limited schedules prevent replacing them with a constant. These "magic numbers" are quick and easy to use when doing initial development, but can lead to a maintenance nightmare. Global variables offer a similar trade-off of easy early development at the expense of maintainability, robustness, and scalability.

Prose authoring has similar frowned-upon, but still often used, features. For example, it is often said that sentences should not end with prepositions. Similarly, it is often said that strong, active voice should be used. The tense of the writing should also be consistent. These are all things that are recommended because they do offer recognized benefits, but they are also easy to ignore or cheat on a bit when it is not deemed worth the time or effort to satisfy all of them all of the time.


The Standards

Since nearly the beginning of software development, developers have seemed to want to create and adhere to coding standards and conventions. Of course, we also seem to have been resistant to other peoples’ ideas of standards and conventions for nearly as long. The reason most of us are willing to give up some "creative freedom" and adhere to standards and conventions is that we have learned that code is more readable and maintainable (especially by others) when we adhere to a minimum set of conventions.

Prose can benefit from the same benefits of standardization and convention. There are books such as Elements of Style and Chicago Manual of Style devoted to prose style. Most of the arguments in favor of these prose writing style conventions are the same arguments used in favor of coding conventions: easier to read and maintain and consistency to benefit different readers and authors/developers.

One style issue is very similar between prose writing and code development. The subject of spaces can be surprisingly controversial in both arenas. In software development, most developers seem to agree that the optimal number of spaces for indentation is between 2 and 4 spaces. However, trying to narrow down which of these is best (2 spaces or 3 spaces or 4 spaces) is significantly more difficult.

There has been a controversy in the prose writing world regarding how many spaces should follow a period that ends one sentence before the first letter of the next sentence. I grew up thinking that two spaces was the expected number of spaces between the end of one sentence and the beginning of the next sentence. However, I was recently informed by a prose reviewer and editor that a single space is now preferred. I wondered if this was a web browser-inspired shift, but there seems to be evidence that this shift started even before the widespread adoption of HTML.


Conciseness

There seem to be differences of opinion on whether prose should be concise or verbose. To some extent, this depends on the subject of the prose. I prefer technical prose to be as concise as possible while still remaining thorough. This is especially true of technical references. However, with novels, extra verbosity can sometimes be nice to explain the story and character development. This can even go too far, for my taste, as evidenced by Moby Dick.

I have found that even in software development there is a wide diversity of opinion about conciseness of code. The longer I work in the industry, the more I value conciseness. However, I know many Java developers who don’t like the same degree of conciseness that I like. An example of this is the Java ternary operator. This operator has really grown on me, but I still know many Java developers who do not care for it at all. Although many developers are migrating to programming languages that emphasize and value conciseness, there are limits to how concise we want to be. After all, none of us are probably too excited to write and maintain production code that can be sneakily squeezed into a single line.


Refactoring

"Refactoring" is a popular term in software development, but it does have its equivalent in prose writing. In my case, I find that when I revisit my own articles, I continually "refactor" the text to make it flow better, to reduce unnecessary repetition, and to make it generally more concise. The editors of formal articles often do this to an even greater degree. In fact, the editors’ reviews often remind me of how developers are eager to change others’ peoples’ code to match their own preferences. Some of the "refactorings" I see in both development and in article editing have marginal value. However, I think most of us can agree that some "refactoring" or editing is useful and recommended for writing code or for writing prose.

When the editors and reviewers at Oracle Technology Network recommended cutting my original draft of the Basic JPA Best Practices article to less than half its original draft size, it took some effort to "refactor" that article to that point. Although some minor details and some explanatory text were removed in the process, most of the substance was retained even though the final article had half as many words as my original draft. It was not trivial trimming that draft down without losing too much substantial content, but the effort the reviewers, editors and I invested is reflected in the improvements. That article still weighs in around 11 pages, but it is leaner, tighter, and more optimized than my original draft. That sounds awfully similar to the benefits of code refactoring. The process really was like refactoring because it was more than just removing words; it involved changing words and changing sentence structure and paragraph structure.

I don’t spend much time reviewing my blog posts at the time of their writing. I think this is common among blogs, though some are exceptions. Because of this, most of us expect blog posts to be rougher than formal articles. There are definitely different expectations for different forms of writing. Similarly in code, prototypes and demonstration code can often be a little "rougher around the edges" than highly reviewed and refactored production code.


More Knowledge Means More Expressiveness

When writing prose, one of the most useful techniques for writing concise but thorough prose is to know and carefully use the appropriate words and phrases. Words have different nuances and these nuances can be used to provide more expressiveness with the same number of words. In code, we see the same thing. There is often more than one way to get the job done, but thorough understanding of the language’s features and provided class libraries allows us to select the most appropriate language feature or class that provides the exact nuanced solution appropriate to the problem at hand.

When writing prose, it is common to use well-known idioms and phrases to imply much more than the few words would normally imply. For example, "a picture is worth a thousand words" consists of only seven words but implies much more than what we might say with only seven words. Some assumed knowledge is required (readers must be familiar with the idiom) to make this work. In code, we often use design patterns and other common phrases to succinctly describe much larger ideas that otherwise would require much more description.


The Value of Review

I have found that both code and prose that I write benefit when reviewed by someone else. When I write my own code or prose, I know what I am trying to say and it all makes sense. Reviewers of articles and reviewers of code can ask questions about what is intended and provide feedback that makes the code or article more generally appealing. Sometimes we’re too close to the product for our own good and the reviewer can help us to see things that we don’t see.


"Readable" is in the Eye of the Beholder

To some degree, what is "readable" depends on the person doing the reading. This applies to both prose and code. Readers (whether reading prose or someone’s code) have their own preferences. Just as we all like different prose authors’ writing, it is not surprising that we each find different styles of code easier or more difficult to read. For example, I have an easier time reading code written by people who have similar tastes and preferences to mine. For this reason, I don’t think we’ll ever see a single programming language or framework that everyone uses. There is just too wide of a spectrum of differences of opinion for any one language or framework to appeal to everyone. This is also an important observation to realize when writing code or prose. You can try to appeal to the widest set possible, but no matter what you do there will probably be at least a small group of people who don’t like it.


Conclusion

Writing prose and writing code have much in common. Many of the same techniques that make better prose also make better code. In both cases, knowing what one has to work with (vocabulary and common phrases for prose and language features and class libraries for code) can make it easier to write particular effective prose or code. Both types of writing also benefit tremendously from review. Many of the same controversies surround both types of writing.

Saturday, June 27, 2009

Java Enums Are Inherently Serializable

More than once, I have seen code such as the following (without the comments I have added to point out flaws), in which a well-intentioned Java developer has ensured that their favorite Enum explicitly declares that it is Serializable and has even provided a serialVersionUID for it.


import java.io.Serializable;

/**
* Enum example with unnecessary and ignored serialization specification
* details. The Enum is already Serializable and attempts to control its
* serialization behavior are ignored. See Section 1.12 ("Serialization of Enum
* Constants") of the "Java Object Serialization Specification Version 6.0".
*/
public enum StateEnum implements Serializable
{
ALABAMA("Alabama", "AL"),
CALIFORNIA("California", "CA"),
COLORADO("Colorado", "CO"),
IDAHO("Idaho", "ID"),
UTAH("Utah", "UT"),
WYOMING("Wyoming", "WY");

// Don't do this: Don't specify serialVersionUID for enums and don't use
// an arbitrary constant such as 42L for all versions; use serialver on Sun JDK
private static final long serialVersionUID = 42L;

private String stateName;
private String stateAbbreviation;

StateEnum(final String newStateName, final String newStateAbbreviation)
{
this.stateName = newStateName;
this.stateAbbreviation = newStateAbbreviation;
}
}


Because enums are automatically Serializable (see Javadoc API documentation for Enum), there is no need to explicitly add the "implements Serializable" clause following the enum declaration. Once this is removed, the import statement for the java.io.Serializable interface can also be removed. If you have any doubts about Enum being Serializable, run the HotSpot-provided serialver tool against your favorite enum that does not declare itself Serializable. The tool will return 0L for all enums. When a class is not Serializable, this tool returns the message "Class --yourClassNameHere-- is not Serializable." An example of this is shown in the next screen snapshot.



The fact that serialver returns 0L for the enum’s serialVersionUID indicates that the enum is indeed Serializable. The Javadoc also indicates this. A third way to prove this to yourself is to use instanceof operator as shown in the next code sample.


import java.io.Serializable;

public class UsesStateEnum
{
private StateEnum state;

public UsesStateEnum(final StateEnum newState)
{
this.state = newState;
}

public StateEnum getState()
{
return this.state;
}

public void verifyEnumIsSerializable()
{
System.out.print("StateEnum instance of Serializable? ");
System.out.println(this.state instanceof Serializable ? "yes" : "no");
}

public static void main(final String[] arguments)
{
System.out.println("Verify Enum is Serializable");
final UsesStateEnum me = new UsesStateEnum(StateEnum.COLORADO);
me.verifyEnumIsSerializable();
}
}


As mentioned above, all Enums have a serialVersionUID of 0L. Therefore, it is not necessary to specify one as is shown in the code above. In fact, when one is specified, it is ignored anyway. The example above intentionally used the hard-coded 42L used in Joshua Bloch’s Effective Java example of how not to create a serialVersionUID. As the screen snapshot below indicates, this explicitly specified value is ignored anyway:



The above screen snapshot also demonstrates an advantage of running serialver against a class to generate the serialVersionUID rather than making up an arbitrary long value such as 42L. By using the script, we get the 0L result for all enums and improve our chances of remembering that enums all have 0L for this value and don’t need it explicitly specified.

Although it does not hurt anything to unnecessarily specify that an enum implements Serializable or to even provide an ignored serialVersionUID, I prefer not to include these. One might argue that at least adding "implements Serializable" communicates the intent to have an enum be Serializable, but my feeling is that this is a fundamental part of the language since J2SE 5 and such communication should be unnecessary. When building a class that needs to be Serializable, using enum constituent pieces can be treated just the same as using Strings and primitives and the reference types corresponding to primitives.

All of the details I demonstrated and explained in this blog posting related to Enums being inherently Serializable are concisely described in two paragraphs of Section 1.12 ("Serialization of Enum Constants") of the Java Object Serialization Specification.


Additional Resources

Java Object Serialization Specification

Serialization of Enum Constants

Object Serialization: Frequently Asked Questions

Into the Mist of Serialization Myths

Flatten Your Objects: Discover the Secrets of the Java Serialization API

Java Serialization Algorithm Revealed

Thursday, June 25, 2009

Viewing Names Bound to RMI Registry

When working with Java Remote Method Invocation (RMI), there are times when it is helpful to know which names are currently bound to a particular rmiregistry on a particular host/port combination. This is especially true when debugging problems related to getting an RMI client unable to connect to an RMI server either because the server cannot be found (NotBoundException) or because the server port is already bound to the provided name (AlreadyBoundException).

A simple Java application can be written that provides all named bindings for an RMI registry on a particular host and port. The simple application demonstrated in this posting takes advantage of standard Java classes such as LocateRegistry, Registry, and other classes and exceptions in the java.rmi and java.rmi.registry packages. The code for this application is shown next.


RmiPortNamesDisplay.java

package dustin.examples.rmi;

import java.rmi.ConnectException;
import java.rmi.RemoteException;
import java.rmi.registry.LocateRegistry;
import java.rmi.registry.Registry;

/**
* Display names bound to RMI registry on provided host and port.
*/
public class RmiPortNamesDisplay
{
private final static String NEW_LINE = System.getProperty("line.separator");

/**
* Main executable function for printing out RMI registry names on provided
* host and port.
*
* @param arguments Command-line arguments; Two expected: first is a String
* representing a host name ('localhost' works) and the second is an
* integer representing the port.
*/
public static void main(final String[] arguments)
{
if (arguments.length < 2)
{
System.err.println(
"A host name (String) and a port (Integer) must be provided.");
System.err.println(
"\tExample: java dustin.examples.rmi.RmiPortNamesDisplay localhost 1099");
System.exit(-2);
}

final String host = arguments[0];
int port = 1099;
try
{
port = Integer.valueOf(arguments[1]);
}
catch (NumberFormatException numericFormatEx)
{
System.err.println(
"The provided port value [" + arguments[1] + "] is not an integer."
+ NEW_LINE + numericFormatEx.toString());
}

try
{
final Registry registry = LocateRegistry.getRegistry(host, port);
final String[] boundNames = registry.list();
System.out.println(
"Names bound to RMI registry at host " + host + " and port " + port + ":");
for (final String name : boundNames)
{
System.out.println("\t" + name);
}
}
catch (ConnectException connectEx)
{
System.err.println(
"ConnectionException - Are you certain an RMI registry is available at port "
+ port + "?" + NEW_LINE + connectEx.toString());
}
catch (RemoteException remoteEx)
{
System.err.println("RemoteException encountered: " + remoteEx.toString());
}
}
}



To test out the above application, I can start any service exposing an RMI interface. For this example, I have started a GlassFish domain as shown in the next screen snapshot.



The port on which GlassFish exposes its JMX RMI interface for management and monitoring is highlighted in the screen snapshot and is 8686. When I run the simple RMI port names display application shown above on the same host on which I ran GlassFish, I can use "localhost" as the host. When I run the above Java application, I see two bound names on the RMI registry on localhost at port 8686. This is shown in the next screen snapshot.



From the results shown in the above image, we see that GlassFish exposes two named services on port 8686: jmxrmi and management/rmi-jmx-connector.

The simple application shown above uses standard Java libraries and classes, but also has a "script" feel. It seems like what would really work well here is a script language that uses Java classes. In other words, a scripting language that runs on the JVM such as JRuby or Groovy seems like the perfect fit. With that in mind, the next code listing shows a Groovy implementation of the application written above in traditional Java.

rmiPortNamesDisplay.groovy

import java.rmi.ConnectException
import java.rmi.RemoteException
import java.rmi.registry.LocateRegistry
import java.rmi.registry.Registry

if (args.length < 2)
{
println "A host name (String) and a port (Integer) must be provided."
println "\tExample: groovy rmiPortNamesDisplay localhost 1099"
System.exit(-2)
}

host = args[0]
port = 1099
try
{
port = Integer.valueOf(args[1])
}
catch (NumberFormatException numericFormatEx)
{
println "The provided port value '${args[1]}' is not an integer."
System.exit(-1)
}

registry = LocateRegistry.getRegistry(host, port)
boundNames = registry.list()
println "Names bound to RMI registry at host ${host} and port ${port}:"
boundNames.each{println "\t${it}"}


The above Groovy script, like the Java application from which it was adapted, can be run on the command line. This is shown in the next screen snapshot.



Besides the obvious syntactic differences, one advantage of writing something like this in Groovy is that it does not require an explicit compilation step. The Java application above had to be compiled into a Java .class file first and then executed. With Groovy, this is all done implicitly so that to the user it just feels like running a text-based script. I am especially fond of using Groovy in cases like this where I wish to combine scripting features with accessibility to the JVM and standard Java libraries.


Conclusion

When working with RMI, there are times when it is important to know which named services are already bound to an RMI registry at a given host and port. This blog posting has demonstrated use of the LocateRegistry, Registry, and other relevant classes to do this in a fairly easy manner with traditional Java and with Groovy. It is a straightforward process to extend the Java application and Groovy script shown in this blog posting to cover multiple hosts and ports to, in effect, search for RMI registered names.

Wednesday, June 24, 2009

Thread Analysis with VisualVM

Although jstack (Java Stack Trace) is a useful tool for learning more about a how a Java thread is behaving, VisualVM is an even easier method for obtaining the same type of information.

It is easy to run jstack as demonstrated in the next screen snapshot:



Only the top portion of the generated stack trace information is shown above. The output of an entire jstack run looks similar to that which follows:


2009-06-24 23:25:26
Full thread dump Java HotSpot(TM) Client VM (14.0-b16 mixed mode, sharing):

"RMI Scheduler(0)" daemon prio=6 tid=0x04bb4c00 nid=0x126c waiting on condition [0x048ef000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x2492dac8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)

Locked ownable synchronizers:
- None

"RMI TCP Accept-0" daemon prio=6 tid=0x023bfc00 nid=0xa74 runnable [0x0483f000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
- locked <0x2492dc48> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:453)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
at java.lang.Thread.run(Thread.java:619)

Locked ownable synchronizers:
- None

"Low Memory Detector" daemon prio=6 tid=0x02348000 nid=0xec4 runnable [0x00000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"CompilerThread0" daemon prio=10 tid=0x02343000 nid=0x108 waiting on condition [0x00000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"Attach Listener" daemon prio=10 tid=0x02342800 nid=0xf6c waiting on condition [0x00000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"Signal Dispatcher" daemon prio=10 tid=0x02339c00 nid=0xac runnable [0x00000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"Finalizer" daemon prio=8 tid=0x022f4000 nid=0xd14 in Object.wait() [0x044cf000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x248ee9a0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x248ee9a0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

Locked ownable synchronizers:
- None

"Reference Handler" daemon prio=10 tid=0x022f2c00 nid=0x1198 in Object.wait() [0x0240f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x248eea28> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x248eea28> (a java.lang.ref.Reference$Lock)

Locked ownable synchronizers:
- None

"main" prio=6 tid=0x00209000 nid=0x1134 runnable [0x002af000]
java.lang.Thread.State: RUNNABLE
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java:119)
at dustin.examples.tools.AnalyzableImpl.loopProvidedNumberOfTimes(AnalyzableImpl.java:48)
at dustin.examples.tools.AnalyzableImpl.main(AnalyzableImpl.java:80)

Locked ownable synchronizers:
- None

"VM Thread" prio=10 tid=0x022f1400 nid=0x10c runnable

"VM Periodic Task Thread" prio=10 tid=0x02348c00 nid=0x278 waiting on condition

JNI global references: 872



VisualVM makes it easy to monitor application threads. VisualVM offers the capability to generate and view the jstack-generated stack trace. The first way to do this is to right-click on the appropriate Java process and select the option "Thread Dump." This will generate a thread dump file whose name appears under the selected Java process as shown in the following screen snapshot.



A second way to get the thread dump generated in VisualVM is to use the "Threads" tab and click on the button "Thread Dump." This button is demonstrated in the next screen snapshot.



Whether the right-click option is used or the "Thread Dump" button is pressed, VisualVM generates jstack thread dump output file and displays it as shown in the next screen snapshot.



As discussed and demonstrated, VisualVM allows for easy generation of a stack trace dump with jstack. However, VisualVM provides much more than that for thread analysis.

The screen snapshot above that showed the "Thread Dump" button also conveniently demonstrates VisualVM's Timeline tab that demonstrates in a live fashion the "live" threads. These colored horizontal bars represent individual threads. Their display is enabled because the "Threads visualization" checkbox is checked.

Another useful tab on VisualVM is the "Threads" "Table" view that provides textual overview information on the threads. This is demonstrated in the next screen snapshot.



Any individual thread can be clicked on to see the detailed Threads view. This view, also under the "Threads" tab in VisualVM. When "Threads visualization" is checked, a pie chart is included that graphically indicates what each thread is doing. This is demonstrated in the next screen snapshot.




Conclusion

The jstack tool is a useful command-line tool, but VisualVM provides the same capabilities along with a significantly improved presentation and details.

Heap Dump and Analysis with VisualVM

In previous blog posts, I have covered using VisualVM to acquire HotSpot JVM runtime information in a manner similar to jinfo and how to use VisualVM in conjunction with JMX and MBeans in a manner similar to JConsole. This blog posting looks at how VisualVM can be used to generate and analyze a heap dump in a manner similar to that done with command-line tools jmap and jhat.

The jmap (Java Memory Map) tool is one of several ways that a Java heap dump can be generated. The Java Heap Analysis Tool (jhat) TechNotes/man page lists four methods for generating a heap dump that can be analyzed by jhat. The four listed methods for generating a heap dump are the use of jmap, JConsole (Java Monitoring and Management Console), HPROF, and when an OutOfMemoryError occurs when the -XX:+HeapDumpOnOutOfMemoryError VM option has been specified. A fifth approach that is not listed, but is easy to use, is Java VisualVM. (By the way, another method is use of the MXBean called HotSpotDiagnosticMXBean and its dumpHeap(String,Boolean) method.)

The jmap tool is simple to use from the command line to produce a heap dump. It can be used against a running Java process whose piocess ID (pid) is known (available via jps) or against a core file. In this post, I'll focus on using jmap with a running process's ID.

The jmap page states that jmap is an experimental tool with relatively limited capabilities on Windows that may not be available with future versions of the JDK. This page also lists options available to specify how jmap should generate a heap dump.

The following screen snapshot shows how jmap can be used to dump a heap.



The generated dump file, dustin.bin in this case, is binary as shown in the next screen snapshot.



The binary heap dump can be read with the jhat tool. Sun's Java SE 6 included implementation of jhat replaces HAT, which was formerly available as a separate download. It is almost trivial to run jhat. One need only invoke jhat on the heap dump file generated with jmap (or alternative dump generation technique) as shown in the next screen snapshot.



With the heap dump generated (jmap) and the jhat tool invoked, the dump can be analyzed with a web browser. The output on the console tells us that the dump is available on port 7000 (this default port can be overridden with the -port option). When I run the browser on the same machine on which I ran jhat, I can use localhost for the host portion of the URL. The starting page using localhost and port 7000 is shown in the next screen snapshot.



Arbitrary Object Query Language (OQL) statements can be written to find necessary details in the heap dump. The jhat-started web server includes OQL help at the URL http://localhost:7000/oqlhelp/. See also Querying Java Heap with OQL for more details on how to use OQL. However, one can often find what one needs simply using the already provided information and moving between pieces of information using the provided hyperlinks.

The following screen snapshot demonstrates one of the more useful pages available thanks to jhat's web server-based output of the heap dump. This page shows the number of instances of various Java objects, including platform objects.



A significant aid in understanding what these web pages generated by jhat mean is the VM Specification on Class File Format. In Section 4.3.2 ("Field Descriptors") of this document, there is a table that shows the mapping of field descriptor characters to the data type we use. According to this table, "B" indicates a byte, "C" indicates a char, "D" indicates a double, "F" indicates a float, "I" indicates an integer, "J" indicates a long, "L<someClassName>" indicates a reference (instance of a class), "Z" indicates a boolean, and [ indicates an array.

So far, I have looked at using jmap and jhat from the command-line to generate a heap dump and provide a web browser-based method for analyzing the generated heap dump. Although these tools are relatively easy to use, VisualVM provides similar functionality in an even easier approach.

One method for generating a heap dump in Visual VM is to simply right click on the desired process and select "Heap Dump". This method is shown in the next screen snapshot.



This generates the heap dump as indicated by its name underneath the Java process.



A second approach for generating a heap dump with VisualVM is to click on the Java process of interest so that relevant tabs ("Overview", "Monitor", "Threads", and "Profiler") come up in VisualVM. Selecting the "Monitor" tab provides the "Heap Dump" button as shown in the next screen snapshot.



Clicking on the "Heap Dump" button leads to a heap dump being generated just as it was with the right click option described above. This is shown in the next screen snapshot, which happens in this case to show the "Summary" tab of the analyzed heap dump.



In addition to the "Summary" tab of the heap dump analysis, other interesting details from the heap dump are presented in the "Class" tab. This tab includes horizontal bar charts that graphically indicate the percentage of total instances that are associated with each class. An example is shown in the next screen snapshot.



The displayed classes are spelled out rather than using symbols like those described above for jhat-based heap dump analysis. One can right-click on any class in the "Classes" tab and select "Show in Instances View" to see details on each individual instance of the selected class. This is shown in the next screen snapshot.




Conclusion

VisualVM provides several advantages when creating and analyzing heap dumps. First, everything from creation to analysis is in one place. Second, the data is provided in what may be considered a more presentable format with graphical support. Finally, other tools can also be used in VisualVM in conjunction with the heap dump analysis. VisualVM provides one-stop shopping for many of the development, debugging, and performance analysis needs of the Java developer.


Additional References

Troubleshooting Java SE

Troubleshooting Guide for Java SE 6 with HotSpot JVM (PDF)

Java SE 6 Performance White Paper

What's in My Java Heap?

Analyzing Java Heaps with jmap and jhat

Java Memory Profiling with jmap and jhat

Tuesday, June 23, 2009

JMX 2 Postponed Until Java SE 8

It was disappointing, but not altogether surprising, to learn that JMX 2 will not be part of Java SE 7. Anyone who saw my Colorado Software Summit 2008 presentation JMX Circa 2008 is aware of how excited I was for some of the new features that were tentatively planned for Java SE 7. Java Management Extensions (JMX) is already a highly useful technology, but the advancements in JMX hoped for in Java SE 7 are convenient and welcome. With the long period of time between Java SE 6 and Java SE 7, it likely means quite a wait for JMX 2 features that are now expected with Java SE 8.

Sunday, June 14, 2009

Java Developers' Thoughts on 2009 JavaOne and the Future of JavaOne

I have found the results of the current and previous survey on Java.net to be interesting. These two surveys and indeed the three of the last four Java.net surveys have been related to JavaOne.

As always, there are several disclaimers when analyzing the results of these surveys: the survey is not scientific, sample size is relatively small compared to population of Java developers, sample demographics may not be indicative feelings of the wider and more general Java development community, and survey could be stuffed or otherwise cheated if someone actually saw an advantage to doing so, etc. Despite these disclaimers, I generally have found these survey results to be typical of the things I see, read, and hear from Java developers in other forums.

The current Java.net survey question asks, "Will there be a JavaOne conference in 2010?' So far, with over 120 votes, the overwhelming consensus (over 2/3 of the total votes) is "Yes" (there will be a 2010 JavaOne Conference). This indicates to me that Java developers are generally confident that Oracle will continue shepherding the Java community in a fashion similar to that used by Sun. Two-thirds of the developers responding so far are confident that 2009 JavaOne was not the last.

The previous Java.net question seems to fit into the idea that Java developers feel Oracle will continue running Java similarly to how Sun has. That question asked, "What was the most significant event about JavaOne 2009?" One-third of the (currently 180) responses indicate that "Larry Ellison's appearance" at 2009 JavaOne was the most significant event of this year's JavaOne. As I wrote in the blog posting Questions Answered: First Day of 2009 JavaOne, many attendees at JavaOne did seem relieved at what Larry Ellison had to say in his brief appearance during the opening day keynote. I think most people liked what they heard. I think some of the quotes captured in Justin Kestelyn's blog posting Larry Ellison on the Future of Java were particularly well received.

The somewhat surprising (to me) second most popular choice for "most significant" happening at 2009 JavaOne (with nearly 1/4 of the vote at this point) was the core Java SDK presentations. I am happy to see that; it indicates to me that there are many others out there besides myself who are generally happy with the basics of the Java language and the Java platform and want to continue learning about it and seeing innovation in that area.

It is easy to read some overly enthusiastic blog postings (and more so the anonymous responses to blog posts) and think that Java is dead and that anyone interested in Java is just behind the times. However, events like JavaOne and the follow-up to this massive event are reminders that Java is still among the most widely used languages out there and that many of us are still able to provide customers with outstanding results using the Java language and the Java platform.

Monday, June 8, 2009

Acquiring JVM Runtime Information

I have found the tools and utilities that come with the Sun-provided Java SDK (especially the monitoring and management tools) to be very useful in daily Java development. I have previously blogged on the use of several of these handy tools such as jps, JConsole, and VisualVM. Although my previous blog posts have generally focused on using a single particular tool to accomplish something, I am using this posting to look at how to acquire JVM runtime information using several different tools.

There are many reasons one might desire JVM runtime information. These include the ability to figure out why an application runs differently in two different environments, the ability to use other tools and utilities based on runtime information (such as Java process ID), and the ability to identify Java processes that are not closing even when politely asked.

jps

The jps command (Java Virtual Machine Process Status Tool) is one of the tools I use most often in my Java development. I have covered it more thoroughly before, but I quickly describe it here for convenience. The jps command options I generally use are "l" and "m." The overall command, jps -lm provides the process IDs of all running Java processes on that machine and the options provide enough detail to know which IDs go with which process descriptions. There are some permissions issues that can potentially reduce the usefulness of jps and the jps man page warns that "this utility is unsupported and may not be available in future versions of the JDK," but I have found this simple command-line tool to be enormously helpful.

jinfo

One of the ways to run the jinfo (Java Configuration Info) tool is run it against a Java process. The jps tool just mentioned can be used to retrieve the appropriate Java process ID.

The -help flag can be used to see how to use jinfo. There are fewer options available on Windows than on Linux. The options available to jinfo on Windows are shown in the next screen snapshot.



Even with the Windows jinfo, I can view individual Virtual Machine (VM) options. This is demonstrated in the next screen snapshot.



How do you know which VM options are available? In Windows, jinfo does NOT support the -flags option which is supposed to provide a list of VM options. However, the HotSpot VM options are available at http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp and Troubleshooting Guide for Java SE 6 with HotSpot VM also contains several of them. This limited version of jinfo for Windows has only been available since J2SE 6.

On Linux, the -flags option is supported and can be used to see all flags associated with the given Java process. The Linux version of jinfo also supports the -sysprops to view the system properties associated with the particular application and supports no option to see properties and command-line flags. The Linux version of jinfo is obviously more powerful than the Windows version.


JConsole

Command-line tools like jps and jinfo have several advantages such as reduced overhead, potentially faster performance, and availability to be used in non-interactive scripts. However, there are also situations when a graphical interface is preferred for ease of use or for improved presentation. JConsole supports visual representation of virtual machine information and has been available as a standard part of the Sun-provided JDK since J2SE 5 (and was improved for Java SE 6).

JConsole is particularly easy to use with Java SE 6 because the Attach API enables JConsole to automatically look up Java processes running on the same host and under the same user without any special properties being specified for the Java application being monitored (can be done with J2SE 5 if appropriate system properties are first specified). This automatic detection of local processes in Java SE 6 is shown in the next screen snapshot.



As the screen snapshot above indicates, it is easy to identify Java process IDs in JConsole as they are explicitly shown ("PID" column) in the first screen. When one clicks on one of these processes, the "Connect" button is enabled. When that button is clicked on, JConsole connects to the selected Java process.

Once JConsole connects to a selected Java process, six tabs are present by default (and JConsole allows creation of additional custom tabs). The "VM Summary" tab is useful for getting highly descriptive summary of characteristics and settings of the JVM. An example of this is shown in the next screen snapshot.



There is significantly more that can be gleaned from JConsole, but for now I'll focus on how to obtain the type of information provided by jinfo. The VM options can be obtained via the MXBean provided with the Sun implementation of Java called com.sun.management.HotSpotDiagnostic. The getVM operation on this MXBean allows the name of the desired VM option to be specified in a text field and that option's value is returned when the "getVMOption" button next to the text field is clicked. The next screen snapshot demonstrates this (note that this all takes place in the "MBeans" tab).



JConsole allows system properties to be accessed via its "MBeans" tab as well. They are accessed via the MXBean java.lang.Runtime. The next screen snapshot demonstrates how this appears when this MXBean is selected.



The only problem at this point is that the value javax.management.openmbean.TabularDataSupport is not exactly the type of thing we're looking for when we want to see system properties. Fortunately, one can simply click on this bold string and the entire area expands as shown in the next screen snapshot.



As the last screen snapshot demonstrates, one system property is shown at a time. The "Tabular Navigation" buttons with < and > icons can be used to move through the various properties. In this case, the displayed java.endorsed.dirs property is the 9th of 53 properties.

Besides being more aesthetically pleasing, JConsole offers the advantage of working well and fully on Windows as well as on Linux. The fact that JConsole is constructed to generically handle exposed JMX MBeans makes its use virtually limitless. Not only can it be used to monitor and management standard properties provided with the JVM, but it can even be used to manage and monitor custom JMX-enabled applications.


VisualVM

VisualVM has been available as a distinct open source product for some time, but has been included with Sun's JDK since Java SE 6 Update 7. It provides advantages of JConsole such as visualization of memory and performance data and generic access to JMX MBeans.

Like jconsole used to run JConsole, the jvisualvm command for running VisualVM is located in the same directory as the other Sun-provided Java tools such as java, javac, jps, and jinfo. Although I have discussed jinfo and VisualVM together before, I look at them together again here for convenience.

The next screen snapshot shows the VisualVM startup screen. Like JConsole, local Java processes running under the same user are automatically displayed with the Process ID in parentheses.



Clicking on the desired process leads to several tabs being shown. The "Overview" tab is shown in the next screen snapshot.



In the bottom right corner, there is a pair of tabs nested within this "Overview" tab. In the screen snapshot above, the "System Properties" tab is selected. As shown, all the properties are shown at once and a scroll bar on the right can be used to scroll down the list. The other nested tab can be used to see any VM arguments.

To see VM options, one can use the JMX-exposed MXBean used with JConsole. To access MBeans from VisualVM, the "MBeans" plugin should be installed as I have blogged about previously. Once the plug-in is installed, the "MBeans" tab can be accessed and used in a manner very similar to how the "MBeans" tab is used in JConsole. The next screen snapshot shows viewing of one of eight VM options this way.



Like JConsole, there is much more to VisualVM than is shown here. However, as these simple examples have shown, a wide variety of information about the JVM is available via VisualVM and JMX-exposed applications can be accessed generically from this tool.


Other Tools

Because the Sun JVM is widely instrumented to support monitoring and management via JMX, any JMX client (generic or custom) can be used to monitor and manage the JVM. However, I see very little reason to use other tools for JMX-based JVM monitoring and management with the ready availability of JConsole and VisualVM. For even more sophisticated non-Java client access of JVM information, compatible web services clients (implement WS-Management, such as winrm) can manage and monitor JVMs via the JMX Web Services connector.


Conclusion

There are many alternatives available for acquiring information related to the Sun JVM. These options get better and more prevalent with each new major Java release. This blog posting has attempted to demonstrate how the command-line tools jinfo and jps and the graphical tools jconsole and jvisualvm are readily available and easily provide information on the JVM.

Learning Java via Simple Tests

On forums dedicated to answering questions for people who are new to Java programming (such as the Sun forum called New to Java), a common frustration vented by many of the "regulars" is when people posting questions have not even bothered to search for something that has already been frequently answered or is easily answered with Google, Bing, or other search engines. This is often considered even more egregious if the very question has been answered in that very forum already. Indeed, as I've written about before, web searches and forums can be invaluable tools for the software developer.

Although the combination of today's powerful online search engines with countless blogs, articles, and forums makes it easier than ever to learn how and why to perform just about any software development task, there are still advantages to simply trying some things out for oneself. When a developer gets used to writing simple tests to find out how something works, the developer can sometimes create and run these tests nearly as quickly as the answer could be found online. Even better, I have found that I often learn more and remember better if I have tested it myself. Another advantage of the simple test approach is that it can help us determine differences on different versions of Java or different JVM implementations in cases where implementation-specific details are allowed. Finally, just the act of writing simple tests and running them simply (often without IDE) helps keeps foundation skills sharp.

Production development can be very different from writing simple tests and demonstration to learn how to use something. In this blog post, I intend to discuss some small things that many experienced developers do all the time to learn via simple testing and demonstrations. Note that I am not talking about tests such as unit tests, functional tests, and integration tests. Rather, I am talking about tests that answer questions that often start with, "What does Java do if..."


The Common Text Editor

For these simple tests to learn how something works, the overhead of starting an IDE, creating an IDE project, and other steps associated with using an IDE are often unnecessary. Although I will almost always use an IDE for production development (other than for quick fixes in which I don't need any of the IDE features), I often find myself using a simple text editor such as JEdit, vim, emacs, or even WordPad.


Template Java Application Class

For most of my simple tests/demonstrations, I need a simple Java class with a main function. I have typed in the minimal skeletal class for this thousands of times, but I know that many developers like to keep a "template" class around for such cases. It will often look like that shown in the next code listing.

ClassName.java - Template Java Class
// TODO - Add package declaration here to avoid unnamed package scope

// TODO - Replace 'ClassName' with appropriate class's name in all locations
//        (search and replace suggested).

/**
 * TODO - Add class description here.
 */
public class ClassName
{
   /**
    * Main exectuable function for this class.
    *
    * @param arguments Command-line arguments for this application.
    */
   public static void main(final String[] arguments)
   {
      final ClassName me = new ClassName();
   }

   /**
    * Provide String representation of me.
    *
    * @return String representation of me.
    */
   public String toString()
   {
      return "String representation not yet provided for " + this.getClass().getName(); 
   }
}


A simple template could be even more minimal than that shown above. For example, the toString() implementation is often unnecessary when writing simple tests.


Building the Simple Test

Thanks to the prevalence of IDEs and highly useful products such as Ant, I have seen that some experienced Java developers have become unfamiliar with using the javac Java compiler and the java Java application launcher. However, when running simple tests to learn a new API or to answer one's own question about "what does Java when ...", these can be quick to use. I typically use javac when I only need to compile a small number of .java files in the same directory. If it gets more complicated than that, I move onto a simple Ant build.xml file. I especially like to move from command-line building directly with javac to using Ant when there are multiple dependencies that need to be explicitly declared on the classpath (though scripting the javac call to use a lengthy classpath can also be a useful tactic).

To make Ant even easier and quicker to use, I often use a template build.xml file similar to the template .java file shown earlier. The next code listing demonstrates what this template Ant build file might look like.

build-template.xml
<?xml version="1.0" encoding="UTF-8"?>
<!-- Look for XXXXX and replace with appropriate String. -->
<project name="XXXXX" default="all" basedir=".">
<description>XXXXX</description>

<property environment="env"/>

<property name="javac.debug" value="true" />
<property name="src.dir" value="src" />
<property name="dist.dir" value="dist" />
<property name="classes.dir" value="classes" />
<property name="javadoc.dir" value="${dist.dir}/javadoc" />

<property name="jar.name" value="XXXXX.jar" />
<property name="jar.filesonly" value="true" />

<path id="java.example.classpath" />

<target name="-init">
   <mkdir dir="${classes.dir}" />
   <mkdir dir="${dist.dir}" />
</target>

<target name="compile"
        description="Compile the Java code."
        depends="-init">
   <javac srcdir="${src.dir}"
          destdir="${classes.dir}"
          classpathref="java.example.classpath"
          debug="${javac.debug}"
          includeantruntime="false" />
</target>

<target name="jar"
        description="Package compiled classes into JAR file"
        depends="compile">
   <jar destfile="${dist.dir}/${jar.name}"
        basedir="${classes.dir}"
        filesonly="${jar.filesonly}">
   </jar>
</target>

<target name="all"
         description="Compile Java source, assemble JAR, and generate documentation"
         depends="jar, javadoc" />

<target name="javadoc" description="Generate Javadoc-based documentation">
   <mkdir dir="${javadoc.dir}" />
   <javadoc doctitle="XXXXX"
            destdir="${javadoc.dir}"
            sourcepath="${src.dir}"
            classpathref="java.example.classpath"
            private="true"
            author="Dustin" />
</target>

<target name="clean" description="Remove generated artifacts.">
   <delete dir="${classes.dir}" />
   <delete dir="${dist.dir}" />
</target>

</project>


This template file can be renamed build.xml and the XXXXX tokens can be replaced with text specific to the particular test being executed. Often, this is all I need for the simple tests and I can expand the filesets for Java files in multiple directories. Of course, as things become complicated past a certain point, it often indicates to me that it is time to return to the IDE anyway.

UPDATE (1 October 2010): With Ant 1.8 and above, I also like to set the "includeantruntime" attribute of the "javac" task to "false" to NOT include Ant's runtime classes in my compile-time classpath and to avoid the associated warning.


Handling Dynamic Input

Many of the simple "what happens if I..." tests that I write are perfectly useful with static configuration. In these environments, I shameless hard-code values in to demonstrate the desired function's behavior. However, there are times when my tests need values to be provided at runtime. Java offers a plethora of approaches for supplying data dynamically. These include loading properties, reading environment variables, using command-line arguments, reading from files, and reading from standard input with mechanisms such as System.in and Console.

For the simplistic demonstrative tests that I'm talking about in this blog posting, I generally prefer command-line arguments or command-line input/out. The Console class has made it much simpler to read in values from a console since Java SE 6, but it does not work when redirection is used (see section below on output) because there is no applicable console (System.console() returns null). Command-line arguments work particularly well when I do want to use redirection or want to run my demonstrative tests as part of scripts.

One could also write special clients in Swing or Flex or other technology, but that is almost certainly overkill for the simple types of tests I am talking about unless one of the purposes of the test is to understand interaction between clients such as these.


Generating Output of Simple Test

When writing these simple tests to learn or prove how something works in Java, one of the most important steps is to output results. A key point here is that the output should include the particular thing being tested and the result for that particular thing.

The java.util.logging package is very easy to use, especially when one only wants to use its default settings. However, most of the advantages of using a logging framework are not really that important in this environment of simple tests. Therefore, I find myself using the old-fashioned but still highly useful System.out and System.err. I also have come to really like using Console (via System.console()) that was introduced with Java SE 6, especially for getting interactive user input.


Capturing Output

If I have a lot of output in my simple test or if I want to provide the output in an e-mail, blog, on a Wiki page, or in any other form at a later time, it is useful to capture the text output from my simple Java tests. I could adjust the test code itself to write to a file, but it is often easier to simply use the operating system's redirection support to redirect the output from the console to the specified file.

Both Windows and Linux/Unix support output redirection. In fact, both use the greater-than operation (>) to redirect standard output. In both operating systems, the > operator overwrites any file already at the target location (assuming permissions allow this). In Linux and in Windows, one can append to an existing file with the double greater-than operator (>>).

If the test class outputs to standard error rather than or in addition to standard input, then standard error may need to be redirected to a file for later use as well. In Linux, this redirection of standard output and standard error to the same file can be accomplished with the syntax 2>&1 (placed after the filename that the > or >> operator redirects to). For Windows, redirection of both standard output and standard input is also accomplished with the 2>&1 notation placed after the name of the file to which output is redirected via the > or >> operators.

For more details on redirection in the major operating systems, see the following additional resources:
* Unix Power Tools: Using Standard Input and Output
* The Linux Cookbook: Redirecting Input and Output
* Batch Files - Redirection
* Windows: Using Command Operations Redirection

As mentioned above output redirection precludes the use of java.io.Console because there is no applicable Console in such situations.


An Example

I conclude this blog post with an example. The two questions to be answered in this example are "What is the default initial capacity of a StringBuffer and is that different than the default initial capacity of a StringBuilder?" The answers to these questions are undoubtedly available online. However, these are perfect examples of where we can enjoy the advantages of knowing the exact answers to these questions quickly and confidently by writing simple tests that answer the questions.

The following code list for InitialCapacities.java, demonstrates how easy it is to answer these questions. Using the template Java class makes this really easy to generate quickly.

InitialCapacities.java
/**
* Determine initial capacity of StringBuffer and StringBuilder.
*/
public class InitialCapacities
{
/**
* Main exectuable function for this class.
*
* @param arguments Command-line arguments for this application.
*/
public static void main(final String[] arguments)
{
final StringBuffer buffer = new StringBuffer();
System.out.println("Initial StringBUFFER Capacity: " + buffer.capacity());

final StringBuilder builder = new StringBuilder();
System.out.println("Initial StringBUILDER Capacity: " + builder.capacity());
}

/**
* Provide String representation of me.
*
* @return String representation of me.
*/
public String toString()
{
return "Test to indicate initial capacities of StringBuffer and StringBuilder"; 
}
}


This example is only a single class without any third-party class dependencies, so I'll just use trusty old javac directly here:

javac InitialCapacities.java


This generates a InitialCapacities.class file in the same directory. I can easily run that with the Java application launcher:

java InitialCapacities


The output shows that StringBuffer and StringBuilder have the same capacity of 16. The output is shown in the next screen snapshot.



With this demonstrative test written, I can use it later on different JVMs, different operating systems, and in generally different environments to prove the value is always the same or determine when it is not the same. Likewise, the next time someone asks me about this particular behavior, I could simply pass along this test to them as well.



Conclusion

Simple demonstrative tests can be a useful learning tool in the Java developer's toolbox. These simple tests can supplement information available online and in other resources and offer unique advantages not always as easily obtained with other means. This blog post has attempted to show the few simple steps a new Java developer can take to start using these small demonstrative tests when appropriate.

2009 JavaOne Sessions Available Online

Slides for many of the 2009 JavaOne technical sessions are now available for download via the Content Catalog. The username and password needed to download the slides are provided on this page as well.

Slides for many of the presentations are also available directly from many of the presenters. Examples of this include Monitoring and Troubleshooting Java Platform Applications with JDK Software, Project Coin (small Java SE 7 changes), and A RESTful Approach to Identity-based Web Services.