Incremental Java
Compiling

Compiling

You write programs in Java. However, computers run a language called machine language. Machine language is just a bunch of 0's and 1's, and deal with the internals of a CPU. Even when translated to a kind of English called assembly language, it is far from Java.

Any program that runs must eventually be written in machine code. Rather than make you go write machine code, we use a program called a compiler. A compiler translated code written in one language to that written in another language.

Thus, a compiler converts Java programs to machine code, which can then be run on the computer.

Well, almost. The Java compiler converts Java program to something called bytecode which resembles assembly language, but is not tied to any particular CPU. For this reason, bytecode is said to be machine-independent.

When you run a program, you are really the JVM run the bytecode. JVM is short for Java Virtual Machine. This isn't a machine at all. It is yet another program. This program's job is to run bytecode.

Originally, JVMs were quite slow. Translating and running anything, even bytecode, is much slower than running machine code (i.e., running code natively).

So why doesn't a Java compiler convert Java code to machine code? It could do that, but the reason is portability. The original idea for Java was to develop code on the Web. You would download the code from the Web to your computer, then run it.

If this code had been written in machine language, then it would only work on CPUs that could run that machine language. For example, if it were written in IA32 then only x86 machines (such as Intel and AMD chips) could run the code. Macintoshes (which use PowerPC chips) would be unable to run the machine language of IA32.

Each computer has its own JVM. PCs, Macs, Suns all have JVMs that are especially built for the CPUs on the machine. When you download bytecode, you don't need to modify the bytecode. You run it on the local JVM.

Compiling is pretty easy. Usually you type a command like:

% javac Foo.java
javac is the name of the compiler. This is a command. This command expects a Java file, that ends in .java. Assume you've written a Java program where you have defined a Java class called Foo and the file's name is Foo.java.

The compiler attempts to translate Foo.java to bytecode. If it is successful, the compiler generates a file called Foo.class. (The original file Foo.java is left unchanged).

If it is unsuccessful, this means you've made a syntax error. This is an error in the way you wrote your program. Essentially, you made a grammatical error.

The Java compiler can usually tell you which line it found the error, but the error messages aren't usually very helpful. You have to learn what the error messages mean. It usually takes a few months of programming, with some help from someone who knows most of the error messages mean and how to correct them. This doesn't mean a suitably intelligent and persistent person couldn't deduce the error messages themselves.

If there is an error, then you need to go fix it. The compiler tells you which line to look at. Most text editors indicate which line the cursor is on, so hopefully you can find the line number easily.

Sometimes the Java compiler spits out up to 30 errors. This can be scary to a first time programmer. You may wonder why their program has so many errors, and wonder what you should do. The easiest way to manage the errors is to fix the first error, then fix the second error. As with any set of tasks, you solve them one at a time.

As you fix these errors, you will discover one of two things. Sometimes fixing one error fixes several others. The compiler can get confused, and think one error is really two or three errors. This happens more than you might expect, which is why seeing 30 errors doesn't always mean there are 30 errors.

Occasionally, fixing one error creates more errors. This is because the compiler can get confused, and not recognize an error because there's a much more glaring error that it's noticed first.

A beginning programmer spends a lot of time with these syntax errors, and it takes a few months (or perhaps weeks) to learn how to correct these errors. We won't spend a great deal of time on it though.

Getting a Compiler

Here's another technology issue. How do you get a Java compiler? Sun, a computer company, develops Java. You can go to their site java.sun.com to download a version of the JDK which is called the Java Development Kit. There is a standard version of this called the SDK.

If you're lucky, it's already been installed on your machine, and all you have to do is use it.

IBM has a compiler called jikes which compiles very quickly. It's also supposed to produce good error messages when compiling.

If in doubt, just get the JDK from Sun. In general, you want to get the latest non-beta version. For example, if you go to Sun's website, and click on Downloads on the left, it takes you to a webpage, where on the right side, it lists popular downloads. The latest version, as of this writing, of the standard edition is called J2SE 1.4.2 SDK.

You download the version that's most appropriate to the machine you're working on. This may require special privileges (called admin privileges), or it may not.

Again, an example of dealing with technology. Getting set up to program is often harder than programming. What's worse, the set up seems to vary from one programming language to the next, from one company to the next. If you haven't had lots of experience downloading and installing software, you may wish to find a friend or expert who's willing to teach you how to do it. Take good notes, because you may have to do it again, the next time a new version comes out.

Summary

A compiler has two purposes. First, if the program has no errors, it creates bytecode, which can then be run on a JVM. Recall that a JVM is just another program which runs your program.

Second, and perhaps more importantly, it tells you errors that you've made in your program. These errors are essentially "grammar" errors in Java. Java has very strict rules about how to write valid acceptable code. If you break the rules, the compiler tells you quickly.

Such errors include misspelled words, unmatched braces, some problems with types, and so forth.

A program that compiles doesn't guarantee that the program actually works. It would be fantastic if that were true, but it isn't. The closest analogy I can think of is a word processor that says your essay has no spelling errors. Just because it has no spelling errors doesn't mean it's a good essay. Even if the word processor is more powerful and can check grammatical errors, a lack of grammatical errors does not mean you've written a good essay.

Similarly, the compiler can tell you about many errors, but it has no idea what your program is supposed to do. Only you know that (or maybe your teacher, or colleagues do too). Compilers are "smart" but not that smart.