Java-Scala Interoperability

To many people, Java is programming. It’s the only programming language most of my non-technical friends could name off-hand, and the first one that many programmers learn. Since its initial appearance in 1995, Java’s usage has steadily grown.

There are many reasons for this. The design of the Java platform, in which Java code compiles to bytecode that runs on the Java Virtual Machine (JVM), allows Java to be a “write once, run anywhere” language. That is, Java code that works on my computer will work exactly the same on your computer, or a server in the cloud (in theory, at least). Its language features, especially static typing and object-orientation, lend it to robust, modular, rock-solid code. Due to years of JVM and Java development, the platform is blazingly fast and stable. In addition, the many libraries available for Java allow anybody to leverage years of development and experience to perform complicated tasks simply and quickly. The Java community is also strong: on top of the official documentation, there are years of blog posts (an unscientific Google search for “java blog” yields 268,000,000 results) and close to 280,000 questions tagged “Java” on StackExchange. That’s a lot of collective knowledge and expertise.

At Knewton our systems are Java-based, for a few of the reasons mentioned above. But Java is far from perfect. Its syntax includes much boilerplate and support for concepts like concurrency, parallelism, and functional programming is unidiomatic. There are a lot of small tweaks or features that could make Java a lot more pleasant to work with. These imperfections aren’t significant enough to outweigh the benefits of using Java, but they do make daily work a little more challenging.

Enter Scala. Scala is what would happen if there were a Genie that eavesdropped on every Java developer as they cursed when their build broke due to a missing semicolon, then inferred a wish from their complaints. It offers what Java lacks: a much cleaner syntax (semicolons optional, types inferred), pattern matching and case classes, mix-ins (a more powerful inheritance/interface system), higher-order/first-class/anonymous functions, built-in immutable data structures, a built-in read-evaluate-print-loop (REPL) and actor-based concurrency. Even more importantly than all of this, though, is one huge advantage: it runs on the JVM. This means that, unlike other languages, Scala doesn’t require the reimplementation of large banks of code; programs can use existing Java libraries with minimal effort. For instance, the Java standard libraries are available as soon as your Scala REPL starts. To translate

ArrayList<String> fruits = new ArrayList<String>();
fruits.add("apple");
fruits.add("pear");
fruits.add("banana");
for (String fruit : fruits) {
    System.out.println(fruit);
}

into Scala, we simply write the following (JavaConversions is a Scala library that makes using Java collections like ArrayList much more idiomatic):

import java.util.ArrayList
import scala.collection.JavaConversions._
val fruits = new ArrayList[String]
fruits.add("apple")
fruits.add("pear")
fruits.add("banana")
for (fruit &lt;- fruits) {
    println(fruit)
}
// More idiomatically
fruits.foreach(println)

If you’re using libraries that aren’t packaged with Java or your own classes, just ensure that they lie in the classpath. In addition, the syntactic similarities between Java and Scala mean that the information from nearly any Java resource — a blog post, StackExchange question, Javadoc, or whiteboard code snippet — will be equally valuable in Scala, allowing Scala to tap into over 15 years of experience. While Scala may be a relatively new language, it has access to all of the network benefits of a behemoth like Java.

Portability works the other way, too, but with a few caveats. Consider the following Scala class, in the file “Hamster.scala”:

class Hamster()

The Java file “HamsterTest.java” might contain the following:

public class HamsterTest {
    public static void main(String[] args) {
        Hamster hamster = new Hamster();
    }
}

To compile and run, execute

$ scalac Hamster.scala
$ javac -cp path/to/scala-library.jar:. HamsterTest.java
$ java -cp path/to/scala-library.jar:. HamsterTest

in the shell, and hopefully find that everything works as expected. However, if we want to add a weight field to our Hamster class (“class Hamster(var weight: Int)”), things start to break.

Hamster hamster = new Hamster(42);
// This works (note: "weight()", not "weight"):
System.out.println("The hamster weighs " + hamster.weight());
// But if our hamster then goes on a diet, we're out of luck:
hamster.setA(10); // doesn't work
hamster.a(10); // doesn't work

We can solve this problem with the handy scala.reflect.BeanProperty annotation, which adds getWeight and setWeight methods (we could also add them explicitly):

class Hamster(@BeanProperty var weight: Int)

Then, from Java:

Hamster hamster = new Hamster(42);
hamster.setWeight(hamster.getWeight() - 32);

What about using Scala traits within Java? Since traits aren’t quite interfaces, but they aren’t quite abstract classes, they’re a little tricky. A basic Scala trait, such as

trait Mammal {
    def beHairy
}

can be used just as an interface. But if we add another trait, Rodent,

trait Rodent extends Mammal {
    def beHairy {
        println("I have fur!")
    }
    def gnaw
}

and attempt to extend it

public class Rat extends Rodent {
    public void gnaw() {
        System.out.println("gnawsome");
    }
}

we are met with the compile-time error “no interface expected here”, even though Rodent implements beHairy(). To see why we get this message, whip out javap, the Java disassembler bundled with every JDK (javap is an essential tool for deciphering strange Scala behavior).

$ javap Rodent
Compiled from "Rodent.scala"
public interface Rodent extends Mammal,scala.ScalaObject{
    public abstract void beHairy();
    public abstract void gnaw();
}

Thus, even though Rodent contains method implementation, it still compiles to an interface. Some compile-time magic allows Scala classes to use these traits. The easiest fix is to add something like

abstract class RodentJava extends Rodent

for each trait that implements methods, and have Java classes extend the abstract class, rather than the trait. This has the advantage of being dead simple, but the disadvantage of requiring the maintenance of a separate Java API and sacrificing the ability to mix-in multiple traits as in Scala (this is, however, impossible in Java, since it lacks multiple inheritance); a single abstract Scala class for each potentially used set of mix-ins is the way to go here, but this method risks a combinatorial explosion).

Traits are just one example of a Scala feature that lacks a direct Java equivalent. First-class functions, anonymous functions, Scala’s greater number of access level modifiers, pattern matching, tuples, implicits, and operator overloading are all either non-trivial or impossible to use in Java. In general, to be useful to Java programmers, a library that uses any Scala-specific features must offer a Java-oriented API (of greater complexity than the examples seen here, but involving similar workarounds). Both Apache Kafka [3] and Twitter’s Finagle [4], which I’ve worked with in my summer at Knewton, take this approach; while it’s possible to write Scala code that can be used seamlessly from Java, to do so is to give up the expressiveness of idiomatic Scala.

Whether, as some suggest, Scala will be a successor to Java, or merely a fad, its compatibility with Java is one of its greatest strengths. Combining Java and Scala is nowhere near the undertaking that combining two unrelated languages would be, so the cost of experimenting with Scala as a part of an existing Java architecture is minimal. Given the potential benefits and low risk, it makes sense to consider a total or partial move to Scala as part of a JVM-based ecosystem.