Why Maven?

Thursday 10 March 2016 at 08:00 GMT

Every time I mention Maven, people ask me why I hate myself so much. It's a (scarily) valid question to ask after seeing a POM file, so I thought I'd go into it a little.

Here's a POM file. I hope your scrolling finger has been working out.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <artifactId>rekord-core</artifactId>
    <packaging>jar</packaging>

    <name>Rekord Core</name>
    <url>https://github.com/SamirTalwar/Rekord</url>

    <parent>
        <groupId>com.noodlesandwich</groupId>
        <artifactId>rekord-parent</artifactId>
        <version>0.4-SNAPSHOT</version>
    </parent>

    <dependencies>
        <dependency>
            <groupId>org.pcollections</groupId>
            <artifactId>pcollections</artifactId>
            <version>${dependencies.pcollections.version}</version>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>${dependencies.junit.version}</version>
            <scope>test</scope>
            <exclusions>
                <exclusion>
                    <groupId>org.hamcrest</groupId>
                    <artifactId>hamcrest-core</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.hamcrest</groupId>
            <artifactId>java-hamcrest</artifactId>
            <version>${dependencies.hamcrest.version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.xmlmatchers</groupId>
            <artifactId>xml-matchers</artifactId>
            <version>${dependencies.xml-matchers.version}</version>
            <scope>test</scope>
            <exclusions>
                <exclusion>
                    <groupId>org.hamcrest</groupId>
                    <artifactId>hamcrest-core</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>${dependencies.guava.version}</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>build-helper-maven-plugin</artifactId>
                <version>1.8</version>
                <executions>
                    <execution>
                        <phase>generate-test-sources</phase>
                        <goals>
                            <goal>add-test-source</goal>
                        </goals>
                        <configuration>
                            <sources>
                                <source>src/testsupport/java</source>
                            </sources>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

It's… long.

That is the POM file describing the build and dependencies of one of the subprojects of Rekord. There's a lot more in the parent project that this one depends upon, as well as lots of siblings. I count seven in that one repository. It's intimidating, even for me, so I get why others shy away from Maven in favour of Gradle, SBT or Leiningen.

For contrast, take a look at the equivalent Gradle build file. I've written this basically from memory, so forgive me if I'm a bit off. I've omitted the maven-publish plugin and associated configuration as that's the sort of thing you'd put in the root project file and inherit in the subprojects.

apply plugin: 'java'

name = 'rekord-core'

sourceSets {
    test {
        java {
            srcDirs ['src/testsupport/java', 'src/test/java']
        }
    }
}

dependencies {
    compile group: 'org.pcollections', name: 'pcollections', version: versions.pcollections

    testCompile(group: 'junit', name: 'junit', version: versions.junit) {
        exclude module: 'hamcrest-core'
    }
    testCompile group: 'org.hamcrest', name: 'java-hamcrest', version: versions.hamcrest
    testCompile(group: 'org.xmlmatchers', name: 'xml-matchers', version: version.xmlMatchers) {
        exclude module: 'hamcrest-core'
    }
    testCompile group: 'com.google.guava', name: 'guava', version: versions.guava
}

Isn't that way nicer? So, Samir, what's wrong with that?

XML is dead. Long live XML!

Here's my first issue. With all three of the alternative JVM build tools, the configuration is code. That's Groovy. SBT uses a subset of Scala. Leiningen uses Clojure. The common thread is that they're all JVM languages, and therefore just parsing the files requires evaluating custom code. Code that's not always compiled, can't be easily cached, uses a huge amount of syntactic sugar (which is generally slower to parse) and takes a large amount of memory. While Gradle, especially, puts a huge amount of effort into caching a compiled version of the build.gradle file, this is frequently negated by other factors, and the speed increase is often negligible.[^Thanks to @CedricChampeau]

Maven, on the other hand, uses XML. It's verbose, but it's simple. It follows a schema, and can be validated incredibly quickly before parsing even starts. Reading an XML file and constructing the appropriate data structures takes just a few milliseconds. Compare the performance of an IDE such as IntelliJ IDEA when re-importing a Maven project as opposed to another. The former takes under a second, and I don't even notice it—I even let IntelliJ detect changes and re-import automatically. On the other side, if you ask IntelliJ to import your SBT or Gradle files, you might as well go make a cup of coffee. Your computer will be unusable for some time.

I really hope Maven ditches XML soon, but in favour of YAML, edn or another data structure syntax. Code as configuration is incredibly powerful, but I'd rather my build tool was fast.

[^thanks to @cedricchampeau]: Many thanks to Cédric Champeau for convincing me to correct the record here. I had originally left it ambiguous as to whether any compilation or caching actually happens at all.

Speed is King

On the topic of speed, here's another problem I have with Gradle and SBT (though less so with Leiningen). Both of those tools use Ivy under the hood. Like Maven, Ivy is an XML-based JVM dependency manager managed by the Apache Software Foundation… because you can't have too many XML-based JVM dependency managers, I guess.

Unlike Maven, Ivy takes forever to download dependencies.

Y'see, in Maven 3, downloads were parallelised and far more aggressively cached. Whereas Ivy downloads one file at a time, Maven grabs as many as it can, because it knows that the network may not be the bottleneck. In addition, a Maven repository is guaranteed to never modify an artifact once it's been uploaded. Ivy is far more flexible, and can work with any repository structure, which means it can't assume those same guarantees and has to double-check every time. I hate waiting for my build tool to download the world. If Java and Scala developers insist on depending on every single library known to man, I'd at least like them to hurry up and download to my machine so I can get on with my work.

Consistency keeps my brain spinning

On the client, there are lots of options, but on the server, Maven is ubiquitous. Maven Central and jCenter, which are both Maven repositories, are the only two places developers look for libraries and tools, and even when we're pointed to a third-party such as Twitter's Maven repository, which I mentioned yesterday, it's still Maven. This means that the directory structure is defined, everything is signed, and everything needs a POM file. If I'm writing Gradle scripts to generate POMs, they're almost always going to be wrong—Gradle even provides a way of modifying the XML data structure as part of the generation process because of this. I'd much rather write POMs, read POMs and deal with POMs throughout. I've been bitten by incompatibilities enough times that I feel life is just easier sticking to one format.

The Lesser of Many Evils

Maven is hardly perfect. The XML hurts my eyes, as does the command-line output whenever I run something. Extending it often requires another 30 or 40 lines of XML to configure the appropriate plugin, or in the worst case, writing a new plugin in Java with interfaces that should have stayed in the stone age. None of these build tools really take my fancy. But I have to pick, and so I pick speed and simplicity over power and prettiness.


If you enjoyed this post, you can subscribe to this blog using Atom.

Maybe you have something to say. You can email me or toot at me. I love feedback. I also love gigantic compliments, so please send those too.

Please feel free to share this on any and all good social networks.

This article is licensed under the Creative Commons Attribution 4.0 International Public License (CC-BY-4.0).