Wednesday, April 2, 2014

Tutorial: The mechanisms behind Gradle

Gradle is a build system made to rival maven. Gradle has declarative scripts, plugins, dependency management, artifact publishing, multiproject builds, and all under a sleek Groovy DSL. For random tasks I still would often use make instead, but for structured projects I highly recommend you ditch maven and move to gradle.

You either like maven or you don't

This isn't a maven-bashing article (though I would love to write one) but I will simply say that for my latest project at work, migrating from maven to gradle cut our build times down by 30%-70%, ~600 lines of xml became ~150 lines of groovy, and weeks of various people troubleshooting various issues with maven became three days of trying to grok groovy's workings.

Assuming you are moving from maven to gradle, like I did, you may wish for a better tutorial, like I did. Gradle's documentation is great for standard workflows, but gradle is more flexible than that. It was figuring these things out that cost me the most time, so I figured I'd publish what I'd learned to further gradle as a great tool.

What this tutorial covers

I'm going to walk you through what I went through. We're going to begin with gradle basics, move to multi-project setup, and then end on some advanced notes that explain how it all works.

This tutorial does not cover every way things could be accomplished, or every feature you may like to know. It is intended to get you on firm ground from which point documentation can take over. This tutorial is also not intended to start projects from the ground up, there are plugins that automatically do much of what I'm about to describe, and often they do everything you need.

If plugins don't do everything you need, here is how to cleanly write build scripts that integrate with them.

Basic tasks

To start off, lets create a project where gradle build prints "hello." Its extremely straightforward:

build.gradle
task build << {
  println "hello";
}

By using the << operator, you can specify code to be run when a task is triggered. But don't worry, you don't have to write redundant code for things like running commands or managing files. Here we define some basic tasks leveraging the "task type" paradigm.

build.gradle
task myDeletionTask(type: Delete) {
  delete fileTree('**.o');
}

task myCopyTask(type: Copy) {
  from 'path/to/source/file'
  into 'path/to/destination/directory'
}

task myCommandTask(type: Exec) {
  commandLine "echo", "hello", "world"
}

Gradle will even capture the output of "myCommandTask" and print it with gradle formatting automatically.

This does bring us to an important distinction. When a task is declared with <<, its code is run when the task is run. However, if the task is declared without <<, its code will be run when the task is configured (ie, immediately). Thus, our first "hello" task needed << to delay echoing until build, while our copy, delete, and exec tasks cannot have << or the configurations will be delayed.

Task dependencies

You can add dependencies between tasks in a couple ways, but these seem to be the most common:

build.gradle
task myTask {
  dependsOn build
}

myOtherTask.dependsOn myTask

tasks.withType(Compile) {
  task -> task.dependsOn myTask
}

And viola, with this simple example, myTask will run before all compilation tasks, and also before myOtherTask.

If you are curious, the last of the three is functional programming in groovy. What it means is this. For each compile task, invoke this function. That task will be the argument named task (as per task ->), and with it we'll use the tried and true method dependsOn.

Breaking up your build into projects

When you run a `gradle` command, it will search up the directory tree looking for a settings.gradle file which defines a project structure. You can declare a project setup like this:

settings.gradle
include "ProjectA", "ProjectB"

and if your projects names don't directly correspond to their paths, you can fix that as well:

settings.gradle
include "ProjectA", "ProjectB"

project(":ProjectA").projectDir = file("path/to/project/a")
project(":ProjectB").projectDir = file("path/to/project/b")

You can now have build.gradle files inside ProjectA's root directory and ProjectB's directory that configure the projects just as you would expect, but you can also create relationships between these projects and cut redundant configurations into a master build file. It is noteworthy to add that you do not need build scripts in these directories at all, often the master build script can take care of everything these subproject build files might need to do.

Multi-project configurations


Next to this settings.gradle file, you can create a root build.gradle file which will serve as a master build script. Here we just add empty tasks to the root project, but we'll talk about configuration injection in a bit.

build.gradle
task build
task clean

Quite boring. What is this master build script for? Well, you can configure subprojects directly, either by name or to each one in general.

build.gradle
project(":ProjectA") {
  task projectAOnlyTask << { println "Project A!"; }
}

subProjects { subProject ->
  task allProjectsTask << { println "Any project!"; }
}

Project A now has a task "projectAOnlyTask," and every project has a task "allProjectsTask." This is simple but incredibly useful code consolidation. It essentially lets you have your own convention-vs-configuration defined in your root build.gradle. Notice the form of configuring all subprojects uses the same functional groovy syntax as tasks.withType -- a function with one argument named "subProject" that's executed for each subproject.

Waiting for subproject evaluation


One gotcha here, is that this root project script is evaluated before each subproject script. That may sound inconsequential, but it means any tasks defined in a subproject build.gradle file cannot be used here, for instance, in a dependency relationship. Gradle solves this with some more functional sugarcoating.

build.gradle
subProjects { subProject ->
  afterEvaluate {
    taskDefinedInSubProject.dependsOn taskDefinedInRootProject
  }
}

The extension API


So lastly, what if you find you have 20 lines of code duplicated between two different projects? Its simple, using the extension properties API, you can make flags/properties in subproject build scripts that the root build script looks for.

projectA/build.gradle
ext {
  runsDocumenter = true
}
build.gradle
subProjects { subproject ->
  ext {
    runsDocumenter = false
  }
  afterEvaluate {
    if(subproject.ext.runsDocumenter == true) {
      task document(type: Exec) {
        commandLine "document.sh", subproject.projectDir.path
    }
  }
}

You have all the power of Groovy to make this API get your job done.

Dependencies between projects

If you split up a project into multiple smaller projects, you probably have a relationship between these two source trees. In this circumstance you almost certainly have a compilation relationship as well. In gradle, you can declare dependencies from project to project like this.

projectA/build.gradle
dependencies {
  compile project ":ProjectB"
}
projectB/build.gradle
dependencies {
  compile project ":ProjectC"
}

Gradle will now incrementally look back up the project dependency tree, as well as pass around jar files, and more. It really is too good to be true, yet somehow it is.

Advanced configuration

What's discussed so far, plus some plugins and supplemental API reading, should get you an advanced build script for an advanced project. However, there is still much magic yet in the way these multi-project builds work. To finish the article, I will break down a simple, but mysterious declaration from the code sample above.

projectA/build.gradle
dependencies {
  compile project(":ProjectB")
}

What does this do? What does it mean?

I recently created protobuf generation as a subproject of a sprawling android project. Since there is no protobuf plugin, I found strange issues when trying to create dependencies on that project in this manner. Stumbling through it greatly improved my sense of understanding of gradle.

We'll take two projects that run simple commands, have no plugins, and then create a dependency between them.

dependency/build.gradle
task build(type: Exec) {
  commandLine "touch", "myfile"
}
dependent/build.gradle
dependencies {
  compile project(":dependency")
}
task build << {
  println "hello"
}

The command `gradle build` now yields us a vague error,

A problem occurred evaluating project ':dependent'.
> Could not find method compile() for argument [project ':dependency'] on project ':dependent'

It makes me glad to say that the "compile" part of the dependency declaration is not an inherent part of gradle's DSL. What this means is that in our current project, something is missing. What is missing? A project configuration. These are added automatically by most plugins, but in our case we have to understand how they work and add one ourselves. Hint: its easy.

dependent/build.gradle
configurations {
  compile
}

Running this now nets us a message "hello" and an empty file in dependency/myfile. However, in more complex scenarios you'll find that there is a problem. Firstly, dependent:build will not build incrementally. Secondly, this will not, surprisingly, guarantee that dependency:build will run before dependent:build.

To solve the first problem you can use the dsl quite neatly:

dependency/build.gradle
task build(type: Exec) {
  commandLine "touch", "myfile"
  outputs.file file("myfile")
}

By declaring outputs (and by the way, there is an "inputs" property too), we can prevent gradle from needlessly regenerating this file. But what's wrong with our order?

The problem is that gradle doesn't know which tasks from :dependent generate the artifacts needed by :dependency, and gradle doesn't know which tasks in :dependent need the artifacts generated by whichever task generates them in :dependency. While declaring inputs and outputs should solve this problem in theory, it isn't scalable and crossing projects like that is against gradle's One True Way.

Telling gradle that :dependent:build should use the artifacts in the compile configuration is easy, so we'll fix that first.

dependent/build.gradle
task build << {
  dependsOn configurations.compile
  println "hello"
}

Its strange how much simple changes can clear away so much fog. Unfortunately, there isn't an easy fix on the next problem.

What gradle expects to see is something like this. Depending on a project means you declare a relationship with its artifacts. Artifacts are matched up to configurations, and tasks. While you can specify exact artifacts from a project by having multiple configurations, the standard DSL format expects a configuration named "default".

dependency/build.gradle
configurations {
  default
}

artifacts {
  default build
}  

Here we give our dependency a default configuration that's pulled in by our dependent project. Then by use of artifacts, we can match that configuration to a specific task. However, since default is a keyword in groovy, we actually have to do this instead.

dependency/build.gradle
configurations {
  delegate.default
}

artifacts {
  delegate.default build
}  

Be aware, we're in the belly of the beast. Our problems don't end here. The next problem you'll see if you try to run this is that "build" is not a task of type "AbstractPublishableTask". To truly get this code working, you would need to create a plugin which defines a new task which implements these methods. However, you don't want to do that, because its too much work, and that's why you ditched maven in the first place. Fear not, there is an easy hack to solve your woes.

dependency/build.gradle
configurations {
  delegate.default
}

task jarHack(type: Jar)

artifacts {
  delegate.default jarHack
}

build.dependsOn jarHack  

And now we have made gradle happy, by appeasing to its focus on java. The important part is not the big picture, but the details. You should now have a better grip on gradle than most, and there should not be much in a build process that escapes you.

By doing it the real way and not the groovy way you can share the artifacts with your other build processes. However, I am not qualified to describe that process, and so I will leave it in your hands to do so and maybe blog about it. Or maybe I'll get around to it eventually.

In summary

Hopefully you can now look at buildscripts and see through the DSL and into the core. If you have any questions, feel free to leave a comment and I'll try to clarify any subjects/topics that I missed.

Happy gradling!