Learn You Rake
Who needs a Makefile anyway?
Rake is a task runner, and in my opinion, a worthy replacement of Make. It is written in Ruby and, as a result, has the immense benefit of being concise, pleasant to eyes, and fun to write.
Task runners, such as Rake or Make, provide many advantages over manual task management. You can program complex rules or conditional triggers to automate your tasks. They also come with a variety of in-built functions to help you write DRY code and reduce bugs. Finally, they help you write consistent and reproducible code to run on different systems.
For example, Rake can help you organize and manage analyses in a bioinformatics project. With a few simple commands, you can rerun all of your analyses in a desired ordered manner without worrying about mistyping file paths or forgetting important flags.
“Very cool,” you may say, “but how do I use it?” Fair enough. In this post, I will cover elements of Rake that I find useful on daily basis and covers almost all of my use cases.
So, in words of Gandalf, let’s:
Fly, You Fools!
Usage
macOS users should have rake preinstalled. If not, use the following command to install rake:
gem install rake
Once installed, using rake
is as simple as following:
$ cd my_dir
$ tree -L 1
.
├── Rakefile
├── file1.py
├── file2.py
├── data/
└── src/
$ rake
(in /home/user/my_dir)
...running...
When invoked on command line without any options, rake
searches for a file
named Rakefile
in the current directory and executes the default
task
within the file. We will see later how we can adjust which task gets executed and how.
Rakefile is the “Makefile” equivalent of Rake.
Write you a Rakefile
First of all, there is no special format for a Rakefile. A Rakefile contains executable Ruby code. Anything legal in a ruby script is allowed in a Rakefile. However, there are conventions that we must follow.1 For a crash course in Ruby, consider Learn X in Y minutes tutorial for Ruby.
Hello Rake
One of the primary building blocks of a Rakefile is a task
. Task is an
action you wish to perform which consumes an input and produces an output.
Naturally, if you make the input of one task dependent on the output of
another, you create a dependency between the tasks, and create what is
called a pipeline or workflow. More on
that later. For now, let us write a simple task that prints hello world to
the terminal.
# Rakefile
task :default => :hello_world
# cool ruby block syntax
task :hello_rake do
# any valid Ruby code goes here
puts "Hello, Rake!"
end
Save the snippet in a Rakefile
inside a directory and run rake
.
$ rake
Hello, Rake!
By default, Rake will build the :default
task within the Rakefile. If you
don’t have a :default
task, it will give you an error and ask to specify a
task. In the case above, you could also get the same output by executing rake hello_rake
.
So, to summarize, here’s what we did so far:
- We wrote a “hello world” task that prints “Hello, World!” to the screen,
- And because it is not the default task, we specified this task as a pre-requisite to the default task in the first line using the
=>
syntax.
What is a Rake Task?
Based on what we see above, here’s what an empty or pseudo-code style task looks like:
You can also use strings for task names and prerequisites, rake doesn’t care.
For example,task 'name' => %w[prereq1 prereq2]
task <task_name> => <pre-requisites> do |t|
# actions (may reference t or omit |t| from previous line)
end
As you can notice already, multiple pre-requisites are simply specified by putting them into an array or list.
Augmenting Tasks
The existing skeleton, although simple, is quite effective. We can, however, add additional features to it that can make life easier in certain cases. For instance, say your task expects an argument and does something based on that. Say, run code assuming certain version of a program. Rake provides a way to achieve this as follows:
# Rakefile
task :default => :hello_world
desc "Run Hello World with your name"
task :hello_world, [:name] do |t, args|
# any valid Ruby code goes here
puts "Hello, #{args.name}!"
end
Now run it with a twist this time:
$ rake hello_world[Vivek]
Hello, Vivek!
$ rake -T
rake hello_world[name] # Run hello_world with your name
The arguments specified on the command line were passed to the tasks through args
which can then be used inside the block to perform specific functions. As before, you can pass multiple arguments at once.
You may also notice that we added a desc
statement with a string describing what the task does. The benefit of doing that is that it shows up as help when a user runs rake -T
as shown in the example.
RULE
Document your tasks.
What if no arguments are supplied? In such cases, you can use with_defaults
method in the task body to assume specific defaults.
task :name, [:first_name, :last_name] do |t, args|
args.with_defaults(:first_name => "John", :last_name => "Dough")
puts "First name is #{args.first_name}"
puts "Last name is #{args.last_name}"
end
What if number of arguments is unknown or variable? In that case, use the extras
method of args
variable. This allows for tasks that can loop over a variable number of values, and its compatible with named parameters as well:
task :email, [:message] do |t, args|
mail = Mail.new(args.message)
recipients = args.extras
recipients.each do |target|
mail.send_to(target)
end
end
Skeleton of a Task
task <target>, [:arg1, :arg2] => [:pre_req1, :pre_req2] do |t, args|
# actions, may reference t, args here and use methods on them
end
Types of Tasks
While task
is a generic way of doing things using Rake, there are special types of tasks as well depending on what kind of output is expected.
file
tasksphony
tasksdirectory
tasksclean
orclobber
tasks
File tasks
As name suggests, file
tasks are expected to create a file from one or more input files. These tasks would be skipped if the target files already exist.
- File tasks are declared using the
file
method instead of thetask
method. - File tasks are usually named with a string rather than a symbol.
file "read_distribution.pdf" => ["read_counts.csv"] do |t|
sh "Rscript plot_distribution #{t.prerequisites[0]} #{t.name}"
end
The great thing about File
tasks is that rake provides useful file handling
functions such as cp
, mv
, and rm_r
to perform common file operations. For convenience,
these are named after their equivalent command line programs. These functions are
included in the RakeFileUtils
module, an extended version of the standard ruby
fileutils
module and can be explored using ri FileUtils
command.
task :remove_all do
rm_r("./build")
end
Phony Tasks
A phony task is a file task but instead of other files are input, it uses non-file-based-tasks are prerequisites (without forcing them to rebuild). In Makefile
, this is specified using .PHONY
.
Use require 'rake/phony'
to add the phony
task.
require 'rake'
require 'rake/phony'
# Define a phony task to generate a random phone number
Rake::PhonyTask.new('phone_number') do |t|
t.area_codes = %w[212 646 718]
end
# Define a task that uses the phone_number task
task :make_call => :phone_number do
puts "Dialing #{Rake.application['phone_number'].value}..."
end
Directory tasks
It is common to need to create directories upon demand. The directory
convenience method is a short-hand for creating a FileTask that creates the directory. However, the directory method does not accept prerequisites or actions, but both prerequisites and actions can be added later.
directory "testdata"
file "testdata" => ["otherdata"]
file "testdata" do
cp Dir["standard_data/*.data"], "testdata"
end
Clean and Clobber Tasks
Through require 'rake/clean'
Rake providesclean
and clobber
tasks:
clean:
Clean up the project by deleting scratch files and backup files. Add files to theCLEAN
FileList to have theclean
target handle them.clobber:
Clobber all generated and non-source files in a project. The task depends onclean
, so all theCLEAN
files will be deleted as well as files in theCLOBBER
FileList. The intent of this task is to return a project to its pristine, just unpacked state.
You can add file names or glob patterns to both the CLEAN
and CLOBBER
lists.
RULE
Include rules to cleanup temporary files.
require 'rake/clean'
# Define some files to be cleaned
CLEAN.include('*.o', '*.obj')
# Define some files to be clobbered
CLOBBER.include('*.exe', '*.dll')
# Define a task that generates some object files
task :compile do
# code to compile source files into object files
end
# Define a task that links the object files into an executable
task :link => :compile do
# code to link object files into an executable
end
# Define a task that depends on the link task and cleans up afterwards
task :build => :link do
puts "Build complete"
end
# Define a task that cleans up the generated object files
task :clean do
Rake::Task['clean'].invoke
puts "Object files cleaned"
end
# Define a task that clobbers all generated files
task :clobber do
Rake::Task['clobber'].invoke
puts "All generated files clobbered"
end
Useful powerups
FileLists
FileList
allows you to write tasks that process a lot of files. It is essentially an array of files with special methods available.
Creating a file list is easy. Just give it the list of file names:
fl = FileList['file1.rb', file2.rb']
Or give it a glob pattern:
fl = FileList['*.rb']
# more fun
FileList['*.rb'].each do |src|
# add tasks that process files here
file target => src do
# actions
end
end
Parallel execution
Rake allows parallel execution of prerequisites using the following syntax:
multitask copy_files: %w[copy_src copy_doc copy_bin] do
puts "All Copies Complete"
end
In this example, copy_files
is a normal rake task. Its actions are executed
whenever all of its prerequisites are done. The big difference is that the
prerequisites (copy_src
, copy_bin
and copy_doc
) are executed in
parallel. Each of the prerequisites are run in their own Ruby thread,
possibly allowing faster overall runtime.
Rules
Rule tasks, also known as synthesized tasks, have the same characteristics as all other kinds of tasks: they have a name, they can have zero or more actions, they can have prerequisites, and if Rake determines the task needs to be run it will only be run once.
What makes rule tasks different is that you don’t actually give them a name – I know, I just said that rule tasks have names, just bear with me – instead when you declare the task you give it a pattern in place of a name.2
Regular expression based matching:
rule /foo/ do |task|
puts 'called task named: %s' % task.name
end
Specifying regular expression matched rules with syntactic sugar:
rule '.txt' do |task|
puts 'creating file: %s' % task.name
touch task.name
end
Specifying dependencies:
rule '.dependency' do |task|
puts 'called task: %s' % task.name
end
rule '.task' => '.dependency' do |task|
puts 'called task: %s' % task.name
end
Rules for Files
rule '.txt' => '.template' do |task|
cp task.source, task.name
end
Advanced Rakefile
Accessing Other Tasks
You can directly manipulate the input, output, and actions of one task from another. For instance,
task :doit do
puts "DONE"
end
task :dont do
Rake::Task[:doit].clear
end
Running this example:
$ rake doit
(in /Users/jim/working/git/rake/x)
DONE
$ rake dont doit
(in /Users/jim/working/git/rake/x)
Namespaces
When your Rakefile grows, it’s a good idea to bundle tasks into separate namespaces as an additional layer of organization. This is done using namespace
.
CAUTION
File tasks are not scoped by namespace command since they refer to actual physical file on the system.
For example:
namespace "main" do
task :build do
# Build the main program
end
end
namespace "samples" do
task :build do
# Build the sample programs
end
end
task build: %w[main:build samples:build]
This post should provide you sufficient details to get started with a majority of tasks. If you are done with the post, But I highly recommend checking out other excellent resources that dive deeper into specific details.