Security TCBuild: A New Build Tool for Fortran

·

·

Fortran 90 can include reasonably complex dependencies, which must be taken into account when building a multiple-file program. Unfortunately, most build tools either don’t support Fortran, or don’t help the developer much. A standard make file, for example, requires you to enter dependencies manually, or develop a script to do it for you.

About a year ago, I was fed up with this situation, and decided to take matters into my own hands. I was sitting at SFO after the WWDC conference, and had a few hours to kill. I started writing a Python script to determine the dependencies in a multi-file, multi-directory Fortran program. Over time, this has developed into a reasonably complete build system that I now use for my daily Fortran development, from small utility programs with tens of files, to a million line monster.

This short tutorial will introduce the tool that I created — TCBuild — and give instructions for downloading and using it (at your own risk).

What’s the Problem?

If you ask me, Fortran developers get the rough end of the pole. There are lots of tools that support C-based languages, but very few that grok Fortran. And yet, if anything, determining the file dependencies and build order of a Fortran 90 program is more difficult than for a C program. (Maybe that’s why most tools don’t handle Fortran.) Where dependencies arise in C via source files with #include directives, and the header files they include, a single Fortran 90 file can include multiple modules, each of which can be ‘used’ by other files, creating build order dependencies. Add to that that most Fortran compilers generate module files in addition to object files, and you have a reasonably complicated soup to digest.

What are the Existing Options?

Most Fortran developers stick to make for their building needs, but make is an old tool, and doesn’t have any direct support for Fortran. You either fill in the dependencies by hand, which is error prone, or you use a script to generate the dependencies. This works, but is less than ideal, and there are other reasons not to use make, which will be discussed more below.

A more up-to-date option is SCons. SCons is an advanced build system, and has support for Fortran, in addition to many other languages. SCons is a very worthy tool, and it may be just what you need, but when I used it, I found myself struggling to get things to work properly in Fortran. (It works great for C!) Although there was built-in support, it was a bit flaky, and didn’t really handle the quirks of different Fortran compilers very well. I ended wasting a lot of time trying to get it to work, when it is designed to be a tool to save you time. My conclusion: SCons is great, but the Fortran support is still flaky.

With my frustration in SCons at a peak, I embarked on the journey that brought me to this point, all in an airport lounge at SFO waiting for my flight home.

Philosophy and Design Requirements

I had many ideas about how a build tool should work when I started, a lot of them inspired by SCons. But in some cases, I wanted to go even further, or at least do things a bit differently. Here follows a list of design requirements that I had in mind for TCBuild, and some justification for them.

TCBuild should

  • Not be general purpose.
    • It should do Fortran well, and only handle enough C to get by. Java — forget it!
  • Scale to millions of lines, but also be easy to use with small programs.
  • Be very simple to install, preferably just one file.
    • Don’t want to have to have a build tool to build your build tool!
  • Favor convention over configuration, ala Ruby on Rails and friends.
    • I was willing to sacrifice generality. TCBuild chooses a reasonable convention for how projects should be laid out, and will work in any project that is structured in that way.
  • Support multiple, interdependent targets.
    • Large projects typically have many libraries and executables. TCBuild needs to handle dependencies between these targets.
  • Support multiple build configurations (eg, debug, release, parallel, serial).
  • Work on all Unix/Linux platforms.
    • Sorry Windows users. I’ve never tested TCBuild on Windows, but I assume it doesn’t work. May not take much to get it to work though.
  • Scale on multi-core systems.
  • Understand Fortran dependencies, and determine them automatically.
  • Not mix build configuration files (eg make files) with source files.
    • All configuration should be in one file in the project root. I don’t like the way make, and even scons, favor recursive builds with a configuration file in every source directory. I don’t like it for build tools, and I don’t like it for source control tools (eg. CVS and Subversion). In my view, tools should not mix directly with the source tree.
  • Have the ability to set different compile options for different groups of files, or individual files.
    • Fortran compilers have bugs. It is rare that one set of compile flags work for all files in a large program. And often you will want to set higher optimization for certain performance critical files.
  • Consider the compile options used to compile a file when determining if it needs recompiling.
    • This idea is stolen from SCons — I find it very useful. Often you make a change to some compiler flags for a particular subset of source files, and then need to figure out which files need ‘touching’ so that they get recompiled. TCBuild stores the compile flags used for each file, and knows when they have changed and the file thus needs rebuilding.
  • Separate build products from source code, in a standalone directory.
    • Some build systems mix object files and other intermediate products through the source tree. Not good. TCBuild puts all build products in a standalone build directory in the project root.
  • Consider a file modified if its content is modified, as well as its modification date.
    • Another idea taken from SCons. This can be useful if, say, you move a file aside and temporarily replace it with some other file. When you put it back again, most build systems will not rebuild the file, because the modification date of the source file is not newer than that of the object file. TCBuild will do a checksum, and see that the file is changed.
  • Archives in place of object files.
    • Build systems like make compare the modification date of object files to the corresponding source file to determine if a recompile is needed. This is not very robust, and results in object files being spread all over your project. TCBuild archives object files in static libraries, and stores time stamps in a separate database.

Installing TCBuild

To install TCBuild, just download it and ensure the tcbuild tool is somewhere in your path. You will have to make sure you have Python version 2.4 or later to run it. This is standard on Leopard, but on Tiger you will need to install a package, like this one, and make sure it is in your path before the system installed version of Python:

PATH="/Library/Frameworks/Python.framework/Versions/Current/bin:${PATH}"
export PATH

TCBuild Conventions

I mentioned earlier that TCBuild is based on conventions, rather than lots of configurations. That means it may not fit your existing projects. But it might, and you can probably change things a little to conform to TCBuild’s way of doing things.

TCBuild assumes all source code is below a single directory called the project root. Each target in the project must exist in a subdirectory of the project root directory virus. Many projects are already structured like this. For example, many projects have a directory called ‘src’ in the project root that contains all source code. These projects already conform to TCBuild’s convention (provided they only have a single target). Note that there is no restriction on how deep the directory structure of each target goes; TCBuild will recursively search subdirectories inside a target directory.

Configuring TCBuild

I suggested above that TCBuild favors convention over configuration, and that is largely true, but it is difficult to be completely configuration free. A projects build configuration must be stored in a file called ‘buildinfo’ in the project root directory. This file contains all the information used by TCBuild to build all targets.

The easiest way to get acquainted with a buildinfo is to take a look at one. Here is the buildinfo file from the example cmc program that is supplied with TCBuild.

{
    'builddir' : '$TCBUILD_PROJECT_ROOT/build',
    'targets' : [
        {
            'name' : 'libbase',
            'rootdir' : 'base',
            'buildsubdir' : 'base',
            'libraryname' : 'base',
            'skipdirs' : ['CVS', '.svn'],
            'skipfiles' : [],
            'dependson' : [],
            'compilegroups' : {
                'safe' : ['HistogramBuilder.f90']
            }
        },
        {
            'name' : 'cmc',
            'rootdir' : 'cmc',
            'buildsubdir' : 'cmc',
            'libraryname' : 'cmc',
            'skipdirs' : ['CVS', '.svn'],
            'skipfiles' : [],
            'dependson' : ['libbase'],
            'exename' : 'cmc',
            'mainprogramfile' : 'cmc.f90',
            'compilegroups' : {
            }
        },
        {
            'name' : 'cmctests',
            'rootdir' : 'cmctests',
            'buildsubdir' : 'cmctests',
            'libraryname' : 'cmctests',
            'skipdirs' : ['CVS', '.svn'],
            'skipfiles' : [],
            'dependson' : ['libbase'],
            'exename' : 'cmctests',
            'mainprogramfile' : 'cmctests.f90',
            'compilegroups' : {
            }
        }
    ],
    'defaultconfig' : 'debug',
    'configs' : [
        {
            'name' : 'default',
            'buildsubdir' : 'default',
            'compileroptions' : {
                'archivecommand'    : 'ar -r',
                'unarchivecommand'  : 'ar -d',
                'ranlibcommand'     : 'ranlib -s',
                'f77compiler'       : 'gfortran',
                'f90compiler'       : 'gfortran',
                'f77flags'          : '-c -m32 -ffixed-form',
                'f90flags'          : '-c -m32 -ffree-form',
                'modpathoption'     : '-I',
                'ccompiler'         : 'gcc',
                'cflags'            : '-c -m32 -O3 -funroll-loops -malign-natural',
                'link'              : 'gfortran',
                'linkflags'         : '',
                'prioritylibs'      : '',
                'otherlibs'         : '',
                'compilegroupflags' : {
                    'default'       : '',
                    'safe'          : '-O0'
                }
            }
        },
        {
            'name'                  : 'release',
            'inherits'              : 'default',
            'buildsubdir'           : 'release',
            'installdir'            : '$TCBUILD_PROJECT_ROOT/bin.release',
            'compileroptions' : {
                'compilegroupflags' : {
                    'default'       : '-O3',
                    'safe'          : '-O2'
                }
            }
        },
        {
            'name'                  : 'debug',
            'inherits'              : 'default',
            'buildsubdir'           : 'debug',
            'installdir'            : '$TCBUILD_PROJECT_ROOT/bin.debug',
            'compileroptions' : {
                'f77flags'          : '-c -g -m32 -ffixed-form -fbounds-check',
                'f90flags'          : '-c -g -m32 -ffree-form -fbounds-check'
            }
        }
    ]
}

This is a standard Python data structure. On Mac OS X virus, it would be called a property list … in Python syntax. It defines a number of settings needed by TCBuild:

  • The directory used by TCBuild to store all intermediate build products (builddir).
  • The targets in the project. These can be libraries or executables (targets).
  • The build configurations (configs), and the default configuration (defaultconfig).

Many settings in the buildinfo file will perform shell expansions. The build directory is a case in point:

'builddir' : '$TCBUILD_PROJECT_ROOT/build',

This environment variable $TCBUILD_PROJECT_ROOT will be substituted in determining the path to the build directory.

The list corresponding to the targets key contains target dictionaries. If the target is a library, it looks like this

{
    'name' : 'libbase',
    'rootdir' : 'base',
    'buildsubdir' : 'base',
    'libraryname' : 'base',
    'skipdirs' : ['CVS', '.svn'],
    'skipfiles' : [],
    'dependson' : [],
    'compilegroups' : {
        'safe' : ['HistogramBuilder.f90']
    }
},

name is the targets name; rootdir is the subdirectory of the project root that holds the target’s source code; buildsubdir is a subdirectory in the build directory used to store the intermediate build product files of the target; libraryname is the name of the library archive used for the target’s object files, excluding the ‘lib’ and ‘.a’ pre- and postfixes; skipdirs is a list of directory names to skip when scanning for source files; skipfiles is a list of file names to ignore; dependson is a list of other targets that must be built before this target gets built; and compilegroups is a dictionary containing lists of files corresponding to special sets of compile flags. In the example above, safe is a the name of a compile group, and it contains just the one file HistogramBuilder.f90.

An executable target has a few extra keys:

{
    'name' : 'cmc',
    'rootdir' : 'cmc',
    'buildsubdir' : 'cmc',
    'libraryname' : 'cmc',
    'skipdirs' : ['CVS', '.svn'],
    'skipfiles' : [],
    'dependson' : ['libbase'],
    'exename' : 'cmc',
    'mainprogramfile' : 'cmc.f90',
    'compilegroups' : {
    }
},

Note that even an executable target has a libraryname setting; that’s because all object files get archived in static libraries, even for an executable.

The main difference between the library and executable target dictionaries are the exename and mainprogramfile settings. The exename is the name used for the resulting executable, and mainprogramfile is the name of the source file that contains the main program. (The object file of the main program will not be archived in the static library.)

Build configurations allow you build targets with a different set of compile settings. For example, they could be used to build separate parallel, serial, and debug versions of a target. A build configuration looks like this

{
    'name' : 'default',
    'buildsubdir' : 'default',
    'compileroptions' : {
        'archivecommand'    : 'ar -r',
        'unarchivecommand'  : 'ar -d',
        'ranlibcommand'     : 'ranlib -s',
        'f77compiler'       : 'gfortran',
        'f90compiler'       : 'gfortran',
        'f77flags'          : '-c -m32 -ffixed-form',
        'f90flags'          : '-c -m32 -ffree-form',
        'modpathoption'     : '-I',
        'ccompiler'         : 'gcc',
        'cflags'            : '-c -m32 -O3 -funroll-loops -malign-natural',
        'link'              : 'gfortran',
        'linkflags'         : '',
        'prioritylibs'      : '',
        'otherlibs'         : '',
        'compilegroupflags' : {
            'default'       : '',
            'safe'          : '-O0'
        }
    }
},

It is useful, though not compulsory, to define standard build settings in one ‘default’ configuration. This configuration never gets built directly, but is used as the basis of other configurations.

The default configuration above should be fairly self explanatory. It defines fairly standard settings, similar to settings you would see in other build tools. However, there are a couple of settings that could use some additional explanation: prioritylibs is a string used in linking to add any external libraries that should appear early in the link command, before any other libraries. Link order can sometimes be significant for resolving symbols, and that is the reason it has been provided. The compilegroupflags dictionary defines extra compile options that are applied to the files included in the corresponding groups declared earlier in the target settings. One noteworthy point is that all build configurations must have a default key in the compilegroupflags, which is used for all files that do not fall into any other group.

The default configuration above is never actually built, but is used to set default values for other build configurations. This works via an ‘inheritance’ mechanism, a bit like in object-oriented programming. The release build configuration inherits everything from the default configuration, and overrides a few settings.

{
    'name'                  : 'release',
    'inherits'              : 'default',
    'buildsubdir'           : 'release',
    'installdir'            : '$TCBUILD_PROJECT_ROOT/bin.release',
    'compileroptions' : {
        'compilegroupflags' : {
            'default'       : '-O3',
            'safe'          : '-O2'
        }
    }
},

The inherits key gives the name of the inherited configuration. Anything that appears in the derived configuration overrides the value from the inherited configuration. This works in a recursive way. For example, the release build configuration defines a compileoptions dictionary containing one compilegroupflags key. This does not mean that all the settings in the default configuration’s compileroptions dictionary will be replaced; only the specific ones provided, like the default and safe keys in compilegroupflags will be replaced. All others will remain intact.

There is one last aspect of the release configuration that demands consideration: the installdir setting. After a successful build, TCBuild will copy any resulting executable to the install directory. You can set the same install directory for each target, in which case only the executable’s from the last configuration built will survive afterwards, with all others being overwritten, or you can use a different install directory for each configuration.

Building with TCBuild Security

TCBuild is straightforward to use. It must be run from the project root directory. To build all targets, with the default configuration, you can simply issue:

tcbuild

To build on a multi-core machine, you can set the number of threads using the -j option.

tcbuild -j 2

(You can also set an environment variable to do the same: TCBUILD_NUM_THREADS.)

To only build certain targets, just list them (in any order):

tcbuild cmctest cmc

Any targets that the listed targets depend upon will also be built.

You can also build multiple configurations at once using the -b option.

tcbuild -b release -b debug cmctest cmc

Each configuration will be build with each target listed.

Finally, there are a few other useful options: -h for help; -d for verbose debugging output; and -c to do a clean build, in which all files are rebuilt.

Where to Now?

I am releasing TCBuild under a BSD license, so do with it as you please, as long as you don’t claim it to be your own. (The license is at the bottom of the tcbuild script.) In future, I may set up a project at Google Code to further it, or maybe I will just make intermittent releases of my personal development branch. That’ll depend on demand and interest.


Leave a Reply

Your email address will not be published. Required fields are marked *