r3 - 27 Nov 2002 - 17:17:43 - RichCawleyYou are here: myGrid wiki >  Mygrid Web  > DeveloperResources > BuildResources > BuildSpec

Please note that this is not the normative version.
This is in CVS /mygrid/doc/developer/build/build.html

A build specification for myGrid

Phillip Lord

Abstract

The following is an attempt to define standards for a build environment for myGrid.

Requirements

  • myGrid is being developed on different sites, by different people
  • Although myGrid should exist as a single entity, it is also clear that different parts of the project will be useful to other groups.
  • Other groups should be able to use parts of myGrid without getting the whole thing

From this it appears therefore that myGrid should consist of a number of modules, which are largely independent, or where dependencies are clearly enumerated.

  • In order to have a unified build however, the different modules need to have a build system which can be unified, as well as operating "stand alone"
  • Many of the modules will share external dependencies. These should not be duplicated between them, as this will lead to both download bloat, and also, however careful we are with version numbers, external dependency version conflict
  • "Stand-alone" build should also allow developers to build just the subsystems within myGrid that they are actively developing, to reduce the time spent on the code-compile-test cycle. This will also reduce the amount of space they need to use, if they use multiple work file copies of their modules, as some people like to
  • The build system should also be the common point of entry for the unit test system.
  • It may also be appropriate to use it as a common point of entry for the unit test system.
  • Likewise deployment and installation. And possibly even launch of the system.

Although its not a specific requirement, it appears at the moment that ant is likely to be the main build tool. This will work so long as we only develop in Java. My experience suggests that ant is clunky when used for other languages.

Overall Design

The best way to address these issues is to define a single unified directory structure which all of the projects should follow. Each project should also design a single build file with standard entry points (targets). These individual build files can then be directed "from above" by a top level build file to perform a unified build. Here I call the individual build files "module build files", and the top level build file "the main build file".

This addresses a number of requirements. The projects maintain independence in their build environment, but can still be addressed in a unified way. It will also mean that if external users want to build several different subsystems of myGrid, but not the whole thing, then they have a unified interface to do so. This clearly also addresses subsystem build.

I considered suggesting just a unified build, and not also a unified directory structure. However the are several reasons for arguing against this. Firstly a simple case of familiarity will make it easier for developers working on more than one package. Secondly, some tasks needed for the unified build will need to operate directly on the directory structure, rather than through the module build files. For example, driving a "javadoc" target through module build files would result in a series of javadoc directories, which would not be hyperlinked. If we want these hyperlinks we have to invoke Javadoc on all of the source files simultaneously (and hope the machine has enough memory). I can see no way around this other than a unified directory structure.

From a secondary non build point of view, this will also have an advantage when using tools. Within developing the "ontologytools" package for instance, we were all using Emacs-JDE, and I was able to write a "project file" which set up all the options appropriately. It would be easy to share this. The same would be true where any two (or more) developers shared an IDE.1

The main disadvantage with this system is that we will end up with considerable code duplication between the module build files. It may be possible to get around this by providing a default module build file, with appropriate hooks, and settable properties. On the other hand it may not. My suspicion is that it wont.

The second disadvantage is that the compiler will not just work out dependencies for us, and we will have to build modules in the correct order. This is addressed in later.

Directory Structure

There are any number of potential directory structures, but most of the projects that I have seen are broadly similar. The main consequences of the directory structure relate to the build files, and hopefully no one will care enough about it, to find it worth disagreeing.

The basic directory structure that I am proposing for myGrid is shown in the following table. I have somewhat mixed up end directory structure and directory structure in the CVS. Those directories marked with †should be present in the CVS, those without will be generated directories.

Directory or File Name Usage
ext/ All external dependencies should be found here
Licenses/† Location of all the license documents should be here. There may be quite a few of these, as external dependencies will need them.
doc/† Top Level documentation which applies to all modules. Would probably include stuff like this document, general help documentation. Also probably an "AUTHORS" document.
javadoc/ Unified Javadoc repository, created from all of the source.
build/ Used for temporary files during the build. What this includes will depend on the "build mode".
build/test-results/ JUnit test results should go in here
build/classes/ Generated class files.
lib/ For jar files. What this includes will depend on the "build mode".
samples† Examples which run and do cute things
module-n The various modules that we need to do things.

Below this directory structure we will have the various modules which actually do things.

Directory of File name Usage
src/ All source code should be placed in here. By source we mean, anything that is need to generate the functional software. And not anything else. This would exclude documentation, and also "source" code that is actually built from something else
build/ The place where everything that is generated during the build process goes, which is not required for the final software. Obvious candidates would include .class files. Less obvious would be generated .java files.
lib/ The place where all generated files needed to run the system will go. In general this should include only jar files.
ext/ When packaged stand-alone, external dependencies should go here. These should be copied out from the top level ext/ directory.
samples/ Examples which run and do cute things

Standard Targets

There will be a number of standard build targets that each module build file will need.

Target Depends Usage
compile none Compile everything that needs to be compiled for functional code. Not documentation. Compile means generate .class files, and not .jar files.
doc none Generate any documentation which needs generating.
javadoc none Generate javadoc.
jar compile Package everything up into .jars in the lib/ directory
dist jar Generate all of the distribution files, both zip and tar.gz
test compile Run complete test sets on files in build/ directory.
clean none Delete everything that is generated. This should return the system to what appears out of the CVS.
deploy compile Create the web apps structure within the build
install deploy Install webapps to tomcat, using the admin upload client. There are also reload, remove, and list targets using the ant tasks shipped with tomcat

A couple of these target could probably do with some further subdivision, in particular "clean", which probably needs a separate "clean_classes", which I've found very useful during development. Likewise dist, could depend on dist-zip, and dist-tgz.

Build Modes

I think we will need two different build modes. The first will be "modular", which will build a module on it's own, within its own directory structure. The second will be "unified" which will build the whole system in a unified manner. Essentially this means re-directed the class path, and target directories from "module-n/build" to the top level "build" directory, and so on.

I'm not convinced we need this at the moment. It might be nice to have a distribution which presents mygrid in a single build environment, and it would probably be reasonably easy to achieve. But it might not be, and its probably not worth a lot of hassle.

Either way it can be achieved with a unified use of ant properties.

Ant Properties

It may be nice to present the individual developers with the option of altering certain properties of the ant files from within the individual module files. I use this facility for instance to set up various ant properties, such as for instance, the "build.compiler" to jikes (which is faster during development), and "build.compiler.emacs" to true, which makes jikes emit emacs parsable error message (which is the other reason I use jikes, cause it gives column numbers). I normally achieve this by loading a properties file from ".ant/module-name.props. With this technique you can also subvert the build in fairly major ways, which can be very useful under some circumstances (for instance, keeping source in one backed up directory, and all generated files to somewhere else entirely). This is easy to support in the build files.

On documentation

We have not got an agreed standard for generating documentation. Like IDE's, its probably not a good idea to try and get one. My own feeling is that we should include both the source of the documentation, and a unified presentation format, which will be PDF. Sub-directories should be created where this is appropriate for the format (LaTeX?, docbook or linuxdoc, and so on). Once we have a specified build platform, we should also include documentation builds in the main build, although, for end user convenience, pre-built copies should be in the download!

Automated Top Level Build

In one sense it would be really nice to have an automated top level build file, which does everything based purely on the directory names below it.

I am not convinced that this is possible at the moment. As noted previously, we have to build modules in the correct order because of dependencies between them. At first sight this is a disadvantage of the recursive build strategy. However it has one advantage. We have to be explicit about these dependencies, so we will know what they are, and when they are inappropriate. We will also not be able to support circular dependencies (a depends on b, b depends on a), because one of them will have to be built first. My own feeling is that if two modules are intertwined in this way, they should actually be one module.

If we can not support an automated top level build, we should probably have an automated warning system, which checks that the list of modules the build system has, is the same as the directories that exist. That way we will know if we have missed something, but will not have to find a way to define dependencies automatically.

The module build files should be considered to be part of the test suite from the top level build file. It should check whether the module build files implement all the targets that they should do.

Websites

Some services will require publication of web documents on tomcat. There does not appear to be an automatic way of doing this at the moment. I suggest all web documents go in "ROOT/module-n-site" where "module-n" is the name of the module. This should avoid name clashes.

At the current time there appears to be no easy solution to the problem of file permissions. The prepare target should try and create the directory, and if that fails (as it will under unix), the developer will have to create that directory by hand as root, and then do a chown. This should be a one off setup cost.

Axis

Axis doesn't seem to provide an easy install client, as tomcat does, so jar files will have to be copied into the "AXIS_HOME/lib" directory. We can avoid problems here with the a few rules. No jar files should be placed in lib, unless they are created by the build (in which case they should be named after the module), or unless they are external dependencies, identified by the gather target. This way even if one module overwrites a file placed by another we can guarantee that its the same file. For a unified build even this should not be necessary as the files should all be up-to date.

File permission problems also arise here. The lib directory will have to be readable and writable by the user running ant. This is not great, but I can see no other solution.

Who does what?

There is an issue here of who does what, and who is responsible for maintenance of which bits. My own feeling is that this should be as follows...

The top level build file should be Work Package 8. They should also generate a canonical module build file which other work packages can steal. They should also be responsible for investigating the possibility of a unified module build file with appropriate hooks.

If we can not do unified module build files, individual module developers should be responsible entirely for these.

Outstanding Issues

There are a number of outstanding issues at the moment.

  • Properties...I've not made suggestions for these yet, but we will require a number of specified ant properties in all build files.
  • Version numbers on external dependencies. Good idea or not?
  • Installation targets. The tomcat ``install'' target is fine for installing temporarily, but we also have might want more permenant installation.
  • Source gathering targets. We might want targets to gather source, from CVS, current head, or tagged, and so on. My own feeling is that we should use an external ``driver'' build file to do this. For an overall myGrid build this would make three different levels of ant files, which is somewhat scary.
  • Lots of naming issues. Project names (within ant), names of jar files, names of module directories and so on.
  • Webapps directory structure. Do we need something beyond "lib".

Machine build requirements

There are a number of requirements that a machine will need before the build will work. These are...

  • A tomcat installation
  • A tomcat user, "username" with password "password", with admin, manager and tomcat roles.
  • An axis instllation with readable/writeable lib/ directory.
  • The relevant readable/writeable directory in tomcat for web files.

Conclusions

I believe the build system suggested fulfils all of the requirements that I have stated at the beginning of the document. In this version the requirements have come largely from my head, and so are a point for discussion.

The build system also does not appear to be that complex, so should be easy enough to implement. The system described here as the module build files was implemented initially in a few hours by Sean for oiled (which is now in the ``ontologytools'' section of the CVS), and probably has had a couple more hours work by myself and Angus, during subsequent development.

The module build system is also fairly similar to other standard pieces of software like xerces and xalan, and bears a significant resemblance to the GNU autoconf/automake style of build, so should be easy to understand.

The top level build system, also should not be too complex, mostly just invoking module ant files in the appropriate order. They may be slightly more complex if we go for a ``unified build'' mode.

Footnotes

1 It might be worth doing a survey of the developers to find out who uses what IDE. If there are a few main candidates, then it may be worth providing resources to supporting one or two, within myGrid. Our experience with the ontologytools suggested that this saved time and effort over everyone having to do it independently. I am not suggesting here mandating a single IDE, which is likely to be greeting with a stony silence from many quarters. Including me, unless its the one I use, of course.

-- RichCawley - 22 Oct 2002

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r3 < r2 < r1 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback