A Few Notes on Ocamlbuild, Ocamldebug, and Caml-mode

2 Jun 2007

A short writeup on programming in OCaml on Mac OS X. The use of ocamlbuild, ocamldebug, and caml-mode.

So it took me a few weeks to convert the Python version of MJB2Lite into OCaml. Both the Python and OCaml implementations are about 3000 lines each. While I was writing the OCaml code I also made a few improvements to the structure and design of the program. Perhaps it would have taken a little less time if I was only doing a "line-for-line" translation. But that would be less fun. Working on an independent and unsupported project allows me to devote the extra time to do it right, rather than just meeting some arbitrary deadline. Now I'll have to worry less about performance when some of the algorithms' complexity increases.

During the rewrite, version 3.10.0 of OCaml was released (I started out writing in version 3.09.3). I wasn't much affected by the new release (the biggest change was in Camlp4, which I haven't learned to use yet). But a new tool ocamlbuild, described as a "compilation manager", became part of this release. With it I no longer need the Makefile which I was using to specify how the project is to be compiled and linked.

To learn to use ocamlbuild, one should certainly read the Ocamlbuild Users Manual. Some details are still sketchy in that document but I'm sure it'll improve over time since ocamlbuild is such a useful tool. In the meantime, some of the information I'll describe below has come from mailing list archives or is the result of experimentation.

If everything has been set up correctly, the commands

ocamlbuild mjb2lite.native

and

ocamlbuild mjb2lite.byte

will generate the native-code and byte-code executables from the main program mjb2lite.ml, respectively. Ocamlbuild determines the source files on which this main program depends and compiles/re-compiles and links them as necessary. The command

ocamlbuild mjb2lite.d.byte

will generate a byte-code executable for debugging. My mjb2lite.ml file is not contained in the project directory but in a directory mjb2lite in it. So instead of what are listed above, I actually need to specify its pathname in the ocamlbuild command, for example:

ocamlbuild mjb2lite/mjb2lite.d.byte

Files that implement chord and scale theory objects, harmonic analysis algorithms, etc. are organized in the directories mjb and toe. So the directory structure looks like this:

projdir/
    _tags
    load.ml
    mjb2lite/
        _tags
        mjb2lite.ml
    toe/
        _tags
        toe.mllib
        note.ml
        chord.ml
        scale.ml
        ...
    mjb/
        _tags
        mjb.mllib
        pattern.ml
        compatscales.ml
        tonalana.ml
        ...

Ocamlbuild needs to be told that the directories toe and mjb should be included in the search for modules. The _tags file in the project directory provides this information by containing the line:

<toe> or <mjb>: include

Some of the files in mjb2lite, toe, and mjb use the OCaml libraries Unix, Num, and Str. To specify this the _tags files in these directories should contain the lines:

true: use_unix
true: use_str
true: use_nums

Instead of building a byte-code or native-code executable, it is sometimes useful to run and test code in the OCaml toplevel (I do this in XEmacs). If this code uses modules defined in toe and mjb, the latter must be first loaded into the toplevel. To do this, files in each of these directories can be compiled and linked into a byte-code library. The files toe.mllib and mjb.mllib list the modules in their respective directories. For example, toe.mllib contains the lines:

Note
Chord
Scale
...

To build the libraries, issue the commands:

ocamlbuild toe/toe.cma

and

ocamlbuild mjb/mjb.cma

To load either of these libraries, one can use the toplevel command #load. Since this is done quite often, the file load.ml in the project directory contains the lines:

#load "str.cma";;
#load "nums.cma";;
let pp_num formatter x = Format.pp_print_string formatter (Num.string_of_num x);
#install_printer pp_num;;
#load "unix.cma";;

#directory "_build/toe";;
#load "toe.cma";;
#install_printer Note.pp;;
#install_printer Chord.pp;;
#install_printer Scale.pp;;

#directory "_build/mjb";;
#load "mjb.cma";;

and these can be executed as a whole by the toplevel command:

#use "load.ml";;

Note that "using" the load.ml file loads the OCaml libraries Unix, Num, and Str as well as sets up custom printer functions for types defined in our modules. Note also that the libraries generated are located in the directory _build because ocamlbuild always leaves our project directory otherwise unchanged.

When a program is debugged under ocamldebug (again this can be done within XEmacs), it is useful to know the directory command. It lets one specify the directories in which ocamldebug can look for source files. If ocamldebug is started from the project directory, we can issue the commands:

directory _build/toe
directory _build/mjb

so it can find the appropriate files when execution stops within them. The ocamldebug command source can save you a few keystrokes.

The final usage note on ocamlbuild concerns the target extension .inferred.mli. All my .ml files were written without first writing the .mli files. This is faster and makes more sense to me since we're using a type inference system! Since (for example) chord.ml depends on other files (such as note.ml), we cannot simply use the command ocaml -i chord.ml to generate its interface file. However,

ocamlbuild toe/chord.inferred.mli

will correctly determine dependencies and generate an interface file in _build/toe/chord.inferred.mli. Then you can copy that file to your source directory, rename it to chord.mli, and edit out the declarations that should be hiddened. Two (bash) shell script snippets will prove convenient for batch processing these interface files:

for f in toe/*.ml;do ocamlbuild "${f%.*}.inferred.mli";done

and

for f in ../_build/toe/*.mli;do name=`basename "${f%.*.*}.mli"`; cp $f $name;done

Finally, I re-examined and installed the caml-mode in the "emacs" directory in the standard OCaml distribution and have been using it ever since. I must say I cannot see that great a difference between it and tuareg-mode which I recommended a few weeks ago. (I admit I did that because everyone was recommending it; but I suppose so much of what people write on the Web is hearsay.) I do miss the two key bindings C-x C-e and C-c C-b from tuareg-mode but these can be easier added by a tiny bit of ELisp:

(add-hook 'caml-mode-hook
          '(lambda ()
	     (define-key caml-mode-map "\C-x\C-e" 'caml-eval-phrase)
             (define-key caml-mode-map "\C-c\C-b"
               (lambda ()
                 "Eval the entire buffer"
                 (interactive)
                 (caml-eval-region (point-min) (point-max))))))

I now prefer and recommend caml-mode if only for the reason that it comes standard with OCaml.

Finally, unlike version 3.09.3, OCaml 3.10.0 does not require any additional patches to work correctly on Intel Macs.

Category: Programming