THE RAZORBACK UTILITY LIBRARY
by András Aszódi
Novartis Forschungsinstitut GmbH
Brunnerstrasse 59
A-1235 Vienna, Austria, Europe
Version 2.0
Introduction
The Razorback library is hardly more than a haphazard collection of useful
software modules that I accumulated over the years. This collection has
been growing organically over the years and its composition reflects my
scientific interests and the projects I have been working on in the
Novartis
Forschungsinstitut in Vienna since 1996. I do not pretend that the
routines you will find here are optimal in any sense, but they do their
job reasonably well. Feel free to use and modify them.
The Razorback collection consists of C and C++ sublibraries. Although
you can use the C library in your C++ programs, most of the C routines
have already been ported to the C++ library. These ports often contain
significant enhancements. My recommendation is not to program in C unless
absolutely necessary.
The sublibraries are documented separately. Follow the links below to
get to the detailed documentation (generated by Doxygen).
C Libraries Overview
-
Linear algebra: provides
vector and matrix structures that attempt to make C a bit object-oriented.
Only rectangular and lower triangular (symmetric) matrices are provided.
All linear algebra routines operate on double-precision numbers. Solution
of linear equations are provided.
-
Statistics: estimation of distributions
via histograms, simple one-way and two-way statistics, statistical tests,
linear and nonlinear parameter estimation.
-
Bioinformatics: this is just
a buzzword here because no sequence-processing routines are offered. It
is rather structural biology (another pompous buzzword): a PDB read/write
module, a DSSP reader and the GOR-III secondary structure prediction algorithm
are provided.
-
Miscellaneous: everything
else. File and directory manipulation, time stamps, command-line processing.
C++ Libraries Overview
-
Linear algebra: vector and
matrix classes. These are implemented as templates so that you can work
e.g. with complex matrices if you wish. Most people would just use the
double-precision instantiations. This is a very small collection of linear
algebra routines, you would expect much more from a real package. In any
case, you get linear equation solvers, SVD, real symmetric matrix diagonalisers
(both QL all-eigenvalues/all-eigenvectors and a specialised class that
can be used to get only subsets of eigenvalues and/or eigenvectors).
-
Statistics: offers a fairly
complete class hierarchy for representing distributions (both analytically
known and empirically estimated). The parameter estimation classes enhance
the capabilities of the C version considerably, in particular the SVD-based
linear regression and the orthogonal polynomial fit are worth mentioning.
A simple one-way ANOVA class is also provided.
-
Bioinformatics: manipulation
of macromolecular sequences. Can read and write a variety of formats, but
there are no guarantees :-). A class representing multiple sequence alignments
is also provided.
-
POSIX thread wrappers:
makes life with multithreaded programs easier. Provides thread launchers
and a job queue that links "producer" and "consumer" threads. All POSIX
synchronisation primitives with the exception of read/write locks are implemented.
This is the least portable part of the Razorback library: first, the architecture
must support POSIX threads, and second, the thread launcher checks the
number of available CPUs on your system which is machine-dependent.
-
Exceptions: these get thrown
by the other sub-libraries if something went wrong. The Razorback library
has its own exception hierarchy which is independent from the hierarchy
defined by the C++ standard (see <stdexcept>). To catch Razorback exceptions,
put a Utilsexc_& object reference
in the catch argument.
Miscellaneous: useful
bits (such as a bit vector class, no pun intended :-) ) and pieces.
Licensing
This software is distributed as Novartis Open Source which essentially
amounts to granting a license similar to the GPL. IMPORTANT:
refer to the LICENSE file for the precise legal terms and conditions.
Implementation
Source
The library contains C and C++ modules. The C part is written in ANSI C,
the C++ part is in something that is intended to be as close to ANSI C++
as possible. Given the generic nature of the libraries, no machine-dependent
features are used.
Supported platforms

SGI
The Razorback Library was mostly developed on SGI machines. There is a
bewildering variety of ABI and MIPS instruction set combinations: the
Makefiles are configured for the -n32
-mips3 ("irix32") and the -64 -mips4("irix64")
ABIs. You would like to have at least an R4000 running IRIX 6.2 (IRIX 6.5
recommended). Compiling for lowlier architectures or old IRIX versions
is theoretically possible but not recommended. The installation script
requires an IRIX version >=6.2.

PC Linux
I develop actively for Linux since 1998. That was the year when the G++
2.8.1 compiler became available, and it made the reliable and simple instantiation
of C++ templates possible at last. That release also fixed a number of
annoying bugs that had made C++ development under Linux anything but enjoyable
before. The Razorback libraries can be compiled under any 2.x kernel, but
the GNU compiler version must be at least version 2.95.

Alpha Tru64 UNIX
The Razorback library was ported to Tru64 UNIX V5.1, with the Compaq C++
compiler V6.3. This is the only C++ compiler I have access to that actually
manages the ANSI C++ standard more or less. It is also quite picky and
has often helped me finding bugs that went unnoticed under IRIX or Linux.
Other architectures
Porting the Razorback Library to another platform running a UNIX variant
should be straightforward, especially if the GNU C/C++ compiler supports
the platform. Ideally, only the platform-dependent Makefiles includes have
to be changed to cope with the idiosyncracies of compiler and linker command-line
arguments. The platform must support shared libraries and the POSIX threads.
I do not know if the library could be ported to non-UNIX architectures
(e.g. Windows or MacOS) because I never had to do professional work on
these platforms.
Installation
-
Prerequisites: you will need a C and a C++ compiler.
-
Obtain the Razorback source archive razorback_x.y.tar.gz
where x and y
are the major and minor version numbers. Create a razorback top directory
(for example, "razorback") and
copy the tarfile there. Extract the contents: I am not telling you how
to do this :-) Your directory will now contain a directory called "razorback_x.y".
Change to this directory "razorback/razorback_x.y"
now. It should contain the following subdirectories:- admin,
doc, include, cc, c and the files INSTALL,
LICENSE, Makefile.
-
Invoke the script admin/configure.sh.
This is a Bourne shell script that figures out your architecture, and asks
you a few questions. In particular, you have to specify an already existing
directory <ARCHDIR>.
In this directory a new subdirectory <ARCHDIR>/<ARCH>/lib
will be created (where <ARCH>
is your architecture such as "irix64")
that contains the static libraries. The dynamic libraries are put into
<ARCHDIR>/<ARCH>/lib/shared.
The configure script generates a platform-specific file Makefile.defs
that contains the macros needed for compilation and/or installation. Have
a look at these to get an impression how difficult it is to write portable
software...
-
Next, compile the library by typing 'make
compile'. On multiprocessor architectures that have a parallel make
utility, the compilation will be done in parallel.
-
Install the static libraries and dynamic shared objects (DSOs) to their
final location defined in the configuration step by invoking 'make
install'. This step actually performs the compilation, too,
if it was not done before.
-
When compiling against the Razorback libraries, you should specify the
header directory "razorback/razorback_x.y/include":
this is not moved during installation. Additionally, specify the appropriate
directory for the linker using the -L option.
-
If you wish to link your programs against the Razorback DSOs, you also
have to set your LD_LIBRARY_PATH
environment variable accordingly. The installation script generates a C
shell script
<ARCH>/ldpath.csh
that you can use for setting LD_LIBRARY_PATH:
just source it from your .login
file. Some combinations of the TC-shell and the KDE desktop environment
under Linux do not invoke .login
which is positively silly: put the lines above in your .cshrc
file instead as a workaround.
-
The Razorback user guide is located under "razorback/razorback_x.y/doc/index.html".
Bookmark it now!
You are done! Enjoy programming with the Razorback.
Programming
Header Files
Symbolic links to all header files can be found in the "razorback_x.y/include"
directory. The C header files have all-lowercase names and a "*.h"
extension (e.g. "svd.h"). The C++
header files always begin with an uppercase letter and have a "*.hh"
extension (e.g. "Svd.hh"). Both
the C and C++ libraries are wrapped into the namespace 'RazorBack', so
if you link them against C++ code, do not forget to use either the 'using
namespace RazorBack' directive or prefix each name from the library
with 'RazorBack:: '. When compiling
a program against the Razorback library, tell the compiler about the header
file location using the "-Irazorback_x.y/include"
switch.
Linking
Link statically against the C library by specifying the options "-L<ARCHDIR>/<ARCH>/lib
-lrazorback" to the compiler. Similarly, link statically against
the C++ library by using the switches "-L<ARCHDIR>/<ARCH>/lib
-lRazorBack" (note the spelling!).
If you want to link against the shared libraries, specify "-L<ARCHDIR>/<ARCH>/lib/shared
-lrazorback" and "-L<ARCHDIR>/<ARCH>/lib/shared
-lRazorBack" for the C and C++ libraries, respectively. On some
architectures you might need other compiler options for dynamic linking,
please consult the relevant manpages.
ANSI C++ issues
-
SGI: Do NOT use the '-LANG:std'
flag on the SGI because the "new" iostream libraries are still broken.
-
Alpha: you can achieve a kind of ANSI C++ conformance by specifying the
flags '-std ansi -D_STANDARD_C_PLUS_PLUS
-D__USE_STD_IOSTREAM'. The '_STANDARD_C_PLUS_PLUS'
macro comes from the SGI compiler actually (defined by -LANG:std)
and indicates that the new-style C++ headers (without the .h extension)
are to be used. The Alpha-specific macro '__USE_STD_IOSTREAM'
enables the "new" iostream libraries. When compiling with these flags,
you'll get a warning on the Alpha:- "cxx:
Warning: /usr/include/stdio.h, line 263: The "extern_prefix" pragma is
not available on this platform. Pragma is ignored." This warning
can be ignored as well.
-
G++/Linux: Before G++ Version 3.0, the C++ standard libraries were incomplete.
If you have G++ 2.95.3, then you mustn't define __USE_STD_IOSTREAM.
WIth Version 3.0 and above, you may use the ANSI C++ iostreams the same
way as with the Alpha architecture. The Razorback configuration script
takes care of this automatically.
C++ template instantiation
This is a thorny issue as no compiler writer seems to get it right. General
recommendation:- try to compile your programs as if you were using the
"Borland model". The Razorback template declarations are in *.hh
files, and the template definitions are in *.cc
files. Both are available in the include
directory. The template headers should include the template definition
files: ALWAYS define the macro INCLUDE_TMPL_DEFS
on the compiler command line. Additionally, switch off automatic template
instantiation: on the SGI platform, use the switch -no_auto_include,
on the Alpha use -noimplicit_include.
Wonderfully enough, G++ under Linux needs no additional flags.
SGI warning:- the linker may have problems
with templates instantiated more than once and may issue warnings that
certain functions with horrible mangled names were defined twice. Ignore
these.
Alpha C++ warning:- this compiler uses
repositories for template instantiation. The Razorback C++ library has
its repository under <ARCHDIR>/alpha/lib/cxx_repository.
When
linking, you have to tell the compiler where to look for these files: you
have to specify the switches '-ptr ./cxx_repository
-ptr <ARCHDIR>/alpha/lib/cxx_repository'.
The
first repository will contain the instantiation records for your program
and must be writable, the second is the Razorback repository. If you link
against other C++ libraries with templates, you have to list those repositories
as well. See the Compaq C++ manual for the gory details :-(
Thread safety
The libraries are compiled so that they use the reentrant system routines
and are POSIX thread-aware. One needs to set some mysterious flags before
using the threads, here is the list without any explanations, you just
have to take my word for it that it works :-)
-
SGI: '-D_POSIX_C_SOURCE=199506L -D_PTHREADS
-D_XOPEN_SOURCE'
-
Linux: '-D_REENTRANT'
-
Alpha: '-pthread -D_XOPEN_SOURCE=500'
If you use these flags it may happen that some not strictly ANSI functions
disappear from scope, routines from the math library are the usual
victims. You then have to play around with those, reading /usr/include/math.h
and standards.h are recommended
:-).
Cutting corners with Makefile.defs
All these horrendous things described above are taken care of by the macro
settings in Makefile.defs that
we used for compiling the libraries. Permission is hereby granted to steal
most of the macro settings from this file :-). However: NEVER define
-DRAZORBACK_LIB_COMPILE
when linking against the Razorback. That flag is reserved for some internal
tricks, most importantly for taking care of template instantiations within
the library, and if it is defined then most probably you'll get linking
errors.
Why Razorback?
Wild boar are cute and intelligent animals, and (despite of their bad reputation)
they are quite friendly, too. If you meet them in the forest, have an apple
ready for them. Avoid the sows with young piglets though -- they can get
real paranoid.