Software and Data Sets for "2D Object Detection and Recognition"

Downloadable files
detect.tgz
The full system. Occupies approximately 54 MegaBytes
detectsmall.tgz
All files except the data files. Occupies approximately 2.5 MegaBytes. 
Data files: BRAIN.tgz, CHESS.tgz, CLIP.tgz, FACES.tgz, HEART.tgz, LATEX.tgz, NIST.tgz
This page provides some information regarding the software and data sets that accompany the book. More information can be found in Chapter 12 as well as in the documentation found in the source files and the script files. There is no guarantee attached to this software (it is not too hard to make it crash,) nor is any support to be expected. A certain level of proficiency in C++ is essential to understand the program, and some experience with Unix and an X11 based window manager are necessary to get things running smoothly.

Setting things up

The source code provided here will compile on Linux. Download detect.tgz to a directory whose full path will be called base for further reference. Type tar xvfz detect.tgz.

The following must be set for things to work.

  • Set an environment variable $DETDIR to base. In csh add the line setenv DETDIR base to your .cshrc file. In bash add the line DETDIR=base to your .bashrc file.

  •  
  • Add base/bin and base/bin/script to your path.

  • In csh add the linesetenv PATH $PATH:base/bin:/base/bin/script.
    In bash add the linePATH=''$PATH:base/bin:/base/bin/script''
     
  • Add the bash shell program to your /bin directory. All the scripts are written in bash and assume it is in directory/bin
In directory base you will now see several directories.
  • source. Contains the code with graphic options.cd into source and typemake. The program will compile and face base/bin

  •  
  • sourcenox. Contains the code with no graphic options.cd into sourcenox and typemake. The program will compile and facenox will be written to base/bin

  •  
  • bin. Contains the compiled executables, face andfacenox, and a subdirectory scripts where all the script files written in bash are stored.

  •  
  • book. This directory has subdirectories corresponding to the chapters of the book ( chap1, chap2, ...), as well as some subdirectories with data written in upper case. Within each subdirectory corresponding to a particular chapter are parameter files that more or less reproduce the figures in that Chapter. Running these scripts is a good way to begin getting acquainted with the program and the relevant parameters.
The subdirectories with data are the following.
  • FACES. Contains a subdirectory train with 300, $110 × 96$, images of faces from the Olivetti dataset, ten images per person. These are used to train the detectors. There is an additional directory test with 100 faces from the same dataset. The directory pgm contains a number of pgm images on which detectors can be tested. The directories filt1, filt_d2, filt_from_edges contain different sparse models trained using different parameters.

  •  
  • HEART. Contains ultrasound images of heart ventricles in directoriespat1pat2pat3, and a couple of angiograms in directory ang.

  •  
  • BRAIN. Contains two directories of axial MRI brain scans, train and test, as well as a directory filt3 which contains a sparse model for these images, and a directory grmtch containing parameters for a sparse model of these scans for detection with dynamic programming.

  •  
  • LATEX. Contains a directory protos with the prototypes of all the 293 LaTeX symbols, and a subdirectory latex_0 with a sparse model for the symbol 0 as well as a classifier for the hits of this detector (see Chapter 10). Subdirectories latex_1, latex_4, latex_7 contain models for the 1, the 4 and the 7.

  •  
  • ESCR. Contains a sparse model and various templates for the `Epsilon' used in Chapter 3. Also the classifier for hits of this detector on other script style symbols.

  •  
  • CLIP. Contains a sparse model for the clip shown in Chapter 8.

  •  
  • CHESS. Contains the sparse model and classifiers for the chess-pieces.

  •  
  • NIST. Contains one set of classification trees trained on the NIST data set and a small sample of 10,000 NIST digits for testing. The full data set is very large but can be obtained upon request. 
The size of the entire directory NIST. is approximately 54 Megabytes. It is possible to download only parts of this system.
For the source code, the scripts and the parameter files for the figures in each chapter dowload detectsmall.tgz ; Then separately download any of the data directories of interest from the directory download

Running the program

The program face can receive input from a parameter file or from the command line or from the parameter file and the command line. The general form for running face from the command line is 
face file par1=n1 par2=n2 ... or
face par1=n1 par2=n2 ...

The parameter file, if used, must come first. Parameters set on the command line override values set in the parameter file. Among the parameters opt must be set to a particular option which tells the program what routine to use. If no graphics is needed then facenox can be run the same way. The list of parameters needed for each of the algorithms is detailed in Chapter 12. In each subdirectory of book with a name corresponding to a chapter are prepared parameter files for the figures in that chapter. Type face f.par on the command line to have the program run the algoritmh producing the corresponding figure.

Graphics

The program will show results of the algorithms as they are computing, as well as the final result, depending on parameter settings. Often the program will show one or several windows and will not continue until prompted by the user. This is done by typing c inside the active window which is should be highlighted by the window manager. Typing q kills the window and the program. Typing n magnifies the window to n times the original size of the image.