FSP

NAMES
SYNOPSIS
DESCRIPTION
SEE ALSO
FILES
AUTHOR

NAMES

re, dictionary, concatenate, intersect, kleene, train, union, winnow -- Finite State Playground

SYNOPSIS

re [ -fdecmMRpwx ]
[ -r regular-expression ]
[ -i xml-filename ]
[ -s string ]

concatenate xml-filename xml-filename

dictionary filename

intersect xml-filename xml-filename

kleene xml-filename

train xml-file training-data-file

union xml-filename xml-filename

winnow xml-filename level

DESCRIPTION

These commands create and manipulate (weighted) finite-state automata (FSAs). FSAs can be constructed from regular expressions or from a dictionary of words. The resulting FSAs can be manipulated in various ways, saved as XML, and outut in a format suitable for display by Graphviz.

Note that weights are not fully integrated into the program suite at this time. While they are present and can be manipulated with the train and winnow programs, other programs disregard these values.

CREATING AN AUTOMATON

The re program allows one to create an FSA from a regular expression or to manipulate an already created FSA. The manipulations include: determinization, two different minimization algorithms, removing epsilon arcs, reversal, complement.

In addition, the program allows one to output an FSA in a format which allows the Graphviz program to display it using the -p flag. The program also allows one to test a string against an existing automaton with the -s flag.

An automaton created with re can be saved with the -x flag and is output directly. They can thus be saved with the unix redirection operator: ’>’, e.g.

re -r ’a*b*’ -fedMx > mynet.xml

This creates an FSA for a regular expression, removes epsilon arcs, determinizes it, minimizes it with the Brzozowski algorithm, and then saves it in the file mynet.xml.

The dictionary program allows one to compile a dictionary into an automaton. This program works reasonably well for small dictionaries, but slows exponentially for larger ones.

The dictionary is given as a column of words in a text file and the resulting FSA is output in XML.

MANIPULATING AN AUTOMATON

There is a suite of programs that perform the expected closure operations on FSAs. These all operate on FSAs saved as XML.

The concatenate program concatenates two FSAs.

The intersect program creates the intersection of two FSAs.

The kleene program creates the Kleene closure of an automaton.

The union program creates the union of two FSAs.

MISCELLANEOUS

Finally, the train program updates the weights in an automaton based on training data. The automaton is given in XML and the training data takes the same for as for the dictionary program: a column of words in a text file.

SEE ALSO

re(1)

Create and manipulate FSAs.

concatenate(1)

Concatenate FSAs

dictionary(1)

Build an FSA from a list of words

intersect(1)

Intersect FSAs.

kleene(1)

Kleene closure of an FSA

train(1)

Train an FSA with a list of words

union(1)

Union two FSAs

winnow(1)

Prune arcs below a threshold

FILES

flbi/bin

Distribution binaries.

flbi/src

Distribution sources.

flbi/man

Man pages

flbi/xml

dtd file for XML

AUTHOR

Mike Hammond (hammond@u.arizona.edu)

Copyright (C) 2007. All rights reserved.