re, dictionary, concatenate, intersect, kleene, train, union, winnow -- Finite State Playground |
re [ -fdecmMRpwx ] concatenate xml-filename xml-filename dictionary filename intersect xml-filename xml-filename kleene xml-filename train xml-file training-data-file union xml-filename xml-filename winnow xml-filename level |
These commands create and manipulate (weighted) finite-state automata (FSAs). FSAs can be constructed from regular expressions or from a dictionary of words. The resulting FSAs can be manipulated in various ways, saved as XML, and outut in a format suitable for display by Graphviz. Note that weights are not fully integrated into the program suite at this time. While they are present and can be manipulated with the train and winnow programs, other programs disregard these values. |
CREATING AN AUTOMATON |
The re program allows one to create an FSA from a regular expression or to manipulate an already created FSA. The manipulations include: determinization, two different minimization algorithms, removing epsilon arcs, reversal, complement. In addition, the program allows one to output an FSA in a format which allows the Graphviz program to display it using the -p flag. The program also allows one to test a string against an existing automaton with the -s flag. An automaton created with re can be saved with the -x flag and is output directly. They can thus be saved with the unix redirection operator: ’>’, e.g. |
re -r ’a*b*’ -fedMx > mynet.xml |
This creates an FSA for a regular expression, removes epsilon arcs, determinizes it, minimizes it with the Brzozowski algorithm, and then saves it in the file mynet.xml. The dictionary program allows one to compile a dictionary into an automaton. This program works reasonably well for small dictionaries, but slows exponentially for larger ones. The dictionary is given as a column of words in a text file and the resulting FSA is output in XML. |
MANIPULATING AN AUTOMATON |
There is a suite of programs that perform the expected closure operations on FSAs. These all operate on FSAs saved as XML. The concatenate program concatenates two FSAs. The intersect program creates the intersection of two FSAs. The kleene program creates the Kleene closure of an automaton. The union program creates the union of two FSAs. |
MISCELLANEOUS |
Finally, the train program updates the weights in an automaton based on training data. The automaton is given in XML and the training data takes the same for as for the dictionary program: a column of words in a text file. |
re(1) |
Create and manipulate FSAs. |
||
concatenate(1) |
Concatenate FSAs |
||
dictionary(1) |
Build an FSA from a list of words |
||
intersect(1) |
Intersect FSAs. |
||
kleene(1) |
Kleene closure of an FSA |
||
train(1) |
Train an FSA with a list of words |
||
union(1) |
Union two FSAs |
||
winnow(1) |
Prune arcs below a threshold |
flbi/bin |
Distribution binaries. |
|||
flbi/src |
Distribution sources. |
|||
flbi/man |
Man pages |
|||
flbi/xml |
dtd file for XML |
Mike Hammond (hammond@u.arizona.edu) Copyright (C) 2007. All rights reserved. |