re − Finite State Playground |
re [ -fdecmMRpwx ] |
The re program creates and manipulates finite state automata in various ways. |
-r regular-expression introduces a regular expression to be parsed. In isolation, the program will parse the regular expression and quit. -i xml-filename reads in an automaton encoded in XML from a filename argument. -s string tests a string argument against a determinized automaton (with no epsilon-arcs). Therefore this flag only has an effect if: i) an automaton has been created by some other combination of flags; ii) the automaton has had epsilon-arcs removed; and iii) the automaton has been determinized. -f creates a non-deterministic automaton with epsilon-arcs from a regular expression that has been entered with the -r flag. -e removes epsilon-arcs from an automaton read in with -i or created from a regular expression with -r and -f. -d determinizes an automaton read in with -i or created from a regular expression with -r and -f. This operation can only apply if epsilon-arcs have been removed with -e. -m minimizes a deterministic automaton. -M minimizes a deterministic automaton using the Brzozowski algorithm. -c creates the complement language from a deterministic automaton. -R creates the reverse language from a deterministic automaton. -p exports an automaton in a format appropriate for display with the dot utility of the Graphviz package. -w controls whether weights are included when the FSA is exported with -p or when strings are tested with -s. -x exports an automaton in XML format, so that it can be read in again in a later invocation of the program. |
The XML syntax is quite simple. An FSA is represented as a net object which is composed of a series of fsanode objects, each of which encodes a separate state in the FSA. Whether the state is a start and/or a final state is encoded with attributes. Arcs between states are represented as arc objects within the fsanode objects, with attributes indicating what the arc symols are and what the destination states are. The full syntax is encoded in the fsanet.dtd file. |
Regular expressions are closed under concatenation, union, and Kleene star. These are represented in the usual fashion. Letters of the alphabet and digits can be used as symbols. Upper and lowercase letters are treated as distinct symbols. Epsilon is represented with ’@’. Concatenation is represented directly, e.g. ab, x777i, etc. Union is represented with a tie-bar, e.g. a|b, d|e, etc. Kleene star is represented with an asterisk, e.g. a*, 7*, etc. These operations can, of course, be combined and parentheses can be used to disambiguate, e.g. a*(bc|de)f*, a*(b(c|d)e)f*, b(b|a)*, (bb)|a*, etc. The asterisk and tie-bar have other uses on the Unix command-line, so regular expressions using those symbols must be placed in single quotes, e.g. ’a*(bc|de)b*. |
Here are a few examples to show how the program works. |
re -r ’ab*c’ -fedx > mynet.xml |
This reads in a regular expression and creates an automaton from it. Epsilon-arcs are removed, the automaton is determinized and an XML representation is saved to a file. |
re -i mynet.xml -p > mynet.dot |
Here the same XML representation is now read in and then exported in the dot format for display. |
re -r ’ab*c*’ -feds acc |
Here a determinized automaton is created from a regular expression and then tested against the string acc. |
fsp(1) |
http://www.u.arizona.edu/~hammond/ |
Author’s home page |
||
http://www.graphviz.org |
Home page for the Graphviz package |
flbi/src/ |
Distribution source code |
||
flbi/xml/fsanet.dtd |
DTD file for XML specification of FSAs |
||
flbi/man/re.1 |
This man file |
Mike Hammond (hammond@u.arizona.edu) Copyright (C) 2007. All rights reserved. |