February 2005 - fsa - lists.rwth-aachen.de

Composing and determinizing efficiently
by Teemu Hirsimäki 17 Feb '05

17 Feb '05

Hi, again. After getting familiar with the FSA toolkit, it seems to be a very nice toolkit. However, I have one question. When big transducers are composed and determinized, is normally necessary to some additional operations to improve the performance of the FSA toolkit? Or should "compose", "trim" and "determinize" be enough? I have an acyclic pronunciation lexicon transducer (50 input labels, 24362 output labels, 20784 states, 45145 arcs) with disambiguation symbols, and a language model acceptor (24363 labels, 729323 states, 5640013 arcs). Composing them with the AT&T toolkit (using "dmake -a lex -b lm") takes 4m35s with peak memory consumption around 1,4G. The RWTH FSA tool, on the other hand, got up to 11G in 15 minutes after which I stopped the process before it started to use swap. I used the following command line: fsa --progress=yes bin:lex closure bin:lm compose trim determinize \ write bin:composition The system is SuSE Linux 9.0 (x86-64). Any ideas if I am missing something? Perhaps I have to look more carefully that the conversion between formats is ok. At least the number of states, arcs, input-epsilon-arcs and output-epsilon-arcs seems to match. -- Teemu Hirsimäki

1 0

Compiling on SuSE 9.1
by Teemu Hirsimäki 16 Feb '05

16 Feb '05

I managed to compile the RWTH FSA Toolkit on a SuSE 9.1 Linux system, but I did have to make a few changed in the code. I had to add the line #include <cc++/tokenizer.h> in files src/Core/Configuration.cc src/Core/Directory.cc src/Fsa/Storage.cc Then I also had to edit the file "config/os-linux.make" and change the line LDFLAGS += $(shell ccgnu2-config --libs) to LDFLAGS += $(shell ccgnu2-config --stdlibs) as the "--libs" flag did not seem to add enough libraries for the Common C++. In particular, the "-lccext2" flag was missing. The version of the Common C++ library was 1.3.1. -- Teemu Hirsimäki +358 50 3667288

2 1

Corrections to README
by Teemu Hirsimäki 16 Feb '05

16 Feb '05

The top level README file has a short C++ example of using the library. It contains three bugs: * Must use "f->" instead of "f." everywhere * f->setInputAlphabet() wants ConstAlphabetRef instead of (StaticAlphabet*) * It seems that setId() must be called for each state manually. Here is the correct (hopefully) version of the code: using namespace Fsa; StaticAutomaton *f = new StaticAutomaton; f->setType(TypeAcceptor); f->setSemiring(TropicalSemiring); StaticAlphabet *a = new StaticAlphabet(); f->setInputAlphabet(ConstAlphabetRef(a)); State *sp1 = new State(), *sp2 = new State(StateTagFinal); sp1->setId(0); sp2->setId(1); sp1->newArc(sp2->id(), f->semiring()->one(), a->addSymbol("a")); sp2->newArc(sp2->id(), f->semiring()->one(), a->addSymbol("a")); f->setState(sp1); f->setState(sp2); f->setInitialStateId(sp1->id()); ConstAutomatonRef fr = ConstAutomatonRef(f); -- Teemu Hirsimäki

2 2

default weight i/o
by Bryan Jurish 12 Feb '05

12 Feb '05

morning all, i believe i've discovered a bug in fsa-0.9.1 concerning i/o of (default) weights (using CountSemiring). I understand that default weights are not written to output files in order to minimize i/o, but i'd like to load the same automata that i've previously saved... attached are two automata files which illustrate the problem -- the first (example-in.xml) was created by hand, the second (example-out.xml) is the erroneous output of: bash$ ./fsa.linux-intel-standard example-in.xml write - marmosets, Bryan

1 0