- fsa - lists.rwth-aachen.de

default weight i/o
by Bryan Jurish 12 Feb '05

12 Feb '05

morning all, i believe i've discovered a bug in fsa-0.9.1 concerning i/o of (default) weights (using CountSemiring). I understand that default weights are not written to output files in order to minimize i/o, but i'd like to load the same automata that i've previously saved... attached are two automata files which illustrate the problem -- the first (example-in.xml) was created by hand, the second (example-out.xml) is the erroneous output of: bash$ ./fsa.linux-intel-standard example-in.xml write - marmosets, Bryan

1 0

Disabling lazy computation
by Abhay Vardhan 09 Jan '05

09 Jan '05

Is it possible to disable the lazy computation in FSA? Thanks Abhay

2 1

Debian fsa compilation
by Marcin Szkudlarek 20 Dec '04

20 Dec '04

Hello! I'm trying to compile fsa on: Debian 2.2.20 (woody) with gcc 3.4. Platform: PC. I get such errors: make[1]: Entering directory `/home/marcin/PTX/fsa/fsa-0.9.1/src' ../Rules.make:99: warning: overriding commands for target `install' Makefile:30: warning: ignoring old commands for target `install' make -C Core libSprintCore.linux-intel-standard.a make[2]: Entering directory `/home/marcin/PTX/fsa/fsa-0.9.1/src/Core' compiling Configuration.cc Configuration.cc:111:2: warning: #warning is a GCC extension Configuration.cc:111:2: warning: #warning "Configuration::Resource::match() fails to match *.A.B with A.A.B !" Configuration.cc: In member function `s32 Core::Configuration::Resource::match(const std::vector<std::string, std::allocator<std::string> >&) const': Configuration.cc:115: error: `StringTokenizer' undeclared in namespace `ost' Configuration.cc:115: error: parse error before `(' token Configuration.cc:116: error: no class template named `StringTokenizer' in `ost' Configuration.cc:116: error: parse error before `=' token Configuration.cc:116: error: `token' undeclared (first use this function) Configuration.cc:116: error: (Each undeclared identifier is reported only once for each function it appears in.) Configuration.cc:116: error: `tokenizer' undeclared (first use this function) Configuration.cc: In member function `const Core::Configuration::Resource* Core::Configuration::ResourceDataBase::find(const std::string&) const': Configuration.cc:246: error: `StringTokenizer' undeclared in namespace `ost' Configuration.cc:246: error: parse error before `(' token Configuration.cc:247: error: no class template named `StringTokenizer' in `ost' Configuration.cc:247: error: parse error before `=' token make[2]: *** [.build/linux-intel-standard/Configuration.o] Error 1 make[2]: Leaving directory `/home/marcin/PTX/fsa/fsa-0.9.1/src/Core' make[1]: *** [Core] Error 2 make[1]: Leaving directory `/home/marcin/PTX/fsa/fsa-0.9.1/src' make: *** [build] Error 2 I'm not feeling up to look into that code. Maybe I'm making some trivial mistake here, any suggestions? Marcin Szkudlarek

3 2

Archives
by Stephan Kanthak 17 Dec '04

17 Dec '04

Hi! Archives for fsa will be disabled till Monday as I need to modify them which in turn depends on the central system administration service at RWTH Aachen. Thanks for you patience, Stephan Kanthak -- GMX ProMail mit bestem Virenschutz http://www.gmx.net/de/go/mail +++ Empfehlung der Redaktion +++ Internet Professionell 10/04 +++

1 0

Problem with the composition
by Stephan Kanthak 17 Dec '04

17 Dec '04

Hi Emilian! > I have a strange problem with composition with the fsa tool - It > says that " 'some word' is not in second alphabet". What does it > mean? Could you please have a look at the files that I'm trying to > compose - a dictionary transducer and a language model transducer. > I've also attached the error log. > > The command I use is > fsa att:Dict_2003+.extended.fsm.txt att:lm.txt compose write att:LoG.txt > > P.S. What is "the second alphabet" anyway? The output or the input > alphabet of the second transducer? The "warnings" (yes, I should make that more explicit) appear due to the fact that the lexicon contains more symbols than the language model and FSA does NOT silently ignore that fact by default. Internally, you can disable those warnings. However, if you know that your lexicon contains more words (look at the sizes of alphabets through "info") than you can safely ignore them as FSA does exactly what you want. In your case I doubt that. I see at least two mistakes: 1. FSA still interprets AT&T's format by assuming that 0 is the initial state (This behaviour is wrong. I will go and fix that soon). 2. Your lexicon contains 9589 symbols, but your language model has only 3689 and I suggest that you wish to map unknown words to the unknown class. FSA does not do this on its own unless you use the failure symbol *FAIL* instead of UNK or use an intermediate transducer that maps lexicon words to lm words (you can use the map-fsa script to automate that). Cheers, Stephan -- NEU +++ DSL Komplett von GMX +++ http://www.gmx.net/de/go/dsl GMX DSL-Netzanschluss + Tarif zum superg�nstigen Komplett-Preis!

1 0

Problem with the composition
by Emilian Stoimenov 17 Dec '04

17 Dec '04

Hi Stefan, I have a strange problem with composition with the fsa tool - It says that " 'some word' is not in second alphabet". What does it mean? Could you please have a look at the files that I'm trying to compose - a dictionary transducer and a language model transducer. I've also attached the error log. The command I use is fsa att:Dict_2003+.extended.fsm.txt att:lm.txt compose write att:LoG.txt Thanx in advance, Emilian P.S. What is "the second alphabet" anyway? The output or the input alphabet of the second transducer?

1 0

Compiling and running FSA on a 64bit machine
by Emilian Stoimenov 17 Dec '04

17 Dec '04

Hi Stefan, I tried to compile and run FSA on our Suse Linux x86_64 machine (uname: Linux 2.6.4-52-smp #1 SMP Wed Apr 7 01:58:54 UTC 2004 x86_64 x86_64 x86_64 GNU/Linux) with a little succes. I got some warnings in Unicode.cc, namely: Unicode.cc: In member function `OutIterator Core::UnicodeInputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = const char*, OutIterator = std::back_insert_iterator<std::vector<char, std::allocator<char> > >]': Unicode.cc:199: instantiated from here Unicode.cc:159: warning: invalid conversion from `const char**' to `char**' Unicode.cc: In member function `OutIterator Core::UnicodeInputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = const char*, OutIterator = std::back_insert_iterator<std::string>]': Unicode.cc:204: instantiated from here Unicode.cc:159: warning: invalid conversion from `const char**' to `char**' Unicode.cc: In member function `OutIterator Core::UnicodeOutputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = const char*, OutIterator = std::back_insert_iterator<std::vector<char, std::allocator<char> > >]': Unicode.cc:266: instantiated from here Unicode.cc:223: warning: invalid conversion from `const char**' to `char**' Unicode.cc: In member function `OutIterator Core::UnicodeOutputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = const char*, OutIterator = std::back_insert_iterator<std::string>]': Unicode.cc:271: instantiated from here Unicode.cc:223: warning: invalid conversion from `const char**' to `char**' Unicode.cc: In member function `OutIterator Core::UnicodeOutputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = __gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, OutIterator = std::ostream_iterator<char, char, std::char_traits<char> >]': Unicode.cc:276: instantiated from here Unicode.cc:223: warning: invalid conversion from `const char**' to `char**' Unicode.cc: In member function `OutIterator Core::UnicodeOutputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = const char*, OutIterator = std::ostream_iterator<char, char, std::char_traits<char> >]': Unicode.cc:281: instantiated from here Unicode.cc:223: warning: invalid conversion from `const char**' to `char**' Unicode.cc: In member function `OutIterator Core::UnicodeOutputConverter::convert(const InIterator&, const InIterator&, OutIterator) [with InIterator = const char*, OutIterator = std::ostreambuf_iterator<char, std::char_traits<char> >]': Unicode.cc:286: instantiated from here Unicode.cc:223: warning: invalid conversion from `const char**' to `char**' Also, I had to explicitly typecast in the call to std::min // u32 bufferSize = std::min(bufferThreshold_ - putBackBufferSize, formatted_.size()); u32 bufferSize = std::min(size_t(bufferThreshold_ - putBackBufferSize), formatted_.size()); Otherwise it complained that it cannot find std::min: TextStream.cc: In member function `virtual int Core::TextInputStream::Buffer::underflow()': TextStream.cc:422: error: no matching function for call to `min(unsigned int, size_t)' Ignoring the warnings and modifying the code a bit got me the executable, which unfortunately I couldn't run, because it gave me a segmentation fault. Running gdb on fsa.linux-x86_64-standard and backtracing printed this output: (gdb) backtrace #0 0x000000000048cb9d in Core::Choice::addChoice () #1 0x000000000048cdda in Core::Choice::Choice () #2 0x000000000049e184 in global constructors keyed to _ZN4Core17AbstractParameterC2ERKS0_ () #3 0x00000000004bccf6 in __do_global_ctors_aux () #4 0x00000000004352e3 in _init () #5 0x00000000004bcc20 in Core::XmlWriter::Buffer::~Buffer () #6 0x00000000004bcc91 in __libc_csu_init () at elf-init.c:60 #7 0x0000002a96264e05 in __libc_start_main () from /lib64/tls/libc.so.6 #8 0x0000000000435f7a in _start () at ../sysdeps/x86_64/elf/start.S:96 #9 0x0000007fbffff0a8 in ?? () #10 0x0000000000000000 in ?? () #11 0x0000000000000001 in ?? () #12 0x0000007fbffff3d0 in ?? () #13 0x0000000000000000 in ?? () #14 0x0000007fbffff40e in ?? () #15 0x0000007fbffff427 in ?? () #16 0x0000007fbffff437 in ?? () #17 0x0000007fbffff45f in ?? () #18 0x0000007fbffff4e1 in ?? () #19 0x0000007fbffff508 in ?? () #20 0x0000007fbffff515 in ?? () #21 0x0000007fbffff520 in ?? () #22 0x0000007fbffff530 in ?? () #23 0x0000007fbffff558 in ?? () #24 0x0000007fbffff590 in ?? () #25 0x0000007fbffff599 in ?? () #26 0x0000007fbffff5ac in ?? () #27 0x0000007fbffff5c0 in ?? () #28 0x0000007fbffff5db in ?? () #29 0x0000007fbffff5e5 in ?? () #30 0x0000007fbffff5f2 in ?? () #31 0x0000007fbffff787 in ?? () #32 0x0000007fbffff79d in ?? () #33 0x0000007fbffff7a8 in ?? () #34 0x0000007fbffff7b6 in ?? () #35 0x0000007fbffff7c3 in ?? () #36 0x0000007fbffff7cb in ?? () #37 0x0000007fbffff7d9 in ?? () #38 0x0000007fbffff7f0 in ?? () #39 0x0000007fbffffa13 in ?? () #40 0x0000007fbffffa1d in ?? () #41 0x0000007fbffffa3f in ?? () #42 0x0000007fbffffa54 in ?? () #43 0x0000007fbffffa7c in ?? () #44 0x0000007fbffffa98 in ?? () #45 0x0000007fbffffaa9 in ?? () ... Everything compiled and ran OK on a 32bit machine. Is there anything that I can do to run FSA on the 64bit machine? Best, Emilian

2 3

ANNOUNCE fsa version 0.9.1
by Stephan Kanthak 16 Dec '04

16 Dec '04

Hi! We are proud to announce the first maintainence release of fsa, namely version 0.9.1. This release fixes: * compilation problems on x86_64 * a crash on x86_64 * synchronous pruning * reading / writing of AT&T's ASCII file format It offers newly: * simplified interface: removed dump method from Semiring class * some more documentation in the README * a NEWS file Cheers, Stephan Kanthak -- GMX ProMail mit bestem Virenschutz http://www.gmx.net/de/go/mail +++ Empfehlung der Redaktion +++ Internet Professionell 10/04 +++

1 0

compilation success / at&t format weirdness
by Bryan Jurish 15 Dec '04

15 Dec '04

hi Stephan, hi list, On 15 December 2004 at 21:45:18, Stephan Kanthak appears to have written: > At least some success. You are the first one reporting successful compilation > outside our institute.... well, here's another success report: compilation works for me on debian unstable (gcc 3.3.5), although i've had no success reading at&t format files (at&t format output works fine though) -- i haven't looked at the code yet, so if this is a known bug, just ignore me. that said, the fsa library looks like a great toolkit: thanks for releasing it and for getting these lists up and running so quickly! marmosets, Bryan

1 0

test
by Stephan Kanthak 15 Dec '04

15 Dec '04

Hi! This is just a short mail to test if the fsa mailing list is working. You can savely ignore it. Stephan Kanthak

1 0