Dear Martin,
I think that is it: The number of transitions is almost 2^36, but is stored by MRMC in an integer variable. Most C compilers assign 32 bits for this type.
I noticed that and adapted the variable types from int to long.
I propose that, if you adapt the code, you ask for access to the svn repository. I cannot grant it to you, but somebody at RWTH Aachen should be able to do so. While the responsibles were very reluctant to grant write access, at least they can create a branch for you.
After the first pass through the file, MRMC allocates space for the actual transitions: initially 12 B per transition, which will later be extended by another 4 B for the so-called backsets. However, it only allocates space for 45866397 transitions (= 64470375837 modulo 2^32), i. e. 525 KB.
Do you think allocating more space is feasible in (case it's physical available).
It should be possible, as malloc() and calloc() have parameters of type size_t (which is typedef'd to be unsigned long in many C compilers). Of course, you will also have to change the types of the relevant calculations to size_t. The function allocate_sparse_matrix_ncolse() in src/storage/sparse.c will allocate one big array for all transitions and should use size_t. The function set_mtx_val_ncolse() in the same file updates the backset or the backpointer, allocated a separate, smaller chunks of memory. Probably it is enough to use int here, but it is more consistent to use size_t as well.
When this space is full, MRMC will (without checking further) store remaining transitions outside its assigned memory, which then leads to the core dump. I think this is why MRMC uses (a bit more than) 3 GB before crashing.
I'll try to find a work around for that.
Please have a close look at the source file src/io/read_tra_file.c. That file reads the transition file and allocates memory. Kind regards, David N. Jansen.