[Termtools] Re: survey: what software for SAT/SMT encoding (for

17 Aug 2022

      Dear Johannes, dear all,
...
- how do you write the constraints
In AProVE, arithmetic constraints are created programmatically and represented using recursive data structures, without an explicit intermediate language. These data structures are then either exported to SMT-LIB2 strings or bit-blasted to propositional formulas.

For RPO, a dedicated propositional encoding is used:

Michael Codish, Jürgen Giesl, Peter Schneider-Kamp, René Thiemann:
SAT Solving for Termination Proofs with Recursive Path Orders and Dependency Pairs. J. Autom. Reason. 49(1): 53-93 (2012)

http://verify.rwth-aachen.de/giesl/papers/RPO-Journal-distribute.pdf

The representation of propositional formulas allows for arbitrary nesting of connectives. Formula construction uses structural hashing (aka sharing of subformulas), implemented via suitable instances of the Factory pattern. Formulas are converted to CNF via the implementation of Tseitin's equisatisfiable encoding in SAT4J.
...
- what logics do you use
Beyond propositional logic:

-- QF_NIA is used for arithmetic interpretations that are at least as expressive as linear polynomial interpretations (max-polynomials, matrix interpretations, non-linear polynomials, ...); carriers of these interpretations are the naturals, the integers, and the non-negative rationals.

Solved via bit-blasting with structural hashing/sharing of subformulas and several simplifications during formula construction, using the encoding in

Carsten Fuhs, Jürgen Giesl, Aart Middeldorp, Peter Schneider-Kamp, René Thiemann, Harald Zankl:
SAT Solving for Termination Analysis with Polynomial Interpretations. SAT 2007: 340-354

https://www.dcs.bbk.ac.uk/~carsten/papers/SAT07-satpolo.pdf

Experiments with Z3 v3.2 in 2011 (for a talk at a Z3 workshop) showed that solving QF_NIA formulas from the search for linear polynomial interpretations with /unbounded/ arithmetic variables was very slow.

However, adding constraints a_i <= N for variables a_i and small values for a constant bound N led to a performance that was only slightly worse than that of AProVE's dedicated SAT encoding for the same bound N. QF_NIA has the benefit over QF_BV that no overflow-related issues can occur, and internally SMT solvers can make use of the bound N.

(Rerunning such experiments with /current/ versions of Z3, Yices, CVC5, ... might be interesting.)

-- QF_LIA is used for linear polynomials with hard-coded coefficient 1, where only the additive weight is needed, and for KBO following the encoding in

Harald Zankl, Nao Hirokawa, Aart Middeldorp:
KBO Orientability. J. Autom. Reason. 43(2): 173-201 (2009)

http://cl-informatik.uibk.ac.at/users/hzankl/new/publications/ZA09_01.pdf

-- Quantifier-free arctic naturals/integers (as pointed out earlier, closely related to QF_LIA/QF_LRA, but with an additional -\infty value) are used in the search for arctic matrix interpretations. AProVE uses a variant of the unary order encoding described in Section 5.3 of

Carsten Fuhs:
SAT encodings: from constraint-based termination analysis to circuit synthesis. RWTH Aachen University, 2012, pp. 1-188

http://sunsite.informatik.rwth-aachen.de/Publications/AIB/2011/2011-17.pdf
...
- how do you manage the (external) solver process
-- External solvers called via Java API: SAT4J, SMTInterpol.

For quick queries on very small inputs (e.g., checking validity of conditional polynomial constraints from concrete max-polynomial interpretations or checking effectiveness of a semantic labelling), SAT4J is used - the overhead of spawning an external process is avoided.

This matters in particular for JVM-based termination tools: creating a new process first makes a copy of the parent process - with all its used memory - and only then runs the called program, with undesired effects (even if the process copy is actually created lazily by the OS):

https://www.tapatalk.com/groups/starexec/strange-memouts-t106.html

In this sense, for "heavy-duty" constraint solving in a competitive setting, also systems-level considerations matter.

-- External solvers used as processes: MiniSat, Yices, Z3.

MiniSat 2.2 is arguably not the most recent solver (last update in 2008?), so improvements via (suitably configured) more modern SAT solvers may be possible. According to Section 4 of

Emre Yolcu, Scott Aaronson, Marijn J. H. Heule:
An Automated Approach to the Collatz Conjecture. CADE 2021: 468-484

https://link.springer.com/content/pdf/10.1007/978-3-030-79876-5_27.pdf

a suitable configuration for modern solvers may be to change the solver's branching behaviour from phase-saving (default in many recent solvers) to negative branching (default in MiniSat - always try 'false' before 'true').

AProVE kills external processes on OS level if either their timer (configured in the strategy language) has elapsed or separate progress in the termination proof makes the remaining computation by the process unnecessary: for example, when AProVE searches for an RPO and a polynomial interpretation with two running MiniSat instances in parallel, and an RPO that simplifies the termination problem is found, then the search for a polynomial interpretation, along with its MiniSat process, are aborted immediately. Being able to abort computations immediately, also in a parallel setting, is a motivation to use external processes rather than a solver API.

An overview of AProVE's constraint solving automation via SAT and SMT solving with further pointers is available in Section 4 of

Jürgen Giesl, Cornelius Aschermann, Marc Brockschmidt, Fabian Emmes, Florian Frohn, Carsten Fuhs, Jera Hensel, Carsten Otto, Martin Plücker, Peter Schneider-Kamp, Thomas Ströder, Stephanie Swiderski, René Thiemann:
Analyzing Program Termination and Complexity Automatically with AProVE. J. Autom. Reason. 58(1): 3-31 (2017)

https://www.dcs.bbk.ac.uk/~carsten/papers/JAR-AProVE.pdf

Best regards,

   Carsten

On 16/08/2022 13:29, Johannes Waldmann wrote:
...
Dear all,
I am preparing my course at ISR 22
about "constraint programming for analysis of rewriting".
(plan: https://www.imn.htwk-leipzig.de/~waldmann/talk/22/isr/ )
If you are using SAT/SMT in your termination prover -
how do you actually do it?
- how do you write the constraints
- what logics do you use
- how do you manage the (external) solver process
Example answers for matchbox - see below.
I'd appreciate brief answers here on the list
(or in email, and I can later summarize back)
perhaps with pointers to source and publications.
If there are no (current) publications - then we could write one?
Thanks - Johannes.
For example - for matchbox, the constraints are really
QF_NIA (standard matrices) and QF_LRA (arctic)
but I am not writing them as such,
but instead I am using an (eager) bit-blasting translation to SAT,
with the Ersatz library. That library neatly solves the
memoization problem (with hidden unsafePerformIO
https://hackage.haskell.org/package/ersatz-0.4.12/docs/src/Ersatz.Problem.ht...
)
so I can write the constraints in a functional way
(e.g., unknown numbers, and matrices, are instances of Num,
so `sum` automatically works, etc.)
For actual SMT writing, a nicely typed embedded DSL is
https://hackage.haskell.org/package/smtlib2-1.0/docs/Language-SMTLib2.html
but it seems unmaintained, and I found it hard to add patches
(necessary because of changes in SMTLIB format specs)
I recently looked into https://hackage.haskell.org/package/simple-smt
but it requires some ugly monadic code, cf. evaluate_rule(_arc)
https://gitlab.imn.htwk-leipzig.de/waldmann/pure-matchbox/-/blob/457-find-ar...
For actual SMT solving - I found these problems:
- QF_NIA (polynomial integer arithmetic) is hopeless,
   we could use QF_BV (bitvectors) instead
   but their arithmetic operators are silently overflowing
   (it's modular arithmetic)  so it requires extra (Boolean) programming
- QF_LRA is nice, but for each arctic number, I need an extra Boolean
   (to encode MinusInfinity)
in all, the Boolean aspect seems to dominate,
so I could bit-blast directly (which I do).
Perhaps it's time for an arctic theory solver ...
Oh, and another technical point - I prefer to call the solver
via its (C) API, instead of writing to file, and calling an external
process. It seems faster - and I can stop/kill the solver (process)
much more reliably. But - while the textual input/output format is
fixed, those binary interfaces require extra (solver-specific) code
that will bit-rot quickly. SAT solvers seem to settle on IPASIR,
perhaps there's a similar standard emerging for SMT?
_______________________________________________
Termtools mailing list -- termtools@lists.rwth-aachen.de
To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
WARNING: This email originated from outside of Birkbeck. Do not click links or open attachments unless you recognise the sender. Under testing so any comments sg@dcs.bbk.ac.uk
-- 
Carsten Fuhs
Birkbeck, University of London
Department of Computer Science and Information Systems
http://www.dcs.bbk.ac.uk/~carsten/