Dear all,
some jobs of the first run have been started. Please monitor via: https://termcomp.github.io/Y2022/
(Not all jobs because of the popular permission problems.)
Good luck! Akihisa
Dear all,
most jobs are started. And already a conflict is detected! And the fault is clearly on NaTT...
Best, Akihisa
On 7/26/2022 7:30 PM, YAMADA, Akihisa wrote:
Dear all,
some jobs of the first run have been started. Please monitor via: https://termcomp.github.io/Y2022/
(Not all jobs because of the popular permission problems.)
Good luck! Akihisa
On 7/26/22 13:25, YAMADA, Akihisa wrote:
.. conflict ... NaTT...
that's in TRS-Relative. This makes me a bit skeptical about SRS-relative "new results" (NB: I like this, and other options for navigation that you implemented) where NaTT claims to solve Zantema_06_relative/rel{11,12}
But maybe (certainly) I don't understand your paper (JAR 2017) well enough.
rel11 is "obviously" terminating but the argument seems to require arithmetics. I think Hans crafted these to show exactly that. Was there an accompanying paper? (*)
I once found a proof (matrices of dimension 5) for rel12, but never for rel11.
Best regards, Johannes.
(*) some of these problems had annotations when submitted - but these were lost in XMLification?
tpdb-4.0/SRS/Zantema06/rel11.srs:
(RULES b p b -> b a p b , p ->= a p a , a p a a ->= p ) (COMMENT: invariant after first rule: left from p more a's than right from p )
Dear Johannes,
no sorry, I'm sure NaTT 2.3 on relative is incorrect. It seems some processors don't look at relative rules. I can't debug soon, though...
Best, Akihisa
On 7/27/2022 8:54 PM, Johannes Waldmann wrote:
On 7/26/22 13:25, YAMADA, Akihisa wrote:
.. conflict ... NaTT...
that's in TRS-Relative. This makes me a bit skeptical about SRS-relative "new results" (NB: I like this, and other options for navigation that you implemented) where NaTT claims to solve Zantema_06_relative/rel{11,12}
But maybe (certainly) I don't understand your paper (JAR 2017) well enough.
rel11 is "obviously" terminating but the argument seems to require arithmetics. I think Hans crafted these to show exactly that. Was there an accompanying paper? (*)
I once found a proof (matrices of dimension 5) for rel12, but never for rel11.
Best regards, Johannes.
(*) some of these problems had annotations when submitted - but these were lost in XMLification?
tpdb-4.0/SRS/Zantema06/rel11.srs:
(RULES b p b -> b a p b , p ->= a p a , a p a a ->= p ) (COMMENT: invariant after first rule: left from p more a's than right from p )
Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
Dear all,
Jobs are ready for the final run. https://termcomp.github.io/Y2022/
I'll start them tomorrow morning in Haifa time.
Best, Akihisa
Dear all,
it turned out we don't have a live screen until IJCAR starts tomorrow. I'll start some big jobs (TRS/SRS Standard, Complexity) later today, and other categories tomorrow morning.
Best, Akihisa
On 8/6/2022 1:36 PM, YAMADA, Akihisa wrote:
Dear all,
Jobs are ready for the final run. https://termcomp.github.io/Y2022/
I'll start them tomorrow morning in Haifa time.
Best, Akihisa _______________________________________________ Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
Dear all,
some complexity jobs are started.
Best, Akihisa
On 8/7/2022 9:13 AM, YAMADA, Akihisa wrote:
Dear all,
it turned out we don't have a live screen until IJCAR starts tomorrow. I'll start some big jobs (TRS/SRS Standard, Complexity) later today, and other categories tomorrow morning.
Best, Akihisa
On 8/6/2022 1:36 PM, YAMADA, Akihisa wrote:
Dear all,
Jobs are ready for the final run. https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftermcomp.g...
I'll start them tomorrow morning in Haifa time.
Best, Akihisa _______________________________________________ Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
Hi Akihisa, and all,
some complexity jobs are started.
Thanks for putting in the work. I am looking forward to the competition. I regret I can't be in Haifa/at WST.
I encourage participants, and observers, to comment on what they are seeing - to let remote participants in on some of the fun.
For example, for the complexity categories, I noticed several "new" results for Aprove-2021, and I thought - how can that be, since Aprove (if it's really -2021, yes it is, config id matches) should have got them last year as well?
It turns out that all of these new results are obtained at 295 seconds (or later), so it is noise in the timing (in Aprove, in JVM, in bench/star-exec) that makes the difference here?
It's fine - other (sports) competitions are stochastic as well.
- J.
Dear Johannes, and all,
It turns out that all of these new results are obtained at 295 seconds (or later), so it is noise in the timing (in Aprove, in JVM, in bench/star-exec) that makes the difference here?
last evening SMT-comp organizers informed me that StarExec machines are renewed. I guess this resulted in former timeouts to turn to success. (It was also quicker than I expected.)
I'll confirm this with Aaron.
Best, Akihisa
It's fine - other (sports) competitions are stochastic as well.
- J.
Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
renewed. I guess this resulted in former timeouts to turn to success.
but in these cases (complexity) I think that Aprove has some (trivial) bound very early, but keeps looking for better bounds, and prints the trivial bound only at the very last moment.
"faster stareexec" does not help here - but exact timing does.
One could have a different mode for competition: output is a sequence of (true) statements BOUNDS(low,hi) (of nested intervals), and the last (narrowest) one counts. I think applications would prefer something like this as well.
- J.
Dear Johannes, all,
Aaron confirmed that we are on the same nodes. So it must be due to the stochastic behavior, indeed.
Some of the new bounds are not trivial, BTW. You can filter with "new results" to inspect them: https://termcomp.herokuapp.com/Y2022/Derivational_Complexity__TRS.html?filte...
One could have a different mode for competition: output is a sequence of (true) statements BOUNDS(low,hi) (of nested intervals), and the last (narrowest) one counts. I think applications would prefer something like this as well.
Upper and lower bounds (or even termination claims) could better be given independently. But I'll wait until tool authors are motivated.
Best, Akihisa
Dear all,
I've started other demanding jobs (TRS/SRS, Complexity, Certified). I'll start other smaller jobs in next break and try to present it on the conference screen.
Best, Akihisa
Dear all,
all but a few jobs are finished. Sorry that it was not really successfully live. There was a connection trouble.
"Teams" are redefined after a long discussion.
Best, Akihisa
On 8/8/2022 8:32 AM, YAMADA, Akihisa wrote:
Dear all,
I've started other demanding jobs (TRS/SRS, Complexity, Certified). I'll start other smaller jobs in next break and try to present it on the conference screen.
Best, Akihisa
Dear Akihisa,
Thanks for running TermCOMP 2022!
On the "Advancing-the-State-of-the-Art Ranking" the score of MU-TERM is 0.
I do not know very well how such an score is obtained, but
(1) We obtained better results than last year in CTRS_conditional(operational termination); ok, this was due to an increase of the CTRS_conditional collection; and (2) We promoted a "demo" on termination of CTRSs.
I thought this could be considered as "advancing the state of the art", specially (2). Perhaps (2) does not count because it is not an 'official' competition. If this is the case, I suggest that, in the future, tools aiming to obtain results in such demo competitions (addressing new problems in termination and complexity should also be recognized as "advacing the state of the art", as I think this is primarily a scientific concept, rather than a subject for competition.
Best regards,
Salvador.
El 8/8/22 a las 14:16, YAMADA, Akihisa escribió:
Dear all,
all but a few jobs are finished. Sorry that it was not really successfully live. There was a connection trouble.
"Teams" are redefined after a long discussion.
Best, Akihisa
On 8/8/2022 8:32 AM, YAMADA, Akihisa wrote:
Dear all,
I've started other demanding jobs (TRS/SRS, Complexity, Certified). I'll start other smaller jobs in next break and try to present it on the conference screen.
Best, Akihisa
Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
Dear Johannes,
but in these cases (complexity) I think that Aprove has some (trivial) bound very early, but keeps looking for better bounds, and prints the trivial bound only at the very last moment.
Yes, most likely that's what happening. AProVE keeps searching for better bounds until it either manages to prove a tight complexity bound, or all available techniques terminated. If the time limit is exceeded, it tries to output the best result that has been found so far before it gets killed, but that can fail if it's stuck in an expensive computation.
One could have a different mode for competition: output is a sequence of (true) statements BOUNDS(low,hi) (of nested intervals), and the last (narrowest) one counts. I think applications would prefer something like this as well.
That's a good idea (even though it might be quite a bit of work to adapt the implementation).
Best Florian
termtools@lists.rwth-aachen.de