Dear all, I was just informed that the benchmarking framework that we used in the past (BenchExec) is no longer available on StarExec. Hence, we'll have to work with "runsolver" instead. If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know. Best Florian
Dear Florian, thanks for all your work.
If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know.
Wasn't runsolver this piece of software that couldn't handle memory management properly, especially for Java applications like Aprove and MnM? We had lively discussions on this topic in 2019, but without helpful results. Only replacing runsolver by benchexec finally solved the issue. I definitely don't want to experience this disaster again. In 2019, I seriously considered removing MnM from TermComp for that reason. Excerpts from emails 29 Mar 2019: DH -> Akihisa Yamada and Jera Hensel: "[...] thanks a lot, also to Johannes, for your hints and suggestions to overcome the memory/core dump problem - but without success. The current workaround (start with 20G instead of 128G as last year) will most certainly result in losing proofs [...]" Answer by Jera Hensel: "AProVE is also affected. We run it with only 4G, which reduces the memouts significantly but does not solve the problem completely. We think that for most categories, at least it does not affect the ranking. But for example in „C Integer“, we lost the first place due to some memouts. However, we do not want to use even less than 4G since this might also lead to a loss of solved problems. Therefore, we just gave up on this." Best Dieter On Thu, 2024-06-13 at 15:06 +0200, Florian Frohn wrote:
Dear all,
I was just informed that the benchmarking framework that we used in the past (BenchExec) is no longer available on StarExec. Hence, we'll have to work with "runsolver" instead. If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know.
-- Dieter Hofbauer dieter_hofbauer@web.de Mühlengasse 16, D-34125 Kassel, Germany phone (+49) 561 739 29 54
Dear Dieter,
If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know. Wasn't runsolver this piece of software that couldn't handle memory management properly, especially for Java applications like Aprove and MnM? We had lively discussions on this topic in 2019, but without helpful results. Only replacing runsolver by benchexec finally solved the issue. I definitely don't want to experience this disaster again. In 2019, I seriously considered removing MnM from TermComp for that reason.
You're right, it was about memory, not runtime. I made clear that this is a serious issue for us, and asked whether they can revive benchexec. Best Florian
Dear all, I got new from the StarExec team. In the best case, BenchExec will be back at the beginning of next week. If they don't manage to get it running, we'll ditch the memory limit completely (by setting it to an unrealistically huge value), and we'll only run one benchmark per node at a time. If a solver exhausts the memory of its node, it will start swapping, slow down, and time out, eventually. I think both options are acceptable for us, even though the latter will double the time that's needed to run the competition, as we can only run one instead of two benchmarks per node. So in this case, we'll have to think about countermeasures (e.g., a smaller timeout). Best Florian -------- Forwarded Message -------- Subject: Re: [Termtools] Re: BenchExec no longer available Date: Thu, 13 Jun 2024 18:15:14 +0200 From: Florian Frohn <florian.frohn@informatik.rwth-aachen.de> To: termtools@lists.rwth-aachen.de Dear Dieter,
If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know. Wasn't runsolver this piece of software that couldn't handle memory management properly, especially for Java applications like Aprove and MnM? We had lively discussions on this topic in 2019, but without helpful results. Only replacing runsolver by benchexec finally solved the issue. I definitely don't want to experience this disaster again. In 2019, I seriously considered removing MnM from TermComp for that reason.
You're right, it was about memory, not runtime. I made clear that this is a serious issue for us, and asked whether they can revive benchexec. Best Florian
Dear Florian, concerning BenchExec, that's good news. In any case we need reliable information about the allowed memory size for the competition. Otherwise, memory settings like Xmx$((${STAREXEC_MAX_MEM}-40))G fail spectacularly: Today's test on new.q reports "Could not reserve enough space for 137397010432KB object heap". Best Dieter On Fri, 2024-06-14 at 08:49 +0200, Florian Frohn wrote:
I got new from the StarExec team. In the best case, BenchExec will be back at the beginning of next week.
If they don't manage to get it running, we'll ditch the memory limit completely (by setting it to an unrealistically huge value), and we'll only run one benchmark per node at a time. If a solver exhausts the memory of its node, it will start swapping, slow down, and time out, eventually.
I think both options are acceptable for us, even though the latter will double the time that's needed to run the competition, as we can only run one instead of two benchmarks per node. So in this case, we'll have to think about countermeasures (e.g., a smaller timeout).
-- Dieter Hofbauer dieter_hofbauer@web.de Mühlengasse 16, D-34125 Kassel, Germany phone (+49) 561 739 29 54
Dear Dieter, I think STAREXEC_MAX_MEM is in Mb, so please try Xmx$((${STAREXEC_MAX_MEM}-40))M Best Florian On 6/14/24 13:51, Dieter Hofbauer wrote:
Dear Florian,
concerning BenchExec, that's good news.
In any case we need reliable information about the allowed memory size for the competition. Otherwise, memory settings like Xmx$((${STAREXEC_MAX_MEM}-40))G fail spectacularly: Today's test on new.q reports "Could not reserve enough space for 137397010432KB object heap".
Best Dieter
On Fri, 2024-06-14 at 08:49 +0200, Florian Frohn wrote:
I got new from the StarExec team. In the best case, BenchExec will be back at the beginning of next week.
If they don't manage to get it running, we'll ditch the memory limit completely (by setting it to an unrealistically huge value), and we'll only run one benchmark per node at a time. If a solver exhausts the memory of its node, it will start swapping, slow down, and time out, eventually.
I think both options are acceptable for us, even though the latter will double the time that's needed to run the competition, as we can only run one instead of two benchmarks per node. So in this case, we'll have to think about countermeasures (e.g., a smaller timeout). --
Dieter Hofbauer dieter_hofbauer@web.de Mühlengasse 16, D-34125 Kassel, Germany phone (+49) 561 739 29 54
_______________________________________________ Termtools mailing list -- termtools@lists.rwth-aachen.de To unsubscribe send an email to termtools-leave@lists.rwth-aachen.de
Dear all, BenchExec is available again on StarExec. However, it doesn't work yet in my tests (all pairs fail with 'run script errors'). I'll let you know as soon as it is fixed. Best Florian On 6/14/24 08:49, Florian Frohn wrote:
Dear all,
I got new from the StarExec team. In the best case, BenchExec will be back at the beginning of next week.
If they don't manage to get it running, we'll ditch the memory limit completely (by setting it to an unrealistically huge value), and we'll only run one benchmark per node at a time. If a solver exhausts the memory of its node, it will start swapping, slow down, and time out, eventually.
I think both options are acceptable for us, even though the latter will double the time that's needed to run the competition, as we can only run one instead of two benchmarks per node. So in this case, we'll have to think about countermeasures (e.g., a smaller timeout).
Best Florian
-------- Forwarded Message -------- Subject: Re: [Termtools] Re: BenchExec no longer available Date: Thu, 13 Jun 2024 18:15:14 +0200 From: Florian Frohn <florian.frohn@informatik.rwth-aachen.de> To: termtools@lists.rwth-aachen.de
Dear Dieter,
If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know. Wasn't runsolver this piece of software that couldn't handle memory management properly, especially for Java applications like Aprove and MnM? We had lively discussions on this topic in 2019, but without helpful results. Only replacing runsolver by benchexec finally solved the issue. I definitely don't want to experience this disaster again. In 2019, I seriously considered removing MnM from TermComp for that reason.
You're right, it was about memory, not runtime. I made clear that this is a serious issue for us, and asked whether they can revive benchexec.
Best Florian
_______________________________________________ Termtools mailing list --termtools@lists.rwth-aachen.de To unsubscribe send an email totermtools-leave@lists.rwth-aachen.de
Dear all, finally, BenchExec is working again! Note that the current post-processors (plain.ari..., ceta.ari...) don't work with BenchExec. I'll upload new post-processors. Best Florian On 6/20/24 09:28, Florian Frohn wrote:
Dear all,
BenchExec is available again on StarExec. However, it doesn't work yet in my tests (all pairs fail with 'run script errors'). I'll let you know as soon as it is fixed.
Best Florian
On 6/14/24 08:49, Florian Frohn wrote:
Dear all,
I got new from the StarExec team. In the best case, BenchExec will be back at the beginning of next week.
If they don't manage to get it running, we'll ditch the memory limit completely (by setting it to an unrealistically huge value), and we'll only run one benchmark per node at a time. If a solver exhausts the memory of its node, it will start swapping, slow down, and time out, eventually.
I think both options are acceptable for us, even though the latter will double the time that's needed to run the competition, as we can only run one instead of two benchmarks per node. So in this case, we'll have to think about countermeasures (e.g., a smaller timeout).
Best Florian
-------- Forwarded Message -------- Subject: Re: [Termtools] Re: BenchExec no longer available Date: Thu, 13 Jun 2024 18:15:14 +0200 From: Florian Frohn <florian.frohn@informatik.rwth-aachen.de> To: termtools@lists.rwth-aachen.de
Dear Dieter,
If I remember correctly, we used runsolver at some point, but then we switched to BenchExec due to issues with the measurement of the runtime. If you remember any details / your tool was affected by these problems, please let me know. Wasn't runsolver this piece of software that couldn't handle memory management properly, especially for Java applications like Aprove and MnM? We had lively discussions on this topic in 2019, but without helpful results. Only replacing runsolver by benchexec finally solved the issue. I definitely don't want to experience this disaster again. In 2019, I seriously considered removing MnM from TermComp for that reason.
You're right, it was about memory, not runtime. I made clear that this is a serious issue for us, and asked whether they can revive benchexec.
Best Florian
_______________________________________________ Termtools mailing list --termtools@lists.rwth-aachen.de To unsubscribe send an email totermtools-leave@lists.rwth-aachen.de
_______________________________________________ Termtools mailing list --termtools@lists.rwth-aachen.de To unsubscribe send an email totermtools-leave@lists.rwth-aachen.de
participants (2)
-
Dieter Hofbauer
-
Florian Frohn