From dam at opencsw.org Wed Jun 12 14:52:22 2019 From: dam at opencsw.org (Dagobert Michelsen) Date: Wed, 12 Jun 2019 14:52:22 +0200 Subject: [buildfarm] About libpcap-rc and tcpdump-rc setups In-Reply-To: References: <52d7029e-e8ca-e5f3-1cf4-2174fe6c9584@orange.fr> Message-ID: <4AF5398E-26AF-4F48-9D17-4AC3E06DF108@opencsw.org> Hi Francois, Am 12.06.2019 um 11:40 schrieb Francois-Xavier Le Bail : > On 11/06/2019 20:48, Dagobert Michelsen wrote: >> Am 09.06.2019 um 15:32 schrieb Francois-Xavier Le Bail : >>> Could you add 'touch .devel' at the beginning of these build setups ? >>> (Trying to get more warnings, if not already the case) >> >> This is now added to all builders for tcpdump and libpcap right after checkout before configure. > > Sorry but the 'Set developer mode' is missing for 'tcpdump-solaris10-amd64' and > 'tcpdump-solaris10-sparcv9' setups. Rats, tcpdump has two builders for technical reasons, one for 32 bit and one for 64 bit and guess where I added it :-/ Hopefully this is fixed now. Best regards ? Dago -- "You don't become great by trying to be great, you become great by wanting to do something, and then doing it so hard that you become great in the process." - xkcd #896 From dam at opencsw.org Wed Jun 19 12:49:12 2019 From: dam at opencsw.org (Dagobert Michelsen) Date: Wed, 19 Jun 2019 12:49:12 +0200 Subject: Ruby CI on Solaris In-Reply-To: References: <1EA8DF6D-92BB-4C73-87F5-5BA4C3973775@opencsw.org> Message-ID: Hi Yusuke, Am 18.06.2019 um 01:08 schrieb Yusuke Endoh : > I'm Yusuke Endoh, who is a developer of Ruby programming language. > > We are using OpenCSW build machines to test Ruby on Solaris 10/11 x86/SPARC. > We highly appreciate your support. > > Recently, our test suite often fails on OpenCSW due to timeout, ENOSPC, or ENOMEM. > A few years ago, it took much less than three hours, but currently it sometimes > takes more than ten hours. This makes it difficult for us to manage the CIs. > > In fact, when I log in to the machines (especially unstable10s), they are painfully slow. > Are they working properly? Or do you know if any trouble is happening? > > I'd be happy if you could give me any advice to address this issue. The problem is that the zfs cache in the kernel we are using has a memory leak which results in memory getting low after a couple of month. We cannot patch the system as the newer kernel also introduces a new libc which makes it harder to build packages for older versions of Solaris, so we must stick to it. The solution is reboot the machine every couple of month if it gets too slow. The solutio0n is to drop me a note when the responsetimes become unacceptable, sorry :-/ I am currently on vacation with limited connectivity. @Jan: Can you please reboot the 5220 sometime? Best regards ? Dago > > Best regards, > Yusuke Endoh > > > 2019?6?18?(?) 7:32 SHIBATA Hiroshi : > ---------- Forwarded message --------- > From: NARUSE, Yui > Date: Mon, Mar 23, 2015 at 3:57 AM > Subject: Re: Ruby CI on Solaris > To: Dagobert Michelsen > Cc: SHIBATA Hiroshi > > > Hi Dago, > > Thank you for the offer. > rubyci.org runs chkbuild (https://github.com/akr/chkbuild). > If you can provide vm or ssh account, I'll setup it. > > Best regards, > > --- naruse > > 2015-03-23 0:25 GMT+09:00 Dagobert Michelsen : > > Hi, > > > > I got your contact from Luis Lavena. > > > > I package up Ruby on Solaris for OpenCSW and have regularly problems compiling > > recent versions of Ruby. For other upstream projects I already offer buildbot > > CI to enhance the general compatibility and noticed you already have a CI system > > in place: > > http://rubyci.org > > > > I would like to offer Solaris build hosts which can be connected to that CI > > instance (Jenkins?). I hope you are the right person to talk to for the addition. > > > > > > Best regards > > > > -- Dago > > > > -- > > "You don't become great by trying to be great, you become great by wanting to do something, > > and then doing it so hard that you become great in the process." - xkcd #896 > > > > > > -- > NARUSE, Yui > > > -- > SHIBATA Hiroshi hsbt at ruby-lang.org > https://www.hsbt.org/ -- "You don't become great by trying to be great, you become great by wanting to do something, and then doing it so hard that you become great in the process." - xkcd #896 From jh at opencsw.org Wed Jun 19 12:57:55 2019 From: jh at opencsw.org (Jan Holzhueter) Date: Wed, 19 Jun 2019 12:57:55 +0200 Subject: Ruby CI on Solaris In-Reply-To: References: <1EA8DF6D-92BB-4C73-87F5-5BA4C3973775@opencsw.org> Message-ID: <680ba675-c137-254f-5827-6328a7d109d4@opencsw.org> Hi, Am 19.06.19 um 12:49 schrieb Dagobert Michelsen: > Hi Yusuke, > > Am 18.06.2019 um 01:08 schrieb Yusuke Endoh : >> I'm Yusuke Endoh, who is a developer of Ruby programming language. >> >> We are using OpenCSW build machines to test Ruby on Solaris 10/11 x86/SPARC. >> We highly appreciate your support. >> >> Recently, our test suite often fails on OpenCSW due to timeout, ENOSPC, or ENOMEM. >> A few years ago, it took much less than three hours, but currently it sometimes >> takes more than ten hours. This makes it difficult for us to manage the CIs. >> >> In fact, when I log in to the machines (especially unstable10s), they are painfully slow. >> Are they working properly? Or do you know if any trouble is happening? >> >> I'd be happy if you could give me any advice to address this issue. > > The problem is that the zfs cache in the kernel we are using has a memory leak which results > in memory getting low after a couple of month. We cannot patch the system as the newer > kernel also introduces a new libc which makes it harder to build packages for older > versions of Solaris, so we must stick to it. The solution is reboot the machine every couple > of month if it gets too slow. The solutio0n is to drop me a note when the responsetimes > become unacceptable, sorry :-/ > > I am currently on vacation with limited connectivity. > @Jan: Can you please reboot the 5220 sometime? > > > Will put on an my list for tomorrow. Greetings Jan From mame at ruby-lang.org Wed Jun 19 14:02:55 2019 From: mame at ruby-lang.org (Yusuke Endoh) Date: Wed, 19 Jun 2019 21:02:55 +0900 Subject: Ruby CI on Solaris In-Reply-To: <680ba675-c137-254f-5827-6328a7d109d4@opencsw.org> References: <1EA8DF6D-92BB-4C73-87F5-5BA4C3973775@opencsw.org> <680ba675-c137-254f-5827-6328a7d109d4@opencsw.org> Message-ID: Dago and Jan, I understand the situation. When I often see timeout, I'll contact on you. Thank you very much! 2019?6?19?(?) 19:58 Jan Holzhueter : > Hi, > > Am 19.06.19 um 12:49 schrieb Dagobert Michelsen: > > Hi Yusuke, > > > > Am 18.06.2019 um 01:08 schrieb Yusuke Endoh : > >> I'm Yusuke Endoh, who is a developer of Ruby programming language. > >> > >> We are using OpenCSW build machines to test Ruby on Solaris 10/11 > x86/SPARC. > >> We highly appreciate your support. > >> > >> Recently, our test suite often fails on OpenCSW due to timeout, ENOSPC, > or ENOMEM. > >> A few years ago, it took much less than three hours, but currently it > sometimes > >> takes more than ten hours. This makes it difficult for us to manage > the CIs. > >> > >> In fact, when I log in to the machines (especially unstable10s), they > are painfully slow. > >> Are they working properly? Or do you know if any trouble is happening? > >> > >> I'd be happy if you could give me any advice to address this issue. > > > > The problem is that the zfs cache in the kernel we are using has a > memory leak which results > > in memory getting low after a couple of month. We cannot patch the > system as the newer > > kernel also introduces a new libc which makes it harder to build > packages for older > > versions of Solaris, so we must stick to it. The solution is reboot the > machine every couple > > of month if it gets too slow. The solutio0n is to drop me a note when > the responsetimes > > become unacceptable, sorry :-/ > > > > I am currently on vacation with limited connectivity. > > @Jan: Can you please reboot the 5220 sometime? > > > > > > > > Will put on an my list for tomorrow. > > Greetings > Jan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at opencsw.org Thu Jun 20 14:38:10 2019 From: jh at opencsw.org (Jan Holzhueter) Date: Thu, 20 Jun 2019 14:38:10 +0200 Subject: Ruby CI on Solaris In-Reply-To: References: <1EA8DF6D-92BB-4C73-87F5-5BA4C3973775@opencsw.org> <680ba675-c137-254f-5827-6328a7d109d4@opencsw.org> Message-ID: Hi Am 19.06.19 um 14:02 schrieb Yusuke Endoh: > Dago and Jan, > > I understand the situation. > When I often see timeout, I'll contact on you. > Thank you very much! reboot done. Should be better now Greetings Jan From dam at opencsw.org Thu Jun 20 22:03:51 2019 From: dam at opencsw.org (Dagobert Michelsen) Date: Thu, 20 Jun 2019 22:03:51 +0200 Subject: Usage of /tmp on Solaris buildfarm Message-ID: <0EDED39B-535C-4E6F-9E33-668F41A84D4B@opencsw.org> Hi Chuck, we noticed that you store a lot of stuff in /tmp. This is discouraged in Solaris as /tmp is a memory filesystem - files there directly occupy RAM which is then missing from file system cache making the whole system unusable slow. Please do not store stuff there, if you need more space I can increase your quota. Your regular home is on ZFS with lazy write enabled, so writing to /tmp should also not be faster than writing to your home directory. Best regards ? Dago -- "You don't become great by trying to be great, you become great by wanting to do something, and then doing it so hard that you become great in the process." - xkcd #896 From chuck.atkins at kitware.com Thu Jun 20 22:42:58 2019 From: chuck.atkins at kitware.com (Chuck Atkins) Date: Thu, 20 Jun 2019 16:42:58 -0400 Subject: Usage of /tmp on Solaris buildfarm In-Reply-To: <0EDED39B-535C-4E6F-9E33-668F41A84D4B@opencsw.org> References: <0EDED39B-535C-4E6F-9E33-668F41A84D4B@opencsw.org> Message-ID: Hi Dago, I moved the builds to run out of /tmp a while ago because it seemed to speed them up significantly by doing so. I haven't looked at them in quite some time though as they've been working well. There were many moving parts at the time and it's possible one of the other changes I made to the automation is responsible for the speed up (like shifting the start time to run when the machine is less loaded). I'll take a look at them tomorrow and try running them out of home is instead to stay tidy. IIRC the performance was a problem only on the Solaris 10 machine so I should be able to move all the Solaris 11 builds without issue. Best case scenario it will have little impact on build times. Worst case scenario the Solaris 10 builds are significantly slower out of home and I just reduce the number of builds running to focus on the most significant instead of trying to run all possible combinations. Thanks for the heads up. I'll clean it up. - Chuck On Thu, Jun 20, 2019, 16:03 Dagobert Michelsen wrote: > Hi Chuck, > > we noticed that you store a lot of stuff in /tmp. This is discouraged in > Solaris as /tmp > is a memory filesystem - files there directly occupy RAM which is then > missing from > file system cache making the whole system unusable slow. Please do not > store stuff there, > if you need more space I can increase your quota. Your regular home is on > ZFS with lazy write > enabled, so writing to /tmp should also not be faster than writing to your > home directory. > > > Best regards > > ? Dago > > -- > "You don't become great by trying to be great, you become great by wanting > to do something, > and then doing it so hard that you become great in the process." - xkcd > #896 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joerg.Schilling at fokus.fraunhofer.de Fri Jun 21 12:34:52 2019 From: Joerg.Schilling at fokus.fraunhofer.de (Joerg Schilling) Date: Fri, 21 Jun 2019 12:34:52 +0200 Subject: Usage of /tmp on Solaris buildfarm In-Reply-To: <0EDED39B-535C-4E6F-9E33-668F41A84D4B@opencsw.org> References: <0EDED39B-535C-4E6F-9E33-668F41A84D4B@opencsw.org> Message-ID: <5d0cb2cc.AIVAB0WIV10PiFVL%Joerg.Schilling@fokus.fraunhofer.de> Dagobert Michelsen via buildfarm wrote: > Hi Chuck, > > we noticed that you store a lot of stuff in /tmp. This is discouraged in Solaris as /tmp > is a memory filesystem - files there directly occupy RAM which is then missing from > file system cache making the whole system unusable slow. Please do not store stuff there, > if you need more space I can increase your quota. Your regular home is on ZFS with lazy write > enabled, so writing to /tmp should also not be faster than writing to your home directory. These files do not occupy RAM but rather virtual memory. So they do not really affect the system performance. J?rg -- EMail:joerg at schily.net (home) J?rg Schilling D-13353 Berlin joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/' From dam at opencsw.org Sun Jun 23 22:43:07 2019 From: dam at opencsw.org (Dagobert Michelsen) Date: Sun, 23 Jun 2019 22:43:07 +0200 Subject: Usage of /tmp on Solaris buildfarm In-Reply-To: <5d0cb2cc.AIVAB0WIV10PiFVL%Joerg.Schilling@fokus.fraunhofer.de> References: <0EDED39B-535C-4E6F-9E33-668F41A84D4B@opencsw.org> <5d0cb2cc.AIVAB0WIV10PiFVL%Joerg.Schilling@fokus.fraunhofer.de> Message-ID: Hi J?rg, Am 21.06.2019 um 12:34 schrieb Joerg Schilling : > Dagobert Michelsen via buildfarm wrote: >> we noticed that you store a lot of stuff in /tmp. This is discouraged in Solaris as /tmp >> is a memory filesystem - files there directly occupy RAM which is then missing from >> file system cache making the whole system unusable slow. Please do not store stuff there, >> if you need more space I can increase your quota. Your regular home is on ZFS with lazy write >> enabled, so writing to /tmp should also not be faster than writing to your home directory. > > These files do not occupy RAM but rather virtual memory. > > So they do not really affect the system performance. Ah, a memory discussion I have been missing out for a long time! :-) Indeed /tmp is taken from virtual memory if there is no further configuration, which is SWAP + RAM. When you fill up /tmp with more stuff than SWAP you do take away RAM. Additionally, you get paging activity in terms of pageout if RAM gets low (or increased scanner frequency if you go below ?lotsfree? which is slower than the ZFS delayed writes (remember I have enabled delayed writes and disabled ZIL on this box and lofs does not change semantics). Please also note that this is a rather old Solaris 10 kernel and especially the memory system change considerably during Solaris 11 development. If you have further insights I am all ears :-) Best regards ? Dago -- "You don't become great by trying to be great, you become great by wanting to do something, and then doing it so hard that you become great in the process." - xkcd #896