hang when building packages on unstable9s

Dagobert Michelsen dam at opencsw.org
Sun Apr 5 20:02:29 CEST 2015


Hi folks,

Am 05.04.2015 um 19:41 schrieb Dagobert Michelsen <dam at opencsw.org>:
> Am 29.03.2015 um 22:21 schrieb Maciej Bliziński <maciej at opencsw.org>:
>> On Sun, Mar 29, 2015 at 10:19:58PM +0200, Dagobert Michelsen wrote:
>>> Am 29.03.2015 um 20:23 schrieb Maciej Bliziński via buildfarm <buildfarm at lists.opencsw.org>:
>>>> Hangs in unstable9x too.
>>> 
>>> This is very strange as unstable9s uses a loopback filesystem and unstable10s works
>>> fine as far as I can tell. The testing I did worked fine, do you have a specific recipe
>>> I can run to reproduce the issue? I would like to understand the problem before booting
>>> the whole farm.
>> 
>> I've submitted my changes, try building pkg/arc/trunk. This triggers the
>> problem for me.
> 
> I just rebooted the global zone where NFS, login, unstable10s and others are hosted
> and the problem is still present. One more good reason to not boot servers if there
> is a problem :-)
> 
> The situation looks like this at the moment: On unstable9s some specific
> commands are stuck during fork while the machine continues to run and
> login and some commands are still possible. Here is the process tree and
> stack for one of the hanging commands, maybe someone has a striking idea.
> 
> unstable9s# ptree 18150
> 16482 /usr/lib/ssh/sshd
>  16816 /usr/lib/ssh/sshd
>    16823 /usr/lib/ssh/sshd
>      16826 -zsh
>        17145 /bin/bash /opt/csw/bin/mgar package
>          17182 gmake -I /home/dam/mgar/pkg/.buildsys/v2 package
>            18150 gmake GAR_PLATFORM=solaris9-sparc MODULATION=isa-sparcv8 ISA=
>              18519 /bin/sh -c ( if ggrep -q 'diff --git' /home/dam/mgar/pkg/ar
>                18520 /bin/sh -c ( if ggrep -q 'diff --git' /home/dam/mgar/pkg/
>                  18525 git am --ignore-space-change --ignore-whitespace /home/
>                    18527 git am --ignore-space-change --ignore-whitespace /hom
> 
> unstable9s# pstack 18527
> 18527:  git am --ignore-space-change --ignore-whitespace /home/dam/mgar/pkg/ar
> ff3dcba4 lwp_park (0, 0, 0)
> ff3d3d3c s9_lwp_park (0, 0, 0, 0, 0, 0) + 6c
> ff3dcb08 s9_handler (0, 0, 0, 0, 0, 0) + 90
> feca11bc mutex_lock_queue (fecb8c04, 0, feda0428, 0, 0, 0) + 104
> feca18cc mutex_lock_internal (0, 0, fecb0000, 0, 0, 0) + 610
> fed78440 atfork_append (0, 0, fef1d060, feda0424, feda0428, 0) + 4
> fed785ac pthread_atfork (0, 0, fef1d060, fffbf8dc, 40400, 406e0) + 24
> fef1d0dc ???????? (fef1d060, fef63a18, 0, 6000, fef63a18, 6000)
> fed78788 run_postfork_child (feda0410, feda0428, 23854, fec9cfa8, 5, 0) + 24
> fec9cfa8 fork1    (1, 1afc8, 0, 0, 0, 0) + f4
> fec9d05c fork     (0, 4, ffbff4a8, 223544, 1b5d80, 0) + 28
> 00151534 start_command (ffbff58c, 0, 9, 0, 0, 0) + 220
> 00151b80 run_command (ffbff58c, 0, 20, 0, 8, 8) + 4
> 00151cac run_command_v_opt (ffbff748, 28, 1e6c00, 223544, 1b5d80, 0) + 14
> 000212a8 ???????? (ffbff748, ffffffff, 1fc000, 223544, 1aadd0, 0)
> 00021308 ???????? (ffbff724, ffbff728, ffbffb80, 1dadc8, 0, 1b5000)
> 00021448 main     (ffbff748, ffbff724, 1fc818, 1b6fe8, ffbff728, ffbff890) + f8
> 0001f840 _start   (0, ffbff744, 1, ff3dca78, ff3ee850, ff3ee000) + 108

Just FYI: the same issue happens for buildbot slave and probably other applications:

buildbot at unstable9s :~/slave > pstack 20922
20922:  /opt/csw/bin/python2.7 /opt/csw/bin/buildslave start .
 ff3dcba4 lwp_park (0, 0, 0)
 ff3d3d3c s9_lwp_park (0, 0, 0, 0, 0, 0) + 6c
 ff3dcb08 s9_handler (0, 0, 0, 0, 0, 0) + 90
 ff1e11bc mutex_lock_queue (ff1f8c04, 0, ff1c0428, 0, 0, 0) + 104
 ff1e18cc mutex_lock_internal (0, 0, ff1b0000, 0, 0, 0) + 610
 ff198440 atfork_append (0, 0, fe7bd060, ff1c0424, ff1c0428, 0) + 4
 ff1985ac pthread_atfork (0, 0, fe7bd060, fffbf8dc, 40400, 406e0) + 24
 fe7bd0dc ???????? (fe7bd060, fe803a18, 0, 6000, fe803a18, 6000)
 ff198788 run_postfork_child (ff1c0410, ff1c0428, 23854, ff1dcfa8, b59ec740, fede1060) + 24
 ff1dcfa8 fork1    (1, 1afc8, ff0d7db0, feeca000, b59ec740, 0) + f4
 ff1dd05c fork     (fede81c0, fede1060, fee58c00, ff0e11a8, ff0c1614, 1) + 28
 ff05bd60 posix_fork (0, 0, a8c4eefc, 11aa90, ff0d9c78, 1) + 18
 ff0167bc PyEval_EvalFrameEx (fe4868a0, 0, 0, fedd5925, 20ab0, fe4869e8) + 556c
 ff017f5c PyEval_EvalFrameEx (fe5b4bf0, 0, fe4868a4, fedee4b6, 20ab0, 20ab0) + 6d0c
 ff017f5c PyEval_EvalFrameEx (feddc030, 0, 0, fedbc785, 20ab0, 20ab0) + 6d0c
 ff018c28 PyEval_EvalCodeEx (fee4dcc8, fee589c0, feddc030, 0, 0, 0) + 91c
 ff018ddc PyEval_EvalCode (fee4dcc8, fee589c0, fee589c0, ff0e1080, fed39488, ffffffff) + 28
 ff03d540 PyRun_FileExFlags (ff1c01f4, ffbffbd3, 97fa8, fee589c0, fee589c0, fee4dcc8) + 7c
 ff03e120 PyRun_SimpleFileExFlags (ffbffbe6, ffbffbd3, 1, ffbffa24, ff1c01f4, fee56f10) + e4
 ff053848 Py_Main  (1, ffbffa8c, ff0e3af8, 0, ff1c01f4, 0) + d3c
 000105c0 _start   (0, ffbffa8c, 1, ff3dca78, ff3ee850, ff3ee000) + 5c


Best regards

  — Dago

-- 
"You don't become great by trying to be great, you become great by wanting to do something,
and then doing it so hard that you become great in the process." - xkcd #896

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2418 bytes
Desc: not available
URL: <http://lists.opencsw.org/pipermail/maintainers/attachments/20150405/9f81bbb3/attachment-0001.p7s>


More information about the maintainers mailing list