strange lockups on sol. 9 machines

Riccardo Mottola rmottola at opencsw.org
Tue Apr 7 16:13:43 CEST 2015


Hi,

Dagobert Michelsen wrote:
> Hi Riccardo,
>
> Am 26.03.2015 um 16:12 schrieb Riccardo Mottola <rmottola at opencsw.org>:
>> did you have perhaps time investigating why I got those lockups on the solaris 9 build hosts?
> Yes. As I wrote earlier it is some kind of locking problem. The stacktrace looks
> like this:
>
> unstable9s% pstack 15521
> 15521:  git am --ignore-space-change --ignore-whitespace /home/dam/mgar/pkg/ar
>   ff3dcba4 lwp_park (0, 0, 0)
>   ff3d3d3c s9_lwp_park (0, 0, 0, 0, 0, 0) + 6c
>   ff3dcb08 s9_handler (0, 0, 0, 0, 0, 0) + 90
>   feca11bc mutex_lock_queue (fecb8c04, 0, feda2408, 0, 0, 0) + 104
>   feca18cc mutex_lock_internal (0, 0, fecb0000, 0, 0, 0) + 610
>   fed78a28 atfork_append (0, 0, fef1d060, feda2404, feda2408, 0) + 4
>   fed78b94 pthread_atfork (0, 0, fef1d060, fffbf8dc, 40400, 406e0) + 24
>   fef1d0dc ???????? (fef1d060, fef63a18, 0, 6000, fef63a18, 6000)
>   fed78d70 run_postfork_child (feda23f0, feda2408, 2526c, fec9cfa8, 5, 0) + 24
>   fec9cfa8 fork1    (1, 1afc8, 0, 0, 0, 0) + f4
>   fec9d05c fork     (0, 4, ffbff3b8, 223544, 1b5d80, 0) + 28
>   00151534 start_command (ffbff49c, 0, 9, 0, 0, 0) + 220
>   00151b80 run_command (ffbff49c, 0, 20, 0, 8, 8) + 4
>   00151cac run_command_v_opt (ffbff658, 28, 1e6c00, 223544, 1b5d80, 0) + 14
>   000212a8 ???????? (ffbff658, ffffffff, 1fc000, 223544, 1aadd0, 0)
>   00021308 ???????? (ffbff634, ffbff638, ffbffa90, 1dadc8, 0, 1b5000)
>   00021448 main     (ffbff658, ffbff634, 1fc818, 1b6fe8, ffbff638, ffbff7a0) + f8
>   0001f840 _start   (0, ffbff654, 1, ff3dca78, ff3ee850, ff3ee000) + 108
>
> I just updated unstable9s to the latest Solaris 9 patchcluster with no change in
> behaviour. If anyone has suggestions I am all ears.

Since it happens on both SPARC ad x86... I wonder when and why it 
started happening? It used to work, since I built several packages. Did 
we update mgar? or one of the packages it depends upon?

If the trace is GIT, I suppose it locks when mgar invokes GIT. Thus the 
problem would be git or one of its dependencies, did we update them?

Riccardo



More information about the maintainers mailing list