[csw-maintainers] [csw-buildfarm] Fwd: OpenCSW catalog update report (ganglia)

Daniel Pocock daniel at opencsw.org
Wed Nov 30 09:37:51 CET 2011


On 29/11/11 16:27, Dagobert Michelsen wrote:
> Hi Daniel,
> 
> Am 29.11.2011 um 06:08 schrieb Daniel Pocock:
>> On 28/11/11 22:43, Daniel Pocock wrote:
>>> On 28/11/11 21:22, Dagobert Michelsen wrote:
>>>> Am 28.11.2011 um 14:14 schrieb Daniel Pocock:
>>>>>>> 3. be aware the this gmond version doesn't run on a non-global zone
>>>>>>>
>>>>>>> http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=100
>>>>>>
>>>>>> If course the webserver runs in a non-global zone :-P
>>>>>> Could you please apply Brians patch as cited in the patch report?
>>>>>>
>>>>>
>>>>> There are comments in the bug report suggesting the patch may cause
>>>>> gmond not to work in a global zone, so I'm not sure if the patch is ideal
>>>>>
>>>>> I'll have to find a quick way to test if running in a zone or not
>>>>
>>>> Comment #19 reads ok for me:
>>>>
>>>>  "Just meet this issue in a Solaris zone. I compiled a small test with Brian's
>>>>   fix. And I found it works great, both in a non-global zone and global zone."
>>>
>>> http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=100#c3
>>>
>>>   'The problem with the "simple" solution is that it breaks normal
>>> (non-zone)
>>> setups.
>>>
>>> e.g. the following is from a Solaris-10 HA configuration:......'
>>>
>>> sounds a bit ominous to me.
> 
> But http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=100#c19 sounds promising.
> Do you have upstream contacts who can aid in finally fixing this?

Most of the emphasis on the mailing list is on the various Linux
distributions

It is definitely something I can fix if necessary, but I would have to
allocate some time to study the real requirements for both zone support
and the way Ganglia deals with network interface metrics.

> 
>>> At the very least, Ganglia 3.1 series might be able to just refuse to
>>> run in a zone, display a more meaningful error than `ioctl failed' or
>>> just disable this code in a zone or some other `safe' hack, and then
>>> comprehensive zone support can be introduced through trunk
>>>
>>> I don't know enough about zones, HA setups and other permutations to say
>>> the ideal way to address the issue
>>>
>>> However, another issue that does come to mind: if someone runs gmond in
>>> a zone, is it meaningful to report all of the CPU stats for every zone?
>>> Or is the CPU inside a zone not really measurable in the same way as a
>>> physical CPU in the global zone?
> 
> It is measurable, but by default you would also get some of the performance data
> from the global zone (like system load).

That's what I suspected, given that each metric RRD file takes up some
disk space, a good implementation of Ganglia for zones should not send
metrics that are not meaningful (e.g. those load/CPU related metrics
that should be measured on the global zone)

>> Another thought:
>>
>> - has anyone tried the Host sFlow agent on Solaris (and in a zone)?
>> http://host-sflow.sourceforge.net/
> 
> Solaris doesn't seem to be on the list of supported operating systems. Have
> you tried compiling it?
>
Just had a brief look at it today - the tarball contains a Linux
directory and a Windows directory.  Looking in SVN/trunk, it doesn't
look like they have started on Solaris support yet.

A Solaris port would probably involve working through the Linux files
and adapting them to read kstat, e.g:
http://host-sflow.svn.sourceforge.net/viewvc/host-sflow/trunk/src/Linux/readCpuCounters.c?revision=238&view=markup

>> - if I update my package for Ganglia 3.2 and build a package of Host
>> sFlow agent, then that may provide an alternative solution, as the Host
>> sFlow agent may not face the zone problem, and users can still use
>> Ganglia's web interface as the reporting tool:
>> http://blog.sflow.com/2011/07/ganglia-32-released.html
> 
> sflow support in ganglia would certainly be a good thing.
> 

Ok, the package is there in experimental now - I'd be interested to know
if anyone has some sFlow agent they can test with it?


More information about the maintainers mailing list