GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Cacti SSH DISKFREE returning -1 randomly on just one volume

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cacti SSH DISKFREE returning -1 randomly on just one volume

    Hello

    We are using the template and script for getting Linux box stats via SSH. All is working well except randomly it reports back -1 for the root partition's DISKFREE_available and DISKFREE_used. Strangely it only does it for one of several volumes checked for this box, all checked at the same time. We have thold running on this so it annoyingly triggers an email about low free space.

    Code:
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nj:-1' [map nj->DISKFREE_used]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nk:-1' [map nk->DISKFREE_available]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nj:216727552' [map nj->DISKFREE_used]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nk:786444288' [map nk->DISKFREE_available]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nj:67098263552' [map nj->DISKFREE_used]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nk:33221103616' [map nk->DISKFREE_available]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nj:1564028928' [map nj->DISKFREE_used]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] Parsed MULTI output field 'nk:339497058304' [map nk->DISKFREE_available]
    02/13/2014 12:01:28 PM - POLLER: Poller[0] CACTI2RRD: diskfree_used_336.rrd --template DISKFREE_used:DISKFREE_available 1392292864:-1:-1
    02/13/2014 12:01:28 PM - POLLER: Poller[0] CACTI2RRD: diskfree_used_337.rrd --template DISKFREE_used:DISKFREE_available 1392292864:216727552:786444288
    02/13/2014 12:01:28 PM - POLLER: Poller[0] CACTI2RRD: diskfree_used_341.rrd --template DISKFREE_used:DISKFREE_available 1392292864:67098263552:33221103616
    02/13/2014 12:01:28 PM - POLLER: Poller[0] CACTI2RRD: diskfree_used_342.rrd --template DISKFREE_used:DISKFREE_available 1392292864:1564028928:339497058304
    Can I get it to save the df out put to a log file maybe, see what's happening better?

    Thanks for your help

  • #2
    Ok, I've found the $debug_log option in the ss_get_by_ssh.php script. I shall see what it records when it happens next

    Comment


    • #3
      Right, it has happened again now I have the debug log going.

      So it is timing out connecting to the server and thus logging it as "-1".

      Is there a better way to log this or a way any one knows of to stop thold triggering?

      Code:
      array (
        0 => 'result of ssh -q -o "ConnectTimeout 10" -o "StrictHostKeyChecking no" user@host -p 22 -i id_rsa \'df -k -P\'',
        1 => 'timed out',
      )
      I'll try increasing the time out to see if it reduces the occurrences but I'd prefer a better solution to stop 'false' alarms emails. I understand this may be better aimed at the thold people so I'll go find where I can ask them too.

      Thank you!

      Comment


      • #4
        You can increase the timeout
        $cmd_tout = 10; # Command exec timeout (ssh itself or local cmd)

        in ss_get_by_ssh.php.cnf.

        Comment


        • #5
          I have done.

          I'm wondering though, is the -1 the correct way to log 'no value' to Cacti? Should it not be null or something?

          Comment


          • #6
            There is no null for Cacti. 0 can mean a value of 0, but -1 can mean that something is wrong and worse to check...

            Comment

            Working...
            X