munin-node (1.4.0-1) unstable; urgency=low munin-node-configure-snmp command is no longer available, use munin-node-configure --snmp to configure snmp hosts. If upgrading from 1.2.6, please review the /usr/share/doc/munin/UPGRADING file as there is an issue with truncated field names in plugins (especially with the df plugin), resulting in loss of history, which can be fixed manually. -- Tom Feiner Fri, 04 Dec 2009 18:29:16 +0200 munin-node (1.2.0-1) unstable; urgency=low * There are two major bugfixes in the 1.2.x series of Munin since 1.0.x that could not be accomplished without introducing a risk of losing historical data after upgrades. Or more precisely: no data will be lost, but the exact name of the RRD file will change, so that the update process will start collecting data into a new, emtpy, file, which in turn will be read by munin-graph, and the final result is that the graph will appear to have lost all data. The historical data will still be present in the old graph. In the last two sections of this file I will attempt to detail how you can minimize the data loss by carefully planning how to perform the upgrade. * The infrastructure for sending warnings if values drop below or rise above preset boundaries has been redesigned to improve flexibility, and are no longer specific to NSCA/Nagios. The old nsca_* settings are still recognized, and are automatically mapped into a contact with the name "old-nagios". Hence the now deprecated munin.conf entries nsca /bin/nsca nsca_server sloth.fud.no nsca_config /etc/nsca.cf would implicitly be converted to the entry contact.old-nagios.command /bin/nsca sloth.fud.no -c /etc/nsca.cf -to 60 unless the latter was explicitly defined in /etc/munin/munin.conf, in which case the deprecated entries would be ignored. * Data loss issue 1 ================= A number of plugins which in the 1.0.x series used the COUNTER data type has now been changed to use the DERIVE type, with a minimum of 0. The reason is to hinder RRDtool from misdetecting counter wraps when a service or machine is restarted, which resulted in abnormal spikes in those graphs. The munin-update component from the 1.2.x series are able to recognize that a plugin has changed thusly, and will automatically copy all the historic data from the old RRD file into the new one, ensuring a smooth transition. However, the munin-update component from the 1.0.x series are not aware of this, and will react to this data type change by starting to collect data into a new, empty, RRD file. The method to ensure a painless upgrade is simple: Ensure that you upgrade the "munin" package BEFORE you upgrade the ================================================================== "munin-node" package on any of the hosts it collects data from. =============================================================== Should you however have already upgraded the packages in the wrong order, you may salvage your graphs by manually change the data type in the old RRD file, and afterwards rename it. For instance, you may have this RRD file containing the "user" field from the "cpu" plugin of munin-node 1.0.x: /var/lib/munin/fud.no/lust.fud.no-cpu-user-c.rrd After upgrading to version 1.2.x of munin-node, this will have changed to: /var/lib/munin/fud.no/lust.fud.no-cpu-user-d.rrd If the "munin" package wasn't upgraded before "munin-node" one, you will have both files, and the latter one will only contain the data gathered since the upgrade of the "munin-node" package. In order to make the old data reappear in the graph, you may do so using the following procedure: cd /var/lib/munin/fud.no rrdtool tune lust.fud.no-cpu-user-c.rrd -d 42:DERIVE mv -f lust.fud.no-cpu-user-c.rrd lust.fud.no-cpu-user-d.rrd You will have to repeat this process once for each field in each affected plugin. Also remember to ensure that the "munin" system user have write access to the resulting RRD file when you are finished. Be warned, however, that by doing this you will lose all data collected since munin-node was upgraded to version 1.2.x. * Data loss issue 2 ================= The 1.0.x series had rather nasty design flaw that caused field names longer than 18 characters be truncated, removing any excessive characters from the start of the field name. This led to a nasty bug; if a plugin reported values for two fields, who both had long names where the last 18 charaters were the same, only one RRD file would be generated, and its contents would be unpredictable. The 1.2.x series do not exhibit this behaviour, and will store the entire field name as part of the RRD file name. As this leads to the fact that a new, empty, file will be created with the non-truncated field name, the graphs will appear to have been reset. To solve this you need to manually figure out which RRD files are affected, and rename them so that they are called what the new version of Munin expects them to. To figure out which files may be affected, you can do the following: cd /var/lib/munin ls */*.rrd | awk '-F[/-]' '{if(length($4)==18) print}' This will output one line for each file that may be affected, for instance: fud.no/pride.fud.no-df-v_mapper_pride_usr-g.rrd The three first strings separated by hyphens in the filename is the interesting ones. The first is the host as named in /etc/munin/munin.conf, the second is the plugin name, and the third is the possibly mangled field name. I say "possibly", because any RRD files with a field name that is exactly 18 characters long will also be reported, even though they are not affected by the change. To figure out if the file is indeed affected, and what the new name should be, you need to ask the host's Munin-node process. First, you need to figure out the DNS hostname or IP address of the node, unless you already know it. This information can be found in the file /etc/munin/munin.conf, and will for this example look like this: [pride.fud.no] address 127.0.0.1 Next, connect to the host's Munin-node process: telnet 127.0.0.1 munin After receiving the welcoming "# munin node at pride.fud.no" banner, input: fetch df "df" is of course the plugin name as found embedded in the RRD file name above. You should now get the values reported by the plugin in return: _dev_hda5.value 54 _dev_mapper_pride_usr.value 88 _dev.value 54 The field names are the strings before the periods. At this point the correct field name is obvious - the truncated field name "v_mapper_pride_usr" is the last 18 characters of "_dev_mapper_pride_usr", so the latter must be the correct one. Now that you know that, you can rename the RRD file so that the new version can find it: cd /var/lib/munin/fud.no mv pride.fud.no-df-v_mapper_pride_usr-g.rrd \ pride.fud.no-df-_dev_mapper_pride_usr-g.rrd If you find no possible matches, it may be because the RRD file contains data that are no longer collected, which could've happened in this example if the filesystem in /dev/mapper/pride-usr was unmounted in the past. To find out if that is the case, look at the time stamp of the file to see when it was last modified. If that's a long time ago, chances are the file isn't used anyway and can be left alone. If you're really unfortunate, you may end up with multiple possibilities, which could've happened in the example used here if both a device named /udev/mapper/pride-usr and also one named /dev/mapper/pride-usr was mounted simultaneously. If this is the case, you can't do anything but inspect the relevant graph as created with Munin 1.0 to see if the field seems to contain the correct data for at least one of the fields, and rename the RRD accordingly. However, there is a possibility that the RRD will contain useless data that isn't correct for either of the fields. In any case, you won't be able to bring back correct data for both the fields, as it wasn't collected properly to begin with. You will have to repeat the process for every possibly affected RRD file, after which you may safely upgrade your "munin" package. -- Tore Anderson Mon, 21 Feb 2005 00:16:25 +0100