CentOS

Segmentation fault using RPM

I’ve got a CentOS 4.2 machine (x86_64 variety) that usually runs very sweetly, with hindsight, the adage if it ain’t broke don’t fix it comes to mind!

There I was doing the usual ‘yum update’ then the next thing ‘segmentation fault’ – agghhh! Anything I tried with yum resulted in the same problem, so turned to the underlying rpm commands instead. Same problem, anything to do with rpm resulted in a segfault. Oh dear, or similar sentiments thinks I, best check what’s going on elsewhere on the system – all seemed fine as usual, still got web server, mail etc.

I tried digging around in yum, rpm and message logs – nothing of any interest, so I remembered that you can do a rpm rebuilddb as detailed on rpm’s site however, that also failed with a segfault so it was obviously something a little more serious, but at least now I’d backed up the /var/lib/rpm directory which I’d need later.

This stumped me for a while, so I started looking at a lower level using ‘strace -f rpm’ to see where things failed – always at the same point just after loading /lib64/tls/libc.so.6 – so I uploaded, did I mention this machine is remote, a replacement copy of the underlying libc-2.3.4.so and rsync’d it into place – no change rpm still segfaulting. After pondering, googling, trawling and no enlightenment later, I headed over to the nice folks at #rpm on irc.freenode.net, who were initially equally puzzled. Then, jbj (I think), mentioned about using rpm2cpio.sh – a shell script to unpack an rpm file when rpm is not installed / working etc. – which I’d obviously not spotted before.

So here’s how it was fixed – so I don’t forget, and because it may be of some use to someone else 🙂 :

  • I decided that it was worth tackling a little en masse, so I uploaded the rpm related rpms from the install media to a temporary directory: ~/rpm_rescue
  • Made sure I was the root user (su), then unpacked the rpms (one at a time – wildcards don’t seem to be supported by the script) using rpm2cpio.sh e.g.:

    # /usr/lib/rpm/rpm2cpio.sh rpm-4.3.3-11_nonptl.x86_64.rpm | cpio -dim

  • Cpio, doesn’t seem to like setting directory permissions properly when run as root so the following was necessary to sort that out:

    # find ./ -type d -exec chmod 755 {} ;

  • Then the ‘official’ method is to use tar cf – ./ | (cd /; tar xvf -) which I thought was a little confusing, so I used:

    # rsync -v --exclude=*.rpm ./ /

  • and that as they say, did the trick, rpm command returned the usual usage message as expected – whichever element of rpm was hosed by yum, had been overwritten by a fresh copy.
  • The only thing remaining was to copy back the backup of /var/lib/rpm so that the current database was used, rather than the default from the rpm rpms themselves.

So that was panic over, job done, rpm and yum were back in action, the system has since had several hundred updated files via yum – nice. 🙂