[csw-maintainers] Garbage collecting the package database

Maciej (Matchek) Bliziński maciej at opencsw.org
Fri Nov 30 17:34:35 CET 2012


The package database stores metadata about every package ever analyzed
with checkpkg. Only a small fraction of built and checked packages
make it into catalogs, which means that there is a boatload of
packages which are in the database, but aren't used.

I wrote a script to find and delete unused packages:
http://sourceforge.net/apps/trac/gar/changeset/19788

There were about 80k packages (defined as SVR4 .pkg files), of which
only 16.5k are actually used in catalogs (since the stable and current
catalogs are now dropped from the database). Before garbage
collection, the compressed database dump had 815MB, and now it has
150MB. I'm guessing that the size reduction will also have performance
benefits.

Maciej


More information about the maintainers mailing list