[csw-maintainers] Garbage collecting the package database
Maciej (Matchek) Bliziński
maciej at opencsw.org
Fri Nov 30 17:34:35 CET 2012
The package database stores metadata about every package ever analyzed
with checkpkg. Only a small fraction of built and checked packages
make it into catalogs, which means that there is a boatload of
packages which are in the database, but aren't used.
I wrote a script to find and delete unused packages:
http://sourceforge.net/apps/trac/gar/changeset/19788
There were about 80k packages (defined as SVR4 .pkg files), of which
only 16.5k are actually used in catalogs (since the stable and current
catalogs are now dropped from the database). Before garbage
collection, the compressed database dump had 815MB, and now it has
150MB. I'm guessing that the size reduction will also have performance
benefits.
Maciej
More information about the maintainers
mailing list