[csw-devel] SF.net SVN: gar:[19788] csw/mgar/gar/v2/lib/python/garbage_collection.py

wahwah at users.sourceforge.net wahwah at users.sourceforge.net
Fri Nov 30 16:48:14 CET 2012


Revision: 19788
          http://gar.svn.sourceforge.net/gar/?rev=19788&view=rev
Author:   wahwah
Date:     2012-11-30 15:48:12 +0000 (Fri, 30 Nov 2012)
Log Message:
-----------
garbage-collection: Delete unused packages from DB

The package database by default indexes and stores all packages every checked
by checkpkg. While this useful in many cases, there isn't a need to store
these packages forever. This script finds and drops all unused packages.

It currently doesn't do anything like "drop unused packages older than
X days"; it drops all unused packages.

Added Paths:
-----------
    csw/mgar/gar/v2/lib/python/garbage_collection.py

Added: csw/mgar/gar/v2/lib/python/garbage_collection.py
===================================================================
--- csw/mgar/gar/v2/lib/python/garbage_collection.py	                        (rev 0)
+++ csw/mgar/gar/v2/lib/python/garbage_collection.py	2012-11-30 15:48:12 UTC (rev 19788)
@@ -0,0 +1,43 @@
+#!/opt/csw/bin/python2.6
+# coding=utf-8
+#
+# $Id$
+#
+# The idea is to remove the package stats entries (and their blobs, and files)
+# for packages that aren't part of any catalogs.
+#
+# The main query can take a couple minutes. Please be careful with editing
+# this script, because if you screw up the main query, it can obliterate the
+# whole database. Make backups!
+
+import logging
+import sys
+
+import configuration
+import models as m
+from sqlobject import sqlbuilder
+
+def main():
+  configuration.SetUpSqlobjectConnection()
+  total_pkgs = m.Srv4FileStats.select().count()
+  logging.info("There are {0} packages to inspect.".format(total_pkgs))
+  res = m.Srv4FileStats.select(
+      sqlbuilder.NOTEXISTS(
+        sqlbuilder.Select(m.Srv4FileInCatalog.q.id,
+                          where=(
+            sqlbuilder.Outer(m.Srv4FileStats).q.id == m.Srv4FileInCatalog.q.srv4file))
+      )
+    ).orderBy('id')
+  deleted_pkgs = 0
+  for stats in res:
+    # logging.info("Package {0} ({1}) is not in any catalogs. Removing.".format(stats.basename, stats.md5_sum))
+    stats.DeleteAllDependentObjects()
+    stats.destroySelf()
+    deleted_pkgs += 1
+    sys.stdout.write(".")
+    sys.stdout.flush()
+  logging.info("Deleted {0} unused packages.".format(deleted_pkgs))
+
+if __name__ == '__main__':
+  logging.basicConfig(level=logging.INFO)
+  main()

This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.



More information about the devel mailing list