[csw-maintainers] automated catalog promotion for packages

Fri Nov 11 10:54:59 CET 2011

Am 11.11.2011 um 09:17 schrieb Maciej (Matchek) Bliziński:

> 2011/11/11 Ben Walton <bwalton at opencsw.org>:
>> Do we agree that all package removals should be done manually or does
>> anyone think this should be done automatically too?
> 
> If it were to be done automatically, there would need to be a
> mechanism detecting whether it's safe to do so.  We could have a rule
> e.g. that we remove *_stub packages that nothing depends on.  This
> way, to remove package foo from the catalog, you would replace it with
> an empty foo_stub, by uploading foo_stub to the catalog.  Then the
> automation would scan the catalog, see that foo_stub exists and has no
> dependent packages - and harvest it.

The problem is that the stub must exist until the user has actually
updated. That means the stub must stay inside a named release until
the next named release or the packages obsoleted by the stub will
not get removed.

>> For package adds and updates we'll need to determine the point in time
>> that a package (add or update) enters the unstable catalog.  The
>> simplistic view would use the REV= date stamp for adds/updates but
>> that can be misleading as a package might sit in experimental for an
>> unknown amount of time before being pushed.  Thus we need to watch the
>> catalog(s) and detect package entry.  This is easy enough to do as we
>> have machinery for this task already.  The initial spotting of a new
>> package should trigger a clock for this package that upon expiry will
>> see it moved forward.
> 
> I agree, we can't use REV for this purpose.  We need some kind of a
> datestamp denoting when certain package entered the catalog.  I'm
> thinking that the srv4_file_in_catalog table could have an additional
> timestamp column for this purpose.
> 
>> What criteria can be used to stop the clock?  Some of the following
>> are obvious and some are possibilities.  I'm sure I'm missing items
>> here too.
>> 
>> 1. A bug against the package in question.
> 
> Meaning, a new bug.  In the general case, if you have two bugs open
> and you fix one, we would like to be able to push the improved
> (although not fixed completely) package to testing.

Really? I would think that this is only allowed if you close both,
probably one with "Resolution: Fix Later", but both closed.

>> 2. A bug against another package in the same 'bundle.'
> 
> Right, the clock would be set for the bundle rather than an individual
> package.  Or it would be set for max(package.insertion_time).

Or even better bugs were reported against the bundle, which would
require an updated bugtracker ;-)

>> 3. A subsequent update of the same package.

Which means that only the latest package version should be taken into
account for propagation.

>> 4. A bug against a dependency if it has a ticking promotion clock too.
> 
> I would achieve the same effect in a different way.  I would not reset
> the clocks for dependent packages; I would handle that by always
> analyzing the package promotion as promotion of the package in
> question, plus all its dependencies.
> 
> When you install a package with pkgutil, it always installs or updates
> all dependencies of that package.  We would have the same requirement
> for promotion: if we want to promote "bar" that depends on "libfoo1",
> we can only promote bar together with libfoo1.
> 
>> All of the bug-based items are complicated by the fact that mantis
>> carries no information about which catalog the particular bug is
>> relevant too.  A bug could be filed against the package in the named
>> release, not the version in unstable but we have no way to ascertain
>> that currently.
> 
> We would assume that the bug is relevant for unstable.  In reality the
> bug might be present in testing, but even if so, the fix will first go
> into unstable.

Again an updated bugtracker would allow opening a bug against specific
REV of a package and tracking that. Until now I think Maciejs assumption
is viable.

>> To handle the mantis limitations, we could either add a facility to
>> the promotion tool to ignore various bug id's (a manual action) or
>> look at alternate bug tracking systems.

I suggest first implementing some very simple propagation where any bug
stop propagation, no special cases, no elaborate coding. When that works
we can start working on a new bugtracker and do the more advanced stuff
with the new capabilities.

>> A subsequent update of the same package could simply overwrite the
>> existing entry in the current promotion tracking object.  That would
>> be the equivalent of saying "we assume that a new update addresses
>> open bugs.  it would require a new bug to stop the clock again."  Does
>> that seem like a valid assumption to build in?
> 
> My instinct says no.  I'd rather say: "we assume that a new update
> addresses the bug if the bug gets closed, and: bug_open_timestamp <
> package_build_timestamp < bug_close_timestamp.
> 
>> I think the dependency issue will be the biggest trap here.  I think
>> we need to track bugs on dependencies as a broken dep could mean a
>> broken app.  We may want to only consider deps that also have ticking
>> clocks.  This would allow for the fact that the package under
>> consideration was built against the dep (library, etc) that is already
>> in the named release catalog.  A dep that pre-exists in the promotion
>> object would (assuming we keep the buildfarm very fresh) mean that the
>> new package was built against the dep package that has a ticking clock
>> itself.
>> 
>> We could choose to ignore dependencies, but I'm not sure that's a good
>> idea.
> 
> Definitely not, it would cause a lot of breakage.  We must take
> dependencies into account.

I am also for stopping propagation on any issue. If there is a block
it needs fixing and the deps get work which they would otherwise not
necessarily get.

>> If we decide that blocking a package based on its dependencies is the
>> best action, then we have further complications.  An update of the
>> dependent package must also remove the block that it set on the
>> original.  I guess that's more of a bookkeeping issue in the data
>> structure than a hard challenge but it's still something to consider.
>> 
>> We should also offer a mechanism for a maintainer to signal that a
>> package is not suitable for promotion.  That could come in the form of
>> a bug filed or through a command that is run.  I think filing a bug is
>> more consistent (and visible) in this case.
> 
> I'm not sure I understand the bug filing idea here.  I was imagining
> something like this:
> 
> $ promote-package foo
> (...)
> Can't promote package foo, because:
> - package foo depends on libbar1
>  - package libbar1 can't be promoted, because:
>    - mantis bug 4321 is open against libbar-utils
> - mantis bug 4123 is open against package foo

But the promotion is done in background, so how will the maintainer get
the output of the command? From my understanding a new bug on foo would
opened as "Propagation Stopped"-bug.

>> Now, if we end up with several packages in the promotion object that
>> have stopped clocks, we'll want to eventually garbage collect them.
>> We should be able to set quite a long expiry time on this but at some
>> point, things will fall out the bottom.  Do we need a way for the
>> maintainer to signal that the clock should start again?
> 
> That sounds like something that should happen automatically.

And it should be visible to the maintainer. Maybe in a qa-page for the
catalog with something like this:
  foo    OLD=2011.04.02,1.23 NEW=2011.11.03,1.25  5 days until propagation
  bar    OLD=2011.04.02,1.23 NEW=2011.11.03,1.25  would propagate, but 2 open bugs

Maybe one page with all migrations per catalog, and one page per maintainer.

>> I think the
>> simplest way to achieve that is to have the maintainer submit the
>> package again and start a new clock.  (This would require a different
>> REV= stamp to ensure it is detected.)
> 
> In general, as a maintainer I'd like to know: if my package is not
> promoted, is it only because it needs to wait more (no action needed),
> or are there bugs that need to be fixed (action needed).
> 
>> Now, what about different architectures and os releases?  The bug
>> system isn't granular enough to pick out this info so I think we need
>> to stall the package in all catalogs until the normal criteria are
>> met.  This would have the benefit of keeping catalogs in lock step
>> which I think is a good thing.
> 
> I agree.  It's better to keep the catalogs in sync.  There are some
> cases though, where a package exists in the i386 catalog only (acrobat
> reader for instance), so the system would need to cope with these too.

If in doubt halt propagation. After the first simple migration approach
lets face a new bugtracker and do finer grained stuff later.

Best regards

  -- Dago

-- 
"You don't become great by trying to be great, you become great by wanting to do something,
and then doing it so hard that you become great in the process." - xkcd #896