[csw-maintainers] Python 2.7

Peter FELECAN pfelecan at opencsw.org
Mon Jul 29 12:19:27 CEST 2013


"Maciej (Matchek) Bliziński" <maciej at opencsw.org> writes:

> 2013/7/29 Peter FELECAN <pfelecan at opencsw.org>:
>> "Maciej (Matchek) Bliziński" <maciej at opencsw.org> writes:
>>> I did not implement compiling one .py file multiple times for
>>> different Python versions.
>>
>> Do you plan to do it later?
>
> No, unless we figure out and agree on how to do it.
>
>>> The main concern for the current CAS is the suitability of pattern
>>> matching. Is it good enough to just match a part of the path to know
>>> which interpreter to use? It should be good enough for most packages,
>>> but I'm not sure what the corner cases / contrived path names could
>>> be.
>>
>> Even if it misses some is not blocking as the code is there. Anyhow, the
>> difference between .py and .pyc is not so great (and I don't speak about
>> .pyo which seems to me a scam), in my experience and if I understood the
>> mechanism of dynamic compilation the overhead is incurred only
>> once. This is why supplying the compiled components in the package
>> seemed to me a very good solution.
>
> I'm wondering why Debian's policy is so strict about not shipping
> them. The document doesn't explain the reasons, unfortunately, but
> there must be some. If we learn what the reasons are and it turns out
> they are not relevant for us, we might well do as you're suggesting. I
> found a Python packaging FAQ, the question wasn't on the list.
>
> http://wiki.debian.org/Python/FAQ
>
> Is anyone onboard a Debian developer and can provide any pointers?

I'm not, but using the following search equation: "pyc portability" on
Google and after a cursory exploration of the proposed hits on the first
page I think that the reason for which Debian imposes this is raised by
network shares between systems running different interpreters, version
wise.

The only thing that I didn't do is to ask on the good mailing list
which, by the way, I don't know.

Here follows the salient citations and my observations:

** [[http://stackoverflow.com/questions/2263356/are-python-2-5-pyc-files-compatible-with-python-2-6-pyc-files][stackoverflow]]
   "In general, .pyc files are specific to one Python version
   (although portable across different machine architectures, as long
   as they're running the same version); the files carry the
   information about the relevant Python version in their headers --
   so, if you leave the corresponding .py files next to the .pyc ones,
   the .pyc will be rebuilt every time a different Python version is
   used to import those modules."
** [[http://docs.python.org/2/tutorial/modules.html#compiled-python-files][tutorial]]
   6.1.3. "Compiled" Python files

   As an important speed-up of the start-up time for short programs
   that use a lot of standard modules, if a file called spam.pyc
   exists in the directory where spam.py is found, this is assumed to
   contain an already-"byte-compiled" version of the module spam. The
   modification time of the version of spam.py used to create spam.pyc
   is recorded in spam.pyc, and the .pyc file is ignored if these
   don't match.

   Normally, you don't need to do anything to create the spam.pyc
   file. Whenever spam.py is successfully compiled, an attempt is made
   to write the compiled version to spam.pyc. It is not an error if
   this attempt fails; if for any reason the file is not written
   completely, the resulting spam.pyc file will be recognized as
   invalid and thus ignored later. The contents of the spam.pyc file
   are platform independent, so a Python module directory can be
   shared by machines of different architectures.

   Some tips for experts:

   1. When the Python interpreter is invoked with the -O flag,
      optimized code is generated and stored in .pyo files. The
      optimizer currently doesn't help much; it only removes assert
      statements. When -O is used, all bytecode is optimized; .pyc
      files are ignored and .py files are compiled to optimized
      bytecode.

   2. Passing two -O flags to the Python interpreter (-OO) will cause
      the bytecode compiler to perform optimizations that could in
      some rare cases result in malfunctioning programs. Currently
      only __doc__ strings are removed from the bytecode, resulting in
      more compact .pyo files. Since some programs may rely on having
      these available, you should only use this option if you know
      what you're doing.

   3. A program doesn't run any faster when it is read from a .pyc or
      .pyo file than when it is read from a .py file; the only thing
      that's faster about .pyc or .pyo files is the speed with which
      they are loaded.

   4.  When a script is run by giving its name on the command line,
       the bytecode for the script is never written to a .pyc or .pyo
       file. Thus, the startup time of a script may be reduced by
       moving most of its code to a module and having a small
       bootstrap script that imports that module. It is also possible
       to name a .pyc or .pyo file directly on the command line.

   5.  It is possible to have a file called spam.pyc (or spam.pyo when
       -O is used) without a file spam.py for the same module. This
       can be used to distribute a library of Python code in a form
       that is moderately hard to reverse engineer.

   6.  The module compileall can create .pyc files (or .pyo files when
       -O is used) for all modules in a directory.

   Item 6 refers to distributing code.
       
** [[http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html][An Introduction to Python]]
   Guido van Rossum himself says:

   "[...]The contents of the `spam.pyc' file are platform independent,
   so a Python module directory can be shared by machines of different
   architectures."
** [[http://www.velocityreviews.com/forums/t643545-are-pyc-files-portable.html][forum discussion]]
   Discussion related on usage of .pyc on different versions of
   Solaris.

   The conclusion is that architecture doesn't matter, interpreter
   matter.
** [[http://www.debian.org/doc/packaging-manuals/python-policy/ch-module_packages.html#s-byte_compilation][Debian Python Policy]]
   The discussion has more to do with interpreter version changes
   than with architecture.
** [[http://www.ehow.com/info_12214796_pyc-files.html][random]]
   "Uses for ".PYC" Files

    Modules that are imported into user scripts get compiled by the
    interpreter before execution. Because these modules tend to
    undergo repeated use, the interpreter compiles the module and
    stores the ".pyc" file in a directory. This way, when a script
    imports that module, the bytecode version already exists, ready
    for use. Furthermore, bytecode ".pyc" files are portable across
    multiple platforms, making pre-compiling Python scripts useful for
    distributing Python programs across different operating systems."

>From my standpoint, here is the exceptional case where Debian is not to
be followed. Note that they have a quite similar policy of Emacs Lisp
pre-compiled files which is of the same order.

What we should do is to supply the compiled files in the package, for
each version for which the maintainer thinks that it's
adequate. Simpler and leaner, isn't it?
-- 
Peter


More information about the maintainers mailing list