Hi Maciej,<br> <br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_quote"><div><br></div><div><br></div></div></blockquote></div><div>Yes, this is a good approach. If you store the data in a way that well resembles the original structure, it'll be easier to make write future checks.</div>
</div></blockquote><div><br></div><div>I made the modifications.</div><div>The elfdump output is not so easy to parse correctly, I wonder if we should rather directly some python library that can directly read the elf headers, but anyway the full patch contains all new checks and stats is attached, and I put the new part at the end of this mail for comment.</div>
<div><br></div><div>I added a new check to make sure that binaries don't use interface version available only on recent Solaris releases.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_quote">
<div><br></div><div>What do you think about all this ?</div></div></blockquote><div><br></div></div><div>We need to plan the database update. The rough outline of steps to take, from memory:</div></div></blockquote><div>
<br></div><div><div><br></div><div>I don't know how to do at least some of these steps.</div><div>Maybe we can do this live on irc and document on the wiki at the same time ?</div></div><div><br></div><div>Yann</div><div>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><br></div><div>- increment the data structure version in the code so that checkpkg knows we're messing with it</div>
<div>- create a new empty database, named checkpkg2 (will be renamed later) </div></div></blockquote>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div>- alter ~/.checkpkg/... config in your home directory to make checkpkg use the new database (needs to use the relmgr credentials, I'll give them to you)</div>
<div>- stop the catalog generation to avoid race conditions</div><div>- using the new code import the complete catalog into the database</div><div>- submit the code into the repository</div><div>- send a notification to maintainers@ that people should update the code and not use the buildfarm for the next 2h or so (to have some margin)</div>
<div>- move the old checkpkg database out of the way, rename the 'checkpkg2' database to 'checkpkg' (I'm quoting because I mean the literals)</div><div>- now people with the old code can't do anything, will be getting error messages about data mismatches (they shouldn't index new packages and storing the old data structure, there's a safety check to prevent that)</div>
<div>- people with the new code should be able to work normally</div><div>- revert ~/.checkpkg (your homedir only) to the old state (or remove it altogether)</div><span><font color="#888888"><div><br></div>
<div>Maciej</div></font></span></div>
<br>_______________________________________________<br>
maintainers mailing list<br>
<a href="mailto:maintainers@lists.opencsw.org" target="_blank">maintainers@lists.opencsw.org</a><br>
<a href="https://lists.opencsw.org/mailman/listinfo/maintainers" target="_blank">https://lists.opencsw.org/mailman/listinfo/maintainers</a><br>
.:: This mailing list's archive is public. ::.<br></blockquote></div><br><div><br></div><div><div>--- dependency_checks.py<span style="white-space:pre-wrap"> </span>(revision 19130)</div><div>+++ dependency_checks.py<span style="white-space:pre-wrap"> </span>(working copy)</div>
<div>@@ -39,6 +39,24 @@</div><div> </div><div> PREFERRED_DIRECTORY_PROVIDERS = set([u"CSWcommon"])</div><div> </div><div>[...]</div><div>+</div><div>+ALLOWED_VERSION_DEPENDENCIES = { "libc.so.1": [ 'SYSVABI_1.3', 'SUNWprivate_1.1', 'SUNW_1.22.6', 'SUNW_1.22.5', </div>
<div>+ 'SUNW_1.22.4', 'SUNW_1.22.3', 'SUNW_1.22.2', 'SUNW_1.22.1', </div><div>+ 'SUNW_1.22', 'SUNW_1.21.3', 'SUNW_1.21.2', 'SUNW_1.21.1', </div>
<div>+ 'SUNW_1.21', 'SUNW_1.20.4', 'SUNW_1.20.1', 'SUNW_1.20', </div><div>+ 'SUNW_1.19', 'SUNW_1.18.1', 'SUNW_1.18', 'SUNW_1.17', </div>
<div>+ 'SUNW_1.16', 'SUNW_1.15', 'SUNW_1.14', 'SUNW_1.13', </div><div>+ 'SUNW_1.12', 'SUNW_1.11', 'SUNW_1.10', 'SUNW_1.9', </div>
<div>+ 'SUNW_1.8', 'SUNW_1.7', 'SUNW_1.6', 'SUNW_1.5', </div><div>+ 'SUNW_1.4', 'SUNW_1.3', 'SUNW_1.2', 'SUNW_1.1', </div>
<div>+ 'SUNW_0.9', 'SUNW_0.8', 'SUNW_0.7', 'SISCD_2.3' ] }</div><div>+</div><div>+</div></div><div><br></div><div><div>@@ -147,6 +165,57 @@</div>
<div> error_mgr,</div><div> pkgname, messenger)</div><div> orphan_sonames.extend(orphan_sonames_tmp)</div><div>+</div><div>+</div><div>+ ldd_info = pkg_data['ldd_info'][binary_info["path"]]</div>
<div>+ for ldd_response in ldd_info:</div><div>+ if ldd_response['state'] == 'soname-unused' and ldd_response['soname'] not in BASE_SOLARIS_LIBRARIES:</div><div>+ messenger.Message(</div>
<div>+ "Binary %s links to library %s but doesn't seem to use any of its symbols. "</div><div>+ "It usually happens because superfluous libraries were added to the linker options, "</div>
<div>+<span style="white-space:pre-wrap"> </span> "either because of the configure script itself or because of the \"pkg-config --libs\""</div><div>+<span style="white-space:pre-wrap"> </span> " output of one the dependency."</div>
<div>+ % ("/" + binary_info["path"], ldd_response['soname']))</div><div>+ error_mgr.ReportError(</div><div>+ pkgname, "soname-unused",</div><div>+ "%s is needed by %s but never used" % (ldd_response['soname'], "/" + binary_info["path"]))</div>
<div>+</div><div>+ # Even when direct binding is enabled, some symbols might not be</div><div>+ # directly bound because the library explicitely requested the symbol</div><div>+ # not to be drectly bound to.</div>
<div>+ # For example, libc.so.1 does it for symbol sigaction, free, malloc and realloc</div><div>+ # So we consider that direct binding is enabled if at least one symbol is directly</div><div>+ # bound to because that definitely means that -B direct or -z direct was used.</div>
<div>+ directly_binded = set()</div><div>+ for syminfo in pkg_data["binaries_elf_info"][binary_info["path"]]['symbol table']:</div><div>+ if syminfo['external'] and syminfo['flags'] and 'D' in syminfo['flags'] and 'B' in syminfo['flags']:</div>
<div>+ directly_binded.add(syminfo['soname'])</div><div>+ not_directly_binded = directly_binded.symmetric_difference(binary_info["needed sonames"])</div><div>+</div><div>+ if not_directly_binded:</div>
<div>+ messenger.Message(</div><div>+<span style="white-space:pre-wrap"> </span>"No symbol of binary %s is directly binded against the following libraries: %s. "</div><div>+ "Please make sure the binaries are compiled using the \"-Bdirect\" linker option."</div>
<div>+ % ("/" + binary_info["path"], ", ".join(not_directly_binded)))</div><div>+ for soname in not_directly_binded:</div><div>+ error_mgr.ReportError(</div><div>+ pkgname, "no-direct-binding",</div>
<div>+ "%s is not directly binded to soname %s" % ("/" + binary_info["path"], soname))</div><div>+</div><div>+</div><div>+ for version_dep in pkg_data["binaries_elf_info"][binary_info["path"]]['version needed']:</div>
<div>+ if (version_dep['soname'] in ALLOWED_VERSION_DEPENDENCIES and </div><div>+ not version_dep['version'] in ALLOWED_VERSION_DEPENDENCIES[version_dep['soname']]):</div><div>+ messenger.Message(</div>
<div>+ "Binary %s requires interface version %s in library %s which is only available "</div><div>+<span style="white-space:pre-wrap"> </span> "in recent Solaris releases."</div>
<div>+ % ("/" + binary_info["path"], version_dep['version'], version_dep['soname']))</div><div>+ error_mgr.ReportError(</div><div>+ pkgname, "forbidden-version-interface-dependencies",</div>
<div>+ "%s requires forbidden interface version %s in library %s" </div><div>+<span style="white-space:pre-wrap"> </span> % ("/" + binary_info["path"], version_dep['version'], version_dep['soname']))</div>
<div>+ </div><div>+ </div><div> orphan_sonames = set(orphan_sonames)</div><div> for soname, binary_path in orphan_sonames:</div><div> if soname not in ALLOWED_ORPHAN_SONAMES:</div></div><div><br></div>
<div>
<br></div><div><div>--- inspective_package.py<span style="white-space:pre-wrap"> </span>(revision 19130)</div><div>+++ inspective_package.py<span style="white-space:pre-wrap"> </span>(working copy)</div>
<div>@@ -229,17 +229,132 @@</div><div> </div><div> return defined_symbols</div><div> </div><div>+</div><div>+ def GetBinaryElfInfo(self):</div><div>+ """Returns various informations symbol and version present in elf header</div>
<div>+</div><div>+ To do this we parse output lines from elfdump -syv, it's the </div><div>+ only command that will give us all informations we need on symbols and versions.</div><div>+ We will analyse 3 sections:</div>
<div>+ - version section: contains soname needed, version interface required for each soname, </div><div>+ and version definition</div><div>+ - symbol table section: contains list of symbol and soname/version interface providing it</div>
<div>+ (the latter is an index in the version section)</div><div>+ - syminfo section: contains special linking flags for each symbol</div><div>+ """</div><div>+ binaries = self.ListBinaries()</div>
<div>+ binaries_elf_info = {}</div><div>+</div><div>+ for binary in binaries:</div><div>+ binary_abspath = os.path.join(self.directory, "root", binary)</div><div>+ # elfdump is the only tool that give us all informations</div>
<div>+ args = ["/usr/ccs/bin/elfdump", "-svy", binary_abspath]</div><div>+ elfdump_proc = subprocess.Popen(</div><div>+ args,</div><div>+ stdout=subprocess.PIPE,</div><div>
+ stderr=subprocess.PIPE)</div>
<div>+ stdout, stderr = elfdump_proc.communicate()</div><div>+ retcode = elfdump_proc.wait()</div><div>+ if retcode or stderr:</div><div>+ logging.error("%s returned one or more errors: %s", args, stderr.splitlines()[0])</div>
<div>+ continue</div><div>+ elfdump_out = stdout.splitlines()</div><div>+</div><div>+ symbols = {}</div><div>+ binary_info = { 'version definition': [], </div><div>+<span style="white-space:pre-wrap"> </span> 'version needed': [],</div>
<div>+<span style="white-space:pre-wrap"> </span> 'symbol table': [] }</div><div>+ # we will merge syminfo and symbol table information in one list</div><div>+ # so the syminfo list is the same as the symbol table one</div>
<div>+ binary_info['syminfo'] = binary_info['symbol table']</div><div>+</div><div>+ # The list of fields we want to retrieve in the elfdump output by section</div><div>+ # if the field is a tuple, it means we will map the original field name</div>
<div>+ # to another name in the final data structure</div><div>+ elf_fields = { 'version defined': [ 'version', 'dependency' ],</div><div>+ 'version needed': [ ('file', 'soname'), 'version' ],</div>
<div>+<span style="white-space:pre-wrap"> </span> 'symbol table': [ ('name', 'symbol'), ('ver', 'version'), </div><div>+<span style="white-space:pre-wrap"> </span> 'bind', ('shndx', 'external') ],</div>
<div>+<span style="white-space:pre-wrap"> </span> 'syminfo': [ ('library', 'soname'), 'symbol', 'flags' ] }</div><div>+</div><div>+ cur_section = None</div>
<div>+ for line in elfdump_out:</div><div>+</div><div>+ elfdump_data, cur_section = self._ParseElfdumpLine(line, cur_section)</div><div>+</div><div>+<span style="white-space:pre-wrap"> </span># header or blank line contains no information</div>
<div>+<span style="white-space:pre-wrap"> </span>if not elfdump_data:</div><div>+ continue</div><div>+</div><div>+ elf_info = {}</div><div>+<span style="white-space:pre-wrap"> </span>for field in elf_fields[cur_section]:</div>
<div>+ if type(field) == tuple:</div><div>+ elf_info[field[1]] = elfdump_data[field[0]]</div><div>+ else:</div><div>+ elf_info[field] = elfdump_data[field]</div><div>+ </div>
<div>+ # we merge symbol table and syminfo informations so we have to check</div><div>+<span style="white-space:pre-wrap"> </span># if the symbol has not already been added</div><div>+<span style="white-space:pre-wrap"> </span>if cur_section in ('symbol table', 'syminfo'):</div>
<div>+<span style="white-space:pre-wrap"> </span> if elf_info['symbol'] in symbols:</div><div>+<span style="white-space:pre-wrap"> </span> symbols[elf_info['symbol']].update(elf_info)</div>
<div>+<span style="white-space:pre-wrap"> </span> continue </div><div>+ else:</div><div>+<span style="white-space:pre-wrap"> </span> symbols[elf_info['symbol']] = elf_info</div>
<div>+ </div><div>+ binary_info[cur_section].append(elf_info)</div><div>+</div><div>+ # elfdump doesn't repeat the name of the soname in the version section</div><div>+ # if it's the same on two contiguous line, so we have to make sure</div>
<div>+ # the information is present in each entry</div><div>+ for i, version in enumerate(binary_info['version needed'][1:]):</div><div>+ if not version['soname']:</div><div>+ version['soname'] = binary_info['version needed'][i]['soname']</div>
<div>+</div><div>+ # if it exists, the first "version definition" entry is the base soname </div><div>+ # we don't need this information</div><div>+ if binary_info['version definition']:</div>
<div>+<span style="white-space:pre-wrap"> </span>binary_info['version definition'].pop(0)</div><div>+</div><div>+ # To not rely of the section order output of elfdump, we resolve symbol version</div>
<div>+ # informations here after having parsed all elfdump output</div><div>+ nb_versions_definition = len(binary_info['version definition'])</div><div>+ for sym_info in binary_info['symbol table']:</div>
<div>+ version_index = int(sym_info['version']) - 2</div><div>+<span style="white-space:pre-wrap"> </span>if version_index > 1:</div><div>+<span style="white-space:pre-wrap"> </span> if version_index < nb_versions_definition:</div>
<div>+<span style="white-space:pre-wrap"> </span> version = binary_info['version definition'][version_index]</div><div>+ else:</div><div>+<span style="white-space:pre-wrap"> </span> version = binary_info['version needed'][version_index - nb_versions_definition]</div>
<div>+ sym_info['version'] = version['version']</div><div>+<span style="white-space:pre-wrap"> </span> sym_info['soname'] = version['soname']</div><div>+ else:</div>
<div>+ sym_info['version'] = None</div><div>+</div><div>+ if sym_info['external'] == 'UNDEF':</div><div>+ sym_info['external'] = True</div><div>+ else:</div>
<div>+ sym_info['external'] = False</div><div>+</div><div>+ # we make sure the field are present even if the syminfo section is not</div><div>+ sym_info.setdefault('soname')</div><div>
+ sym_info.setdefault('flags')</div><div>+</div><div>+ binaries_elf_info[binary] = binary_info</div><div>+</div><div>+ return binaries_elf_info</div><div>+</div><div>+</div></div><div><br></div><div>
<div>@@ -263,7 +390,38 @@</div><div> sym = { 'address': fields[0], 'type': fields[1], 'name': fields[2] }</div><div> return sym</div><div> </div><div>- def _ParseLddDashRline(self, line):</div>
<div>+</div><div>+ def _ParseElfdumpLine(self, line, section = None):</div><div>+</div><div>+ headers_re = (r'(?P<section>Version Needed|Version Definition|Symbol Table|Syminfo) Section:\s+(?:\.SUNW_version|\.dynsym|\.SUNW_syminfo)\s*$|'</div>
<div>+ '\s*(?:index\s+)?version\s+dependency\s*$|'</div><div>+ '\s*(?:index\s+)?file\s+version\s*$|'</div><div>+ '\s*index\s*value\s+size\s+type\s+bind\s+oth\s+ver\s+shndx\s+name\s*$|'</div>
<div>+ '\s*index\s+flags\s+bound to\s+symbol\s*$|'</div><div>+<span style="white-space:pre-wrap"> </span> '\s*$')</div><div>+</div><div>+ re_by_section = { 'version definition': (r'\s*(?:\[(?P<index>\d+)\]\s+)?(?P<version>.*\S)\s+(?P<dependency>\S+)?\s*$'),</div>
<div>+<span style="white-space:pre-wrap"> </span> 'version needed': (r'\s*(?:\[(?P<index>\d+)\]\s+)?(?:(?P<file>\S+)\s+(?!\[ (?:INFO|WEAK) \]))?(?P<version>\S+)(?:\s+\[ (?:INFO|WEAK) \])?\s*$'),</div>
<div>+ 'symbol table': (r'\s*\[\d+\]\s+0x[0-9a-f]+\s+0x[0-9a-f]+\s+\S+\s+(?P<bind>\S+)\s+\S+\s+(?P<ver>\S+)\s+(?P<shndx>\S+)\s+(?P<name>\S+)?\s*$'),</div><div>
+<span style="white-space:pre-wrap"> </span> 'syminfo': (r'\s*\[\d+\]\s+(?P<flags>[ABCDFILNPS]+)\s+(?:(?:\[\d+\]\s+(?P<library>.*\S)|<self>)\s+)?(?P<symbol>.*\S)\s*') }</div>
<div>+</div><div>+ if section:</div><div>+ m = re.match(re_by_section[section], line)</div><div>+ if m:</div><div>+ elfdump_data = m.groupdict()</div><div>+ return elfdump_data, section</div><div>
+</div><div>+ m = re.match(headers_re, line)</div><div>+ if not m:</div><div>+ raise package.StdoutSyntaxError("Could not parse %s" % (repr(line)))</div><div>+</div><div>+ if m.lastindex:</div><div>
+ section = m.group('section').lower()</div><div>+</div><div>+ return None, section</div><div>+</div></div>