trying to improve mandoc Solaris support

Joerg Schilling via buildfarm buildfarm at lists.opencsw.org
Thu Mar 19 14:14:27 CET 2015


Ingo Schwarze <schwarze at usta.de> wrote:

> It is true that tbl(7) support is still not as good as mdoc(7)
> support.  That is mostly due to the fact that BSD manuals rarely
> use tables, but it can certainly be improved.  In fact, i implemented
> some improvements recently, for example horizontal centering of
> tables as a whole.
>
> > or disappear from the output.
>
> That is one of the things i recently improved, so it should be better
> in mandoc-1.13.3 than in 1.13.2.  I don't claim that all of such
> issues are gone, though.

I tested two months ago using 1.13.2.


> > In general, the mandoc _format_ (introduced with BSD-4.4-lite)
>
> I assume you are talking about the mdoc(7) format designed and
> implemented by Cynthia Livingston - as opposed to the traditional
> Version 7 AT&T UNIX man(7) format.

I did not try to find out who was the author. It was hard enough to find a 
useful version from dozens of variants that appear on the BSD snapshot from 
Kirk McKusick that comes with SCCS and a most recent edit timestamp from 
June 23 1995.

> > https://sourceforge.net/p/schillix-on/schillix-on/ci/default/tree/usr/src/cmd/man/
> > 
> > What I did was to take the mandoc troff macros from BSD-4.4-lite (those
> > that still work with original troff) and added a few minor fixes (e.g.
> > a Y2000 fix).
>
> I wasn't aware of that and have not checked your work in detail,
> but i have a rough idea of the quality of Cynthia's original macro
> implementation because it is still in use in Heirloom roff; Carsten
> Kunze is working on improvements right now.

If there are improvements, please keep me informed.

> These macros are usable for historical manuals, but less so for modern
> manuals.  For example, they still have the nine argument limit, and
> some macros are not parsed and/or not callable that are generally
> assumed to be in modern manuals.  Basically, this old version of

Are there *BSD man pages that go beyond the official 6 argument limit for troff 
macros? From my interpretation, using such features is not a good idea if you 
intend to create portable man pages. For this reason I decided not to enhance 
the Solaris man macros to support more than 6 arguments, just to be sure to 
detect problems early if I write or enhance man pages.

> the macros lacks all the improvements that Werner Lemberg (GNU) and
> Ruslan Ermilov (FreeBSD) did for groff, in particular the rewrite
> for groff-1.17, which many modern manuals now depend on.  Note that
> Werner was kind enough to not change the license from BSD to GPL
> on the occasion of his rewrite, but the macros need groff features
> that are not available in older roff implementations.

I cannot speak for *BSD man pages as a whole, but the man pages I checked did 
not show problems with the macros I provide with SchilliX-ON.

> > What I don't understand is why the mandoc program was written at all.
>
> There are a number of reasons:
>
>  1. Kristaps wasn't satisfied with groff's HTML output quality,
>     and due to the basic architecture of groff, that's almost
>     impossible to improve.

Is there something that is significantly better than man2html? This is what I 
use when I format man pages for the Web.

>  2. OpenBSD wanted to remove groff from the bases system, both
>     for licensing reasons, to get rid of a major chunk of code
>     written in C++ (which is generally considered an inappropriate
>     language for the OpenBSD base system), and because groff is
>     relatively hard to maintain properly.

Well, I agree that there are three basic problems with groff:

-	It is huge with respect to AT&Ts troff

-	It is written in C++

-	It uses a license that contains ambiguous text and thus is 
	miss-interpreted by too many people in an extremely restrictive way.
	Note that the GPL was listed as a non-free license by opensource.org
	for some years. This changed after the FSF confirmed that the GPL
	has to be interpreted in a way that makes it compatible to the
	OpenSource.org OSS guidelines.

	Given the fact that you cannot enforce all of the GPL in court and
	that what you can enforce is not more than what's in the CDDL, the GPL
	is a license with unneeded restrictions that just afflict those people
	who intend to follow the written rules.

>  3. While groff is a rather fast typesetting system, it is a rather
>     slow documentation formatter (though still *much*, much better
>     than DocBook-based toolchains which are positively ridiculously
>     slow in addition to producing man(7) output of abysmal quality;
>     never use DocBook for anything!).  Typically, mandoc is by a
>     factor of 3 to 20 faster than groff, depending on the source
>     language and manual size.  That allowed switching from
>     installing preformatted manuals to installing manual sources
>     and formatting on the fly, even on older, slower architectures.
>     That also sped up the system build, which is considered an
>     important point in OpenBSD's developer-centric culture.

The performance of troff (the AT&T original code) is comparable to what you get 
from mandoc. My tests resulted in aprox. 30% performance benefit with mandoc, 
but this seems to be neglible if you compare to gtroff.

>  4. mandoc has support for semantic searching in apropos(1),
>     see my conference presentations available from mdocml.bsd.lv
>     for details.

I'll need to remember this next week, when I am back from CLT. Tomorow, I'll 
move to Chemnitz and before there is not much time.

> Anyway, these arguments convinced FreeBSD, OpenBSD, NetBSD, Minix 3
> and illumos to use mandoc as the system default formatter.

Well illumos originally have a major bug in their i18n configuration tables for 
character classification. For this reason, "col -x" does not work correctly on
illumos. The illumos people have been in hope to use mandoc to cover their i18n 
bug, but last week, they reported wimilar problems with mandoc ;-)


> It is still much smaller than groff, even though it does support
> a subset of roff(7) that is now and then used in manuals (including
> full support for user-defined macros, conditional and numerical
> expressions).

UNIX-V7 man pages did not use more from tbl than what seems to be implemented 
in mandoc. This changed in the 1980s for some UNIX distros and UNIX man pages 
not usually expect to be able to define column widths.

> > and the current state seems to be that only 80-90% of the Solaris
> > man pages work with the mandoc program. 
>
> The sounds like Solaris manuals tend to use some features that are
> less commonly used elsewhere - now that i have an account on your
> build cluster, i can have a look what exactly that is.
>
> To put this into perspective, all OpenBSD manuals work with mandoc,
> and i'm not aware of any current complaints with respect to FreeBSD,
> NetBSD, or illumos base system manuals.  The OpenBSD ports tree
> contains about 9000 third-party packages; more than 3000 of those
> contain manuals; less than 300 still use groff for formatting, and
> the vast majority of those 300 are DocBook generated manuals.  There
> are plans to improve mandoc to deal with even DocBook insanity, but
> that will still need a bit of work.

most of the man pages for my OSS packages use tbl features that do not work 
with mandoc.

> > The former problem that groff is under GPL is no longer a problem
> > since the original AT&T troff has been made OpenSource under a free
> > license together with OpenSolaris in June 2005.
>
> CDDL-licensed software is considered free software by the OSI,
> but the OpenBSD project regards the CDDL as a non-free license,
> and CDDL-licensed code cannot be used in the OpenBSD base system,
> but only in the OpenBSD ports tree.  The CDDL has at least two
> defects that disqualifies it for free software:
>
>  1. It has strings attached with respect to patent law (6.2).

The missing patent defending support is usually seen as a problem with the BSD
license. Note that this usually _is_ a problem if a company that owns patents
published code under the BSD license. This is why most people (including me and 
opensource.org) recommend to use Apache-2 to people that like to publish 
software under an "academic license" (see the book from Lawrence Rosen for the 
definition of OSS license categories).

>  2. It is viral (3.2).

Lawrence Rosen calls it "reciprocal". I convinced the FreeBSD people in 
February 2006 on the "Chemnitzer Linux Tage" that the CDDL is not a problem for 
FreeBSD and they started to port ZFS since then.

> Even if Heirloom roff were free software, i doubt the code
> quality would be sufficient for inclusion into OpenBSD.
> It probably would have been good enough in 1994, but i doubt
> it is still good enough by today's standards.  Cynthia's original
> mdoc(7) macros - even though i highly admire has work, it was
> excellent for the time, and the language itself is still the best
> documentation language available IMHO - certainly aren't good
> enough by today's standards.

My impression with the mdoc() macro set is that it mainly tries to provide new 
macros for use cases that work with classical man(5) macros if you know that 
there is \c to innterupt line processing with [nt]roff. The mdoc() macros look 
unmneeded complex to me. This is why I still prefer man(5).

I did not yet check the quality of Heirloom troff. I know however that there is 
not much maintenance for all the projects in Heirloom that I checked before.

Example 1:

	The Bourne Shell. The Heirloom Bourne Shell still uses the old sbrk(2)
	based allocator and thus is not really portable. Few bugs have been 
	fixed since the code was taken from OpenSolaris.

	My Bourne Shell fixed all "documented bugs" and converted the code to
	be based on malloc()/free(). I added my command line history editor 
	from 1982..1984 and a lot of other new features like an advanced alias 
	system. My Bourne Shell even works on Cygwin and is much faster than 
	bash on Cygwin.

Example 2:
	SCCS.... The Heirloom variant changed 1-2% of the code and introduced
	a limited portability. The first published version did not even work
	on Linux as multiple calls to fclose() caused the software to dump core
	on Linux.

	My SCCS implementation fixed a lot of problems in the code and enhanced
	performance by a factor of 3x. Many new features have been introduced 
	and the code works nearly evrywhere. Aprox. 60-70% of the current SCCS
	code base is code written by me. There soon will be a "project mode" 
	that supports groups of files and network support to allow to use it
	as a distributed SCM.


> > Can you enlighten me please?
>
> Sure, i hope the above helps.  Note that we may well disagree
> on parts of it, but you asked for the motivation for mandoc,
> and these are the main points motivating it.
>
> I hope you don't consider these informations too off-topic here,
> but i didn't want to let the questions go unanswered.

Thank you for the explanation.

When I've time, I'll give the new version a try. I will however stay with the 
classical man(1) + troff on SchilliX. BTW: One reason is line formatting. I 
prefer full justification before left aligned text.


Jörg

-- 
 EMail:joerg at schily.net                    (home) Jörg Schilling D-13353 Berlin
       joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/'


More information about the buildfarm mailing list