#998165 debian-policy: document and allow Description in the source paragraph

#998165#3
Date:
2021-10-31 10:18:35 UTC
From:
To:
Hi!

dpkg 1.19.0 introduced, following the request in #555743, a bunch of new
substvars.  Notably, it now handles ${source:Synopsis} and
${source:Extended-Description} that are described as follow:

       source:Synopsis
           The source package synopsis, extracted from the source stanza
           Description field, if it exists

       source:Extended-Description
           The source package extended description, extracted from the
           source stanza Description field, if it exists


Currently Policy §5.2 lists the allowed known fields, and Description is
accepted only in the "binary package paragraphs", not in the one for the
source package.


As documented in the bug report mentioned above, these are the main
benefits of having a Description in the source paragraph:
 * helps de-duplicate the description in the binary paragraphs (mostly
   relevant for libraries and other packages that build many binaries
   and share a common description).  Note that this would only
   de-duplicate d/control, the binary DEBIAN/control of each binary
   package would still keep the generated long description.
 * ship a generic source-level descrption of the package, which just
   make sense if one thinks about it
 * as a consequence of the above, a bunch of tools (DDPO, PTS, etc)
   would be able to drop the weird and unnatural logic that they use to
   pick a description for the source package
The main "bad" consequence would be that Description would be exported
in the .dsc and as such end up in the Sources index.  This is probably
what we want anyway, but with all the people complaining about how big
the index is getting it's something to consider.  However it's also true
that realistically very few packages are going to make use of this
facility in the near future so it shouldn't really matter IMHO.



If I get no pushbacks I'll also propose some text later on when I'm
freer (unless somebody beats me to it!).

#998165#8
Date:
2021-11-02 00:32:19 UTC
From:
To:
Hello,

Hrm.  Could dak be modified to filter this out of the Sources file?

#998165#11
Date:
2021-11-03 11:39:40 UTC
From:
To:

I'm sure it could, but I don't think we would want that?  As I wrote,
it's likely what we would like to have regardless, so that, say
`apt-cache showsrc` can bring some more information.

#998165#16
Date:
2021-11-03 22:40:02 UTC
From:
To:
Could you clarify what source packages that produce several binary
packages should do ? Maybe give an example ?

Cheers,

#998165#19
Date:
2021-11-04 11:09:30 UTC
From:
To:
Sure.

Here is how I would use it, for example.
--- a/debian/control +++ b/debian/control @@ -25,6 +25,13 @@ Rules-Requires-Root: no Homepage: http://xmlsoft.org Vcs-Git: https://salsa.debian.org/xml-sgml-team/libxml2.git Vcs-Browser: https://salsa.debian.org/xml-sgml-team/libxml2 +Description: GNOME XML library + XML is a metalanguage to let you design your own markup language. + A regular markup language defines a way to describe information in + a certain class of documents (eg HTML). XML lets you define your + own customized markup languages for many classes of document. It + can do this because it's written in SGML, the international standard + metalanguage for markup languages. Package: libxml2 Architecture: any @@ -36,13 +43,8 @@ Depends: Conflicts: w3c-dtd-xhtml, Multi-Arch: same -Description: GNOME XML library - XML is a metalanguage to let you design your own markup language. - A regular markup language defines a way to describe information in - a certain class of documents (eg HTML). XML lets you define your - own customized markup languages for many classes of document. It - can do this because it's written in SGML, the international standard - metalanguage for markup languages. +Description: ${source:Synopsis} + ${source:Extended-Description} . This package provides a library providing an extensive API to handle such XML data files. @@ -54,13 +56,8 @@ Depends: ${misc:Depends}, ${shlibs:Depends}, Multi-Arch: foreign -Description: XML utilities - XML is a metalanguage to let you design your own markup language. - A regular markup language defines a way to describe information in - a certain class of documents (eg HTML). XML lets you define your - own customized markup languages for many classes of document. It - can do this because it's written in SGML, the international standard - metalanguage for markup languages. +Description: ${source:Synopsis} - utilities + ${source:Extended-Description} . This package provides xmllint, a tool for validating and reformatting XML documents, and xmlcatalog, a tool to parse and manipulate XML or @@ -76,13 +73,8 @@ Depends: Suggests: pkg-config, Multi-Arch: same -Description: Development files for the GNOME XML library - XML is a metalanguage to let you design your own markup language. - A regular markup language defines a way to describe information in - a certain class of documents (eg HTML). XML lets you define your - own customized markup languages for many classes of document. It - can do this because it's written in SGML, the international standard - metalanguage for markup languages. +Description: ${source:Synopsis} - development files + ${source:Extended-Description} . Install this package if you wish to develop your own programs using the GNOME XML library. @@ -95,13 +87,8 @@ Depends: Suggests: devhelp, Multi-Arch: foreign -Description: Documentation for the GNOME XML library - XML is a metalanguage to let you design your own markup language. - A regular markup language defines a way to describe information in - a certain class of documents (eg HTML). XML lets you define your - own customized markup languages for many classes of document. It - can do this because it's written in SGML, the international standard - metalanguage for markup languages. +Description: ${source:Synopsis} - documentation + ${source:Extended-Description} . This package contains general information about the GNOME XML library and more specific API references. @@ -115,13 +102,8 @@ Depends: ${misc:Depends}, ${python3:Depends}, ${shlibs:Depends}, -Description: Python3 bindings for the GNOME XML library - XML is a metalanguage to let you design your own markup language. - A regular markup language defines a way to describe information in - a certain class of documents (eg HTML). XML lets you define your - own customized markup languages for many classes of document. It - can do this because it's written in SGML, the international standard - metalanguage for markup languages. +Description: ${source:Synopsis} - Python3 bindings + ${source:Extended-Description} . This package contains the files needed to use the GNOME XML library in Python3 programs.
#998165#22
Date:
2021-12-12 17:47:21 UTC
From:
To:
the following diff, which is based on
daa7d69fbffc1c438002993860f0df407e4aaeb1 (4.6.0.1):

|--- a/policy/ch-controlfields.rst
|+++ b/policy/ch-controlfields.rst
|@@ -131,6 +131,8 @@ package) are:
|
| -  :ref:`Rules-Requires-Root <s-f-Rules-Requires-Root>`
|
|+-  :ref:`Description <s-f-Description>`
|+
| The fields in the binary package paragraphs are:
|
| -  :ref:`Package <s-f-Package>` (mandatory)
|@@ -652,9 +654,14 @@ orderings.  [#]_
| ~~~~~~~~~~~~~~~
|
| In a source or binary control file, the ``Description`` field contains a
|-description of the binary package, consisting of two parts, the synopsis
|-or the short description, and the long description. It is a multiline
|-field with the following format:
|+description of the package, consisting of two parts, the synopsis or the short
|+description, and the long description.
|+
|+When used in a source control file in the general paragraph (i.e., the first
|+one, for the source package), the text in this field is relevant for all binary
|+packages built by the given source package.
|+
|+It is a multiline field with the following format:
|
| ::
|

I also pushed my change here:
https://salsa.debian.org/mattia/policy/-/commit/807bd3ea551087df33c54c270e7f11151c8b0ae2

#998165#27
Date:
2021-12-12 22:51:43 UTC
From:
To:
I believe the 2nd "in" in this line is too much.

and then I'd rewrite "the general paragraph (i.e., the first one, for
the source package)" into "the first paragraph" and the whole paragraph into

When used in a source control file the first paragraph is used as the first
paragraph for all binary packages built by the given source package.

I'd second this change too.

#998165#30
Date:
2021-12-14 14:52:23 UTC
From:
To:
Mh, not really?  However indeed it might be clearer this other way:

    When used in the general paragraph of a source control file [...]
talks about the "Description" and it's linked to from the various places
this field is used (a source control file, a binary control file, a
changes control file); it's not describing the source control file (i.e.
d/control).  Furthermore, note that the wording "general paragraph"
comes from §5.2 "Source package control files – debian/control":

    The first paragraph of the control file contains information about
    the source package in general. [...]

    The fields in the general paragraph (the first one, for the source
    package) are: [...]


Of course I'm happy to use a different expression (like, "first
paragraph of a source control file") if I'm recommended to.

#998165#35
Date:
2021-12-22 00:53:31 UTC
From:
To:
Hello Mattia,

Thanks for the patch.

Is there really no name for the first paragraph other than "general
paragraph"?  Maybe "the source package's stanza"?

Also, how about "the text in this field describes all binary packages
which do not have their own Description: fields" ?

#998165#38
Date:
2021-12-24 10:03:55 UTC
From:
To:
You should tell me :D

As I said, I'm happy to change those 3 words, but first you should
probably decide and then make it uniform with §5.2 "Source package
control files – debian/control" (and I have no idea how it is referred
to in the rest of the document).

Where would you add this line?  The whole section is talking about
Description regardless of where it is used, so your suggestion doesn't
sounds any good there.

#998165#43
Date:
2021-12-24 12:42:07 UTC
From:
To:
That's how the dpkg documentation (man and perl modules POD) refers to
it (or first block of information, which is even worse), but I agree
it's rather suboptimal, and I'd like to get a better name for it. See
below.

Something like this might work, which is probably what we have been
calling it for some time now, but that ties up with something that has
been bothering me for some time now, and that is that I've found our
naming of the various stanzas and the various control filenames rather
confusingly similar, which for dpkg I'd really like to clear up (and
have few tentative commits already), as that's even affecting its perl
API currently. :/

For example we have «Debian source control file» or «Debian source
packages' control file» for .dsc, then we have «Source package control
file» or in dpkg «Debian source packages' master control file» for
debian/control. Which are almost the same. I've been considering naming
debian/control something like «Debian template source control file», as
that is used to generate both the source and binary control files.

But I think I'll open a new bug to cover and discuss that.

I'm not sure whether you are (or the text would then) imply this; but
the Description in the source stanza does not get inherited by the
binary stanzas when generating the binary package control file, one
needs to add references to it via substvars.

Thanks,
Guillem

#998165#48
Date:
2021-12-27 20:20:14 UTC
From:
To:
Hello Guillem, Mattia,

Okay, fair enough, then let's just use "general paragraph" for now.

Cool.

Oh, right.

In that case, returning to Mattia's patch, it is probably not correct to
say that the source Description is relevant for all binary packages,
because perhaps the substvar is used for some but not all of them?

#998165#51
Date:
2021-12-27 20:51:39 UTC
From:
To:
Mh, we probably we'll need Guillem to confirm/deny this, but here I
really really was trying to not even mention on the substvar thing.
That to me feels like an implementation detail on how to fill a binary
package Description (that can already be accomplished in several other
way).
In my mind I was mostly focusing on being able to provide a
**description for the source package** (that is, then, relevant to
everything that source package builds); said description being picked up
by a substvar and used again later on is more like a nicety that comes
after describing the source first.

Should I perhaps express my intention differently?  For example:

|+When used in a source control file in the general paragraph (i.e., the first
|+one, for the source package), the text in this field is used to describe the
|+source package itself, and consequently all of the binary packages
|+built from itself.

?


(fwiw, Guillem: do you think the same text, once picked, should be
copied verbatim on deb-src-control(5)?)

#998165#56
Date:
2021-12-27 21:53:25 UTC
From:
To:
Mattia Rizzolo <mattia@debian.org> writes:

What if we just left off that paragraph entirely?  I'm not sure it's
adding anything.  The new text would then read:

   In a source or binary control file, the ``Description`` field contains a
   description of the package, consisting of two parts, the synopsis or
   the short description, and the long description.

If it's in a source control file, it's a description of the source
package; if it's in a binary control file, it's a description of the
binary package.  That seems obvious, so I'm not sure we need to say it
explicitly.

That said, 5.6.13 currently technically doesn't document Description for a
source package control file, only for source or binary control files or
(later, with entirely different syntax) for *.changes files.  Maybe that's
the root of the problem.  In that case, I think the paragraph we need is:

   The ``Description`` fields in source package control files are used to
   construct the ``Description`` fields for the source and binary control
   files when the package is built.  Any ``Description`` field in the
   first paragraph of the source package control file becomes the
   description of the source package for the source binary control file.
   ``Description`` fields in subsequent paragraphs become the description
   of the corresponding binary packages.  See deb-substvars(5) for some
   substitution variables that may be useful when writing binary package
   descriptions, such as ``source:Synopsis`` and
   ``source:Extended-Description``.

BTW, I think "3.4 The description of the package" may also need some minor
updates.  At the least, "Every Debian package" should probably say "Every
Debian binary package" since I don't think we're requiring source packages
to have descriptions.  It may also be worth adding a paragraph explaining
that source packages may have descriptions as well, but are not required
to.

#998165#61
Date:
2021-12-27 22:08:03 UTC
From:
To:
Hi!

Reply follows inline,

Mattia Rizzolo <mattia@debian.org> writes:

The following is only Informational level, but the existence of
Lintian's "duplicate-long-description" tag suggests that producing
duplicate bin:Descriptions in bin:libfoo and bin:foo packages is not
ideal, thus a straight copy from src:Description is not ideal.  I'm not
sure what the best way to solve this is, but substvar looks like a good
solution.  Alternatively, simply appending "\n\n$binary_pkg\n" to the
src:Description when generating the bin:pkg Descriptions would do the
trick.  Maybe there's an even better way?

This appears to conflict with the "duplicate-long-description" tag.  Of
course, Lintian isn't Policy, but I hope most will agree that it's worth
considering this precedent in some way.

Regards,
Nicholas

#998165#66
Date:
2021-12-27 22:36:56 UTC
From:
To:
I believe the intention is to automate this pattern, which a lot of
packages with shared libraries are already using:

    Source: dbus

    Package: dbus
    Description: simple interprocess messaging system (system message bus)
     D-Bus is a message bus, used for sending messages between applications.
     [the real Description goes into more detail here]
     .
     This package provides a fully-functional D-Bus system bus [etc.]

    Package: libdbus-1-3
    Description: simple interprocess messaging system (library)
     D-Bus is a message bus, used for sending messages between applications.
     [the real Description goes into more detail here]
     .
     This package provides the runtime library for use by applications.

    Package: libdbus-1-dev
    Description: simple interprocess messaging system (development files)
     D-Bus is a message bus, used for sending messages between applications.
     [the real Description goes into more detail here]
     .
     This package provides development headers and a static library.

by turning it into something like this:

    Source: dbus
    Description: simple interprocess messaging system
     D-Bus is a message bus, used for sending messages between applications.
     [the real Description goes into more detail here]

    Package: dbus
    Description: simple interprocess messaging system (system message bus)
     ${source:Description}
     .
     This package provides a fully-functional D-Bus system bus [etc.]

    Package: libdbus-1-3
    Description: simple interprocess messaging system (library)
     ${source:Description}
     .
     This package provides the runtime library for use by applications.

    Package: libdbus-1-dev
    Description: simple interprocess messaging system (development files)
     ${source:Description}
     .
     This package provides development headers and a static library.

which eliminates a lot of the duplication. Is that correct?
causality should be the other way round (Lintian should remind maintainers
about things that are already undesirable, rather than something being
undesirable solely because Lintian says so).

However, the rationale given in the long descriptions of Lintian
tags/hints should point to a reason why it's better to avoid the tagged
behaviour, and that reason is the thing to pay attention to.

In the case of binary package descriptions, I believe the reasoning is:
the Description of a package should tell you whether you might want to
install it. If the Description is identical, then by definition it can't
tell you why you would want to install dbus but not libdbus-1-dev, or
vice versa; and if there is no reason why you would want to install one
but not the other, then they should usually be combined into one larger
package (although multiarch and Architecture: any vs. all sometimes mean
that things need to be split for technical reasons even though there is
no user-facing reason for them to be separate).

    smcv

#998165#71
Date:
2021-12-27 23:32:41 UTC
From:
To:
Nicholas D Steeves <sten@debian.org> writes:

What's wrong with duplicate descriptions?  I think we need to answer that
question first before deciding whether the Lintian tag is telling us
anything meaningful.

Policy has a fairly good description of the intention of the package
description:

https://www.debian.org/doc/debian-policy/ch-binary.html#s-descriptions

I think the additional caveat, and the primary place where I've seen
packages duplicate the same long description, is that there are a bunch of
packages in Debian that are essentially never installed directly by a
systems administrator, and thus for which the package description doesn't
matter a ton if there's nothing else to describe (like conflicts or
dependencies).  Library packages are a typical example; they're pulled in
as dependencies.

I personally tend to add a sentence to the top of the binary package
description saying something like "Provides the shared libraries for foo"
or "Provides architecture-independent support files used by foo" and then
put the shared long description in the second paragraph, which wouldn't be
flagged by Lintian.  But I'm not sure the practice of putting "- shared
library" in the synopsis and using the same long description is all that
bad or something we should worry about discouraging.

#998165#76
Date:
2021-12-27 23:35:22 UTC
From:
To:
Simon McVittie <smcv@debian.org> writes:

[...]

This is somewhat of an aside, but the order of those two paragraphs should
really be reversed, under the maxim that the most important information
about a package should go first due to possible truncation in the UI used
to view package descriptions (and also in the human tendency to only read
the first paragraph).

In other words, rather than:

I think we should prefer:

    Package: libdbus-1-3
    Description: simple interprocess messaging system (library)
     The runtime D-Bus library for use by applications.
     .
     D-Bus is a message bus, used for sending messages between applications.
     [the real Description goes into more detail here]

#998165#81
Date:
2021-12-29 19:43:46 UTC
From:
To:
Hello Mattia, Russ,

Thank you both for your input on this.
in the source package paragraph was for the sake of substituting it into
binary package descriptions.

Could those who have been involved in non-Policy discussions of source
package paragraph Description: fields confirm that the purposes here
really is to add descriptions for source packages, as well as to provide
something to substitute?

Introducing descriptions for source packages seems fine, but I want to
be surer of our intent.

Looks good, once my question above is addressed.

Right.  I don't think we even want to recommend them at this point.  I
would not like to put any pressure on maintainers to write them.