The information about apt repositories is seriously lacking. As far as I know there is no way how to verify that a repository layout is correct other than trying to download an index file and a package with apt. The apt documentation does not tell how to infer the files apt tries to download from its configuration. There is no documentation on creating Release files. There are some outdated howtos around the net that suggest different fields than the ones supported by apt-ftparchive. However, Release files created with apt-ftparchive are ignored (Ign) by apt without any explanation whatsoever. The apt security feature requires working release files as only these are signed. Thanks Michal
Could you clarify how this differs from #481129?
Excerpts from Filipus Klutiero's message of Wed May 16 18:44:21 +0200 2012: It's 4 years later. Sorry, forgot that I filed the bug already. It's quite some time. Given there is no feedback in 4 years I guess it is futile reporting this. Admittedly there is no text in social contract about using Debian-proprietary formats. And a format only defined by "apt can read that" is definitely Debian-proprietary there is no better term for that. I'd say it's slightly discriminatory against software not part of Debian that cannot rely on getting notified when "apt can read that" silently changes, there is no document defining what apt should be able to read that software authors can rely on to interoperate with apt, one of the core Debian tools. Apt in turn relies on open standards like HTTP and FTP to interoperate with the rest of the world. Thanks Michal
Michal Suchanek writes ("Re: Bug#671503: general: APT repository format is not documented"):
Well, it's useful to bring it up again.
Everyone agrees that it would be better if this were documented.
(I have struggled on occasion myself due to the lack of
documentation.)
But I think the use of the word "proprietary" is going too far. It's
certainly a special Debian format, but that wouldn't be changed if it
were documented. But it's not secret and we publish at least two
writer implementations and one reader implementation AFAIK, with
proper Free licences.
I think this is not an appropriate use of the social contract or its
concepts.
Rather than complaining that this documentation doesn't exist, how
about writing the document yourself ? It's not a trivial job but it
should be feasible by looking at the apt source code.
Once such a document exists, even if it's a bit sketchy or perhaps not
entirely accurate, it will be much easier to insist that future
changes are likewise documented.
Thanks,
Ian.
Excerpts from Ian Jackson's message of Thu May 17 14:53:30 +0200 2012: However, it's easier to reverse-engineer an existing repository than the source code so for all practical purposes it's the same as if it were closed source. For me it is not feasible at all. I can, of course, describe what current repositories look like or what the current apt code accepts. However, that has silently changed in the past and is considered apt feature, not a bug. I am not so sure about that. So long as the document merely describes what apt happens to do at the moment rather than apt implementing what the document says there is no saying this document has any value. The status was 'documented' by existing repositories which stopped working. Thanks Michal
Hello, As someone who had to reverse-engineer APT repository format I fully agree with the above. With one minor addition that some software which is (non-core) part of Debian suffer from the same problem.
That is non-sense. You said yourself that the repository is not sufficient to understand it, yet you say that it is easier to understand with it than with looking at the source (and the various bits and pieces where parts are documented). Even if you don't like apt sources, debian has a lot of other tools working with the same repository in a bunch of different languages, so choose which you like most and you will properly find a tool in that language to either download from repositories or to create them. But I don't know why we are still talking here. Russ already said he would like to have it as a subpolicy in the debian-policy. ftpmasters already said they would accept maintaining it. Everything left is writing this goddamn piece of documentation. So, maybe you should just write it… If you want to be extra fancy, start a wikipage in the debian wiki, but start typing. Go to debian-dak@l.d.o and discuss your work there. Would be way more productive than talking about that this document is missing… Everbody knows that. Everybody doesn't like it. Now go and fix that. That everyone would like to have such a document but nobody has it so far is a strong indication that the current people are busy with other stuff. An opportunity to get involved, I would say. It hasn't silently changed. It was and is still the same. Your script was just horribly wrong and older APT versions just happened to work with that brokenness a little better. What you created with that script was NEVER intended to work, it just happened to be working out of complete luck (A Release file is supposed to include current data, not non-existent data, this conclusion is reachable even without too much guessing. Beside that this is actually documented in apt-secure and co, but that is the problem with most of the documentation, nobody really reads it even if it exists… which in the specific case of the Release file is even translated to a few languages -- i am to lazy to look it up now…). At the time you reported that bug i also told you what was wrong in that script and how to fix it if you want to continue to use that script, so please don't keep up the myth that we changed everything and nobody told you why and how and what not. If you are really that unsatisfied with apt, feel free to use something else, we have enough tools to choose from, it not like apt would be the only package manager in existence. It might be the most used one, but that doesn't say anything about documentation or available manpower. that you are now no longer able to follow the oil drops back home, right? If we want to follow that thought, the only repository which should give any indication on how APT might work is the one created by the ftpmasters. Actually that is not really true as APT could basically do anything, but if it wants to keep the status 'debian native' package it better should. And ftpmaster could basically do anything, but this would properly make quiet a few people (a bit) angry. Anyway: "Documented" is it absolutely not by any random repository which just happened to sort of work because its maintainer was lucky and thinks that "if it builds without a fatal error, it must be perfect". Best regards David Kalnischkies
reassign 481129 debian-policy merge 481129 671503 thanks Merging the reports then.
Excerpts from David Kalnischkies's message of Thu May 17 18:21:59 +0200 2012: No, understanding the repository (or current apt source) is not sufficient to ensure that your repository will be readable by future apt or that future repositories will not become unreadable by your apt without any warning or explanation. Both has happened. That would be awesome. Maybe he just forgot to CC this bug as well? As said earlier, just writing a random document does not make apt not diverging from it. My script did exactly what apt-secure says. Well, at least so much as apt-secure is specific bout it. And the data was existing and well recognized by apt so it passed all available tests. Indeed, you did. After I reported it as a bug. Were the format documented I would know from the start ;-) If all cars up to that day would leak oil enough so you could trace them by the oil drops I would be indeed surprised. Thanks Michal
Michal Suchanek <michal.suchanek@ruk.cuni.cz> writes:
I would suggest you look at existing repositories, whatever scraps of
information is in the manuals and maybe a bit at the source and start to
write a documentation. Once you have that offer it for review and other
people can pitch in their bits of knowledge. Getting the current format
documented right shouldn't be that hard if someone just starts.
And once such a document exists it is much easier to get people do
document changes or hit them over the head if they don't.
Remember that you don't have to be 100% right in what you write. You
only need to write a draft to start the process. Getting people to
comment and correct any mistakes you simply don't know about is much
much easier than getting someone else to write the whole thing.
MfG
Goswin
CC'ing the apt list deity@lists.debian.org.
Goswin von Brederlow writes ("Re: Bug#481129: Bug#671503: general: APT repository format is not documented"):
Right.
Can the apt maintainers confirm that once such a document exists, they
will insist that future contributions to apt which change the
repository format update the document ?
What form do the apt maintainers think the document should take ?
Should it eventually be in the apt source package or somewhere else ?
Indeed so.
Thanks,
Ian.
I do not think that APT is responsible for the repository format. The repository format is defined by ftpmaster, not by APT. APT has to my knowledge not defined anything new, but only implemented changes to the repository format after they were introduced by ftpmaster (see InRelease files). We currently have three independent implementations of the repository format in the archive: APT, cupt, smartpm. Furthermore, tools like debian-cd probably also have some knowledge about the repository format. The repository format should thus be part of Policy, not part of APT. APT is one of the users of that format, not the one defining it (it might just get stricter in behavior from time to time, just like compilers). Changes to the format should require approval of ftpmaster, as they have to implement them on the server-side.
describes the current format for Release, Packages, and Sources
files. It's thus missing Contents and Translations, pdiffs, and
stuff, but it's a beginning.
It specifies different requirements for servers and clients,
in order to have clients be backwards compatible with more
repositories, and forcing servers to be stricter. Don't know
how good that is, though.
============================
= Debian Repository Format =
============================
This documents a subset of the Debian repository format. This is a work
in progress.
"Release" files
===============
The file "dists/$DIST/Release" shall contain meta-information about the
distribution and checksums for the indices. The file "dists/$DIST/Release.gpg"
shall be an GPG signature of the "Release" file, compatible with the format
used by the GPG options "-a -b -s". The file "dists/$DIST/InRelease" shall be
the "Release" file with a GPG clearsign signature compatible with the format
used by the GPG options "-a -s --clearsign".
The following fields might be available:
- Origin
- Label
- Suite
- Codename
- Date
- Valid-Until
- Architectures
- Components
- Description
- MD5sum, SHA1, SHA256
- NotAutomatic and ButAutomaticUpgrades
Servers shall provide the Release file, and its signed counterparts with at
least the following keys:
- SHA256
- Origin
- Suite and/or Codename
- Architectures
Clients shall accept missing Release files, and Release files without the
fields required for servers. They might reject Release files that do not
contain at least one of the fields defined herein.
Architectures
-------------
Whitespace separated unique single words identifying Debian machine architectures
as described in Architecture specification strings, Section 11.1.
Origin
------
Shall indicate the origin of the repository.
Label
-----
Optional field including some kind of label.
Suite
-----
The Suite field shall describe the suite. In Debian, this shall be one of
oldstable, stable, testing, unstable, or experimental; with optional suffixes
separated by "-" (such as "stable-updates").
Codename
--------
The Suite field shall describe the codename of the release. This is mostly
a free-form string used to give a name to a release.
Date, Valid-Until
-----------------
The Date field shall specify the time at which the Release file was created. The
Valid-Until field shall specify at which time the Release file should be
considered expired by the client. Client behaviour on expired Release files
is unspecified.
Components
-----------
A whitespace separated list of areas.
Example:
Components: main contrib non-free
MD5sum, SHA1, SHA256
--------------------
Those fields shall be multi-line fields containing multiple lines of whitespace
separated data. Each line shall contain
(1) The checksum of the file in the format corresponding to the field
(2) The size of the file (integer >= 0)
(3) The filename relative to the directory of the Release file
Each datum may be seperated by one or more whitespace characters.
Server requirements:
The field shall contain data about all uncompressed files, and should also
contain information about all compressed files. The checksum and sizes shall
match the actual existing files.
Client behaviour:
Any file should be checked at least once, either in compressed or
uncompressed form, depending on which data is available. If a file
has no associated data, the client shall inform the user about this
under possibly dangerous situations (such as installing a package
from that repository). If a file does not match the data specified
in the release file, the client shall not use any information from
that file, inform the user, and might use old information (such as
the previous locally kept information) instead.
NotAutomatic and ButAutomaticUpgrades
-------------------------------------
The NotAutomatic and ButAutomaticUpgrades fields are boolean fields
instructing the package manager. They may contain the values "yes"
and "no".
If "NotAutomatic: yes" is specified, the client should prevent installation
of packages from this repository unless explicitely requested (APT will assign
priority 1 to that repository).
If "ButAutomaticUpgrades: yes" is specified in addition to "NotAutomatic: yes",
the client should cause upgrades to packages from that repository to be
installed automatically (APT will assign priority 100 to that repository).
If both are either missing or set to "No", the repository should behave like
any other repository (APT will assign either priority 500 or 990 by default,
depending on whether the release is it's target release).
Other combinations are undefined.
"Packages" Indices
==================
The files dists/$DIST/$COMP/binary-$ARCH/Packages are called Binary Packages
Indices. They consist of multiple paragraphs, where each paragraph has the
format defined in Policy 5.3 (Binary package control files -- DEBIAN/control),
and the additional fields defined in this section.
Filename
--------
The Filename field shall list the path of the package archive relative to the
base directory of the repository.
Example:
Filename: pool/main/a/apt/apt_0.9.3_amd64.deb
Required: yes
Size
----
The size field shall give the size of the package file, in bytes.
Example:
Size: 1158196
Required: yes
MD5sum, SHA1, SHA256, SHA512
----------------------------
Checksums for the package. They shall be represented in hexadecimal
notation. The SHA512 field is not in active use prior to this
specification, the MD5sum and SHA1 fields should be considered
deprecated, but should still be provided.
Examples:
MD5sum: 2519c8c1afd27e70cf4ac10a5fa46e32
SHA1: 646eda5b6d51190181c15f5537428161f6f04c1d
SHA256: 3183eff291d1e9d905e78a6b467bbfb90b20fc2808d50b5e91bf55158b4c18be
Server requirements: SHA256 shall be available
Client requirements: Shall accept files without any such fields, should warn
if those fields are missing and a package is used.
Description-md5
----------------
An md5sum of the english description. This will be used to lookup the
translations in the translation indices. If this field is not defined,
the md5sum shall be calculated from the Description field.
Server requirements:
Either Description or Description-md5 shall be specified.
Client requirements:
If neither Description, nor Description-MD5 is defined, the result shall
be the same as if an empty description was specified for all languages.
If Description-md5 is defined, the long description shall be looked up
via translation indices if requested.
Example:
Description-md5: 9fb97a88cb7383934ef963352b53b4a7
Description
-----------
The Description field shall contain the complete package description, if
Description-md5 is not defined; or only the short description of the package,
if Description-md5 is defined.
"Sources" Indices
=================
The files dists/$DIST/$COMP/source/Sources are called Sources indices. They
consist of multiple paragraphs, where each paragraph has the format defined
in Policy 5.5 (5.4 Debian source control files -- .dsc), with the following
changes and additional fields. The changes are:
- The "Source" field is renamed to "Package"
- A new mandatory field "Package-List"
- A new mandatory field "Directory"
- A new optional field "Priority"
- A new optional field "Section"
Package-List
------------
The Package-List field shall contain multiple lines of package information,
where each line begins with a whitespace and has the following format:
$PKGNAME $TYPE $SECTION $PRIORITY
$PKGNAME is the name of the package, $TYPE is "deb" or "udeb", $SECTION
is the section of the package, and $PRIORITY is the priority of the package.
Example:
Package-List:
apt deb admin important
apt-doc deb doc optional
apt-transport-https deb admin optional
apt-utils deb admin important
libapt-inst1.5 deb admin important
libapt-pkg-dev deb libdevel optional
libapt-pkg-doc deb doc optional
libapt-pkg4.12 deb admin important
Directory
---------
The directory field shall list the location of the source package in the
repository, relative to the base directory of the repository.
Example:
Directory: pool/main/a/apt
Priority
--------
Shall contain the value "source".
Example:
Priority: source
Section
-------
Shall contain the section specified for the source package??
Example:
Section: admin
Ubuntu. One of those is the "Important: yes" field, which is like Essential, but does not force installation of the package like Essential would do (and does not force immediate configuration nowadays, so that we can use it for custom meta packages [so that users cannot accidentally remove the meta package that configures the complete system]). I don't know of any other extensions, though. In any case, they should probably not be part of an official specification, but rather documented in APT.
FWIW posted on the wiki: http://wiki.debian.org/RepositoryFormat Thanks Michal
* Julian Andres Klode <jak@debian.org> [120518 14:43]:
I think it does not make much sense to document all legacy information
when starting to standardize. I'd opt for only "InRelease".
In *Release files it is MD5Sum not MD5sum.
That sounds like nothing all is allowed. I think a client should ignore
everything it does not know about instead.
Isn't Components also required currently?
I think a client should not be forced to accept a server without a
Release.
That section does not specify much, except the output of "dpkg-architecture -L",
which makes no sense for servers (as a server should not need a dpkg
with a architecture like that to support clients having it, and a client
should not need to know about what other clients have).
s/Shall/Can/ to make more clean it is optional?
As this is the primary information, some data about how it can be formed
would be nice.
must be instead?
How about suggesting that a client should not download any files not
listed in there or listed in other files already downloaded?
There are more clients than installers and there is no point in saying
what a client should do there. Better describe it's meaning.
dito
Again, that is for the client to decide...
Here it would be nice to state that uncompressed Packages files do not
need to exist even if listed in InRelease.
Has SHA512 any sense? Perhaps wait till there is something more secure
than SHA256 is around?
If there is none of them, I'd rather suggest a client to reject the
package.
I'd suggest that a client should also support files without either of
those.
That field is quite new. Not sure if it makes sense to make it
mandatory.
Bernhard R. Link
InRelease is mostly unused by 3rd parties, so better not. It's not even used by Ubuntu yet, and most clients do not support it yet. So, I'd say that all three must be available (or actually, Release and Release.gpg shall be, and InRelease may be) - whereas Clients may assume that either Release or InRelease is available. OK. I don't really understand what you mean. Well, APT does not use it AFAIK, but all official servers create it. Making it required seems OK. Yes The Codename field contains a name given to the release. In Debian systems, this field contains a name from a Toy Story character. Like Suite, it may have an optional suffix. Somehow, yes. It's a bit redundant with the initial sentence, so they have to be merged. That's a good idea. It should not have been in the draft. The current behavior is to request confirmation from the user in APT, I don't know what the others do. Yes. The statement for Description-md5 contains that, they could be merged. I don't know from when this field is, I just saw it.
I have now documented the Contents indices and the diffs
as well, mostly (sans the exact format we use for the
patches), and Translation indices. Now we're basically
only missing details, it is fairly complete otherwise
(i.e. we should have mentioned every file and field
currently in use, but may not have explained all of
them completely).
We now have documented
dists/$DIST/Release (and InRelease, Release.gpg)
dists/$DIST/$COMP/binary-$ARCH/Packages
dists/$DIST/$COMP/source/Sources
dists/$DIST/$COMP/Contents-$ARCH.gz
dists/$DIST/$COMP/i18n/{Index,Translation-*.bz2}
*.diff/Index *.diff/%Y-%m-%d-%H%M.%S.gz
The other Release files have been omitted, as they are not
used anywhere. We are only missing udeb content files and
packages files now, which are just small subsentences.
In a few months, I'd like to rework this in DocBook form,
and submit it to debian-policy for inclusion into official
Policy, as a sub-policy like copyright-format.
What's the opinion about the flat repository format, where you just have one directory with Release, Packages, Sources, and friends and no sub-directories? Should they be documented as well then? We would then have two kind of documented repository formats: 1. Debian-style, with a pool (or similar) and a dists directory 2. Flat-style, with just one directory This should cover everything we currently support. Although I don't know much about how much stuff we support in flat directories WRT Translation, Contents, and diffs.
+++ Julian Andres Klode [2012-05-18 13:38 +0200]: I think reprepro is another? /usr/share/doc/reprepro/manual.html contains a 'repository basics' section which includes useful layout/format information. Wookey
Of course, I was just only talking about clients. When it comes to creating we probably have much more than 3 programs needing some knowledge of the repository format. We have dak, apt-ftparchive, reprepro, debian-cd, everyone's small script, mini-dinstall (the latter using flat repositories, if I am not mistaken).
Excerpts from Julian Andres Klode's message of Fri May 18 18:49:10 +0200 2012: Yes, looks fairly complete. The formatting is not consistent but that will have to be changed for docbook anyway. Also would need some proof-reading. If nothing else somebody should look in a few weeks from now if it still makes sense ;-) I put a link on the RepositoryHowto page for more exposure. I am not so sure documenting Debian installer files is tremendously useful. I don't think anyone outside Debian Installer team makes Debian Installer repositories and there are other aspects of Debian Installer that would need to be documented in order for it to be usable for 'outside' people in non-default configurations. Thanks Michal
Yes, and it will also be more readable then, than the current wiki version. I'd also like to hear bits from the launchpad team about their implementation and see whether they agree with everything then. We still need to at least document the udeb stuff, the images and other stuff is not relevant to the core format and probably defined by the d-i team anyway (and installed by-hand). The udeb stuff is relatively easy to document, as it just adds one new directory and one new filename for Contents files, so can be done in about two sentences (and udebs are uploaded by standard .changes with the rest of the package, and are thus standard).
I would like to see the flat-style repository documented too, since some of the derivatives in the Debian derivatives census use it and I would like to lint their apt repositories.
Julian Andres Klode <jak@debian.org> writes:
This describes repositories of the form
deb uri suite component [...]
There should be a mention of flat repositories of the form
deb url path/
This changes nothing for the contents of files but it does change their
location and I think it's worth mentioning how that sources.list entry
maps to a repository.
MfG
Goswin
I added (and others edited formatting a bit)
= Flat Repository Format =
A flat repository does not use the {{{dists}}} hierarchy of directories,
and instead places meta index and indices directly into the archive root
(or some part below it) In sources.list syntax, a flat repository is specified
like this:
{{{
deb uri directory/
}}}
Where {{{uri}}} specifies the archive root, and {{{directory}}} specifies the
position of the meta index and the indices relative to the archive root. In
Flat repositories, the following indices are supported:
* Packages (under the location {{{directory/Packages}}})
* Sources (under the location {{{directory/Sources}}})
!InRelease, Release, Release.gpg meta-information are supported as well. Diffs,
Translations, and Contents indices are not defined for that repository format.
Indices may be compressed just like in the standard Debian repository format.
I don't think defining sources.list syntax in a client-agnostic document is a good move. APT has the 'sources.list' manpage for it and other clients might or might not have different ways to specify repositories. (beside, that it would be deb-src, too) (and co) instead of Translation-en. For Contents i am not sure, but i think apt-file downloads these, too. (not sure if this should be a reason to include it in a specification through or just keep it as some legacy cruft around) Diffs are supported by apt, but it will not be used if not in Release. (if no Release file is present, diffs will not be tried). It's the same for the non-flat repository and true for other files as well - and should be a reasonable thing to allow clients to do. In that train of thought, I think it would be a good idea to require a repository to have a Release (or InRelease) file including all files [in their current state] composing this repository. They are easy to create and this way a client could stop guessing if they like to, avoiding possibly a lot of 404's. Best combined with a strong recommendation on signing them. Best regards David Kalnischkies P.S.: Could we please stop talking to three bugs and two mailinglists? Especially as [0] suggests it is the wrong list… [0] http://lists.debian.org/debian-devel/2012/05/msg00222.html