#1127087#5
Date:
2026-02-06 00:59:07 UTC
From:
To:
Hello,
I don't know which package is to blame for this issue, so I've assigned it to two packages intentionally.
I'm on Debian Trixie and wanted to try 'rdfpipe' to read RDFa from a web page. We have a downstream Debian-specific manual page rdfpipe(1) that says
OPTIONS
	-i INPUT_FORMAT, --input-format=INPUT_FORMAT
	Format of the input document(s). Available input formats are: ..., rdfa, application/xhtml+xml, rdfa1.0, rdfa1.1, text/html, html

The date of the manual page says it's from 2013 and I see a lot has changed since then. However, it looks like support for RDFa in HTML is supposed to still work, although there's been substantial restructuring upstream and it looks like this may be offloaded to a plugin now. As a hint, one can try the following:
$ rdfpipe https://johnscott.me/index.xhtml
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/rdflib/plugin.py", line 134, in get
    p: Plugin[PluginT] = _plugins[(name, kind)]
                         ~~~~~~~~^^^^^^^^^^^^^^
KeyError: ('rdfa', <class 'rdflib.parser.Parser'>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/rdflib/graph.py", line 1497, in parse
    parser = plugin.get(format, Parser)()
             ~~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/rdflib/plugin.py", line 136, in get
    raise PluginException("No plugin registered for (%s, %s)" % (name, kind))
rdflib.plugin.PluginException: No plugin registered for (rdfa, <class 'rdflib.parser.Parser'>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/rdflib/plugin.py", line 134, in get
    p: Plugin[PluginT] = _plugins[(name, kind)]
                         ~~~~~~~~^^^^^^^^^^^^^^
KeyError: ('rdfa', <class 'rdflib.parser.Parser'>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/rdfpipe", line 8, in <module>
    sys.exit(main())
             ~~~~^^
  File "/usr/lib/python3/dist-packages/rdflib/tools/rdfpipe.py", line 199, in main
    parse_and_serialize(
    ~~~~~~~~~~~~~~~~~~~^
        args, opts.input_format, opts.guess, outfile, opts.output_format, ns_bindings
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3/dist-packages/rdflib/tools/rdfpipe.py", line 53, in parse_and_serialize
    graph.parse(fpath, format=use_format, **kws)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/rdflib/graph.py", line 2295, in parse
    context.parse(source, publicID=publicID, format=format, **args)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/rdflib/graph.py", line 1507, in parse
    parser = plugin.get(format, Parser)()
             ~~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/rdflib/plugin.py", line 136, in get
    raise PluginException("No plugin registered for (%s, %s)" % (name, kind))
rdflib.plugin.PluginException: No plugin registered for (rdfa, <class 'rdflib.parser.Parser'>)


This is interesting though because rdfpipe was nevertheless smart enough to know that the 'xhtml' file extension meant it should parse as RDFa.

The situation is confusing; I see that there was a time where RDFa support was split out into python3-pyrdfa, and then a plugin began to be provided by that package for python3-rdflib to invoke.
https://github.com/RDFLib/pyrdfa3/issues/33#issuecomment-689465980
https://github.com/RDFLib/rdflib/discussions/1582#discussioncomment-1879756
https://github.com/RDFLib/rdflib/commit/638a867168f05e2d3903f4a6e4ba9fa63807db6a


Maybe there's a reason why the plugin can't be discovered even when it's installed? Apparently there's build system magic that's supposed to help https://packaging.python.org/en/latest/guides/creating-and-discovering-plugins/
I also notice that python3-pyrdfa depends on python3-html5lib, but python3-rdflib build-depends and recommends the python3-html5rdf fork.

Also see https://github.com/RDFLib/rdflib/issues/2099#issue-1352359511

https://github.com/pangaea-data-publisher/fuji/issues/243 user experience report; it was said python3-rdflib version seven should have quirks sorted out.

This is as far as my search has taken me, but I don't know what most of that means. Maybe those leads will get someone started addressing this