Requirements for fannish resource identifiers
Published: .
The following blogpost is a summary of a discussion which was held in the Fandom Coders discord about IĀ·D requirements for various types of fannish resources, and how these things might federate out or be handled by other services. Our goal is to create a decentralized network of fannish platforms, so figuring out resource identification requirements is an important first step.
Note that in the discussion which follows, a āresourceā might be a work, an author, a tag, a bookmark, or something elseā¦ anything which might be a metadata subject.
ā Resources should have Tag UĀ·RĀ·Iās.
Resource Tag UĀ·RĀ·Iās should be UĀ·RĀ·Iās of the formāÆ:ā ā
tag:<domain>,<date>:<path>
ā:āÆwhere <domain>
is the domain name of a site, <date>
is some
date (in YYYY-MM-DD
format), and <path>
is some path decided by the
person or people who owned <domain>
at <date>
to uniquely identify
the resource. Tag UĀ·RĀ·Iās are ideal for fannish resources for the
following reasonsāÆ:ā ā
-
In order for a fannish resource to be published on the internet, it must be published at a domain on a date. So these requirements are easily satisfied.
-
No external registration (beyond owning a domain name) is necessary to mint UĀ·RĀ·Iās, and no maintenance is necessary.
-
The domain name in the Tag UĀ·RĀ·I indicates who should be the trusted party when it comes to information about the resource:
<domain>
. If you hear about the resource from somewhere else, you know to view the information you receive with some level of suspicion.
Some additional notesāÆ:ā ā
-
The term āTag UĀ·RĀ·Iā has no relation to the normal fannish use of ātagā; itās just what the UĀ·RĀ·I scheme happens to be called.
-
<date>
does not (and maybe should not) have to be the actual date a resource was created. My recommendation would be to set<date>
to the date that a service was founded, so that eĀ·g if a service dies and a new one is started at the same domain, the two generate clearly distinguishable UĀ·RĀ·Iās. -
Itās not possible to distinguish between beneficial reasons for content changes at
<domain>
(an author editing a work) and malicious ones (hostile domain takeover). Itās also not possible to verify that the people at<domain>
actually controlled the domain at<date>
. But if people play by the rules, an accidental name collision will never happen.
ā” Resources should have canonical UĀ·RĀ·Lās containing their Tag.
The canonical UĀ·RĀ·L for a resource should look like thisāÆ:ā ā
https://<domain>/<subpath>/tag:<domain>,<date>:<path>
There are a few important things of note hereāÆ:ā ā
-
Both instances of
<domain>
must be the same, or else the UĀ·RĀ·L is not canonical. -
The entire Tag UĀ·RĀ·I is present in the UĀ·RĀ·L, allowing it to be identified even if the UĀ·RĀ·L ceases to be dereferencable.
-
<path>
may contain anything, including a query or fragment part.
It is possible for resources to be mirrored. Mirrors must have UĀ·RĀ·Lās like the followingāÆ:ā ā
https://<mirror-domain>/<mirror-subpath>/tag:<domain>,<date>:<path>
ā:āÆthat is, the same easilyārecognizable Tag UĀ·RĀ·I, but at a different
domain and subpath. Mirrors must identify the canonical UĀ·RĀ·L of
the resource they are mirroring. owl:sameAs
might be one mechanism of
doing this in RĀ·DĀ·F.
ā¢ Crossposted resources should link to each other.
If a work is crossposted in two locations, one is not necessarily
ācanonicalā and the other a āmirrorā. Likely, both will be canonical
and have their own Tag UĀ·RĀ·Iās (and this is a good thing). Crossposted
works should instead identify themselves by linking to each other in
some reciprocal fashion. We may need to come up with our own metadata
term for specifying this, but see eĀ·g dcterms:hasFormat
and
dcterms:isFormatOf
which encode a similar (but not necessarily
reciprocal in the same way) relationship.
ā£ Platforms should only trust mirrors as a last resort.
And with copious warnings. If at all possible, platforms should direct users to the canonical UĀ·RĀ·L associated with a resource. However, this may not be possible (if an archive moves or goes down). In that case, a platform may direct users to a mirror, with a warning that the mirrored version is not the original published work and may differ in significant ways.
Additional thoughts.
These things were either only briefly touched on, or else are my own ideas which came as I was writing this post.
-
Mirroring should be explictly optĀ·in, and ideally automated (to reduce the likelihood of intentional or unintentional error). We will need to develop protocols for this.
-
For added security, publishing platforms might implement Webfinger, to guard against mirrors which correctly identify works they control but misidentify their path (thus making them appear to be down). Discovery platforms may, and probably should, attempt to make a Webfinger request for the resource with its Tag UĀ·RĀ·I instead of trusting the canonical path. However, supporting Webfinger should not be required of all publishing platforms, and the attack vector from mirrors in this sense is pretty small.
-
Webfinger or ordinary HĀ·TĀ·TĀ·P redirects could be used to forward services to new ācanonicalā UĀ·RĀ·Lās in the case that a service moves. However, this trail would only be followable for as long as the redirects or Webfinger endpoint remains up at the original domain.
-
Instead of mirroring tags, a service might indicate that its version of a tag is intended to be synonymous with another serviceās version of a tag using
skos:closeMatch
. The stronger statementskos:exactMatch
requires agreement from both services. Tag mirrors are useful in case the canonical service for a tag goes down, but should not be relied upon otherwise. -
Publishing platforms may serve a ātombstoneā at the canonical UĀ·RĀ·L for a resource, indicating that it was intentionally deleted. In this case, a mirrored version must not be used.