Lady’s Computer: MY computer over YOUR internets

Lady’s Weblog

Federated Tagging of Fanworks

Lady

Published: .

As recent conversations regarding a·i and racism in fandom continue to exert pressure on the idea that the Organization for Transformative Works is in any way a good steward of the 11 million fanworks on their website Archive of Our Own, some fans are again beginning to wonder if a centralized repository controlled by a single organization is truly the best solution for a community as diverse as fandom.

There are a few reasons why one might say it is, but in this post I want to confront one: tagging. Tags (short labels which have been “wrangled” and organized into a hierarchial concept scheme) are the life·blood of search, filtering, and discovery on A·O·3, and powering this system is a huge cadre of volunteers whose duty it is to ensure that each label an author assigns to their work is organized correctly. A·O·3’s tagging system isn’t perfect—far from it—but the fact remains that a small, independent fan repository is unlikely to ever be able to muster the level of volunteer effort needed to build a tagging system of comparable utility. So, the argument goes, posting works to A·O·3 is, at the very least, a necessary evil.

I think this argument suffers from a failure of imagination. Even if we agree on the premise that tag wrangling work is labour‐intensive and fannish needs are best accommodated by a large, centralized tag authority, there is absolutely no reason why that tag authority couldn’t be shared by multiple small, independent repositories instead of being entirely controlled by one big one. We just need a way to get the tags from point A to point B.

This blogpost describes one such way. It uses the Tagging vocabulary to describe and federate tags from one centralized authority—which could be a large, volunteer‐run tag‐wrangling service in the model of A·O·3—out to potentially infinite independent repositories. It’s not necessarily a complete solution, and it’s unlikely that a system built with these technologies would work exactly like things do at A·O·3—but this is not necessarily a bad thing.

Tag Model

The Tagging vocabulary is an R·D·F ontology, which is to say it is a collection of terms and definitions which might be used to convey information between computers on the Web. It’s not a syntax or a protocol, although the conventions of the present day dictate that this information should probably be conveyed in J·son‐L·D via normal H·T·T·P·S GET requests. A tag might look as follows :⁠—

{ "@context": "https://ns.1024.gdn/Tagging/context.jsonld"
, "@id": "https://ladys.example/tag:ladys.example,2023-05-23:5NJ-8SKG"
, "@type": "SexualRelationshipTag"
, "prefLabel":
  { "literalForm":
    { "@value": "Shadow × O·C"
    , "@language": "en" } }
, "altLabel":
  { "literalForm":
    { "@value": "Sexual Relationship: Shadow the Hedgehog & Original Character"
    , "@language": "en" } }
, "involves":
  [ "https://ladys.example/tag:ladys.example,2023-05-23:N54-M773"
  , "https://ladys.example/tag:ladys.example,2023-05-23:V2G-36D8" ] }

This object describes the tag tag:ladys.example,2023-05-23:5NJ-8SKG—or, as humans might refer to it, ‹ Shadow × O·C ›. It is a sexual relationship tag involving two other tags (tag:ladys.example,2023-05-23:N54-M773 and tag:ladys.example,2023-05-23:V2G-36D8—presumably ‹ Shadow › and ‹ Original Character ›), and it has both a preferred label and an alternate one provided. If ladys.example were a real tag authority and this were a real tag, any repository would be able to fetch this U·R·L and retrieve the tag, its label, and its relationships.

If A·O·3 provided this kind of service, we could begin building fanfiction repositories which leveraged their tagging system today.

Works and Tag Labels

Of course, simply having dereferencable tags on the internet is only half of the struggle. We also need a means of associating them with works. In the Tagging vocabulary, this happens through a layer of indirection, because the labels that authors provide often do not match up exactly with the tag authority’s preferred labels for the tag. The way this works is as follows :⁠—

  • The work itself has a tag label collection, probably an ordered one.

  • This tag label collection then contains tag labels, which provide the author‐assigned labels attached to the work.

  • Labels may then label a specific tag, linking to the formal definition of the tag in some authority.

How this all gets actually managed in practice is up to the repository in question; it’s certainly possible to use tags with a simpler (or more complex) model than this, and questions of how works themselves might get federated is a bit out·of·scope for this discussion. However, my own immediate target is building an Atom feed of fanfiction I’ve written; in this case, the tags might be added using AtomTriples like so :⁠—

<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:at="http://purl.org/syndication/atomtriples/1"
  xmlns:html="http://www.w3.org/1999/xhtml"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:t="https://ns.1024.gdn/Tagging/#"
  xmlns:xl="http://www.w3.org/2008/05/skos-xl#"
  xml:lang="en"
>
  <title>Lady’s Fanfiction</title>
  <link rel="alternate" type="text/html" href="https://fanfiction.ladys.example/"/>
  <updated>2023-05-23T12:53:37-07:00</updated>
  <author>
    <name>Lady</name>
  </author>
  <id>https://fanfiction.ladys.example/tag:fanfiction.ladys.example,2023-05-23:</id>
  <entry>
    <title>fucky wucky shadow daddy dom me</title>
    <link rel="alternate" type="text/html" href="https://fanfiction.ladys.example/tag:fanfiction.ladys.example,2023-05-23:EQW-D8HX"/>
    <id>https://fanfiction.ladys.example/tag:fanfiction.ladys.example,2023-05-23:EQW-D8HX</id>
    <updated>2023-05-23T12:53:37-07:00</updated>
    <summary type="text">
      big bad shadow the hedgehog thinks hes the ultimate lifeform. but
      hes actually kind of short. can he still dom a hot dyke like me?
    </summary>
    <at:md>
      <!-- The tag label collection for this entry -->
      <t:hasTagLabelCollection rdf:parseType="Resource">
        <t:hasTagLabelList rdf:parseType="Collection">
          <t:TagLabel xl:literalForm="fuuck me shaddw">
            <t:labels>
              <!-- Shadow × Original Character -->
              <rdf:Description rdf:resource="https://ladys.example/tag:ladys.example,2023-05-23:5NJ-8SKG"/>
            </t:labels>
          </t:TagLabel>
          <t:TagLabel xl:literalForm="daddy th ehedgeog">
            <t:labels>
              <!-- Shadow -->
              <rdf:Description rdf:resource="https://ladys.example/tag:ladys.example,2023-05-23:N54-M773"/>
            </t:labels>
          </t:TagLabel>
          <t:TagLabel xl:literalForm="me">
            <t:labels>
              <!-- Original Character -->
              <rdf:Description rdf:resource="https://ladys.example/tag:ladys.example,2023-05-23:V2G-36D8"/>
            </t:labels>
          </t:TagLabel>
        </t:hasTagLabelList>
      </t:hasTagLabelCollection>
    </at:md>
  </entry>
</feed>

Note how each <t:TagLabel> above has an @xl:literalForm which supplies the author‐provided tag label, while also labelling an R·D·F resource which gives the canonical tag. This is obviously quite verbose (by nature of being X·M·L), and more sophisticated solutions will doubtless be developed in the future, but it is already good enough to start building things on top of.

Pulling in Tags

Simply having the ability to fetch tags on request is likely insufficient for complex tagging systems or large repositories. A tag authority could have thousands of tags, and a repository needs to be able to not only discover them all but also keep them up·to·date without individually fetching each one. Inspired by the I·I·I·F Change Discovery A·P·I, the current plan is to provide tag authorities with an ActivityStreams event log which chronicles their changes, which repositories can then pull to update their models. The creation of a new romantic relationship tag tag:ladys.example,2023-05-23:$BF-7607 might look like this :⁠—

{ "@context": "https://ns.1024.gdn/Tagging/discovery.context.jsonld"
, "type":
  [ "TagActivity"
  , "Create" ]
, "context": "https://ladys.example/tag:ladys.example,2023-05-23:"
, "object": "https://ladys.example/tag:ladys.example,2023-05-23:$BF-7607"
, "endTime": "2023-05-23T19:33:33-07:00"
, "label": "Add Shadow ❤ O·C tag"
, "comment": "This is the romantic equivalent of the existing (sexual) Shadow × O·C tag."
, "states":
  [ { "predicate": "a"
    , "object": "RomanticRelationshipTag" }
  , { "predicate": "prefLabel"
    , "object":
      { "@value": "Shadow ❤ O·C"
      , "@language": "en" } }
  , { "predicate": "involves"
    , "object": "https://ladys.example/tag:ladys.example,2023-05-23:N54-M773" }
  , { "predicate": "involves"
    , "object": "https://ladys.example/tag:ladys.example,2023-05-23:V2G-36D8" } ] }

A change in the preferred label of that tag might look as follows :⁠—

{ "@context": "https://ns.1024.gdn/Tagging/discovery.context.jsonld"
, "type":
  [ "TagActivity"
  , "Update" ]
, "context": "https://ladys.example/tag:ladys.example,2023-05-23:"
, "object": "https://ladys.example/tag:ladys.example,2023-05-23:$BF-7607"
, "endTime": "2023-05-23T19:34:20-07:00"
, "label": "Add variation selector 15 after heart"
, "comment": "Helps to prevent emojification on mobile platforms."
, "unstates":
  [ { "predicate": "prefLabel"
    , "object":
      { "@value": "Shadow ❤ O·C"
      , "@language": "en" } } ]
, "states":
  [ { "predicate": "prefLabel"
    , "object":
      { "@value": "Shadow ❤\uFE0E O·C"
      , "@language": "en" } } ] }

In each case note that the activity unstates and states various triples, with the subject of each assumed to be the same as the object of the activity. As these are published in an chronological, paginated list, it is easy for consumers to skip activities that they have seen before. And because each activity identifies as its object the tag being changed, they can easily skip activities for tags they don’t care about, too.

Implementation Plans

I’m currently working on a library for processing tag streams and a command‐line tool for generating them. Because this is a pull‐based model, it should be pretty easy to get a proof‐of‐concept up that is backed by static files and a minimal Caddy configuration. Then I will attempt to create a simple web viewer for browsing the tags, as well as start publishing works which use them. I’ll be sure to post updates to this blog when·ever any of this happens!

A lot of the initial discussion for this work has taken place on the Fandom Coders Discord server. Shoutout to the people there for putting up with my many 4 A·M Freeform diagrams and rants about ActivityStreams.