Comments in Tags: Examining Bookmarking Cultures on AO3 Comments in Tags: Examining Bookmarking Cultures on AO3

This paper examines the bookmarking tags present on Archive of Our Own (AO3) through a study into the practices from their public annotation. The probe examines the presence of cultures of practice, and usable research data available on the platform. Examined fics had minimums of 25.000 words and seven chapters, published over at least four months. In addition, the number of individual bookmarks was limited to works with 300 or more, selected from disparate fandoms. The number of relevant bookmarks averaged at 11% of the total, a significant representation in the data, pointing to general trends within the archive. The probe revealed several layers of communication, in the forms of interactive tags, pure text commentary, and connections to larger collections. Each of the individual bookmarks, based on content, are classed as “Annotation”, “Curation” and “Communication” for purposes of analysis. These categories also pointed to practices in annotations of targeting specific, individual, audiences. It shows several different trends in users’ application of the bookmark function. These trends of practice go beyond individual fandoms, pointing to cultures pervasive on AO3 as a platform. The probe also presents the difficulties this bring to the study of bookmark data.


Introduction
The purpose of this paper is to examine the bookmarking tags present in the "Archive of Our Own" (AO3) repository. The inspiration for this study is a pilot study conducted in the spring of 2021 (Gyhagen, 2021), where user practices were studied. The respondents of that pilot reported several different motivations for using the bookmark function on AO3, sparking an interest for closer examination of that specific data. The examination is focused on the idea that the freely taggable, and annotatable, bookmark feature might serve the function of passive and localized, communication with the creator directly, as an alternative to the more public comment options. This seemed especially likely, as AO3 does not have a direct messaging function. In short, the paper will examine a curated selection of bookmark sets from the archive. The selection is intended to gauge if there is, in fact, any presence of user/creator/work interaction in them, as well as attempt to map any other trends present.
Initially some of the inspiration for the study, as well as initial assumptions, will be given a short summary. Following this the initial criteria and methods for the collection of data will be presented, along with the reasoning for the collection methods used. The data itself will be split, and presented, in two sections. Firstly, the broad numerical data, describing the fics, and their bookmarks, followed by a section where the findings from the actual content in the text material is given a deeper reading. Here it will be examined in deeper detail and given context as bookmarking practices. The intention in this is to first examine the trend over works in general, and the then to give more detail on the nature of bookmarks in the archives. In a concluding discussion around the nature of the data collected, and the possible applications they present, as well as a few reflections on challenges and possible points of error in the study. In closing, in form of postscript, there will be some thoughts on specific derived studies, and difficulties in this work.

Background
The FanFiction community at large is built by creators, and consumers in tandem (De Kosnik, 2016). This creates a sense of ownership of the community based on mutual investment. This phenomenon applies both to the writing of fics, and the shaping of repositories. Interactions with the fics are functionally synonymous with interaction with the creators, and by extension the fan community at large. The nature of AO3s functionality, and specifically the decision to not implement direct messaging in the design of the platform, requires any interaction to be presented in a public format.
A pilot study was conducted in early 2021, to track user behavior across a period of time through the use of research diaries, with accompanying interviews.
The respondents reported, in the interviews, several different observations on their own bookmarking practices. One user expressed their use of bookmarking as a reminder to revisit the work for their own sake. A different respondent reported using bookmarks as a function for storing recommendations they had received, and those they intended to share. In fact this second respondent went so far as to seek out recommendations by visiting the bookmark collections of creators whose fics they enjoyed. This probe is spun off from that study, where these diverging uses of bookmarks were highlighted as a point of specific interest. (Gyhagen, 2021)

Method
The criteria in selecting fics for examination was designed to produce data from fics that were (1) widely read and interacted with, 1 (2) published over a significant time period, and (3) representative of disparate fandoms. The selection of fics was based on the number of words in the fic, (25.000 and above) assumed to be an indicator of the number of published chapters, as chapter count is not a searchable perimeter. The number of chapters was taken to indicate the period of publication for the work, and in the selection the lowest number of chapters is 7. The selection was also limited to fics tagged as "finished". 2 In addition to this the number of bookmarks for the work were limited to 300 and above, to increase the probability for relevant data. After selection, the fics all had a publication period spanning at least four months. They were explicitly selected to, as far as possible, reflect unrelated, fandoms, with the intention of reaching a wide range of practices. With the prevalence of overlap between fandoms, (Lulu, 2013) the presence of overlap in the selected fics is difficult to gauge, even with these active criteria. Narrowing the selection through even further criteria was deemed unnecessary for the limited scope of this examination. The bookmarks were manually collected and processed in spreadsheets. Given the relative size of the expected data, and the uncertain content in the bookmarks this was preferred over automated aggregation.
The bookmarks in the material were created both during, and after the fics publication period date. This was intentional, so as to include user interaction from those who discovered the work after the end date. This was also in part because the variations on "completed" tags were not possible to date and may have been added or modified long after the works publication period.
For this shallow exploration, seven fics were deemed enough to produce useful data. The fics selected for examination were: "Mudsnake", "a prayer for which no words exist", "Where the Cliff Greets the Sea", "Fools Gold", "Infamia", "And Baby Makes Eight", and "Superman". These were manually selected from 1 [many views and kudos] 2 Works on AO3 can be marked as "finished" by the creator, to indicate that no updates can be expected. This label exists at a higher level than the content tags. the first results pages generated by AO3's internal search engine, while logged in as a registered user. The results page was organized by the default sorting, by "Best Match", and fics were selected in descending order from the first page of presented. In the case of repetitions of fandoms, in the top fics, the results after the first were skipped over. Table 1 presents overarching metadata for the fics used in this study. It shows how each work relates to the selection criteria, and to the rest of the source material. All figures are based on the manual collection on December 21, 2021.  Table 2 presents the entirety of the bookmark material, separated by category. The three main categories for relevant content are grouped based in their different levels of interaction. "Tags" are interactable tags within the bookmarkers own collection. These mimic the function, and structure of the tags in the archive, but only generate recall locally. "Collections", are added to externally curated bookmark collections, moderated or unmoderated. "Free text" are purely non-interactable notes, generally presented in the form of commentary text. The fourth group "Blank" is a purely referential category, denoting those bookmarks in the material without content, to account for the fact that several bookmarks across all fandoms contain overlapping categories. 4

Findings
Findings presented are taken from the bookmarks with relevant content (Table 2, above). The content data points are present in 11% of the total bookmarks. This ratio matched expectations going in to the study, and was judged as prevalent enough for analysis. The number of relevant data points show the content data as a significant factor across the total bookmarks. The even spread also gives the impression of a general trend being present across the broader archive. The bookmarks are separated into the distinct categories, "Annotation", "Curation" and "Communication". These categories are universally present across the material. They serve to give an organized sense of the nature of the bookmarks in the data.
"Curation" tags, and comments, are intended in large part as "reminder" text. This also includes active retrievable information, such as interactable tags. They are more descriptive, and specialized in nature, intended for users' own functional retrieval, and also include most collection tags. On the whole they contain references to content generally, and the bookmarkers' intended use. There are tags such as "Rec", "Fav", and "To Read". They also refer to private, usually moderated collection links, such as "Teen Wolf Recs", "Reading", both from "Superman" bookmarks.
"Communication" bookmarks are generally much broader. They are directly meaningful commentary on the work, and the bookmarkers' experience. The group includes more niche, and unwrangled, 6 tags than the rest of the material. These are tags which may not see wide use outside of the specific fandom, or even beyond the individual user's collection. This occurs across both the tags and free text entries. Considering their active/inactive nature, it is worth examining them separately, but the relationship, and overlap between the two groups is notable. For simplicity, these two groups can be separated into "Additional content" and "Open commentary".
Tags which fall in the additional-content function, span a wide spectrum of content, usually relating to specific tropes of character roles, "Antagonist Ron Weasly", plot themes "Angst with Happy Ending", "Idiots in Love", and general content categories such as "plotty porn". These tags are spread across both wrangled, and unwrangled tags. The wrangled tags, suggested by the system, generally capitalize all words, if it is a tag spanning several words, whereas the tags created by the user can appear in a variety of text formats. (Price & Robinson, 2020, p.328) While this distinction is less useful when it comes to extracting the wrangled tags, they serve a purpose in the sense that they can get a picture of those tags in the data which are wholly generated by the users.
The open commentary tags in the category are much broader, in the sense that they are freer, and less uniform in their construction. They are at the same time narrower, in the sense that they seem to not be intended to generate a form of recall. Rather they seem to function as a form of commentary on the work or the readers experience of the fic.
In the Free text commentary there are also examples of referential, and even archival tags. This last group is grouped as "Referential", rather than "Archival", as they do not contain interactable tags or other active elements. 6 "Wrangling" is the term used internally within AO3 to denote tags that have been curated by volunteers, so-called "tag-Wranglers" (AO3: "Archive FAQ > Tags"). "Unwrangled tags" refers to tags which exist outside of this curated system "@ch 7", "rereader", "To Read" are examples of purely referential notations, marking the fics' archival status, and to mark the readers progress with them. "Cute Spider-Man au", "Gladiator John, Emperor Sherlock, To Read" are notes which overlap in a direct, meaningful way with the interactable tags. It can likely be assumed these are used for similar purposes, as index markers for the readers own recall.
Beyond these, though, the "communication" content bookmarks are much more prominent in this data. They usually carry some degree of affect, and are often seemingly directed towards an outside observer, whether the creator of the work, or other readers. Several of the tags are commentaries on the bookmarkers own experience with the work, directed at other users. There are also critiques seemingly intended for the creator directly. These commentaries take the form of both negative "Bare bones of a story, would be great if it were fleshed out or continued." and positive "screaming. I love this story" feedback.

"The author claims that this fic was supposed to have "cute and consequences". Hoo boy, does it ever. The story starts with Maria Hill finding out she's pregnant with a now-deceased Phil Coulson's baby, and just keeps going from there."
Finished. The one with Hermione fooling people into thinking she's a halfblood on Snape's advice with a lie and she gets adopted by him. Not the one where she specified the Dagworth-Grangers although her friends make assumptions and guess Also, Hermione and Pansy love chocolate and the House heads make bets on Sorting

Conclusion
The categorizations suggested here are, in a sense, defined by the intended audience. The "curation" tags, intended to be read systematically, the "annotation" tags, intended only, or mainly, for the bookmarkers themselves, and finally, the "communication" tags being read by secondary viewers. Through this audience based perspective we can observe some concrete trends within the bookmarking practices.
Especially the tags in Free text, such as "Bare bones of a story, would be great if it were fleshed out or continued." and "screaming. I love this story" seem to be meant to be read as direct feedback, and communication directed at the creator.
It is of note that several of the bookmarks were created after the work was completed. These should likely, in the context of this probe, be read as to more reflect a passive interaction, rather than an active one. This applies especially to the several bookmarks that were created months, or even years after the final published chapter. While these don't affect the writing of the work as it is being written, they are included here. These tags might still affect the tagging of the fic post-publication. This is especially the case for the "curation" notes. This dynamic archiving is available to all works on AO3, and it is not unreasonable to envision a creator monitoring this "hidden feedback" for possible improvements to their indexing. This is likely to be the case especially for less system savvy creators.
This democratization in the author / reader dynamic is something generally unavailable in traditional reader culture. The dynamics in the publication form of longer fics, since they are essentially published in serial form. They frequently appear in unstructured, or semi-structured release formats. This posting of chapters opens an arena for audience interaction. It is important to note, that creators do not receive any direct notification when a work is bookmarked (AO3 "Archive FAQ > Bookmarks"). It is unclear how widely known this is to the bookmarkers across the platform, making it difficult to give a clean reading of the true level of communication reaching the creator.
A tool in reading this material is the connection fandom culture has to the culture of Tumblr. On this platform there is a well-established culture of using functional tags to directly comment on, or add to, content (Brett & Maslen, 2021). These motivations, while not mutually exclusive, do generate very different readings on the intended message in the Communication tags. One reads as a desire to contribute to the metadata, aka. tag pool of the fic, and other similar works. The other is intended as direct, or indirect, communication, and therefore may belong more in the categorization applied to the free text commentary.
Crucially the probe establishes that there is a significant presence of potentially valuable data in the material. While the data available in the bookmarks is plentiful, the scope of this study only scratches the surface of the possibilities.

Limitations, Challenges, and Future Work
There are several readings available of the data presented in the AO3 bookmarks. Many of them seem to support the viability of scraping this content. The most obvious, would be an expansion of the study presented here. The data gathering in this study could be expanded to include a much larger set of fics. This might require the use of a scraper, or other form of automated resource. This would likely still have need of a qualitative reading of the contents on some level of the analysis.
A study of incomplete works could be fruitful, and likely yield different findings than those presented here. The number of post-publication tags are likely a consequence of the selection criteria applied here. This could also have been reduced by focusing on more recently completed fics, but that level of refinement in selection was deemed to be beyond the scope of this study.
Creators do not get notifications when a work is bookmarked, nor of the content in these bookmarks. This could reduce the effect of the annotations on the evolution of the fic. On the other hand, it is also plausible that the creator will seek out interactions with the fic as it is being written, for inspiration, and encouragement. Comparing the tagging to the actual comments related to the individual fics or chapters as they are published could give some insight into the point of this study still unanswered, namely whether the interactions affect the writing, or tagging of the fic.
One specific event occurred in the data collection phase of the study, which warrants specific mention. While curating the tags for "And Baby Makes Eight", the content of one of the bookmarks changed content entirely, from "RE-READ LOVE IT" to "READ √ STEVE AND MARIA LOVE IT". This event opened the possibility that content could be edited by the bookmarking user without any indication to the viewer. This made tracking the chronology of the bookmarks as it relates to the active writing of the individual fic extremely difficult. This factor also goes a long way in explaining the seeming inconsistency specifically in the "Finished" and "WiP" style tags appearing in the material seemingly out of chronological order. It is likely that they are edited, possibly several times, over the life of the fic. These changes to the bookmarks over time could serve an entry point, however, in studying the practice of a smaller fandom, or subset of dedicated users. Any such collection would need to access past iterations of the page itself, perhaps through the use of archival tools, like the archive.org WayBackMachine.