fandom – Data Horde https://datahorde.org Join the Horde! Mon, 31 May 2021 16:55:57 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 https://datahorde.org/wp-content/uploads/2020/04/cropped-DataHorde_Logo_small-32x32.png fandom – Data Horde https://datahorde.org 32 32 Yahoo! Groups Archive Metadata Now Available https://datahorde.org/yahoo-groups-archive-metadata-now-available/ https://datahorde.org/yahoo-groups-archive-metadata-now-available/#comments Sun, 06 Dec 2020 13:40:00 +0000 https://datahorde.org/?p=1849 After months of work and preparation, the metadata for over 1.1 million Yahoo! Groups retrieved by Archive Team’s Python script as well as from other grabs has been organized and is now available on the Internet Archive. Special thanks to Doranwen for organizing this data.

Yahoo! Groups’ mailing lists, which are the last remaining part of Yahoo! Groups, will be shutting down in 10 days, on December 15, 2020. However, since group content is no longer accessible to the public, there is little left to archive.

Next year, volunteers will be needed to sort and organize the full group data so related groups can be uploaded to the Internet Archive together. This will make it easier to access and browse archives for multiple groups related to similar topics.

For more information about Yahoo! Groups, please see Doranwen’s blog or our Yahoo! Groups articles.

]]>
https://datahorde.org/yahoo-groups-archive-metadata-now-available/feed/ 10
October Status Update on the Save Yahoo Groups! Project https://datahorde.org/october-status-update-on-the-save-yahoo-groups-project/ https://datahorde.org/october-status-update-on-the-save-yahoo-groups-project/#respond Thu, 15 Oct 2020 23:00:35 +0000 https://datahorde.org/?p=1631 Last November, Yahoo announced that they would be shutting down many key features on the ancient Yahoo Groups. There was a major project to rescue data, lead by Archive Team and fandoms who traced their origins to Yahoo Groups. In fact we had written all about it back in January:

The story did not end there however. So let’s talk about what has transpired since…


Despite us even reporting 30 January as the final deadline, Yahoo continued to accept Get My Data (GMD) requests for about a week. So active efforts ceased around that time. Now was the waiting game, as it took a few more weeks for some of those GMD requests to process.

By late February, most of the volunteers had disbanded or moved onto other projects. But there was still much to be done. For one thing, people had rushed so much to grab everything that they could, that a lot of these group files were a total mess, not made any better by how Yahoo’s GMD exports worked. So the remaining volunteers stuck around to label their massive collection.

Doranwen, one of the leads on the Yahoo-Geddon (aka Save Yahoo Groups) project, frequently documented their progress during this time.

A few numbers and random other bits of info:

~2 TB of fandom data saved (that I know of, for now)
~200,000 confirmed fandom groups saved in some fashion
~2,000 Sims groups saved* …

*The only reason I know the Sims number is because I was tracking those groups on Google spreadsheets in order to find all of them and get volunteers to join them. For other fandoms it’s impossible to give any sort of number at this point (although I know there was a ton of LOTR, HP, Buffy, and Westlife, lol). Yahoo’s categorization was terrible and a group name doesn’t always give good clues as to whether it’s fandom/non-fandom. Getting that sort of data will take a good deal of time and work.

Doranwen, The end of Yahoo Groups – a few thoughts & stats

Another issue was that the collection was not actually unified. Archive Team had also archived a bunch of data, so the Yahoo-Geddon team continued to label those batch by batch for a few more months.

It truly is endless!!

Yahoo-Geddon volunteer, 14 July 2020

Yet another reason the Yahoo-Geddon team was taking so long was because of how meticulous they were. They worked to not only curate this collection for the sake of archiving, not only to trace the history of fandom, but also to be able to provide a rich dataset that researchers might want to use in the future.

-[Stage] 4.5b: Remember that we got a bunch of groups from scrounging the links of other groups for new groups to join? Some of the commands used to process that data generated “groups” that never existed (with http: stuck at the end, apostrophes or commas in them, etc.). Also one stage of the spreadsheet work ended up with a certain number of groups getting a duplicate version added to the spreadsheet with _dupe after the name.

So for this stage I send the spreadsheets to my assistant who runs a script against them to find groups with punctuation in them or _dupe at the end. A very very tiny number of very old (grandfathered from who knows which list service) groups actually legitimately have periods in their names, but in most cases groups with periods never existed either.

This process is fairly quick for each letter but varies greatly in what has to be done, as sometimes group folders are affected (and some punctuation marks Yahoo simply ignored everything from that mark onwards and treated the letters before it as a group name).

Yahoo Groups metadata processing steps, stage 4.5b

Sadly, Yahoo!, blind as ever to Yahoo-Geddon’s efforts, have decided to permanently shut down Yahoo Groups. While Yahoo Groups only retained its bare-bone features, this will be putting an end to some decade-old mailing lists…

On a related note, an interesting discovery Yahoo-Geddon made is that Yahoo actually has not deleted archives, photos and files but only removed public access.

The files are still there, from what I can tell! They’ve just blocked us from getting to them.

The monthly reminder emails with attachments are still coming in – and the attachments come from files in the files sections. Clearly those were never removed.

Which means that Yahoo could have chosen to grant us access to all of that for a full year before closing Groups entirely, but did not.

via the Save Yahoo Groups Discord server

Just goes to show that curation is the one half of archiving/preservation… If you would like to learn more or even participate in Yahoo Group dissection, check out the Save Yahoo Groups discord server: https://discord.com/invite/DyCNddf

]]>
https://datahorde.org/october-status-update-on-the-save-yahoo-groups-project/feed/ 0
Community Spotlight: Fanlore https://datahorde.org/community-spotlight-fanlore/ https://datahorde.org/community-spotlight-fanlore/#respond Sat, 30 May 2020 14:08:12 +0000 https://datahorde.org/?p=661 Who are they?

It’s all in the name, fanlore is an extensive lore of derivative works made by fans, an encyclopedia of fan works! Fanfiction, Fanart, Filks you name it! Fanlore is a wiki which operates under the Organization for Transformative Works.

What do they do?

Art History is considered a core discipline in the Humanities. Fanlore takes itself very seriously in that they try to cover what is a very much neglected portion of the History of Modern Art.

Fanlore doesn’t only document online/offline fan works, but also critically analyses these. They build timelines, codify tropes and research bibliographic information on authors or artists who might have been deemed “lacking in notoriety” for an actual encyclopedia.

This image has an empty alt attribute; its file name is Thecomet.jpg
The Comet, the oldest known fanzine from 1930 https://fanlore.org/wiki/Zine#Zines_in_Media_Fandom

How do they do it?

Most of this activity takes place on a MediaWiki, namely Fanlore Wiki. They currently sport a whopping 52,017 articles, 940,737 edits.

How do I sign up?

Though fanlore is technically a project under the umbrella of the OTW, OTW membership or a similar position is not necessary to join, anyone who’s willing to edit a wiki is a potential member.

So what are you waiting for? Become a lore keeper today!


Looking to discover other archiving communities? Just follow Data Horde’s Twitter List and check out our other Community Spotlights.

]]>
https://datahorde.org/community-spotlight-fanlore/feed/ 0