New Project – Data Horde https://datahorde.org Join the Horde! Sun, 18 Jul 2021 18:36:17 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 https://datahorde.org/wp-content/uploads/2020/04/cropped-DataHorde_Logo_small-32x32.png New Project – Data Horde https://datahorde.org 32 32 Help Archive Team archive older unlisted YouTube videos! https://datahorde.org/help-archive-team-archive-older-unlisted-youtube-videos/ https://datahorde.org/help-archive-team-archive-older-unlisted-youtube-videos/#comments Sat, 17 Jul 2021 06:47:57 +0000 https://datahorde.org/?p=2530 With less than 5 days left until YouTube will make most unlisted videos uploaded before 2017 private, time is running out before these videos are lost forever!

Fortunately, Archive Team has started a project to back up the metadata and 360p resolution video files for as many of these items as possible, and contributing is really easy! In addition to the videos themselves, data to be archived by this project includes the video watch page (including titles, descriptions, uploader channel, etc.), captions, comments, attributions, and thumbnails. The data archived by this project will be made available in WARC format on the Internet Archive and through the Internet Archive Wayback Machine.

To help out with this project, simply follow the steps to download and run an Archive Team Warrior, and then select the YouTube project. (You can also run the project using a Docker container using atdr.meo.ws/archiveteam/youtube-grab as the image address.)

Additionally, people with lists of unlisted video IDs/URLs and unlisted playlist IDs/URLs are encouraged to share them so they can be archived.

In order to stay up-to-date with the project and be reachable in case of an issue, project contributors are encouraged to connect and stay connected to the project discussion channel, #down-the-tube on irc.hackint.org, also available through webchat.

Archiving progress statistics for this project are available on the Archive Team project tracker, and source code is available on GitHub.

After older unlisted videos are made private on July 23, this project will shift to archiving the metadata for as many YouTube videos as possible, though not the actual video files themselves in most cases due to the amount of storage video takes and limited resources of the Internet Archive.

]]>
https://datahorde.org/help-archive-team-archive-older-unlisted-youtube-videos/feed/ 1
Help Archive YouTube’s Community Contributions! https://datahorde.org/help-archive-youtubes-community-contributions/ https://datahorde.org/help-archive-youtubes-community-contributions/#respond Sat, 26 Sep 2020 00:27:21 +0000 https://datahorde.org/?p=1478 YouTube is removing their community contributions feature on September 28. In case you haven’t already heard, that’s the feature which allows viewers to add captions/subtitles, translated titles and video descriptions on videos. And YouTube seems to be pretty insistent on removing the feature, despite massive backlash.

Now although YouTube have given their word to keep published community captions (and other contributions) online, there’s a small detail many people have overlooked. Last year, YouTube restricted the feature to only allow uploaders to publish contributions. As such, there are many many unpublished captions, title/description translations stuck in review. Furthermore, no information is given on the fate of Caption Credits (people who opted to have their name shown).

Although unpublished on videos, these contributions are still visible in the community captions editor. So for the last few days we have been developing a tool to archive all this data! We have finally reached a mature enough stage that anyone reading this can now run the “YouTube Community Contribution Archiver” (YCCA) on their computer, to help us collect as many of these contribution drafts as we can:

https://github.com/Data-Horde/ytcc-archive

Ideally it’s best if channels accept their own videos, not only from a moral standpoint but also because this method hides information (formatting, stylization, authors of unpublished captions etc.) So beyond archiving these we’ve also done our best to try and reach out to content creators across YouTube.

The good news is that we won’t be archiving these for naught, projects such as YouTubexternal CC will likely be a new home for these captions and other content which have been trapped for so long.

We also have a Discord server where we are coordinating all these efforts, so feel free to hop on board if you have any questions or want to just meet the team!

Discord

Good Luck archiving! Click here to view current stats.

For further context on how we wound up in this predicament, check out our YouTube CC History series:

Part 1: Unusual Beginnings on Google Video

Part 2: Pioneering Online Accessibility

Part 3: Scaling the Waterfall, Captions for All

Part 4: The Untold Story of why YouTube is removing Community Contributions;

]]>
https://datahorde.org/help-archive-youtubes-community-contributions/feed/ 0