deaf – Data Horde https://datahorde.org Join the Horde! Wed, 12 May 2021 23:15:45 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 https://datahorde.org/wp-content/uploads/2020/04/cropped-DataHorde_Logo_small-32x32.png deaf – Data Horde https://datahorde.org 32 32 YouTube Community Contributions Archive Now Available: A Look at the Stats https://datahorde.org/youtube-community-contributions-archive-now-available-a-look-at-the-stats/ https://datahorde.org/youtube-community-contributions-archive-now-available-a-look-at-the-stats/#respond Fri, 05 Mar 2021 22:22:55 +0000 https://datahorde.org/?p=2091 The YouTube Community Contributions Archive is now available on the Internet Archive! You can download the entire collection, or simply search for and download files for a particular video. The collection is composed of 4096 ZIP archives which contain 406,394 folders and 1,361,998 files. Compressed, the collection is 3.83GB, and once decompressed, the collection is 9.46GB.

YouTube Community Contributions allowed users to create and translate closed captions/subtitles, titles, and descriptions of YouTube videos uploaded by channels who enabled the feature. Users could optionally choose to be credited for their captioning contributions.

While over 50 million videos were scanned for community contributions data, community contributions data was found for only 406,394 videos, indicating that the feature was used on only a small portion of the videos on YouTube. Some videos had YouTube Community Contributions enabled, but only had captions or metadata that was provided by the uploader. This accounted for 198,609 videos, meaning that 207,785 videos in the collection had community-contributed captions or metadata, further indicating that few videos on YouTube received community contributions. This means that approximately 0.4% of the videos that were scanned while creating this archive had community-contributed captions or metadata. This was likely because the community contributions feature was hard to discover in the YouTube interface, which limited the number of people who were aware of the feature.

Breaking down these numbers further, 80,746 videos had community-contributed draft metadata, 127,164 had community-contributed draft captions, 38,440 videos had community-contributed published metadata, 93,499 videos had community-contributed published captions, 179,366 videos had uploader-provided published metadata, and 225,466 videos had uploader-provided published captions.

YouTube Community Contributions allowed those who contributed captions to optionally be credited for their published work. 38,939 videos had credits for published captions created by the community. While captioning credits became inaccessible two weeks before the rest of the community contributions data became inaccessible, the number of videos that had captioning credits was still a considerably low number. It is estimated that, had the credits remained accessible until the rest of the community contributions feature was made inaccessible, about 80 thousand videos would have been found to have had credits.

The community contributions feature supported 196 languages, though not all languages were used equally. Below is a chart of the 25 most popular supported languages, and the number of videos that contain at least 1 file for each language (graphing all of the languages did not display well). This chart includes uploader-provided content.

When the the query excludes the uploader-provided content, we see significant shifts in the 25 most popular supported languages.

This shift indicates that community-contributions were often used to translate content.

A look at the language distribution of the collected metadata, including uploader-provided metadata, appears to be similar to the distribution of languages in the overall collection.

A look at the just the community-provided metadata provides a slightly different distribution of data.

The distribution of captioning languages, including uploader-provided captions, is similar to the collection overall.

The distribution of captioning languages, excluding uploader-provided captions, also resembles the overall collection.

It is also interesting to look at the distribution of the draft community captions and metadata that were collected in comparison to the published community captions and metadata.

The published community contributions data appears to be more evenly distributed across languages compared to the draft community contributions data.

Some users contributed many captions and were credited for their work on many videos. In total, 83,563 channels appeared in our credits collection. On average, a channel was credited on 1.47 caption tracks. 55 channels were credited for more than 50 caption tracks, and 14 channels were credited for more than 100 caption tracks! The top three channels which were credited on the most caption tracks in our collection created 255, 522, and 912 caption tracks, respectively.

Thank you to everyone who contributed to this project! Additional details about the collection itself are available in the Internet Archive item description. If you have any additional questions, please feel free to join the project Discord server!

]]>
https://datahorde.org/youtube-community-contributions-archive-now-available-a-look-at-the-stats/feed/ 0
We Just Rescued Thousands of Unpublished YouTube Captions https://datahorde.org/we-just-rescued-thousands-of-unpublished-youtube-captions/ https://datahorde.org/we-just-rescued-thousands-of-unpublished-youtube-captions/#respond Fri, 30 Oct 2020 21:33:41 +0000 https://datahorde.org/?p=1690 Community contributions were a feature on YouTube which allowed viewers to provide translations and captions for their favorite channels. Last year, YouTube realized that the feature had some problems and so began restricting it. And this year, believing the feature to be broken beyond salvation, they decided to axe it for good.

Unfortunately, in the process they were going to be getting rid of caption drafts, some of which were complete but stuck in review. So, Data Horde initiated a project to grab as many of these unpublished captions as possible, with a lot of assistance from Archive Team.

Although officially removed on September 28, we were able to continue accessing caption drafts for a whole month, until the endpoint was cut off at around 8 PM (UTz), October 28. In total, we scanned and pooled nearly 52 million items, including videos, channels, playlists, and mix playlists; for drafts. We also have two or three other bulky collections which were retrieved manually by archivists. In the coming days we will be working on organizing these drafts, with the hopes of giving them a collection on the Internet Archive.

We also have a few other ideas in mind for what to do with this massive collection of captions, so stay tuned these next couple of days to find out! In the mean time check out our YouTube Captioner’s Toolkit page for information on alternatives for the retired community captions feature.

]]>
https://datahorde.org/we-just-rescued-thousands-of-unpublished-youtube-captions/feed/ 0
YouTube is hiding Attributions to Fan-Captioners and Translators who wanted to be credited https://datahorde.org/youtube-is-hiding-attributions-to-fan-captioners-and-translators-who-wanted-to-be-credited/ https://datahorde.org/youtube-is-hiding-attributions-to-fan-captioners-and-translators-who-wanted-to-be-credited/#respond Thu, 15 Oct 2020 09:00:46 +0000 https://datahorde.org/?p=1589 On YouTube you sometimes come across videos which have subtitles for a bunch of languages. Take for instance this Japanese music video with translations in 20 languages! Have you asked yourself where these come from, is the uploader a polyglot or something?

These translations were contributed by fans of the channel, and if you were to go into the video description a few days ago you would have seen authors listed for some –but not all– of the languages. However, if you check the description now, you will notice that YouTube is now hiding all of these caption authors.

And to make matters worse, there is a good reason why the original list did not have all translators listed. To be able to show up on the list, contributors would have to check a box titledCredit my contribution which was turned off by default. So that means that anyone who was showing up on the list had explicitly volunteered to appear non-anonymously.

This comes following YouTube’s depreciation of the community contributions feature. While YouTube has assured users that they will keep published translations online, it would seem that they do not wish translators to receive any credit for these captions beyond this date.

If you as a captioner or content creator have been adversely affected by the removal of community contributions, check out our YouTube Captioner’s Toolkit for alternatives and useful resources.

]]>
https://datahorde.org/youtube-is-hiding-attributions-to-fan-captioners-and-translators-who-wanted-to-be-credited/feed/ 0
[Obsoleted] YouTube removed community translations, but there is a workaround! https://datahorde.org/how-to-submit-accept-community-translations-on-youtube-a-work-around/ https://datahorde.org/how-to-submit-accept-community-translations-on-youtube-a-work-around/#comments Mon, 12 Oct 2020 20:04:36 +0000 https://datahorde.org/?p=1555 Edit: On October 28 around 8 PM (GMT) the old caption editor was shut down for good, blocking off further contributions for good. For external alternatives to community captioning, check out our Captioner’s Toolkit page:

For the past few years YouTube had been supporting community captions, a feature which allowed users to submit captions or translations for videos of other channels. On September 28 the feature was removed and the menu to access it was hidden.

However, you might still spot new videos with community contributions published after September 28. Take a look at this video uploaded on October 9. Notice that the Caption author is Dark_Kuroh, different from the video uploader.

But how, time travel? As it so happens, even with all the menus hidden, it is still possible to access the old captions editor. This method requires the uploader to know where to check, so it’s best that if you are submitting captions or translations using this method, you let the uploader know the language and the video.


How to submit community captions

So you want to caption or translate someone else’s video… Assuming that the channel still has community captions enabled, go to the following URL:

youtube.com/timedtext_editor?action_mde_edit_form=1&v={video code}&lang={language code}

Example: http://youtube.com/timedtext_editor?action_mde_edit_form=1&v=vCxz2lSeer4&lang=en

where {video code} is the end of the video’s id and {language code} the abbreviation for the language you want to translate into. Fortunately, you can also later switch between languages, so if you don’t know the abbreviation you can use en to start open the editor for English and then switch to your actual language through the Switch Language button.

When you’re done, don’t forget to submit by clicking on the Submit Contributions button in the upper right corner.


How to accept community captions

Previously, you were able to view community submissions from the Community Tab on YouTube Studio. Unfortunately, these are now hidden. So you will need to have an idea of which videos and languages to check.

If you hadn’t enabled community contributions before it’s not too late! Just simply go into YouTube Studio > Videos and choose the videos you would like to enable contributions on. Go into Edit > Community Contributions and switch it on. Lastly, don’t forget to click on “Update Videos”.

You, as an uploader, can also theyoutube.com/timedtext_editor?action_mde_edit_form=1&v={video code}&lang={language code}to access the caption/translation submissions you have received. A good place to start from could be some of your most viewed videos, and you should definitely pay attention to your subscribers to see if they are trying to tell you to accept any of their submissions.

All you have to do when you do find a community submission is to click on the Publish or Publish edits button on the upper right corner,


While YouTube is still working on their permissions system and the community is banding together to find alternatives of their own, it’s important to endure through this transition period. So here’s hoping this tutorial helps you continue to add/receive translations on your videos for a little longer…

]]>
https://datahorde.org/how-to-submit-accept-community-translations-on-youtube-a-work-around/feed/ 2
Help Archive YouTube’s Community Contributions! https://datahorde.org/help-archive-youtubes-community-contributions/ https://datahorde.org/help-archive-youtubes-community-contributions/#respond Sat, 26 Sep 2020 00:27:21 +0000 https://datahorde.org/?p=1478 YouTube is removing their community contributions feature on September 28. In case you haven’t already heard, that’s the feature which allows viewers to add captions/subtitles, translated titles and video descriptions on videos. And YouTube seems to be pretty insistent on removing the feature, despite massive backlash.

Now although YouTube have given their word to keep published community captions (and other contributions) online, there’s a small detail many people have overlooked. Last year, YouTube restricted the feature to only allow uploaders to publish contributions. As such, there are many many unpublished captions, title/description translations stuck in review. Furthermore, no information is given on the fate of Caption Credits (people who opted to have their name shown).

Although unpublished on videos, these contributions are still visible in the community captions editor. So for the last few days we have been developing a tool to archive all this data! We have finally reached a mature enough stage that anyone reading this can now run the “YouTube Community Contribution Archiver” (YCCA) on their computer, to help us collect as many of these contribution drafts as we can:

https://github.com/Data-Horde/ytcc-archive

Ideally it’s best if channels accept their own videos, not only from a moral standpoint but also because this method hides information (formatting, stylization, authors of unpublished captions etc.) So beyond archiving these we’ve also done our best to try and reach out to content creators across YouTube.

The good news is that we won’t be archiving these for naught, projects such as YouTubexternal CC will likely be a new home for these captions and other content which have been trapped for so long.

We also have a Discord server where we are coordinating all these efforts, so feel free to hop on board if you have any questions or want to just meet the team!

Discord

Good Luck archiving! Click here to view current stats.

For further context on how we wound up in this predicament, check out our YouTube CC History series:

Part 1: Unusual Beginnings on Google Video

Part 2: Pioneering Online Accessibility

Part 3: Scaling the Waterfall, Captions for All

Part 4: The Untold Story of why YouTube is removing Community Contributions;

]]>
https://datahorde.org/help-archive-youtubes-community-contributions/feed/ 0