tech234a – Data Horde https://datahorde.org Join the Horde! Sat, 11 Sep 2021 00:51:02 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 https://datahorde.org/wp-content/uploads/2020/04/cropped-DataHorde_Logo_small-32x32.png tech234a – Data Horde https://datahorde.org 32 32 Help Archive Team Archive public Google Drive files before September 13! https://datahorde.org/help-archive-team-archive-public-google-drive-files-before-september-13/ https://datahorde.org/help-archive-team-archive-public-google-drive-files-before-september-13/#comments Sat, 11 Sep 2021 00:50:54 +0000 https://datahorde.org/?p=2637 On September 13, Google is going to start requiring longer URLs to access many Google Drive files, breaking links to public files across the web unless users opt out! Because of this, Archive Team has launched a project to archive as many publicly-available Google Drive files as possible and make them accessible on the Internet Archive Wayback Machine. (Note that video files are not included at this time due to their size.)

You can help! Simply follow the steps to download and run an Archive Team Warrior, and then select the Google Drive project. (You can also run the project using a Docker container using atdr.meo.ws/archiveteam/google-drive-grab as the image address.)

Additionally, people with lists of public Google Drive file URLs are encouraged to share them so they can be archived.

In order to stay up-to-date with the project and be reachable in case of an issue, project contributors are encouraged to connect and stay connected to the project discussion channel, #googlecrash on irc.hackint.org, also available through webchat.

Archiving progress statistics for this project are available on the Archive Team project tracker, and source code is available on GitHub.

]]>
https://datahorde.org/help-archive-team-archive-public-google-drive-files-before-september-13/feed/ 2
YouTube Attributions to be removed in September https://datahorde.org/youtube-attributions-to-be-removed-in-september/ https://datahorde.org/youtube-attributions-to-be-removed-in-september/#respond Sat, 28 Aug 2021 22:59:17 +0000 https://datahorde.org/?p=2599 On August 18, YouTube quietly announced that due to “low usage”, they will be removing video attribution pages. One version of the announcement said that this will happen in “early September” and another said “after September”. YouTube instead recommends using the description to attribute videos.

Video attribution pages were intended to list which videos were used to make the current video. This created a network of videos, connecting remixes/compilations/shorter versions of videos with their original source videos. These pages also helped ensure that credit was given to the original authors of video clips, even if the original uploader might have forgotten to do so.

Until some point between 2017 and 2019, video attribution pages also listed the videos that used the current video. The attributions were automatically associated with a video when someone used the online YouTube video editor to add a Creative Commons-licensed clip to their video. If a video had attributions, a link to its attributions page would automatically be placed below its description. On the mobile YouTube app, this link would open the attributions page in the user’s web browser, but more recently all of the attributions links in the mobile app would open the channel that claimed the “Attribution” custom URL.

The video attributions page is one of the oldest pages on YouTube, and is believed to be the last page on YouTube that still uses the old, pre-polymer layout. In fact, the HTML content of the attribution web pages (excluding headers, footers, and video thumbnail overlays) has not been modified since 2011!

No formal archival efforts have been initiated as of this time, but it is anticipated that one will start soon.

]]>
https://datahorde.org/youtube-attributions-to-be-removed-in-september/feed/ 0
Help Archive Team archive older unlisted YouTube videos! https://datahorde.org/help-archive-team-archive-older-unlisted-youtube-videos/ https://datahorde.org/help-archive-team-archive-older-unlisted-youtube-videos/#comments Sat, 17 Jul 2021 06:47:57 +0000 https://datahorde.org/?p=2530 With less than 5 days left until YouTube will make most unlisted videos uploaded before 2017 private, time is running out before these videos are lost forever!

Fortunately, Archive Team has started a project to back up the metadata and 360p resolution video files for as many of these items as possible, and contributing is really easy! In addition to the videos themselves, data to be archived by this project includes the video watch page (including titles, descriptions, uploader channel, etc.), captions, comments, attributions, and thumbnails. The data archived by this project will be made available in WARC format on the Internet Archive and through the Internet Archive Wayback Machine.

To help out with this project, simply follow the steps to download and run an Archive Team Warrior, and then select the YouTube project. (You can also run the project using a Docker container using atdr.meo.ws/archiveteam/youtube-grab as the image address.)

Additionally, people with lists of unlisted video IDs/URLs and unlisted playlist IDs/URLs are encouraged to share them so they can be archived.

In order to stay up-to-date with the project and be reachable in case of an issue, project contributors are encouraged to connect and stay connected to the project discussion channel, #down-the-tube on irc.hackint.org, also available through webchat.

Archiving progress statistics for this project are available on the Archive Team project tracker, and source code is available on GitHub.

After older unlisted videos are made private on July 23, this project will shift to archiving the metadata for as many YouTube videos as possible, though not the actual video files themselves in most cases due to the amount of storage video takes and limited resources of the Internet Archive.

]]>
https://datahorde.org/help-archive-team-archive-older-unlisted-youtube-videos/feed/ 1
Why We Shouldn’t Worry About YouTube’s Inactive Accounts Policy https://datahorde.org/why-we-shouldnt-worry-about-youtubes-inactive-accounts-policy/ https://datahorde.org/why-we-shouldnt-worry-about-youtubes-inactive-accounts-policy/#comments Wed, 14 Jul 2021 02:31:52 +0000 https://datahorde.org/?p=2481 From time to time, YouTube users and archivists worry that, because of YouTube’s Inactive Accounts Policy, YouTube channels will be deleted if they are left inactive for more than six months. The policy reads:

Inactive accounts policy
In general, users are expected to be active members within the YouTube community. If an account is found to be overly inactive, the account may be reclaimed by YouTube without notice. Inactivity may be considered as:
- Not logging into the site for at least six months
- Never having uploaded video content
- Not actively partaking in watching or commenting on videos or channels

This policy is not new. Much of the text of this policy actually dates back to at least June 17, 2009, when the policy was originally introduced as part of YouTube’s username policy for username squatting. At the time, the policy was designed to prevent inactive users from holding valuable usernames or usernames that match brand names. This is because, from YouTube’s launch in 2005 until March 2012, every YouTube channel had to choose a unique username that would form its permanent /user/ URL. Additionally, from 2012 until November 2014, all channels could optionally sign up with or create a permanent username without having to meet any eligibility requirements. Because usernames were in such high demand, the original policy stated that the usernames of reclaimed accounts may be “made available for registration by another party” and that “YouTube may release usernames in cases of a valid trademark complaint”, though the former passage was removed by October 9, 2010.

Since November 24, 2014, YouTube’s username system has been replaced by a custom URL system with minimum eligibility requirements that are more difficult to meet using inactive accounts or accounts created just for squatting usernames. As of July 2021, accounts need to be at least 30 days old, have at least 100 subscribers, and have uploaded a custom profile picture and banner in order to claim a custom URL. Additionally, with the new system, YouTube is able to “change, reclaim, or remove” these custom URLs without otherwise affecting the associated channel. As such, the Username Squatting Policy was no longer necessary for its original purpose.

At some time between February 2013 and March 2014, the Username Squatting Policy was renamed to the “Inactive accounts policy” and the sentence about releasing trademarked usernames was removed. As of July 2021, the policy has not been revised since then. It also appears that the policy has fallen into disuse: in March 2021, a Reddit user posted “As a trusted flagger I can tell you that YouTube hasn’t used that policy in years.”

Additionally, at some point between September 2014 and March 2015, YouTube created a new support article which stated that “Once a username was taken by a channel it could never be used again, even if the original channel was inactive or deleted”, which directly contradicts the purpose of the original Username Squatting Policy.

Some archivists fear that the large amounts of video data being stored from inactive accounts may be lost if YouTube decides to delete those accounts. However, it appears that YouTube has found a way to help offset some of the cost of storing these videos. On November 18, 2020, YouTube announced that they would enable advertisements on videos posted by channels that are not members of the YouTube Partner Program. While no explanation was given for this change, it was announced during the same 3 months in which Google announced several major changes that would reduce the amount of storage being used across the company’s products [1] [2], so it can be inferred that this policy change was made for the same reason.

So, why does the policy still exist? One possible reason is that the policy is simply forgotten. YouTube’s support site is large and contains many articles, and many of them have outdated passages and describe discontinued features that were removed long ago [1] [2] [3] [4]. Many pages also contain references to the old version of YouTube, which has been inaccessible to the public since December 2020 [1] [2] [3]. Also, as of 2021, the text of the Inactive Accounts Policy hasn’t been updated for at least 7 years, though the surrounding page was updated in September 2020 to remove the policy on vulgar language, which had been given its own page. So, YouTube could have simply forgotten that the Inactive Accounts Policy exists, and the people responsible for updating the support pages could have just left the policy because they weren’t specifically instructed to remove it.

Another possible reason the policy still exists is that, while unlikely, YouTube could be preserving the policy for possible use in the future. However, YouTube would provide advance warning to users, likely via email and updated support articles, before enacting this policy, and since we have seen none of those shared online, we have no reason to believe this policy is being enacted at the current time.

So, while YouTube has an Inactive Accounts Policy, it hasn’t used it in years because URLs on the service can now be changed and removed without deleting and recreating accounts, and it appears it has found a way to help offset some of the cost of storing the videos uploaded by these channels. At this time, users and archivists shouldn’t worry about this policy, but should instead focus on specific content removal announcements such as annotations, liked videos lists, draft community contribution closed captions and metadata, playlist notes as well as older unlisted videos.

]]>
https://datahorde.org/why-we-shouldnt-worry-about-youtubes-inactive-accounts-policy/feed/ 2
Site Update: New service for email subscriptions https://datahorde.org/site-update-new-service-for-email-subscriptions/ https://datahorde.org/site-update-new-service-for-email-subscriptions/#respond Wed, 30 Jun 2021 04:46:12 +0000 https://datahorde.org/?p=2453 Because Google will be shutting down Feedburner’s email subscription service in July, we will be migrating our email subscriptions to Feedio. Existing subscribers should keep an eye out for a subscription invitation from our new email service within the next few days and follow the link to confirm their subscription. We will not be automatically unsubscribing users from the Feedburner email list, so subscribers will continue to receive emails from Feedburner until that service is shut down unless they unsubscribe from that list beforehand. We are still waiting for Feedio to fully activate our account, so it might take a few days for email notifications for new posts to start being sent. If any new users want to subscribe to receive new post notifications via email, the subscription widget in the sidebar has been updated to use the new email delivery service.

This change only affects our email subscriptions; our RSS feed will continue to be provided by Feedburner.

]]>
https://datahorde.org/site-update-new-service-for-email-subscriptions/feed/ 0
Help Archive Team Save Yahoo! Answers! https://datahorde.org/help-archive-team-save-yahoo-answers/ https://datahorde.org/help-archive-team-save-yahoo-answers/#comments Thu, 22 Apr 2021 02:35:47 +0000 https://datahorde.org/?p=2207 Yahoo! Answers is shutting down on May 4th, 2021, taking nearly 15 years worth of content with it!

Archive Team is trying to save as much of it as possible, and you can help!

By setting up the Archive Team Warrior and letting it run in the background, you can back up questions and answers from Yahoo! Answers and make them available in the Internet Archive Wayback Machine. The Archive Team Warrior is easy to set up and uses very few of your system resources. The Archive Team Warrior can work on up to 6 items concurrently.

Advanced users can also run the project with Docker using the atdr.meo.ws/archiveteam/yahooanswers-grab Docker image, which can easily be deployed on large networks and allows for running projects at a higher concurrency rate per container (maximum 20 concurrent items, though users running the project with this many concurrent items might be rate-limited by Yahoo!).

If you need any help or have any questions about the project, please feel free to refer to the project page on the Archive Team Wiki or ask in Archive Team’s IRC channel for the Yahoo! Answers project. (Please be patient and stay connected if your question isn’t immediately answered so you don’t miss any responses.)

]]>
https://datahorde.org/help-archive-team-save-yahoo-answers/feed/ 1
YouTube Community Contributions Archive Now Available: A Look at the Stats https://datahorde.org/youtube-community-contributions-archive-now-available-a-look-at-the-stats/ https://datahorde.org/youtube-community-contributions-archive-now-available-a-look-at-the-stats/#respond Fri, 05 Mar 2021 22:22:55 +0000 https://datahorde.org/?p=2091 The YouTube Community Contributions Archive is now available on the Internet Archive! You can download the entire collection, or simply search for and download files for a particular video. The collection is composed of 4096 ZIP archives which contain 406,394 folders and 1,361,998 files. Compressed, the collection is 3.83GB, and once decompressed, the collection is 9.46GB.

YouTube Community Contributions allowed users to create and translate closed captions/subtitles, titles, and descriptions of YouTube videos uploaded by channels who enabled the feature. Users could optionally choose to be credited for their captioning contributions.

While over 50 million videos were scanned for community contributions data, community contributions data was found for only 406,394 videos, indicating that the feature was used on only a small portion of the videos on YouTube. Some videos had YouTube Community Contributions enabled, but only had captions or metadata that was provided by the uploader. This accounted for 198,609 videos, meaning that 207,785 videos in the collection had community-contributed captions or metadata, further indicating that few videos on YouTube received community contributions. This means that approximately 0.4% of the videos that were scanned while creating this archive had community-contributed captions or metadata. This was likely because the community contributions feature was hard to discover in the YouTube interface, which limited the number of people who were aware of the feature.

Breaking down these numbers further, 80,746 videos had community-contributed draft metadata, 127,164 had community-contributed draft captions, 38,440 videos had community-contributed published metadata, 93,499 videos had community-contributed published captions, 179,366 videos had uploader-provided published metadata, and 225,466 videos had uploader-provided published captions.

YouTube Community Contributions allowed those who contributed captions to optionally be credited for their published work. 38,939 videos had credits for published captions created by the community. While captioning credits became inaccessible two weeks before the rest of the community contributions data became inaccessible, the number of videos that had captioning credits was still a considerably low number. It is estimated that, had the credits remained accessible until the rest of the community contributions feature was made inaccessible, about 80 thousand videos would have been found to have had credits.

The community contributions feature supported 196 languages, though not all languages were used equally. Below is a chart of the 25 most popular supported languages, and the number of videos that contain at least 1 file for each language (graphing all of the languages did not display well). This chart includes uploader-provided content.

When the the query excludes the uploader-provided content, we see significant shifts in the 25 most popular supported languages.

This shift indicates that community-contributions were often used to translate content.

A look at the language distribution of the collected metadata, including uploader-provided metadata, appears to be similar to the distribution of languages in the overall collection.

A look at the just the community-provided metadata provides a slightly different distribution of data.

The distribution of captioning languages, including uploader-provided captions, is similar to the collection overall.

The distribution of captioning languages, excluding uploader-provided captions, also resembles the overall collection.

It is also interesting to look at the distribution of the draft community captions and metadata that were collected in comparison to the published community captions and metadata.

The published community contributions data appears to be more evenly distributed across languages compared to the draft community contributions data.

Some users contributed many captions and were credited for their work on many videos. In total, 83,563 channels appeared in our credits collection. On average, a channel was credited on 1.47 caption tracks. 55 channels were credited for more than 50 caption tracks, and 14 channels were credited for more than 100 caption tracks! The top three channels which were credited on the most caption tracks in our collection created 255, 522, and 912 caption tracks, respectively.

Thank you to everyone who contributed to this project! Additional details about the collection itself are available in the Internet Archive item description. If you have any additional questions, please feel free to join the project Discord server!

]]>
https://datahorde.org/youtube-community-contributions-archive-now-available-a-look-at-the-stats/feed/ 0
A Preview of the Flash Kill Switch: January 12, 2021 and Beyond https://datahorde.org/a-preview-of-the-flash-kill-switch-january-12-2021-and-beyond/ https://datahorde.org/a-preview-of-the-flash-kill-switch-january-12-2021-and-beyond/#respond Fri, 01 Jan 2021 01:00:00 +0000 https://datahorde.org/?p=1916 Adobe Flash has a kill switch that has been included in versions released since mid-2020. On January 12, 2021, Adobe will activate this kill switch, rendering internet-based Flash content inaccessible.

Do you still use Flash Player? Data Horde is conducting a survey to see how frequently people continue to use Flash Player even at the very end of its lifespan. It would mean a lot to us if you could spare 5-10 minutes to complete a very short survey.


Contents:


Why is there a kill switch built into my Flash Player?

Adobe has decided to retire Flash Player, which means no more updates. Adobe has stated their reasoning behind the inclusion of this kill switch is to “help secure” users, seeing as Flash Player might still have undiscovered security vulnerabilities.

In doing so, Adobe hopes to no longer be liable for any damage caused by vulnerabilities present in Flash Player, by making Flash Player outright unusable. Overkill? Kind of. For more details see Update: 12 Day Grace Period on the Flash Player Killswitch


What does the Flash Player kill switch do?

Here is an in-depth preview of the effects of this kill switch. When accessing Flash content in a web browser, if your system time is set to January 12, 2021, 12:00AM (midnight) or later, Flash content will be replaced with a Flash information button. Note that if you have already loaded Flash content before this time, you will be able to continue viewing it until you reload the webpage in your browser.

While you roll over the widget, it shows a blue outline.

While you left click the widget, it turns blue.

Right clicking still works as normal for a Flash applet.

Clicking the icon opens a new tab to the URL https://www.adobe.com/go/fp, which redirects to https://www.adobe.com/products/flashplayer/end-of-life.html (Adobe’s End-Of-Life info page) as of this writing.


Browser Support

Browsers will also be removing support for the Flash plugin. The following updates are scheduled to remove Flash support in common browsers:

Most of the browser updates simply drop support for loading the Flash plugin.

The Flash component is fully removed in Chrome 88. Flash Player permissions are removed from site content settings, and clicking on a Flash download link no longer prompts the user to allow Flash (the link functions as a normal).


Chrome Specifics

If a website specifies fallback HTML code, it is displayed.

If a website does not specify fallback HTML code, Chrome 88 replaces Flash components with a message stating “Adobe Flash Player is no longer supported”. Nothing happens when attempting to right click the widget.

Other Chromium-based browsers, including the current versions of Edge, Opera, and Vivaldi are expected to behave similarly.

Chrome also plans to block Flash from loading in previous versions of Chrome by marking the component as outdated (you can view the components manager in Chromium-based browsers at chrome://components).

After loading a page and approving Flash permissions, you will get a message bar similar to this, stating that “Adobe Flash Player was blocked because it is out of date”.

Right-clicking on the Flash widget provides a few options:

Clicking “Update plugin” goes directly to Adobe’s Flash Player End-Of-Life page. Clicking “Learn more” opens the URL https://support.google.com/chrome/?p=ib_outdated_plugin, which redirects to Google’s Flash Player End-Of-Life support article.

After clicking “Update plugin”, returning to the tab with the Flash player widget reveals a new message on the Flash player widget: “When finishes updating, reload the page to activate it” [sic].

As of this writing, the “Run this time” option functions as intended: it allows Flash player to run once, but accessing Flash content after reloading or navigating to another page requires clicking the button again.

Attempting to update the Adobe Flash Player component from Chrome’s component manager at chrome://components reveals no available updates.

This particular method of blocking Flash appears not to be currently active in other Chromium-based browsers at this time.

Note: Loading a custom PPAPI Flash Player DLL through command-line flags to Chrome does not seem to fix the problem, but further testing (including with older and modified versions of Flash) is needed. A sample command would be "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --disable-bundled-ppapi-flash --ppapi-flash-path="<path to Flash Player DLL>". Note that you should fully quit any existing instances of Chrome before using this command by entering chrome://quit in the address bar, and that you can verify the command flags and Flash DLL used the current instance of Chrome by visiting chrome://version. Using a custom Flash DLL Flash path seems to cause Chrome to always report the Flash version as 11.2.999.999. Special thanks to @krum110487 and @nosamu from the Flashpoint Discord server for providing this information.


Firefox Specifics

Flash support is also entirely removed in Firefox 85. The plugins manager at about:plugins no longer lists Shockwave Flash, even if it is installed on the system.

In Firefox 85, if a website does not specify fallback HTML code, nothing replaces the Flash widget.

If fallback HTML code is specified, it will be displayed.

While Firefox 85’s lack of warning or user notification of missing Flash content when there is no fallback HTML code may seem weird, it is likely that this behavior follows the HTML standard more closely.


Other Browsers

Internet Explorer and Edge Legacy aren’t expected to exhibit any special behavior for the Flash shutdown, however, on January 12, 2021, Microsoft is expected to release an update to remove the preinstalled Flash player used by those browsers from Windows 8.1 and 10.


Workarounds

The Flash Player kill switch will break your Flash Player, but there are still workarounds to continue to view/interact with Flash Media after January 12!

There are many projects such as Ruffle and Flashpoint to ensure compatibility and preservation of Flash media, so hang onto your .swf’s!

For a thorough list of workarounds and other resources see this page.

Suggestions? Contact [email protected]

Update January 22, 2021: Linked new EOL killswitch workarounds.

Update January 11, 2021: Added screenshots and additional information for Chrome’s interface for marking the Flash Player component as outdated. Added a note that loading a custom Flash Player DLL in Chrome does not seem to prevent the block.

Special thanks to @themadprogramer for expanding the context for the article, adding a video, providing workarounds, and linking related Data Horde content.

]]>
https://datahorde.org/a-preview-of-the-flash-kill-switch-january-12-2021-and-beyond/feed/ 0
Yahoo! Groups Archive Metadata Now Available https://datahorde.org/yahoo-groups-archive-metadata-now-available/ https://datahorde.org/yahoo-groups-archive-metadata-now-available/#comments Sun, 06 Dec 2020 13:40:00 +0000 https://datahorde.org/?p=1849 After months of work and preparation, the metadata for over 1.1 million Yahoo! Groups retrieved by Archive Team’s Python script as well as from other grabs has been organized and is now available on the Internet Archive. Special thanks to Doranwen for organizing this data.

Yahoo! Groups’ mailing lists, which are the last remaining part of Yahoo! Groups, will be shutting down in 10 days, on December 15, 2020. However, since group content is no longer accessible to the public, there is little left to archive.

Next year, volunteers will be needed to sort and organize the full group data so related groups can be uploaded to the Internet Archive together. This will make it easier to access and browse archives for multiple groups related to similar topics.

For more information about Yahoo! Groups, please see Doranwen’s blog or our Yahoo! Groups articles.

]]>
https://datahorde.org/yahoo-groups-archive-metadata-now-available/feed/ 10
Google Cuts Free Unlimited Storage in Photos, Drive https://datahorde.org/google-cuts-free-unlimited-storage-in-photos-drive/ https://datahorde.org/google-cuts-free-unlimited-storage-in-photos-drive/#respond Thu, 12 Nov 2020 06:22:45 +0000 https://datahorde.org/?p=1746 On Wednesday, Google announced that it will be ending free unlimited storage of high quality uploads in Google Photos, as well as free unlimited storage of Docs, Sheets, Slides, Drawings, Forms, and Jamboard files. This change will go into effect on June 1, 2021. Existing files and photos will remain unaffected, but editing a Google Drive file will make it count against the storage limit.

This change will not apply to Sites, Keep, Blogger, or YouTube.

Google also announced that if a user does not use Gmail, Google Drive, or Google Photos for two years, they may delete the user’s data from the product within which the user is inactive. Additionally, if a user remains above their storage limit for more than two years, their data may be deleted.

These changes align with other recent changes made by Google. For example, starting Friday, November 13, files that have been in a user’s Google Drive trash folder for more than 30 days will be permanently deleted. Additionally, Google recently replaced G Suite with Google Workspace, making unlimited storage only available on enterprise-level plans.

All the while, Google has been increasingly encouraging customers to subscribe to their new Google One service, which provides expanded storage and other Google benefits to customers.

]]>
https://datahorde.org/google-cuts-free-unlimited-storage-in-photos-drive/feed/ 0