Reddit follows Twitter and announces paid access to the data API
Reddit this week announced updated terms for developer tools and services, paid access to the Reddit Data API, and more native moderation tools.
While the Reddit blog explained the changes as part of creating a healthy ecosystem, The New York Times reported that paid API access would prevent large companies from using Reddit content to train large language models (LLMs) for free.
The updated documentation confirms that developers can only use Reddit content for LLM training with prior approval from Reddit and that this constitutes commercial access.
Bard cannot confirm whether Google included Reddit content in its training data as part of the publicly available datasets “probably used”.
ChatGPT cannot share a specific list of sources, but Reddit may be one of them.
Bing AI confirms that Microsoft uses multiple data sources including the Bing index and algorithm using OpenAI GPT models.
Considering that ChatGPT may have used Reddit data, one might assume Microsoft may have done the same via its partnership with OpenAI.
How much will access the Reddit Data API?
According to the updated Developer Terms effective June 19, 2023, Reddit charges for commercial access and use of the API:
- When a monetized business or service connects to the API, it counts as commercial access.
- When a business or service directly or indirectly generates revenue from Reddit Data or derived data.
The following are specific examples of monetized services from Reddit’s Developer Platforms page:
- Services that generate revenue from ads and paywalls.
- Search engines that generate revenue from ads.
- Services that charge users for access to research or data.
- Services for which users pay subscription fees.
- Services included in another product upsell.
- Services that publish Reddit content to monetized websites and apps.
- Services that use Reddit data to train models.
Researchers using the API for non-commercial purposes can continue to do so if they agree not to post Reddit sensitive data or products made with Reddit content. A fee may apply for accessing large amounts of data to cover the costs associated with bulk access to the API.
Commenting on a subreddit discussion on machine learning about the news, Reddit CTO Christopher Slowe wrote:
“We are passionate about LLM and ML research and overall very proud of the role Reddit has played in this work over the years. So while we need to do more to ensure our users’ data is shared in a responsible manner, we don’t want to hinder academic research or monetize researchers.”
Developers must also acknowledge that User Content on Reddit belongs to the user and is subject to the user’s stated rights and usage restrictions. The user agreement confirms that users retain the rights to their content, but they also grant Reddit a royalty-free license to use it.
Reddit will announce pricing details once they are finalized.
Reddit assured moderators that API changes will not impact tools that help enforce subreddit rules and remove content that violates Reddit policies.
Moderators are encouraged to follow the Mod News subreddit to keep up to date with the latest developments in moderation tools. Reddit is reportedly looking to maintain stricter community moderation to keep advertisers happy.
Will Reddit Data API use social media management tools?
If you use a third-party tool to post on Reddit, search for posts on Reddit, or generate analytics reports for your Reddit account, there are three ways this could affect you.
- You may need more access to Reddit features through some third-party services.
- You may have to pay for some third-party services that used to offer free pricing plans to absorb the increased cost of accessing the Reddit Data API.
- You may have to pay more for some third-party services than you already pay.
We’ll see the impact once Reddit releases API pricing details. Platforms that integrate with Reddit include Zapier, HootSuite, IFTTT, Feedly, Vista Social, Tray.io, and Social Rise. These platforms allow users to gain valuable insights into Reddit engagement.
What an increase you could expect when your social media management tool passes the cost on to its users: for third-party services with over a million users, it could be as little as an extra dollar per month per user. For services with fewer users, it could be significantly more.
Related News: How Twitter API Changes Disrupted Popular Services
Two weeks after users started posting images implying corporate pricing for the Twitter API, Twitter officially updated its website with pricing plans for premium access to the Twitter API v2.
It allows developers to build applications that pull and analyze data from Twitter — enabling these tools to search for tweets on a specific topic, discover influencers, and generate analytics reports on a Twitter account’s audience and engagement.
The API Also allows applications to post updates to Twitter, which allows social media management tools to schedule and send tweets to an account.
Twitter offers three pricing options for API v2.
Twitter invited users who need more data to apply Enterprise API Access via a Google form.
Enterprise APIs provide real-time coverage of public tweets with specific operators and rules, advanced search filtering, full historical access to archived tweets and account activity from specific users (tweets, replies, follows, likes, blocks, etc.).
Twitter does not list pricing for enterprise-level access to the Twitter API on its website. A tweet Shared by Wired suggests a monthly price range of $42,000 to $210,000.
Here are the documents. “Large Package” costs $210,000 per month or $2.5 million per year (tip @techmeme) https://t.co/RfGyWqpIgF pic.twitter.com/xuBiCBzoe7
— Chris Stokel-Walker ~ @[email protected] (@stokel) March 10, 2023
According to users in private Twitter developer communities who have contacted the platform for more information, it doesn’t offer any plans between Basic (for $100 a month) and Enterprise.
Twitter too written off previous versions of the API, including Standard (v1.1), Essential (v2), Elevated (v2), and Premium API access levels.
Increased costs and depreciated access impacted the following services that relied on the Twitter API.
- Life-saving weather alerts from multiple national weather services accounts were limited.
- IFTTT, an automation service with 18 million users, came across him subjects with API changes made in early April.
- Feedly, a newsreader service that integrated AI capabilities for over 18 million users in 2020, has discontinued Twitter capabilities and has begun exploring integrations with Mastodon.
- Flipboard, a news aggregation service with 145 million users, announced that Twitter feeds would remain Broken and that mastodon would be in his future.
- HootSuite, a social media management tool with 18 million users, no longer offers free plans for users who manage Twitter and other social profiles.
We reached out to the makers of several popular social media management tools for comment. So far they have been reluctant to comment as they are working with Twitter on custom solutions.
Elon Musk, CEO of Twitter (now X Corp), said paid API access would decrease Bot Abuse.
He also suggested that Microsoft’s refusal to pay Twitter API fees could lead to a legal action above allegedly “rip off the Twitter database” and “sell ours [Twitter] data to others.”
GitHub, Microsoft and OpenAI are facing a class action lawsuit in San Francisco, California, for alleged use of submitted user-generated content in violation of multiple open source license policies. Microsoft, GitHub and OpenAI have moved to dismiss the lawsuit.
The same company also filed a class-action lawsuit against Stability AI, DeviantArt, and Midjourney for using Stable Diffusion, who are accused of using copyrighted art in their training data.
SEJ will follow developments, as other companies with large repositories of public data and conversations will do in the future, in response to AI companies using them for training data.
Featured image: Dennis Diatel/Shutterstock