Reddit announced this week updated terms for developer tools and services, paid access to the Reddit Data API, and more native moderation tools.
While the Reddit blog explained the changes as part of creating a healthy ecosystem, the New York Times reported that paid API access would stop large companies from using Reddit content to train large language models (LLMs) for free.
Updated documentation confirms that developers can only use Reddit content for LLM training with prior approval from Reddit and that it constitutes commercial access.
Bard cannot confirm if Google included Reddit content in its training data as part of the publicly available datasets “likely used.”
ChatGPT cannot share a specific list of sources, but Reddit may be one of them.
Bing AI confirms that Microsoft uses multiple data sources, including the Bing index and algorithm with OpenAI GPT models.
Considering that ChatGPT may have used Reddit data, one could assume that Microsoft may have too, via its partnership with OpenAI.
How Much Will Access To The Reddit Data API?
According to the updated developer terms – effective June 19, 2023 – Reddit will charge for what it deems as commercial access and use of the API:
- If a monetized business or service connects with the API, it is considered commercial access.
- If a business or service generates revenue, directly or indirectly, from any Reddit data or derived data.
The following are specific examples of monetized services from Reddit’s Developer Platform page:
- Services that generate revenue from ads and paywalls.
- Search engines that generate revenue from ads.
- Services that charge users for access to research or data.
- Services for which users pay subscription fees.
- Services included in another product upsell.
- Services that publish Reddit content on monetized websites and apps.
- Services that use Reddit data for training models.
Researchers who use the API for non-commercial purposes may continue to do so if they agree not to release sensitive Reddit data or products built using Reddit content. Access to large volumes of data may incur a fee to cover costs associated with bulk access to the API.
Christopher Slowe, CTO of Reddit, commented on a Machine Learning subreddit discussion about the news, writing:
“We are excited about LLM and ML research and overall very proud of the role that Reddit has played in that work over the years. So, while we do need to do more to ensure that our users’ data is being shared in a responsible manner, we are not looking to inhibit academic research or make money from researchers.”
Developers must also acknowledge that user content on Reddit belongs to the users and is subject to the user’s specified rights and usage restrictions. The user agreement confirms that users retain the rights to their content, but they also grant Reddit a royalty-free license to use it.
Reddit will share pricing details as soon as they are finalized.
Reddit assured moderators that API changes will not affect tools that assist in enforcing subreddit rules and removing content that violates Reddit policies.
Moderators are encouraged to follow the Mod News subreddit to stay updated about the latest developments in moderation tools. Reddit reportedly strives to maintain stricter community moderation to keep advertisers happy.
Will Reddit Data API Social Media Management Tools?
If you use any third-party tool to post on Reddit, search for posts on Reddit, or create analytics reports for your Reddit account, there are three ways this could impact you.
- You may need more access to Reddit features through some third-party services.
- You may have to start paying for some third-party services that once offered free pricing plans to absorb the increased cost of accessing the Reddit Data API.
- You may have to pay more than you already are for some third-party services.
We will see the impact once Reddit releases API pricing details. Platforms that integrate with Reddit include Zapier, HootSuite, IFTTT, Feedly, Vista Social, Tray.io, and Social Rise. These platforms allow users to get valuable insights into Reddit engagement.
As for what kind of increase you could expect if your social media management tool passes the cost to its users: For third-party services with over a million users, it could be as little as an extra dollar per month per user. For services with fewer users, it could be much more.
Related News: How Changes to Twitter API Disrupted Popular Services
Two weeks after users began circulating images implying enterprise pricing for the Twitter API, Twitter officially updated its website with pricing plans for premium access to Twitter API v2.
It allows developers to build applications that retrieve and analyze data from Twitter – allowing these tools to search for Tweets on a specific topic, discover influencers, and create analytics reports about a Twitter account’s audience and engagement.
The API also allows applications to post updates to Twitter, which lets social media management tools schedule and post Tweets to an account.
Twitter offers three pricing options for API v2.
Twitter invited users who need more data to apply for enterprise API access via a Google Form.
Enterprise APIs offer real-time coverage of public Tweets with specific operators and rules, advanced search filtering, full historical access to archived Tweets, and account activity by particular users (tweets, replies, follows, likes, blocks, etc.).
Twitter does not list pricing for enterprise-level Twitter API access on its website. A Tweet shared by Wired suggests a $42,000 – $210,000 monthly price range.
Here’s the docs. “Large package” is $210,000 a month, or $2.5 million a year (tip @techmeme) https://t.co/RfGyWqpIgF pic.twitter.com/xuBiCBzoe7
— Chris Stokel-Walker ~ @email@example.com (@stokel) March 10, 2023
According to users in private Twitter developer communities who have contacted the platform for more information, it does not offer any plans between Basic (at $100 per month) and Enterprise.
Twitter also depreciated previous versions of the API, including Standard (v1.1), Essential (v2), Elevated (v2), and Premium API access tiers.
Increased costs and depreciated access impacted the following services that relied on the Twitter API.
- Life-saving weather alerts from several National Weather Service accounts were limited.
- IFTTT, an automation service with 18 million users, ran into issues with API changes made at the beginning of April.
- Feedly, a news reader service that integrated AI features in 2020 for over 18 million users, retired Twitter features and began exploring integrations with Mastodon.
- Flipboard, a news aggregation service with 145 million users, announced that Twitter feeds would remain broken and that Mastodon would be in its future.
- HootSuite, a social media management tool with 18 million users, stopped offering free plans to users who manage Twitter and other social profiles.
We contacted the makers of several popular social media management tools for comment. So far, they’ve hesitated to comment as they work with Twitter on custom solutions.
Elon Musk, Twitter (Now X Corp) CEO, said paid API access would reduce bot abuse.
He also suggested Microsoft’s refusal to pay Twitter API fees could lead to a lawsuit over allegedly “ripping off the Twitter database” and “selling our [Twitter] data to others.”
GitHub, Microsoft, and OpenAI face a class action lawsuit in San Francisco, California, for allegedly leveraging user-generated content submitted, violating several open-source licensing guidelines. Microsoft, GitHub, and OpenAI have asked to have the lawsuit dismissed.
The same firm also filed a class action lawsuit against Stability AI, DeviantArt, and Midjourney for using Stable Diffusion, accused of using copyrighted art in its training data.
SEJ will follow developments as other companies with large repositories of public data and conversation will do in the future in response to AI companies using them for training data.
Featured image: Dennis Diatel/Shutterstock