Roblox Deploys AI to Auto-Censor Player Chats in Real Time

■

Roblox launches AI-powered real-time chat rephrasing that automatically rewrites toxic messages into civil language, according to company announcement
■

The system converts phrases like ‘Hurry TF up!’ into ‘Hurry up!’ while notifying all chat participants that AI editing occurred
■

This marks a shift from Roblox’s previous hashtag-replacement filtering to active content manipulation on a platform serving 70+ million daily users
■

The deployment raises questions about AI content moderation boundaries and whether automated message editing crosses privacy lines

Roblox just flipped the switch on AI-powered content moderation that rewrites toxic chat messages before they reach other players. The gaming platform’s new real-time chat rephrasing feature, announced today via investor relations, goes way beyond simple word filters. Instead of replacing profanity with hashtag symbols, the AI now translates offensive language into sanitized versions while supposedly preserving intent. It’s a bold move for a platform with over 70 million daily active users, most of them kids.

Roblox is rewriting the rules of online content moderation, literally. The gaming giant started rolling out AI-powered chat rephrasing today that automatically sanitizes toxic messages in real time across its platform. According to the company’s announcement, the feature represents a major evolution from simple word filtering to active content manipulation.

Here’s how it works in practice. When a player types something like “Hurry TF up!” into chat, Roblox’s AI intercepts the message and rewrites it to “Hurry up!” before delivering it to other players. Everyone in the conversation gets notified that the text has been rephrased to maintain civility, but the sanitized version is what actually appears. It’s a far cry from the platform’s previous approach, which simply replaced banned words and phrases with strings of “#” symbols.

Roblox is pitching this as keeping things civil while preserving “gameplay flow.” The hashtag method often rendered messages incomprehensible, frustrating players who weren’t actually trying to be toxic. Someone typing “I love this assassin character!” might see their message appear as “I love this ######### character!” because the system flagged part of the word. The new AI system supposedly understands context well enough to clean up actual toxicity while leaving innocent messages alone.