AutoModerator - Moderate chat using public machine-learning based models

Author Topic: AutoModerator - Moderate chat using public machine-learning based models  (Read 5592 times)


AutoModerator is a server mod for utilizing publicly available machine learning models via API to automatically moderate your server's chat. You can even fine-tune what sort of things it is sensitive to, with 8 different moderation models to utilize. Contrary to what I might be implying in the title, this mod does not implement any machine learning in torquescript as that would be horrendous. Rather, this uses Google's Perspective API to brown townyze chat after it is sent. This mod obviously cannot completely replace human moderators, but it can help moderate chat in extremely hectic situations, and can help keep your server more peaceful when there are no moderators online.


Features:
  • Keep your server moderated to some degree 24/7. Give your fleshbag human moderators a break.
  • Unbiased and impeccable scrutiny.
  • Highly customizable sensitivity. Choose what moderation models to care about, what models to ignore, and everything in-between.
  • Keeps records separately from clients. Clients rejoining won't make the AutoModerator forget what they have done.
  • Accountable. Admins are always notified about the AutoModerator's decisions, and those notifications are also logged in the console for your record keeping.
  • Fault tolerance. In the case of a service outage, chat is cached and brown townysis re-attempted until service is available again.
  • Experimental mode. You can set AutoModerator into an 'experimental mode' in which it will never take actions, but will inform you of the actions it would take under its current settings. This makes fine-tuning your sensitivity values to fit your needs much safer.
  • AutoModerator will never betray you. Probably.


Installation:
  • Acquire a Google API key with the Perspective API enabled. Follow these instructions until you have created an API key and enabled the API.
  • Download Server_AutoModerator.zip
  • Place it into your add-ons folder.
  • Start a server with Server_AutoModerator enabled.
  • Take the API key from step 1 and do /autoModeratorSetKey <your api key>.
  • Shut down the server. This will ensure the package preloader is installed correctly and your API key saves.
  • (Optional) Open Server_AutoModerator.zip and edit settings.cs to configure your settings.
  • Host anything you like with Server_AutoModerator enabled!


Settings and configuration:

AutoModerator has a lot of settings. To edit the settings, you simply open settings.cs from within the zip file and edit the settings to suit your needs, then restart your server. The comments in the file explain what each setting does.

Of these settings, the sensitivity settings are probably the most important. These settings show the 8 models the AutoModerator can evaluate chat by. A higher value in a category will make the AutoModerator care more about that category and take action more quickly when a user's chat violates that category. A value of 0 will completely disable that category. As long as a value is lower than 11, it is impossible for the AutoModerator to take action based off of a single message, but it will take action when it sees a trend of violations in that category. How quickly it determines that trend is based off the sensitivity value.

Available moderation models:
  • Toxicity - A general model aimed at detecting behavior that will generally drive people away from discussions.
  • SevereToxicity - A general model also aimed at detecting behavior that will drive people away from discussions, but focused on much more 'severe' behavior which overlooks a lot of behavior that could be considered more acceptable in informal communities.
  • IdentityAttack - A model that detects language that attacks a person or group's identity.
  • Insult - A model that detects insults.
  • Profanity - A model that detects profane language. (Default configuration disables moderating this category)
  • Threat - A model that detects threats of harm or injury towards other people or groups.
  • loveuallyExplicit - A model that detects language that is highly loveual. (Default configuration disables moderating this category)
  • Flirtation - A model that detects flirtatious language, pickup lines, etc. (Default configuation disables moderating this category)

Default values for moderation models in settings.cs:
Code: [Select]
$autoModSensitivity["Toxicity"] = 1.5;
$autoModSensitivity["severeToxicity"] = 4;
$autoModSensitivity["IdentityAttack"] = 5;
$autoModSensitivity["Insult"] = 4;
$autoModSensitivity["Profanity"] = 0;
$autoModSensitivity["Threat"] = 2;
$autoModSensitivity["loveuallyExplicit"] = 0;
$autoModSensitivity["Flirtation"] = 0;

In addition to these, there are also settings for changing the AutoModerator's name, adjusting the time of mutes, writing custom warning messages, and configuration for running your own middleware if you so desire.

Commands (Super-admin only):
  • /autoModeratorSetKey <api key>
  • /toggleAutoModerator

Attack of the middleware

Unfortunately, the Perspective API requires an HTTPS connection to connect to. This means that Blockland cannot connect directly to it, and some 'middleware' is required which Blockland connects to via HTTP, and the middleware then performs the HTTPS connection to the API to request chat brown townysis. AutoModerator is pre-configured with a public middleware I am hosting, so really you don't have to do anything about this if you don't want to. However, if you wish and have the capability you may want to run your own copy of the middleware. The greatest benefit to running your own middleware would probably be that you don't have to rely on the uptime of my server for your AutoModerator. If you want to run your own copy of the middleware, you can download a copy of it here: https://leopard.hosting.pecon.us/dl/qeorw/middleware.php. Once you have set up the middleware, you can change the middleware settings in settings.cs to configure your AutoModerator to use it.


FAQ:

Q: How does the AutoModerator punish players?
A: When a player triggers the AutoModerator, they will first receive a warning letting them know that the AutoModerator doesn't approve of their chat behavior. Subsequent incidents that trigger the AutoModerator for the same category of behavior will cause mutes of increasing duration (the durations can be modified in the configuration). The AutoModerator will never ban/kick players.

Q: Can this prevent spam/moderate gameplay problems?
A: No. This is only a chat moderator. And, at least for the moment, it doesn't deal with chat spam either.

Q: Will this stop me from saying [...] on servers??
A: That will depend entirely on how the host decides to configure the AutoModerator. By default, the AutoModerator is fairly lienient and will let you do just about anything at least once without triggering; but the host could easily decide to change the behavior to be more or less sensitive.

Q: Is this an invasion of privacy? Will the AutoModerator remember everything I say forever and stalk me on myspace?
A: Chat sent to the brown townysis API is fairly anonymized. Usernames are not sent alongside chat, and typically lacks conversational context. Additionally, the fact that most AutoModerators will run through the public middleware means that the chat coming from your server will be mixed up with the chat coming from any other servers also using AutoModerator. Sorting through this information to track people or log their chat in any meaningful way would be basically impossible. That being said, all chat sent to the brown townysis API is recorded and used to further train the machine learning models; but it's anonymous data.


« Last Edit: December 29, 2019, 06:15:01 PM by Pecon »

this is basically what i was wanting to do for years except i have no idea how ai works

I can't wait until this add-on bans all of my black friends out of my server for using the n word


Seems like moderation for server hosts who either have stuffty moderation or are themselves handicapped. I’m for this

no more free speech

I can't download the middleware, content warning screen is looping back on itself

I'm not sure why that is happening. Download it from here in the meantime: https://pecon.us/perspective/middleware.php.txt

Edit: This is now fixed, the download in the OP should work just fine.
« Last Edit: June 17, 2019, 09:41:23 PM by Pecon »

Wow. Cool mod Pecon. Seems pretty useful for dedicated servers.

i'm all for anything that would help with toxicity, nice work! only worried if it'll detect when someone's just joking around and not meaning any harm but since it only warns and mutes then i'm sure it won't be much of an issue anyway

The future is now, old man.

this is like Terminator in Blockland haha can't wait for judgement day

wtf how am i supposed to flirt with drydess ingame now?

can i still say brother?