Battling online toxic behaviour
“rekt top for mid to do nothign all game”
“f*** you quit dota f*** you”
“**** ***** *********** *** ******* *****”
All people who have engaged in some kind of online multiplayer game are probably familiar with harassments just like the examples above. While gaming in general is great, being met with personal attacks and slurs is not. According to ADL, 81 percent of U.S. adults aged 18-45 who played online multiplayer games experienced some form of harassment in 2020.
There are different theories on how to combat harassment in games. You could for example simply ban the people that behave toxic. You could also deploy a stick and carrot approach where you incentivise helpful behaviour and do the opposite with people who behave not so good. Although it is somewhat unclear how you best counter bad behaviour, I think we all can agree that identifying it is the first step. So how exactly would we go about to identify toxic behaviour?
We reached out to our friends at Save the Children Sweden who kindly supplied us with some chat data from the game Dota 2. Our goal is to build some kind of tool that would allow us to identify toxic behaviour.
We are going to do this by using Labelf. We start out by uploading the data and creating a dataset.
Next, we head over to the models page and create a new model.
We now connect the dataset we uploaded earlier to the model.
Once we are done, we have to set up a few things before we can start teaching the model. The first is the task question. The task question is mainly used to help the user formulate the issue they are trying to solve. In our case, we’ll use the task question “Is this text toxic?”
Now it is time to add the labels. We are just interested in whether a text can be classified as toxic or not toxic. We will therefore add the labels “Toxic” and “Not toxic”. We will also add examples to each label to give the model a head start! Below is an example of an example…;)
Once we are done, we can start labeling the data! Labelf will feed you with examples and you only have to press the correct label. This one is, for example, toxic!
In just a matter of minutes, Labelf starts giving you bulk recommendations. This will speed up your labeling process! Below is an example of a bulk recommendation of texts which Labelf believes should belong to the label “Toxic”.
Fast forward a bit, and we have labeled around 600 chat messages. We can now head over to the model overview and look at some data. First off, we can see that the model currently predicts that around ⅔ of all chats in this dataset are not toxic.
We can also test the model by submitting a text for it to classify!
Next, let's head over to the metrics page. As we have only labeled around 600 examples so far, the model still needs more data for both training and validation, but it will do for our example!
So, now that we have a model that is capable of finding toxic behaviour we can simply go to the API page, get an API key and start making calls to our model!
As I mentioned before, there are several ideas on how to best combat toxic behaviour in gaming. We won’t be creating a real world application in this example, i.e. there is no need to plan for how to deal with toxic behaviour once it is identified. An overall assessment of a player does however seem fair, so next we are going to create a model that looks for helpful and friendly behaviour!
I will not cover all the steps above again, but the principle is exactly the same. You simply set up a model with the labels “Helpful/friendly” and “other”. Once we are done, we can score a players chat sentiment. If you were to track this over time, you could probably look for trends et cetera.
Now, let us test out models! I’ve written a small program which allows the user to input typical chat data, which is then classified by calling our models. The results are finally scored.
Below you see the results from two games where we track an individual player. In the first game, I went for a more friendly tone. In game 2, I tried a more toxic tone.
Now that we have our models up and running, we could hypothetically create a system for incentivising friendly behaviour and warn/ban people based on their toxicity score.