Remove the wordfilter

Started by anthunk

anthunk Lv 1

The wordfilter which has recently been installed has become a major nuisance for the Foldit community.

The most pressing problem is that it blocks perfectly innocent words (e.g. "monkey", "stupid") that appear in chat. This hinders communication and adds confusion. Also, some strings (http://en.wikipedia.org/wiki/Scunthorpe_problem) have the potential to be blocked, causing the same problem.

Additionally, it fails to block the potty-mouth language it was intended to handle. There are too many ways to spell a given word for a filter to handle. (this has been well-documented)

The global channel already has a team of mods who effectively prevent the use of swear words already. In this environment, said filter is superfluous; and on group and private channels, the use of language should be left to its owners' discretion.

Numerous folders have expressed concerns regarding this wordfilter, and so this change be rolled back, as it has little or no utility, and impedes communication in Foldit's chat.

lynnai Lv 1

I have yet to find the extent of this filter and although the intent is harmless enough "stupid" is not an inherently offensive word, once in a while one needs to be able to point out that something is idiotic.

It requires very little imagination to work around the filter and if you don't have enough of that then you probably can't spell well enough for it to filter you in the first place.

Cute but daft.

tokens Lv 1

I was intrigued to find that the word "monkey" is blocked by the word filter. I never knew this was a filthy word (I blame my English teachers for not teaching me). However, given this new information, I am surprised to find papers with such provocative content as the paper "Crystal structure of a monomeric retroviral protease solved by protein folding game players". This paper repeatedly talks about some "Mason-Pfizer monkey virus (M-PMV) retroviral protease". Such papers clearly need to be filtered before they are published.

jflat06 Staff Lv 1

The words which are blocked by default are taken from a template list, and aren't necessarily tuned to Foldit.

Instead of removing it, we are working with players to add/remove words to/from the list to make it more reasonable.

anthunk Lv 1

That still doesn't address the other problems mentioned (see third and fourth paragraphs).

Also, may we obtain the wordfilter used now, in order to characterize its behavior?

jflat06 Staff Lv 1

I believe the filter was implemented at the suggestion of some of our ops. They are not always around, and during times after a paper release, such as the monkey virus paper, it can help them out to have such a filter.

No filter is ever perfect, and we are aware there are way to circumvent the system. But we can probably tune it to catch the majority of cases, while minimizing the false positives. Additionally, the filter only applies to the game's interface, and so people using external clients will see no filtering. We may consider adding an option to the client to allow for disabling it as well.

I think some people have already asked for the filter list, and are adding/removing entries. Tamirh might be able to get you the list.

tamirh Lv 1

I pulled the list from another project I worked on at CGS and it happened to be one geared towards a younger audience. I forgot to trim out the tamer words from the filter. That is why you see words like 'stupid' getting filtered.

As for the Scunthrope problem: I wrote the filter to err on the side of allowing possible profanity rather than blocking it. So anything that is blocked should be exactly on the filter list and not a false positive. If there are any bugs with this, I will address them.

As for 'monkey' being blocked: This is a bug. There was a space in the word list and was catching monkey by itself when the intended profanity was a phrase with monkey in it.

As for catching more words when people are trying to get around the filter: I have code lying around that would help do this but I did not enable it because I thought false positives would be more annoying than false negatives. One involves using a phonetic reduction algorithm to catch spelling variations on profanity. This also catches things like using numbers or symbols instead of letters. The downside to this is it would of course bring about false positives.

I am happy to edit the filter list (either adding or removing), I have sent it out some players but I haven't gotten any response back to what words they would like added/removed. Feel free to PM or email me (my username @gmail) if you would like the plain text filtered word list.

BootsMcGraw Lv 1

As a chat mod, I feel this filter is unnecessary. Enough of us are online at any given time to quickly squash any potty-mouth who gets out of line.

I still feel that the solution to the profanity problem is to chuck the idea of the "puzzle" chat room, and to replace it with an "adult" room. The room would have a key or password, clearly visible in some public place, most likely on the FoldIt web site.

You might say "what good will that do, if kids can see it?" Well, you'll never be able to keep out underaged folks who want to play with the grownups; but by making anyone who is willing to be exposed to "adult" language take an extra step to do so, you prevent people from accidentally stumbling into a room where people are freely using George Carlin's "seven dirty words you can't say on TV". And isn't that why the filter was created, in the first place?

I think I'll open my own feedback to officially suggest this.

anthunk Lv 1

An excellent idea, in fact.

The point is that the adult chatroom would be self-selected. If people wish to eschew the invective, they are free to not participate. But anyone who wants to, can. Profanity, unfettered debate, and porcupine offerings can be done here while leaving global on-topic.

I am fully in support of this.

Deleted user

I don't think an "adult" chatroom has any place on a University-run science project.

There's plenty of other places on the interweb for that kind of thing.