Creating a Banned Words Filter

Creating a Banned Words Filter

I have a website with a user forum and content management system, where certain users (or anyone in the case of forums) can post to. I therefore had to create an easy way to filter and change certain naughty/controversial words from being viewed on my pages. I wanted the solution to be easy to maintain and robust enough to pick up any combination of characters I chose, because let’s face there is always a way round the system.

Solution:

  • Hold banned word list and replacement words in an xml document – see image on the right.
  • Each replacement word can be used with many banned words e.g. duck, d*ck, du*k, d**k and miss are replaced by the word scurry.
  • Use the functionality of Regular Expressions to compose a comparison check for combinations of banned words.
  • When the banned word list needs to be updated, only the xml file needs to be sent to the server, no recompilation is required.
  • Read the input string, and replace any banned words found with their alternatives and return new string – see below for code:

1 // Banned words filter
2
protected static string BadLanguageFilter(string content)
3 {
4 content.Trim();
5
if (!string.IsNullOrEmpty(content))
6 {
7
string strPattern = "";
8
string strReplacement = "";
9 XmlTextReader reader =
new
XmlTextReader(HttpContext.Current.Server.MapPath(
"~/Filter/Words.xml"));
10 reader.WhitespaceHandling = WhitespaceHandling.None;
11
// Read the Xml file
12
while (reader.Read())
13 {
14
switch (reader.NodeType)
15 {
16
case XmlNodeType.Element: // The node is an element.
17
while (reader.MoveToNextAttribute()) // Read the attributes.
18 strReplacement = reader.Value;
19
break;
20
case XmlNodeType.Text: //Display the text in each element.
21
if (string.IsNullOrEmpty(strPattern))
22 strPattern = reader.Value;
23
else
24 strPattern += @
"|" + reader.Value;
25
break;
26
case XmlNodeType.EndElement: //Display the end of the element.
27
if (reader.Name == "replacement")
28 {
29
// Replace banned words with flowery language
30 Regex myRegex =
new Regex(strPattern,
31 RegexOptions.Compiled | RegexOptions.IgnoreCase);
32 content = myRegex.Replace(content, strReplacement);
33 strPattern =
"";
34 strReplacement =
"";
35 }
36
break;
37 }
38 }
39
// Close off connection
40 reader.Close();
41 }
42
return content;
43 }

Leave a Reply

Your email address will not be published. Required fields are marked *