Secure
ASP.NET
LANGUAGES: ALL
ASP.NET VERSIONS: 2.0
Better HTML and URL Encoding Functions
Defend Against Cross-site Scripting Attacks
By Don Kiely
The HtmlEncode and UrlEncode methods of the
HttpServerUtility class in the System.Web namespace have long provided a first
line of defense against cross-site scripting attacks. These are the kinds of
attacks where someone puts scripting code into an input box on a Web page that
includes script. A simple example is to enter this literal text into a text box
that prompts for a person s first name:
<script>alert("Ha ha! We've attacked your site!")</script>
When you redirect to another page and display what you
thought was the person s first name, an alert box pops up with nefarious text.
This is a simple and trivial example of cross-site scripting. If you create Web
pages, you should be well aware of this kind of attack and know how to protect
against it. Google cross-site scripting for lots of good information.
The HtmlEncode and UrlEncode methods provide protection by
converting known bad characters in a string of text to either the &#DECIMAL;
or single and double byte notations, respectively. Encoding the characters this
way keeps the browser from interpreting it as script. When you pass the <script>
code above through these methods, you get these results:
HtmlEncode
<script>alert("Ha ha! We've attacked your
site!")</script>
UrlEncode
%3cscript%3ealert(%e2%80%9cHa+ha!+We%e2%80%99ve+attacked+your+site!%e2%80%9d)%3c%2fscript%3e
These methods take a known bad approach to protecting
against attacks. The idea is that there are certain characters that are known
to be a problem in these kinds of attacks, notably these characters: <, >,
&, , and characters with ASCII values of 160-255, inclusive. As long as
you encode those characters, you should be safe or so goes the concept.
The key word in the previous sentence is should. You should be safe as long as an attacker doesn t come up with a way
to attack your Web site using other characters. Unfortunately, that s exactly what
has been happening lately, making the .NET encoding methods less useful. So
Microsoft has shifted away from a known bad strategy to a known good
strategy, with its new Anti-CrossSite Scripting Library. The idea is that you
shouldn t eliminate only the characters that you know are bad, because that
list changes all the time. Instead, leave alone only the characters that you
know are okay.
So the functions in the library encode all characters
other than the following, providing the same HtmlEncode and UrlEncode methods
as in the .NET Framework:
- a to z
- A to Z
- 0 to 9
- , (Comma)
- . (Period)
- - (Dash)
- _ (Underscore)
- Space (only in the UrlEncode function)
When you run the <script> code above through these
new methods, here is what you get:
HtmlEncode
<script>alert(“Ha ha!
We’ve attacked your
site!”)</script>
UrlEncode
%3cscript%3ealert%28%u201cHa%20ha%21%20We%u2019ve%20attacked%20your%20site%21%u201d%29%3c%2fscript%3e
As you can see, far less of the original text remains in
its character format, meaning that less of the text could be considered
executable by the browser. This isn t exactly a monumental change, and the code
in the library is quite simple. However, it results in far less of an
opportunity for cross-site scripting attacks to succeed.
One difference in the AntiXSSLibrary versions of the
HtmlEncode and UrlEncode functions is that they each only have a single
overload. The .NET Framework versions have an overloaded form to take both a
string and TextWriter object. This overload returns the resulting output to the
specified output stream. While you can easily code around this to use the
AntiXSSLibrary versions, it could break some code so be careful if you use
the new functions in existing applications.
This initial release contains the binaries for versions
1.x and 2.0 of the .NET Framework. You can download the library here (http://www.microsoft.com/downloads/details.aspx?FamilyID=9A2B9C92-7AD9-496C-9A89-AF08DE2E5982&displaylang=en).
Don Kiely, MVP,
MCSD, is a senior technology consultant, building custom applications as well
as providing business and technology consulting services. His development work
involves tools such as SQL Server, Visual Basic, C#, ASP.NET, and Microsoft
Office. He writes regularly for several trade journals, and trains developers
in database and .NET technologies. You can reach Don at mailto:donkiely@computer.org and read
his blog at http://www.sqljunkies.com/weblog/donkiely/.