Sunday, February 26, 2006

New HTMLEncoding and URLEncoding

Widely used approach to avoid risk posed by XSS is to encode all untrusted input into non-executable forms, before rendering it as output. The System.Web.HttpUtility.HtmlEncode is one namespace provided by microsoft that encoded charecters into safer HTML formats.

The approach looks for bad characters in input, with an assumption of all possible invalid inputs an attacker might attempt. This can provide protection to applications against XSS attacks, but it merely depends on howmuch were assumption were correct? For example, some of currently possible valid encodings of the character “<” are: (I have seperated each encoded value with a ':').

<: %3C: <: < : < : &LT; : < : &#060 : < : &#00060 : < : < : &#60; : < : < : < : < : < : < : &#x03c : < : &#x0003c : < : &#x000003c : < : < : <
< : < : < : < : &#X03c : < : &#X0003c : < : &#X000003c : < : < : < : < : < :
< : < : &#x03C : < : &#x0003C : < : &#x000003C :
< : < : < : < : < : < : < : &#X03C : < : &#X0003C : < : &#X000003C : < : <

Tough... isn't it ? The Anti-Cross Site Scripting Library V1.0 by Microsoft takes an approach based on allowing only known or good inputs, and rejecting every thing else. This is a good and comprehensive approach of allowing all known-inputs rather than not-allowing all unknown inputs.

You can download installer here. There is new support for HTMLEncode and URLEncode exactly the same as their System.Web.HttpUtility counterparts (HttpUtility.HtmlEncode and HttpUtility.UrlEncode), but under AntiXSSLibrary.HtmlEncode and AntiXSSLibrary.UrlEncode namespaces !!!