ControlFreak
LANGUAGES:
VB.NET | C#
ASP.NET
VERSIONS: 2.x
CAPTCHASP
Defend Your ASP.NET Web Sites against Evil Bots
By Steve C. Orr
Robots are taking control of the Internet! Don t let them
overwhelm your Web site with their unrelenting, self-serving probes. Now you
can fight back with this free control that allows you to discriminate between
human and computer visitors.
While this might sound like a sci-fi promotion for the
next Terminator or Transformers movie, in a way, that
ominous sci-fi future is already here. But don t be too afraid just like in
the movies, there are robots here to help us, too.
Not All Robots Are Bad
Robots are automated software systems that perform
functions normally expected to be done by people. They send e-mail, surf the Web,
send instant messages, etc. Such bots can be used for good. For example, Google s
multitude of bots surf virtually all public Web sites and collect bits of
information that it uses to help people search and find those Web sites. Google s
bots are generally considered to be respected and responsible members of the
Internet, because they abide by requests for privacy and, in exchange for the
small amount of shared Internet resources they consume, provide a useful
service that s valuable to nearly everybody.
The problem is that bad people can make bots, too and
their bots may do bad things. At their worst, bots have been used to take
control of unsuspecting victims computers. When a hacker owns enough
computers through such a process, their army of zombie PCs can march across the
Internet doing evil deeds, such as swiping credit card numbers, breaking into
more computer systems, and taking down major Web sites by overwhelming them
with phony Web page requests.
Then there are the bots in the middle ground. Maybe their
creators didn t intend for them to do bad things, but, nevertheless, many
people consider them to be nuisances or worse. In the early days of the World
Wide Web, the idea of bots caught fire, and this borderline kind of bot ran
rampant and unhindered. This caused problems for many pioneers of the Internet.
Companies such as eBay and Ticketmaster were dominant on
the Web in their respective industries; their tiny competitors were struggling,
in comparison. They couldn t compete with Ticketmaster if they didn t have
tickets to sell, and they couldn t compete with eBay without plenty of auctions
to attract members. So they made bots that would simply use these big-name
sites in the background. If you bought a ticket from their budding Web site,
their bots would actually go to Ticketmaster s site and buy the ticket and then
pass it on to you. This might not have worked out so badly if it weren t for
the fact that these bots had to constantly probe the sites they were using to
keep their lists of data fresh and in sync. In time, and as such tactics caught
on, the number of bots increased, and their constant automated requests started
to take a heavy toll on the servers they were accessing. Ticketmaster and eBay
were getting irked that their competitors were making money by consuming their
expensive data, resources, and computer systems without permission.
Friendly tactics were tried at first, such as asking these
little companies to go away and buy their own computers. That didn t work
mostly because these little companies couldn t afford to compete with the big
boys. Then legal actions were tried. But even today there are precious few laws
regarding such matters, and back then there were virtually no laws covering
such newfangled concepts. So what was the solution to be?
Yahoo had a bot problem, too. Its free e-mail system was
being abused by spam bots that were automatically signing up for thousands of e-mail
accounts and then exploiting them to send junk e-mail to people all over the
world. Yahoo enlisted the help of a Carnegie
Mellon University
team that came up with a brilliant technological solution.
CAPTCHA to the Rescue
CAPTCHA effectively immunizes a Web site against bots. It
stands for Completely Automated Public Turing Test To Tell Computers and
Humans Apart (http://www.captcha.net). Its
foundation lays in the fact that while computers are brilliantly useful at some
kinds of things (like calculating equations and tracking data), there are still
many tasks that the human brain is much better equipped to handle.
For example, a human can glance at a painting such as the
Mona Lisa and see in an instant without even thinking about it that it s a
picture of a beautiful woman sitting down with a demure smile. On the other
hand, even today s most cutting-edge optical recognition computer systems would
be hard-pressed to even be able to tell you definitively that there is a human
being in such a picture.
Optical character recognition systems (which read text
from an image) fair a little better primarily because there are a finite
number of characters in the alphabet. However, even these programs work only
semi-reliably when under optimal conditions: clear black print on a plain white
background. Colors, blurriness, fancy fonts, handwriting, symbols, embedded
pictures, and crooked text are just a few of the common conditions that tend to
confuse optical character recognition systems, making them unable to recognize
the writing contained in a scanned image.
So these days, Web sites that wish to defend themselves
against bots use the CAPTCHA concept to display a picture that contains
crooked, colorful, blurry text with varying fonts and asks the user to type in
what they see (see Figure 1). It is a simple task for any legitimate human
user, but a virtually insurmountable chore for bots. Therefore, a Web site can
be reasonably sure that any user who makes it through their CAPTCHA gateway is
a person, not a computer.
Figure 1: Ticketmaster implements an
advanced CAPTCHA system that lets users in and keeps bots out.
Using CAPTCHASP
CAPTCHASP is a custom Web control I created to easily add
CAPTCHA verification to any ASP.NET Web site (see Figure 2). Simply drag the
CAPTCHASP.DLL onto your Visual Studio toolbox, then drag it from there onto any
Web form and you ve got instant CAPTCHA (see end of article for download
details). Unlike most controls that generate images, there is no dependency on
outside pages, resources, HTTP Handlers, or web.config settings. This is because
of a novel development technique I used that will be detailed in next month s
follow-up article that delves into CAPTCHASP s source code. Simple, standard
xcopy deployment is all that s needed and it s virtually impossible to mess
up.
Figure 2: The CAPTCHASP control can
be dragged onto any ASP.NET Web form to provide instant, highly customizable
CAPTCHA verification.
After the control s been dropped on your Web form, the
only other thing that s vitally important for you to know is that the control
will raise its UserVerified event when the user has entered the correct
codeword and therefore been proven to be a real human. From that event you can
then choose to let the user in and perhaps set some kind of flag to remember
that the user has successfully been verified.
This is all you really need to know to use CAPTCHASP, but I
suggest you keep reading to learn how to take advantage of the many optional
features the control offers (see Figure 3 for the complete list of CAPTCHASP
events).
|
CAPTCHASP Events
|
Parameter
|
Description
|
|
UserVerified
|
n/a
|
This event is raised when the user has entered the
correct code and therefore been proven to be human. Alternatively, you could
ignore this event and instead call the Validate method followed by a check of
the IsValid property.
|
|
VerificationFailure
|
FailCount (Integer)
ByVal (in)
|
This event is raised when the user has entered an
incorrect code. The FailCount parameter specifies how many consecutive times
they ve failed to enter the correct code. An exception will be thrown after
15 invalid attempts, so you may want to handle this exception or deal with
the suspected bot in some other way before that happens.
|
|
CodeWordSelection
|
CodeWord (String)
ByRef (in/out)
|
This event is raised when it s time to choose a codeword
for display in the image portion of the control. The control s suggested
codeword will be provided by the modifiable CodeWord parameter, unless the
CodeWordType property is set to Custom, in which case you ll be required to
provide your own codeword via the CodeWord parameter.
|
Figure 3:
CAPTCHASP has three events that supply potentially useful information to the
page.
Choosing CodeWords
The CodeWord is the CAPTCHA characters the user sees in
the image and types in to be validated. The CodeWord may be a series of random
characters, an actual word, or some combination thereof, depending on how
CAPTCHASP has been configured.
By default, CAPTCHASP s CodeWordType property is set to
its RandomCharacters enumeration value. This will cause the control to generate
a random series of lowercase characters. The AddSymbols property, when set to
its default value of True, will mix in some symbol characters, as well. The
number of characters generated for each CodeWord is determined by the
NumberOfCharacters property, which has a default value of 5, a minimum value of
3, and a maximum value of 10.
If the CodeWordType property is set to the UseWordList
enumeration value, CAPTCHASP will randomly choose a CodeWord from the comma-separated
list of words in the WordList property. The WordList property comes
pre-populated with a list of more than 150 English words that are well thought
out to be clear to humans, but confusingly similar to bots. Of course, you can
add to this list, modify it, replace it with your own list, or customize it in
any way you wish.
In all cases, the control s randomly chosen CodeWord is
sent as a parameter through the CodeWordSelection event. This gives the page
code a chance to observe the value or request a different CodeWord be randomly
generated by calling the GenerateNewCodeWord method. The CodeWord parameter is
also modifiable so you can optionally replace the CodeWord with one of your own
choosing.
However, if the CodeWordType property is set to the Custom
enumeration value, the CodeWordSelection event s CodeWord parameter becomes
required; the CAPTCHASP control will not take the time to randomly select a
CodeWord instead, the page will be expected to supply it with one. This
option is nice for situations where you want to always use your own function to
generate a custom CodeWord or dynamically grab one from a data source of your
choosing (see Figure 4).
|
CodeWord-related
Properties
|
Property Type
|
Description
|
|
CodeWordType
|
Enumeration
|
Specifies whether the CAPTCHASP control should
automatically generate a random series of letters for the CodeWord, whether
it should randomly choose a word from the WordList property, or whether you
prefer to supply it with a custom CodeWord.
|
|
AddSymbols
|
Boolean
|
When set to its default property of True and the
CodeWordType property is set to its default of RandomLetters, symbol
characters will be randomly mixed in with lowercase letters to create the
CodeWord.
|
|
NumberOfCharacters
|
Byte
|
When the CodeWordType property is set to its default of
RandomLetters, this property specifies how many randomly generated characters
each CodeWord should contain.
|
|
WordList
|
String
|
A comma-separated list of words from which the control
will randomly pick a CodeWord when the CodeWordType property is set to
UseWordList. There are more than 150 pre-populated default words.
|
Figure 4: The
CAPTCHASP control provides a variety of ways to customize the CodeWords that
are displayed to the user.
For security reasons, CodeWords of at least three characters
are required. For clarity and usability reasons, you should avoid supplying
CodeWords of more than 10 characters. It s also good to be aware that the
letter l and number 1 look confusingly similar; therefore, you may want to
avoid using one or both of them. Likewise, the number zero ( 0 ) is often
confused with the letter o . The CAPTCHASP control takes into consideration
these issues when generating its random CodeWords. One way it does this is by
not using any numbers. Additionally, user input is case-insensitive, so users
need not worry about accidentally entering an uppercase O where a similar
looking lowercase o was expected. And the default WordList contains no
numbers nor the letter l .
Cosmetic Customizations
The look and feel of CAPTCHASP can be configured in a
variety of ways. Virtually every aspect of the control s appearance can be
altered via properties and styles. Figure 5 demonstrates many of the control s
optional user interface elements and customizations. I don t necessarily
recommend altering the control s appearance to this much of an extreme, but it s
nice to know you can.
Figure 5: Virtually every aspect of
CAPTCHASP s appearance (including several optional elements) can be altered via
properties and styles even to ugly extremes such as this!
The optional title area (at the top of Figure 5 in green)
can be shown by setting the TitleText property to the text you d like to appear
there. You can adjust the TitleStyle property elements to change how it looks
in a variety of standard ways.
The InstructionText property can be used to change the
text that is displayed above the textbox. Its InstructionStyle property
elements can be used to adjust its look in many ways, and has been used in
Figure 5 to apply a fancy italic font.
A hyperlink can be displayed to explain in more detail why
the user must go through this process. This Why? element (shown in blue in
Figure 5) pops up a customizable message when clicked. The WhyStyle property
elements can be used to adjust the look and feel of this hyperlink. The
CodeWord entry textbox can also be adjusted in a variety of ways via the
TextEntryStyle property elements. Figure 5 demonstrates this with purple text.
The Submit button (shown in orange in Figure 5) has
ButtonStyle property elements associated with it to adjust its appearance. The
ButtonText property can be used to change the button text from its default of
Submit. The ShowSubmitButton Boolean property can be changed to False to make
the button invisible, in case you d like to implement your own submit button
(or link) elsewhere on the page. Such a custom submit element would need to
call the CAPTCHASP control s validate method to trigger the control to check if
the user s entry is correct or not.
If the user enters the wrong code, the FailMessage will
appear, as shown in Figure 5 in red. The FailMessageStyle property elements can
be used to adjust visual aspects, and the FailMessageText property can be used
to change what it says.
Finally, there is an optional ChangeCodeWord hyperlink
that can be shown at the bottom of the control. Figure 5 displays this link
highlighted in yellow via the ChangeCodeWordStyle property elements. The
ChangeCodeWordText property can be used to change what the link says. The
ShowChangeCodeWordLink property can be changed to True (from its default of
False) to get this link to appear. When the user clicks this link the control
will generate and display a new CodeWord. The FailCount property will be
incremented each time this happens to help prevent abuse by any cherry-picking
bots that feel brave enough to attempt to decode a CAPTCHA image. Instead of
displaying this built-in link, you could implement your own link elsewhere on
the page to change the CodeWord. It would simply need to call CAPTCHASP s
server-side GenerateNewCodeWord method.
Conclusion
You should now understand what CAPTCHA is, as well as how
and why it came to be. With the CAPTCHASP control you can now easily immunize
your ASP.NET Web site to keep bots out and let legitimate users in.
The CAPTCHASP control is freely downloadable to everyone. You
can download it or try it out live right now from the demo pages I ve assembled
at http://SteveOrr.net/demo/CAPTCHASP.
Additionally, asp.netPRO subscribers
can download the complete source code for the CAPTCHASP control in both VB.NET
and C# flavors.
Stay tuned next month as we ll be examining the
architecture of the CAPTCHASP control, including an examination of the source
code of some of the more interesting and innovative functions in the CAPTCHASP
control.
Sample code accompanying
this article is available for download.
Steve C. Orr is an
ASPInsider, MCSD, Certified ScrumMaster, Microsoft MVP in ASP.NET, and author
of the book Beginning ASP.NET 2.0 AJAX by Wrox. He s
been developing software solutions for leading companies in the Seattle
area for more than a decade. When he s not busy designing software systems or
writing about them, he can often be found loitering at local user groups and
habitually lurking in the ASP.NET newsgroup. Find out more about him at http://SteveOrr.net or e-mail him at mailto:Steve@Orr.net.