CAPTCHASP

August 09, 2007 12:08 AM
DevConnections
Rating: (0)

ControlFreak

LANGUAGES:VB.NET | C#

ASP.NETVERSIONS: 2.x

 

CAPTCHASP

Defend Your ASP.NET Web Sites against Evil Bots

 

 

Robots are taking control of the Internet! Don?t let themoverwhelm your Web site with their unrelenting, self-serving probes. Now youcan fight back with this free control that allows you to discriminate betweenhuman and computer visitors.

 

While this might sound like a sci-fi promotion for thenext Terminator or Transformers movie, in a way, thatominous sci-fi future is already here. But don?t be too afraid ? just like inthe movies, there are robots here to help us, too.

 

Not All Robots Are Bad

Robots are automated software systems that performfunctions normally expected to be done by people. They send e-mail, surf the Web,send instant messages, etc. Such bots can be used for good. For example, Google?smultitude of bots surf virtually all public Web sites and collect bits ofinformation that it uses to help people search and find those Web sites. Google?sbots are generally considered to be respected and responsible members of theInternet, because they abide by requests for privacy and, in exchange for thesmall amount of shared Internet resources they consume, provide a usefulservice that?s valuable to nearly everybody.

 

The problem is that bad people can make bots, too ? andtheir bots may do bad things. At their worst, bots have been used to takecontrol of unsuspecting victims? computers. When a hacker ?owns? enoughcomputers through such a process, their army of zombie PCs can march across theInternet doing evil deeds, such as swiping credit card numbers, breaking intomore computer systems, and taking down major Web sites by overwhelming themwith phony Web page requests.

 

Then there are the bots in the middle ground. Maybe theircreators didn?t intend for them to do bad things, but, nevertheless, manypeople consider them to be nuisances ? or worse. In the early days of the WorldWide Web, the idea of bots caught fire, and this borderline kind of bot ranrampant and unhindered. This caused problems for many pioneers of the Internet.

 

Companies such as eBay and Ticketmaster were dominant onthe Web in their respective industries; their tiny competitors were struggling,in comparison. They couldn?t compete with Ticketmaster if they didn?t havetickets to sell, and they couldn?t compete with eBay without plenty of auctionsto attract members. So they made bots that would simply use these big-namesites in the background. If you bought a ticket from their budding Web site,their bots would actually go to Ticketmaster?s site and buy the ticket and thenpass it on to you. This might not have worked out so badly if it weren?t forthe fact that these bots had to constantly probe the sites they were using tokeep their lists of data fresh and in sync. In time, and as such tactics caughton, the number of bots increased, and their constant automated requests startedto take a heavy toll on the servers they were accessing. Ticketmaster and eBaywere getting irked that their competitors were making money by consuming theirexpensive data, resources, and computer systems without permission.

 

Friendly tactics were tried at first, such as asking theselittle companies to go away and buy their own computers. That didn?t work ?mostly because these little companies couldn?t afford to compete with the bigboys. Then legal actions were tried. But even today there are precious few lawsregarding such matters, and back then there were virtually no laws coveringsuch newfangled concepts. So what was the solution to be?

 

Yahoo had a bot problem, too. Its free e-mail system wasbeing abused by spam bots that were automatically signing up for thousands of e-mailaccounts and then exploiting them to send junk e-mail to people all over theworld. Yahoo enlisted the help of a Carnegie Mellon Universityteam that came up with a brilliant technological solution.

 

CAPTCHA to the Rescue

CAPTCHA effectively immunizes a Web site against bots. Itstands for ?Completely Automated Public Turing Test To Tell Computers andHumans Apart? (http://www.captcha.net). Itsfoundation lays in the fact that while computers are brilliantly useful at somekinds of things (like calculating equations and tracking data), there are stillmany tasks that the human brain is much better equipped to handle.

 

For example, a human can glance at a painting such as theMona Lisa and see in an instant ? without even thinking about it ? that it?s apicture of a beautiful woman sitting down with a demure smile. On the otherhand, even today?s most cutting-edge optical recognition computer systems wouldbe hard-pressed to even be able to tell you definitively that there is a humanbeing in such a picture.

 

Optical character recognition systems (which ?read? textfrom an image) fair a little better ? primarily because there are a finitenumber of characters in the alphabet. However, even these programs work onlysemi-reliably when under optimal conditions: clear black print on a plain whitebackground. Colors, blurriness, fancy fonts, handwriting, symbols, embeddedpictures, and crooked text are just a few of the common conditions that tend toconfuse optical character recognition systems, making them unable to recognizethe writing contained in a scanned image.

 

So these days, Web sites that wish to defend themselvesagainst bots use the CAPTCHA concept to display a picture that containscrooked, colorful, blurry text with varying fonts and asks the user to type inwhat they see (see Figure 1). It is a simple task for any legitimate humanuser, but a virtually insurmountable chore for bots. Therefore, a Web site canbe reasonably sure that any user who makes it through their CAPTCHA gateway isa person, not a computer.

 


Figure 1: Ticketmaster implements anadvanced CAPTCHA system that lets users in and keeps bots out.

 

Using CAPTCHASP

CAPTCHASP is a custom Web control I created to easily addCAPTCHA verification to any ASP.NET Web site (see Figure 2). Simply drag theCAPTCHASP.DLL onto your Visual Studio toolbox, then drag it from there onto anyWeb form and you?ve got instant CAPTCHA (see end of article for downloaddetails). Unlike most controls that generate images, there is no dependency onoutside pages, resources, HTTP Handlers, or web.config settings. This is becauseof a novel development technique I used that will be detailed in next month?sfollow-up article that delves into CAPTCHASP?s source code. Simple, standardxcopy deployment is all that?s needed ? and it?s virtually impossible to messup.

 


Figure 2: The CAPTCHASP control canbe dragged onto any ASP.NET Web form to provide instant, highly customizableCAPTCHA verification.

 

After the control?s been dropped on your Web form, theonly other thing that?s vitally important for you to know is that the controlwill raise its UserVerified event when the user has entered the correctcodeword and therefore been proven to be a real human. From that event you canthen choose to let the user in and perhaps set some kind of flag to rememberthat the user has successfully been verified.

 

This is all you really need to know to use CAPTCHASP, but Isuggest you keep reading to learn how to take advantage of the many optionalfeatures the control offers (see Figure 3 for the complete list of CAPTCHASPevents).

 

CAPTCHASP Events

Parameter

Description

UserVerified

n/a

This event is raised when the user has entered the correct code and therefore been proven to be human. Alternatively, you could ignore this event and instead call the Validate method followed by a check of the IsValid property.

VerificationFailure

FailCount (Integer)

ByVal (in)

This event is raised when the user has entered an incorrect code. The FailCount parameter specifies how many consecutive times they?ve failed to enter the correct code. An exception will be thrown after 15 invalid attempts, so you may want to handle this exception or deal with the suspected bot in some other way before that happens.

CodeWordSelection

CodeWord (String)

ByRef (in/out)

This event is raised when it?s time to choose a codeword for display in the image portion of the control. The control?s suggested codeword will be provided by the modifiable CodeWord parameter, unless the CodeWordType property is set to Custom, in which case you?ll be required to provide your own codeword via the CodeWord parameter.

Figure 3:CAPTCHASP has three events that supply potentially useful information to thepage.

 

Choosing CodeWords

The ?CodeWord? is the CAPTCHA characters the user sees inthe image and types in to be validated. The CodeWord may be a series of randomcharacters, an actual word, or some combination thereof, depending on howCAPTCHASP has been configured.

 

By default, CAPTCHASP?s CodeWordType property is set toits RandomCharacters enumeration value. This will cause the control to generatea random series of lowercase characters. The AddSymbols property, when set toits default value of True, will mix in some symbol characters, as well. Thenumber of characters generated for each CodeWord is determined by theNumberOfCharacters property, which has a default value of 5, a minimum value of3, and a maximum value of 10.

 

If the CodeWordType property is set to the UseWordListenumeration value, CAPTCHASP will randomly choose a CodeWord from the comma-separatedlist of words in the WordList property. The WordList property comespre-populated with a list of more than 150 English words that are well thoughtout to be clear to humans, but confusingly similar to bots. Of course, you canadd to this list, modify it, replace it with your own list, or customize it inany way you wish.

 

In all cases, the control?s randomly chosen CodeWord issent as a parameter through the CodeWordSelection event. This gives the pagecode a chance to observe the value or request a different CodeWord be randomlygenerated by calling the GenerateNewCodeWord method. The CodeWord parameter isalso modifiable so you can optionally replace the CodeWord with one of your ownchoosing.

 

However, if the CodeWordType property is set to the Customenumeration value, the CodeWordSelection event?s CodeWord parameter becomesrequired; the CAPTCHASP control will not take the time to randomly select aCodeWord ? instead, the page will be expected to supply it with one. Thisoption is nice for situations where you want to always use your own function togenerate a custom CodeWord or dynamically grab one from a data source of yourchoosing (see Figure 4).

 

CodeWord-related Properties

Property Type

Description

CodeWordType

Enumeration

Specifies whether the CAPTCHASP control should automatically generate a random series of letters for the CodeWord, whether it should randomly choose a word from the WordList property, or whether you prefer to supply it with a custom CodeWord.

AddSymbols

Boolean

When set to its default property of True and the CodeWordType property is set to its default of RandomLetters, symbol characters will be randomly mixed in with lowercase letters to create the CodeWord.

NumberOfCharacters

Byte

When the CodeWordType property is set to its default of RandomLetters, this property specifies how many randomly generated characters each CodeWord should contain.

WordList

String

A comma-separated list of words from which the control will randomly pick a CodeWord when the CodeWordType property is set to UseWordList. There are more than 150 pre-populated default words.

Figure 4: TheCAPTCHASP control provides a variety of ways to customize the CodeWords thatare displayed to the user.

 

For security reasons, CodeWords of at least three charactersare required. For clarity and usability reasons, you should avoid supplyingCodeWords of more than 10 characters. It?s also good to be aware that theletter ?l? and number ?1? look confusingly similar; therefore, you may want toavoid using one or both of them. Likewise, the number zero (?0?) is oftenconfused with the letter ?o?. The CAPTCHASP control takes into considerationthese issues when generating its random CodeWords. One way it does this is bynot using any numbers. Additionally, user input is case-insensitive, so usersneed not worry about accidentally entering an uppercase ?O? where a similarlooking lowercase ?o? was expected. And the default WordList contains nonumbers nor the letter ?l?.

 

Cosmetic Customizations

The look and feel of CAPTCHASP can be configured in avariety of ways. Virtually every aspect of the control?s appearance can bealtered via properties and styles. Figure 5 demonstrates many of the control?soptional user interface elements and customizations. I don?t necessarilyrecommend altering the control?s appearance to this much of an extreme, but it?snice to know you can.

 


Figure 5: Virtually every aspect ofCAPTCHASP?s appearance (including several optional elements) can be altered viaproperties and styles ? even to ugly extremes such as this!

 

The optional title area (at the top of Figure 5 in green)can be shown by setting the TitleText property to the text you?d like to appearthere. You can adjust the TitleStyle property elements to change how it looksin a variety of standard ways.

 

The InstructionText property can be used to change thetext that is displayed above the textbox. Its InstructionStyle propertyelements can be used to adjust its look in many ways, and has been used inFigure 5 to apply a fancy italic font.

 

A hyperlink can be displayed to explain in more detail whythe user must go through this process. This ?Why?? element (shown in blue inFigure 5) pops up a customizable message when clicked. The WhyStyle propertyelements can be used to adjust the look and feel of this hyperlink. TheCodeWord entry textbox can also be adjusted in a variety of ways via theTextEntryStyle property elements. Figure 5 demonstrates this with purple text.

 

The Submit button (shown in orange in Figure 5) hasButtonStyle property elements associated with it to adjust its appearance. TheButtonText property can be used to change the button text from its default ofSubmit. The ShowSubmitButton Boolean property can be changed to False to makethe button invisible, in case you?d like to implement your own submit button(or link) elsewhere on the page. Such a custom submit element would need tocall the CAPTCHASP control?s validate method to trigger the control to check ifthe user?s entry is correct or not.

 

If the user enters the wrong code, the FailMessage willappear, as shown in Figure 5 in red. The FailMessageStyle property elements canbe used to adjust visual aspects, and the FailMessageText property can be usedto change what it says.

 

Finally, there is an optional ChangeCodeWord hyperlinkthat can be shown at the bottom of the control. Figure 5 displays this linkhighlighted in yellow via the ChangeCodeWordStyle property elements. TheChangeCodeWordText property can be used to change what the link says. TheShowChangeCodeWordLink property can be changed to True (from its default ofFalse) to get this link to appear. When the user clicks this link the controlwill generate and display a new CodeWord. The FailCount property will beincremented each time this happens to help prevent abuse by any cherry-pickingbots that feel brave enough to attempt to decode a CAPTCHA image. Instead ofdisplaying this built-in link, you could implement your own link elsewhere onthe page to change the CodeWord. It would simply need to call CAPTCHASP?sserver-side GenerateNewCodeWord method.

 

Conclusion

You should now understand what CAPTCHA is, as well as howand why it came to be. With the CAPTCHASP control you can now easily immunizeyour ASP.NET Web site to keep bots out and let legitimate users in.

 

The CAPTCHASP control is freely downloadable to everyone. Youcan download it or try it out live right now from the demo pages I?ve assembledat http://SteveOrr.net/demo/CAPTCHASP.Additionally, asp.netPRO subscriberscan download the complete source code for the CAPTCHASP control in both VB.NETand C# flavors.

 

Stay tuned next month as we?ll be examining thearchitecture of the CAPTCHASP control, including an examination of the sourcecode of some of the more interesting and innovative functions in the CAPTCHASPcontrol.

 

Sample code accompanyingthis article is available for download.

 

Steve C. Orr is anASPInsider, MCSD, Certified ScrumMaster, Microsoft MVP in ASP.NET, and authorof the book Beginning ASP.NET 2.0 AJAX by Wrox. He?sbeen developing software solutions for leading companies in the Seattlearea for more than a decade. When he?s not busy designing software systems orwriting about them, he can often be found loitering at local user groups andhabitually lurking in the ASP.NET newsgroup. Find out more about him at http://SteveOrr.net or e-mail him at mailto:Steve@Orr.net.