Archive for November, 2006

Thwarting form spammers in ColdFusion

Thursday, November 30th, 2006

As I have posted on before, spammers are more and more attempting to use form posting methods to send emails through unsuspecting web hosts. The most dangerous form, of course, are posts through which the spammer injects code that is run invisibly on the server. This approach can often be foiled by verifying form post data and properly escaping characters before they are displayed anywhere.

I have added validation to all the forms on the web site for the main company I work for, but one spammer is still attempting to use a form. However, the only destination is an administrative e-mail box, so the spammer has expended a lot of effort to effectively annoy only a few people. We have been receiving about 100 spam messages a day in the box, so further action was needed to stop this spammer and any other who might follow after.

One common solution has been the “captcha,” a graphic with hard-to-read text that only the human brain can theoretically interpret and relay back to the server to prove the post did not originate from a robot. For our purposes, however, this approach is undesirable since it makes forms less usable. I wanted to come up with a verification process that is invisible to humans and would be (nearly?) impossible for a robot to reproduce.

My solution is to create a checksum value that would consist of data from the server, from the client, and from the form that could be matched up before and after the post to “prove” it to be authentic. Taking a page from our login guidelines, I thought it also appropriate to have forms time stamped, further hindering the likelihood that the checksum could be duplicated, and would certainly make it impossible for a “good” post to be reused at a later time.

This first solution is in ColdFusion since that is what this company uses the most right now. I plan to follow shortly with a PHP example followed by a Java/JSP example.

Creating the Checksum

The following code creates the checksum value. Any number of variables can be used to create the checksum, with the only requirement being that the values at the time of validation have to be the same as the time of posting.

This code uses the following four values:

  • dateTime - The time the form was created, which allows us to validate that the form POST was submitted within a certain time frame.
  • remoteAddress - The client machine’s address, which allows us to validate that the form POST arrived from the same machine that requested the original form. Specifically, this would thwart spammers who use Trojans to hijack unprotected computers and make them spam zombies.
  • serverName - This is the web server’s domain name in this example. If your servers are inside a firewall, and/or if you have multiple, load balanced servers with “sticky persistence” (i.e., a client is guaranteed to always communicate with one particular server during its session), you might use either the server’s IP address or a global variable specific to a given server.
  • formName - On sites that I program, each form is given a unique name. Originally, this was for statistical usage purposes, but it can also come in handy here. This is essentially a “password” for the form data, preventing a given set of validation code from attempting to validate data from a different form.

Overall, this system attempt to guarantee that communication about one form is transferred between a known host and a known client within a known time span. Any detected violation prevents validation from proceeding. Hashing this data using a technique like MD5 creates a checksum that should be virtually impossible to reverse-engineer. Even though spammers are persistent, they are more likely to hack someone else’s less protected site than tackle this problem.

<cfparam name="form.dateTime" default="#DateFormat(Now(),’yyyy-mm-dd’)# #TimeFormat(Now(),’HH:mm:ss’)#">
<cfset remoteAddress = cgi.REMOTE_ADDR>
<cfset serverName = cgi.SERVER_NAME>
<cfset formName = "formSomeName">
<cfset form.checksum = hash("#form.dateTime#~#formName#~#remoteAddress#~#serverName#")>

The next piece of code is added into the form. The only two pieces of information in clear text are the dateTime and checksum values (now that I think about it, these two values could actually be encrypted and given different names to further enhance security - I will most likely do that in further form work). All of the other variables live on the server, and the spammer would have to know all this, including your separators and the exact sequence of variables used, in order to forge the checksum value (not likely without inside help or an outright takeover of your server).

<input type="hidden" name="dateTime" value="#form.dateTime#" />

<input type="hidden" name="checksum" value="#form.checksum#" />

On the validation side, you want to add the following code at the top.

<cfif cgi.REQUEST_METHOD IS NOT "POST" OR NOT isDefined("form.checksum")>
    <cflocation url="/" addtoken="No">
</cfif>
<cfset remoteAddress = cgi.REMOTE_ADDR>
<cfset serverName = cgi.SERVER_NAME>
<cfset formName = "formSomeName">
<cfset checksumTest = hash("#form.dateTime#~#formName#~#remoteAddress#~#serverName#")>
<cfif checksumTest IS NOT form.checksum>
        <cfset errorMessage = "The form data was corrupted. We are sorry, but we cannot send your request at this time.">
</cfif>
<cfif isDate(form.dateTime)>
    <cfif checksumTest IS form.checksum AND DateDiff("n", form.dateTime, now()) GT 30>
        <cfset errorMessage = "The form data has expired. We are sorry, but we cannot send your request at this time.">
    </cfif>
<cfelse>
    <cfset errorMessage = "The form date was corrupted. We are sorry, but we cannot send your request at this time.">
</cfif>

If the errorMessages is set, you most likely want to halt further validation to prevent any harmful code from affecting anything you have missed validating. At this point you are already aware that something is amiss! This should not be your only line of defense, however. You still want to validate every field to make sure it contains data of the right datatype, length, and is free from characters that could lead to other hacks.

CSS hack to differentiate between IE 6 and 7

Thursday, November 30th, 2006

I discovered this quite by accident, and thought this might be helpful for those who find they need to set up different stylesheet rules for formatting in Internet Explorer 6 and IE 7. The crux of this solution is the CSS attribute selector (dentoned as square brackets [] in CSS). It turns out that IE6 does not understand attribute selectors, and, in fact, will ignore the entire associated rule (including rules with more than one selector).

To leverage this difference, two rules can be defined. A normal rule that will be rendered in IE6, and a duplicate rule containing an attribute selector that will be rendered in IE7. In the example below, I use the IE-specific “if” statement to show how you can further differentiate between IE and “other browsers.” The solution below was tested in the Windows environment using IE6, IE7, Oprah 9, and Firefox 2.

If you use a different browser or environment, try the code out and let me know what you see.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>CSS hack to differentiate between IE 6 and 7</title>
    <meta name="author" content="John A. Marsh" />
    <meta name="copyright" content="&copy; 2006 ThreeLeaf.com" />
    <style type="text/css">
        P {
            background:green;
        }
    </style>
    <!–[if IE]>
        <style type="text/css">
            P {
                background:red;
            }
            P[],P {
                background:blue;
            }
        </style>
    <![endif]–>
</head>
<body>
    <p>Blue in IE 6, Red in IE 7, Green in non-IE browsers.</p>
</body>
</html>

JavaScript parent / child communication example

Monday, November 20th, 2006

Problem

I have been asked to come up with a solution that will provide the following web page behavior:

  • The Parent page opens a Child window
  • The Child page can change the location (address) of the Parent window
  • After the location change, the Child page brings the Parent window to the top
  • If the Parent page is navigated away from, or the Parent window is closed, the Child window opens the target URL in the child window

Solution

The following two pages serve as a prototype for the solution. They have been tested in Internet Explorer 6, IE7, Firefox 2, and Opera 9. The only exception is that Firefox does not bring the Parent window back to the foreground. If someone has a workaround for the focus() function in Firefox, please let me know.

parent.htm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">

<head>

    <title>Parent Window</title>

    <script type="text/javascript">

        function openChildWindow(childWindowLocation) {

        childWindow = window.open(childWindowLocation, "child", "width=800,height=600");

        window.childWindow.focus();

        }       

        function disableOpener() {

        if (window.childWindow && window.childWindow.open && !window.childWindow.closed)

        window.childWindow.opener = null;

        }

    </script>

</head>

<body onunload="disableOpener();">

    <p><a href="child.htm" onclick="openChildWindow(this);return false;">Open Child Window</a></p>

</body>

</html>

child.htm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">

<head>

    <title>Parent Window</title>

    <script type="text/javascript">

        function openChildWindow(childWindowLocation) {

        childWindow = window.open(childWindowLocation, "child", "width=800,height=600");

        window.childWindow.focus();

        }       

        function disableOpener() {

        if (window.childWindow && window.childWindow.open && !window.childWindow.closed)

        window.childWindow.opener = null;

        }

    </script>

</head>

<body onunload="disableOpener();">

    <p><a href="child.htm" onclick="openChildWindow(this);return false;">Open Child Window</a></p>

</body>

</html>

DOS Batch File - Looping Over Several Days - Solution

Thursday, November 9th, 2006

Several people have had the question about creating batch files that will loop over several days (usually a week) and process files based on those dates. I have generated a solution with the help of several web sites.

In my particular scenario, I rename log files on a server so that they all begin with a date in yyyy-mm-dd format (I will post that solution later). I copy all the logs to a main directory for permanent storage, and then I want to copy several days worth of logs into a special folder that is used by my log analysis solution.

The core of my solution depends on XSET, which is shareware I will be purchasing. One problem is that there is no easy way to get date parts in DOS. Several solutions I have seen don’t work if you do not use standard US mm/dd/yyyy settings on your system. XSET can get the date parts through an internal function.

@call xset32 year DATE YYYY
@call xset32 month DATE MM
@call xset32 day DATE DD

The only problem is that if the day is less than 10, then the day variable is populated with a leading zero, and the SET /A command does not recognize it as a number. To convert this to a number, I use the XSET /MATH command:

@call xset32 /MATH day=%day%+0
@call xset32 /MATH month=%month%+0

Another website (sorry, I did not write down the reference) provided the code for looping through the days. Since leading zeros are added to the day and month variables, I modified the code to use XSET in two places. There is some debugging code, so you can delete the ECHO statements if you like.

SET i=0
:LOOP
SET /A i+=1
echo %i%
IF /I %i% GTR 7 GOTO END

:: Substract your 1 days here
@call xset32 /MATH day=%day%-1
echo day = %day%
@call xset32 /MATH month=%month%+0

echo month = %month%

if /I %day% GTR 0 goto DONE
set /A month=%month%-1

if /I %month% GTR 0 goto ADJUSTDAY

set /A month=12
set /A year=%year%-1

:ADJUSTDAY
echo adjustday
if %month%==1 goto SET31
if %month%==2 goto LEAPCHK
if %month%==3 goto SET31
if %month%==4 goto SET30
if %month%==5 goto SET31
if %month%==6 goto SET30
if %month%==7 goto SET31
if %month%==8 goto SET31
if %month%==9 goto SET30
if %month%==10 goto SET31
if %month%==11 goto SET30
if %month%==12 goto SET31

goto ERROR

:SET31

set /A day=31 + %day%

goto DONE

:SET30

set /A day=30 + %day%

goto DONE

:LEAPCHK

set /A tt=%year% %% 4

if not %tt%==0 goto SET28

set /A tt=%year% %% 100

if not %tt%==0 goto SET29

set /A tt=%year% %% 400

if %tt%==0 goto SET29

:SET28

set /A day=28 + %day%

goto DONE

:SET29

set /A day=29 + %day%

:DONE

if /i %day% LSS 10 set day=0%day%
if /I %month% LSS 10 set month=0%month%

copy d:\backups\_Logs\%year%-%month%-%day%* d:\backups\_Current_Logs\
GOTO LOOP
:END

Alt and Title attribute behavior in <img /> elements

Thursday, November 9th, 2006

At the company where I work, we were discussing search engine optimization, and one of the topics that came up was the use of alt attributes for images. My understanding is that search engines may use the alt attribute when indexing the graphic itself or any surrounding link. I brought up the topic of the title attribute as well. I do not know if any search engines use the title attribute in indexing, but it might provide another avenue to insert relevant keywords.

Part of the discussion concerned how alt attributes are used for rollover text. However, title attributes are also used for rollover text, so I was wondering who wins when both are present. To test this, I created an HTML page with the following code:

<img title=”title attribute” alt=”alt attribute” src=”bogus.gif” />

Alt Title Image ExampleI tested in four different browsers: IE6, IE7, Firefox 2, Opera 9. The results were the same in each. The title attribute appears when the cursor is hovered over the image. However, the alt attribute is used in the image placeholder. In this test it is obviously seen since bogus.gif does not exist.

Since it is the same across all the browsers, this is most likely a standard that I have not run across before. The question as to whether a search engine will index the title attribute is still open. If anyone knows the answer, please add a comment.