XSS sanitization for php

November 7th, 2008 by Ritesh Gurung

Sanitising your code:
Ahm…  Ok adding some techie stuff to your php knowledge. Before accepting any thing from web make sure it is a clean input as you can never trust user input, it may be malicious, So you have to always check your php input. To sanitise this you have to check all global arrays like $_GET, $_POST, $_REQUEST, $_COOKIE, allow only known variables and make sure that they contain the right type of data.

What does this mean ? It means that if you have a $_GET['id'] variable in your script which has to be an integer, always check it and make sure it is an integer.

Also don’t allow other variables in $_GET or other globals, keep only variables that your scripts need. So, if your script only uses only one variable $_GET['id'] then dispose other variables.

Ref from http://www.denhamcoote.com/php-howto-sanitize-database-inputspost by Denham Coote

Stripping out malicious code
While getting the data in Post or Get, look out for any malicious html tags

———– Code snippet starts ————————-
<?
function cleanInput($input) {

$search = array(
‘@<script[^>]*?>.*?</script>@si’,   // Strip out javascript
‘@<[\/\!]*?[^<>]*?>@si’,            // Strip out HTML tags
‘@<style[^>]*?>.*?</style>@siU’,    // Strip style tags properly
‘@<![\s\S]*?–[ \t\n\r]*>@’         // Strip multi-line comments
);

$output = preg_replace($search, ”, $input);
return $output;
}
?>
———– Code snippet Ends ————————-

’slashing
Add backslash before following : ‘(single-quote), “ (double quote), \ (backslash) and NULL characters. in case magic_quotes is on,  this is done  automatically.  One can use, addslashes(),which  is the manual version of magic_quotes.  .   if your server supports  mysql_real_escape_string() can be used.
———– Code snippet starts ————————-
<?
function sanitize($input) {
if (is_array($input)) {
foreach($input as $var=>$val) {
$output[$var] = sanitize($val);
}
}
else {
if (get_magic_quotes_gpc()) {
$input = stripslashes($input);
}
$input  = cleanInput($input);
$output = mysql_real_escape_string($input);
}
return $output;
}
?>
———– Code snippet Ends ————————-

To use, we simply pass any input to the function. The function works on single strings, as well as deep arrays.

<?
$bad_string = “Hi!
<SCRIPT SRC=http://ha.ckers.org/xss.js></SCRIPT> It’s a good day!”;

$_POST = sanitize($_POST);
$_GET  = sanitize($_GET);
$good_string = sanitize($bad_string);
// $good_string returns “Hi! It\’s a good day!”
?>

Typecasting
Before doing anything typecast the incoming data.
<?
$age = (int) $_GET['age'];
?>

In addition all these we have “filter_var_array” php function supported for PHP 5 >= 5.2.0, and is available as a pecl package.

other Useful links to dig in :

http://htmlpurifier.org/

http://www.codinghorror.com/blog/archives/001175.html

SociBook del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon

8 Comments


  1. dblackshell, November 10, 2008:

    try this on you cleaning function ->
    <script>alert(/xss/)</script>
    >:)


  2. dblackshell, November 10, 2008:

    ah… stupid strip_tags… i mean
    <scri<b></b>pt>alert(/xss/)</scrip<b></b>t>


  3. leeotzu, November 10, 2008:

    Hi dblackshell,

    Thanks for your comments. I tested your input as per the coments message on my cleanup function and it passed the test. This depends how and where you use this function.

    Do let me know in case you find any bug.


  4. Apprentice, November 13, 2008:

    What are the @ and si and U bits in your regex. I’m not familiar with those particular regex characters.


  5. Apprentice, November 13, 2008:

    Okay… so I figured out that @ is just your delimeter. But what about si and the siU. What do those do?


  6. leeotzu, November 13, 2008:

    Hi Apprentice,

    i : it is letters in the pattern match both upper and lower case letters.

    s: This is a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl’s /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier .

    U: This modifier inverts the “greediness” of the quantifiers so that they are not greedy by default, but become greedy if followed by “?”. It is not compatible with Perl. It can also be set by a (?U)

    More pattern modifiers can can be checked out at http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

    Cheers


  7. Denham Coote, February 8, 2009:

    Wow, well on on plagiarizing my work. Original at http://www.denhamcoote.com/php-howto-sanitize-database-inputs


  8. v2, March 10, 2009:

    is this actually safe?
    i mean… really?

Leave a comment

 Subscribe to our Blog feed