You are currently browsing the Markus Breitenbach weblog archives for May, 2007.
- Advertising (1)
- Artificial Intelligence (AI) (13)
- Classification (3)
- Clustering (1)
- Coding / Programming (8)
- Cryptography (1)
- Data Mining (19)
- Economy / Investing (1)
- ewrt linux (2)
- Fixing Stuff (8)
- Machine Learning (31)
- Math (2)
- Politics (3)
- Predictive Modeling (4)
- Psychology (3)
- Ramblings (26)
- Random (9)
- Security (15)
- Society (12)
- Sociology (4)
- spam (3)
- Statistics (16)
- August 5, 2010 1:06 am: Elo Scores and Rating Contestants
- July 11, 2010 8:56 pm: GraphLab & Parallel Machine Learning
- June 15, 2010 8:21 pm: PHP configuration using htaccess on 1and1 shared hosting
- February 28, 2010 12:21 pm: Energy efficient data mining algorithms
- February 16, 2010 11:56 pm: Alternative measures to the AUC for rare-event prognostic models
- January 26, 2010 9:54 pm: Spam Filtering by Learning a Pattern Language
- January 10, 2010 5:37 pm: Strong profiling is not mathematically optimal for discovering rare malfeasors (on rare event detection)
- November 13, 2009 12:27 am: Starcraft AI competition
- July 25, 2009 8:34 pm: Random characters in text mode -> graphics card
- June 7, 2009 5:04 pm: Programs stealing the input focus
Blogroll
Uncategorized
Useful Links
- August 2010
- July 2010
- June 2010
- February 2010
- January 2010
- November 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
Archive for May 2007
Safe Strings in PHP
May 7, 2007 5:01 pm by Markus.
A while ago I read about an idea to make it easier to avoid common programming mistakes in PHP regarding the handling of strings. There are dozens of attacks that one must pay attention to when using strings: you have to escape your string one way when you embed it in an SQL statement, escape it in a different way when outputting it as part of a web-page (XSL attacks), and escape it in a third way when you output it as part of a HTTP-header. It’s not surprising that eventually somewhere something will be not escaped in the right way.
Wells suggests a SafeString class to encapsulate all Strings in a class with different access methods that automatically escape your string the right way. So if you were to output the string back to the user, you’d call a toHTML() method that properly escapes any HTML-tags and special characters embedded in the string. A method to access the raw string would be called “UnsafeRawString” to remind the programmer that the string contains “tainted” user-input. While it is still possible to do something wrong, these parts stick out in the code (for example, one might use String->toHTML() when using it in an SQL statement - obviously wrong, but much easier to find). See “Making Wrong Code look Wrong” for the underlying philosophy.
I really like the idea, but I see a couple of practical problems with this idea:
- All strings, including Server variables and Super-Globals, should be automatically converted to the new String class. Otherwise the programmer has to constantly figure out if he/she is dealing with an encapsulated string or not.
- You’d need a database abstraction layer that will return these kind of strings as results of queries.
- All the existing PHP string operations (from strcmp to soundex) must be usable. This can be tricky, but interestingly PHP5 offers a way with __call to overload the object with arbitrarily named functions (see overload() function in PHP4). With some eval-magic this could be doable. Technically you wouldn’t want anybody to ever to work with the UnsafeRawString…
Posted in Coding / Programming, Security | Print | 1 Comment »