Saturday, December 19, 2009

PHP Arrays sort

There are lots of PHP defined function to order arrays (the core of PHP, technically are hashmaps), ordering by value or keys, preserving key order or not, using an user-defined function, normal or reverse order, etc...

Following, some clear example of the main ones, and some tips about how to remember them

ORDER BY VALUE

sort(): order by VALUE (keys destroyed and renumberted starting from zero)
$a = array('z'=>'A', 'y'=>'C','k'=>'B','x'=>'B');
sort( $a ); # Array ( A ,B ,B ,C )

asort(): as the previous, but it maintains index association
$a = array('z'=>'A', 'y'=>'C','k'=>'B','x'=>'B');
asort( $a ); # Array ( [z] => A , [k] => B , [x] => B , [y] => C )

natsort(): use "natural" sorting
$a = array('z'=>'5p', 'y'=>'10p','20p');
sort( $a ); # Array ( 10p ,20p ,5p )
natsort( $a ); # Array ( [2] => 5p , [0] => 10p , [1] => 20
p )

ORDER BY KEY

ksort(): order by key (maintaining associations)
$a = array('z'=>'A', 'y'=>'C','k'=>'B','x'=>'B');
ksort( $a ); # Array ( [k] => B , [x] => B , [y] => C , [z] => A )


USER-DEFINED COMPARISON FUNCTION

usort(): compare using an user-defined function. (keys destroyed and renumbered starting from zero)
$a = array( '0'=>array('id'=>4,'n'=>'aaa') , '1'=>array('id'=>2,'n'=>'bbb'), '2'=>array('id'=>1,'n'=>'ccc'), );
usort($a, function ($a, $b) { #order by '[id]' contained in the elements #closure (php 5.3)
return ($a['id'] < $b['id']) ? -1 : 1; }); #Array ( Array ( [id] => 1 [n] => ccc ) ,Array ( [id] => 2 [n] => bbb ) ,Array ( [id] => 4 [n] => aaa ) )

uasort(): as usort but MAINTAINS the KEYS

HOW TO REMEMBER FUNCTION NAMES

sort = default sort behaviour: order by value and destroy keys
If the function name contains:
u = sort using [U]ser-defined function (Usort, Uasort, Uksort)
a = [A]ssociative: maintains key order (Asort, Arsort, uAsort)
k = order by [K]ey (Ksort, Krsort)
r = [R]everse order (aRsort, kRsort, Rsort)

OTHER ARRAY USEFUL FUNCTIONS

shuffle(): scramble array s contents. (keys destroyed and renumberted starting from 0)

array_rand($array, $number): return $number random elements from $array

array_multisort()

array_shift( array &$array ) ; return the array shifted by 1element off the beginning of array (remove the 1st)

array_slice( array $array , 3, 2, $preserve_keys); RETURNS 2 elements starting from the 4th. if $preserve_keys is true, the keys in the return array are the same in the original one
$a = array(0=>1, 1=>2, 2=>3, 4=>4, 3=>5, 5=>6);
$a = array_slice($a,3,2); #Array ( 4 ,5 )

array_splice(): remove the slice (REFERENCE) and return the removed part
$a = array(1, 2, 3, 4, 5, 6);
$returned = array_splice($a,3,2); #remove 2 elements from 4th elem. return the removed slice
# $returned = Array ( 4 ,5 )
# $a = Array ( 1 ,2 , 3 , 6 )

Thursday, December 10, 2009

Netbeans 6.8

Netbeans 6.8 is now available !
New features:
  • Symfony support (integrated prompt, syntax + help for command line code generator, shortucts) + YAML syntax support
  • better code completion, supporting PHP 5.3 features (namespaces)
  • PHPUnit improvements
  • PHP application from remote servers
  • SQL better auto completition
  • Embedded Browser + Web Preview for HTML and CSS

Thursday, November 26, 2009

Checking PHP script performance with Xdebug

Xdebug [http://www.xdebug.org/] is a useful tool to debug PHP scripts. An interesting feature is the script profiling.

If the option is enabled, Xdebug will be able to trace and save information (time, details) about all the functions/methods called in the script (CLI or Apache).

The aim of the profiling is mainly recognizing bottlenecks or simply what parts of the code that are slow.

In order to analyze the log file created, use KCacheGrind or WinCacheGrind (see screenshots below).

Setup and docs at [http://www.xdebug.org/docs/profiler].

Configuration for Wamp (PHP 5.3)

#php.ini
[xdebug]
zend_extension=c:/wamp/bin/php/php5.3.0/ext/php_xdebug-2.0.5-5.3-vc6.dll
xdebug.profiler_enable = 1
xdebug.profiler_output_dir=C:/wamp/www/profile/
xdebug.remote_enable=on
xdebug.remote_handler=dbgp
xdebug.remote_host=localhost
xdebug.remote_port=9000
xdebug.remote_mode=req

Screenshots


Thursday, November 19, 2009

Profiling MySQL

To analyze the db server usage in a complex PHP application, the first step is to profile the db server.
There are lots of tools to profile, but I think it's very easy to make a customized code to save the data really needed.
The idea is save information about some queries in the production environment (about 1% of the queries is usually enough, depending on the traffic).


MySQL profiling

Hoping there is a class used to manage queries (or at least mysqli class), it doesn't take long to replace a function that manages the queries with something similar to the following code (written in a simple way to show the idea):


class DB extends mysqli {
...
function query ($q) {

$start = microtime(1);
$this->query($q);
$wtime = microtime(1) - $start;
#save 1% of the queries + info
if ( rand(0,100)<1>query("INSERT DELAYED INTO `ProfileData` (q,wtime,created,...) ($q,$wtime, NOW(), ...) ");
}

}
...
}


What other info to save ? some ideas:
  • client IP
  • other $_SERVER info: user-agent, request_method, etc...
  • PHP backtrace (to understand which line launched the query)
  • web server load
  • mysql server load
  • ...
How to analyze results making queries on the `ProfileData` table.
example: queries grouping by query and showing the average time of the queries. In this way, you can find what queries are the slowest ones.



-- select the slowest queries (average time) in the last 24 h
-- exclusion of the queries executed only once to exclude missing sql cache
SELECT `q`,AVG(`wtime) as "medium time", COUNT(`id`) as "occurences"
FROM `ProfileData`
WHERE `created` > DATE_ADD(NOW(), INTERVAL -1 DAY)
GROUP BY `q`
HAVING COUNT(`id`) > 2
ORDER BY AVG(`wtime) DESC

Simple effective PHP debugging + backtracking

During PHP debugging I often need to debug complex data. Is not always possibile to use Xdebug and the IDE debugging features with MVC frameworks, and also some arrays/object are too big and unhandy for FirePHP.

A valid solution might be a traditional "print_r"/"var_dump" + "exit"
Two problems:
1) accidental commits to the staging/production environment.
2) it takes time to understand where they are placed in the code, also because of the "exit".

Solutions:
Make a function to debug that
1) use a (external) constant (define) that define what is the environment and return without debugging and exiting if the environment is not the localhost one.
2) print the backtrace to easily find and remove the "breakpoints"

Code:

function pd($var, $useVarDump=false, $exit=true){

if (IS_PRODUCTION_ENV) return;
echo '<pre>';
if ($useVarDump) var_dump($var); else print_r($var);
echo "\n\nBACKTRACE:";
print_r(array_slice( debug_backtrace(false),1) ;
echo '</pre>';
if ($exit) exit;

}

MySQL dump importing

Today I realized that "mysqlimport" is not working as expected on Wamp environment.
A working way to import a sql/dump file is to use the "mysql" executable

#localhost
mysql --u root -p --user=root --force [DBNAME] < [FILE.SQL]

Monday, November 16, 2009

PHP 5.3

I've just read this pdf from I.A.'s blog about PHP 5.3 performances.
My comments:

Performances
What I really consider good is the performance increasing (5/10%) that include a smarter behaviour with require/inclusion, smaller binary size and better stack performance.

Features
- Namespaces are OK, but not really necessary. A good code can be written also without them.
- I think the best feature is the late static binding. It was the only big lack about the PHP OO.
- Also closures are sometimes useful to write a clearer code. I've tested their performances (*) and it seems there are no decreasing using them, that's cool.
(*) array_map(function ($n){return($n * $n * $n);}, array(1, 2, 3, 4, 5)); #
- "goto": its utility (especially in a OO language!) doesn't make any sense to me. Very bad code readability with it.
- MySQLInd sounds very interesting to have better performances with MySQL. Client side query cache, written for PHP (not C/C++), performance statistics for bottle-neck analysis [read here]... wow ! I'll probably test it soon although (according to some blog posts) the performance increasing seems not very high.
- Hundreds of bug fixing and improvements, including Directory iterator and date functions => well done PHP community !

Tuesday, November 3, 2009

How to optimize PHP applications

There are lots of advices on the web about how to speed up PHP applications.
The best reading I've found are written by Ilia Alshanetsky [blog]:

PHP & PERFORMANCE [pdf]
By: Ilia Alshanetsky

Common Optimization Mistakes [pdf]
PHP Quebec 2009

Enjoy !

Thursday, October 15, 2009

a Javascript function to manage function timeout

Today I worked with form validations and AJAX requests associated with the event keyup in the input fields.
Sometimes the validation requires an AJAX call (for instance to check if the typed text already exists in the DB). In order to avoid a request for each char typed, a good solution might be using set/clearTimeout.

With the aim of doing only one function to manage all the timeouts of the functions called for each field, I've written a general function that does that, supporting dynamically the function name.

It uses an array (member of the function) to store the timeouts of the functions called, and an eval to launch/stop the each function with the relative timeout.

Tested only on Firefox 3.5.2



/*javascript*/

function callTimeout(funcName, timeout, args) {
if (args==undefined) {args = "";}
if (timeout == undefined) { timeout=500; }

if (this.timeOutsArray == undefined) {
this.timeOutsArray = new Array(); }
if (this.timeOutsArray[funcName] == undefined) {
this.timeOutsArray[funcName] = 0;
}
if (this.timeOutsArray[funcName]) {
clearTimeout(this.timeOutsArray[funcName]);
}
eval("this.timeOutsArray['"+funcName+"'] = setTimeout('" + funcName + "("+args+")', "+timeout+");");
}



Example of use with JQuery:



$("#field1").keyup(function(){callTimeout("validatefield1",500);})
...
...
$("#anotherFieldN").keyup(function(){callTimeout("validatefieldN",500);})







Behaviour: If the user types very quickly in the search box (id=field1), the function is not called hundreds time, but just at the end after 500ms.

Monday, October 12, 2009

printing the backtrace in a complex PHP application

the PHP debug_backtrace [man] function is very useful to understand where a function/method is called.

It prints the back trace of the code.

Example: In the framework I'm currently using there are ORM classes to access the DB. So, it takes long to understand where the query is launched when needed. Solution: save in a global variable (or in a field of the class) the list of the queries launched and the code that has launched each query.


Code:


class DB {
# ...
public function query($query) {
if (
DEBUG_MODE) { ##
$_dbt = debug_backtrace();
$_fromFile = isset( $_dbt[0]['file']) ? str_replace(_ROOT, "",$_dbt[0]['file']) : "";
$_fromLine = isset( $_dbt[0]['line']) ? $_dbt[0]['line'] : "";
$_launchedFrom = "launched from $_fromFile:$_fromLine";
$this->logQueries[] = "[$query][$_launchedFrom]";
# ..query.
}
}
# ...
}



Example of result:


Array
(
[
0] => [set names utf8][launched from C:\wamp\www\dev\index.php:68]
[
1] => [SELECT * FROM `Setting` WHERE `id` = 1][launched from \class\Setting.class.php:20]
[
2] => [SELECT id FROM `User`WHERE id = 293968 LIMIT 1][launched from \class\User.class.php:598]
[
3] => [SELECT * FROM `User` WHERE `id` = 293968][launched from \class\User.class.php:45]
[
4] => [SELECT f.store_id FROM `Store_Monthly_Featured` f ORDER BY f.order ASC][launched from \class\Store_Monthly_Featured.class.php:16]
)


Useful function to print only line and number


function debug_backtrace_filelines() {

$ret = array();
$b = debug_backtrace();
foreach (
$b as $elements) {
$ret[] = $elements['file'].':'.$elements['line'];
}
}

How to search complex source code using regular expressions

Today I needed to search the source code that ware calling a function with a complex array of parameters

URL::Make( 'site/store.inc.php', array( 'action'=>'view','id'=> [... STH...] ,'idTwo'=> [... STH...] ) );

Considering this requirements:

- there are lots of similar function (to exclude from the search) with an argument less or an argument more

- there may are additional spaces in the line

- some other delopers have probably used double quotes instead of single quotes for array keys

- the value of the keys may be a PHP code

- every function call is in one line

An elegant solution is to search using regular expression. Netbeans (my favorite editor) supports POSIX extended regular expression and is very quickly to search a complex epression in a huge amount of source code.

Solution 1

#using char length and patter ".{0,MAX}" (that is: every sequence, MAX maximum chars). Easy !

URL::Make.{0,10}site/store.{0,40}action.{0,9}view.{0,20}id.{0,200}idTwo.{0,30}\).{0,10}\)

Solution 2

#using regexpr for separators. more complex. more accurate in some cases but fails in some cases (eg: comments inside)

URL::Make.*\([ ]*['"]site/store.inc.php['"][ ]*,[ ]*array[ ]*\([ ]*['"]action['"][ ]*=>[ ]*['"]view['"][ ]*,[ ]*['"]id['"][ ]*=>.{0,100}['"]idTwo['"][ ]*=>.{0,100}[ ]*\)[ ]*\)

An idea for DB server load balancing

If the application has a heavy load of reading queries and there are no problems with the application requirements,
a simple possible idea to balance the db server loading is to run the INSERT/UPDATE/ALTER queries in all the servers and run the SELECT queries in just one server (chosen randomly).
Have a look at the 'loadBalancedQuery' function: if the query modifies the db it will be executed in both server, otherwise in just one.

class BalancedDB extends Mysqli {

private function queryServer($query, $server = "mainServer" ) {
$db = DB::GetInstance($server);
return
$db->query();
}

public function
loadBalancedQuery($query) {
if (
preg_match("/[]*(insert|INSERT|update|UPDATE|alter|ALTER).*/",
$query)){
$this->queryServer( $query, "mainServer" );
$this->queryServer( $query, "secondaryServer" );
} else {
$this->queryServer( $query, rand(0,1) ? "mainServer": "secondaryServer" );
}
}

}

Sunday, October 11, 2009

UTF8 in LAMP applications: overview and how to solve the common issues

The problem
In a LAMP application the text is frequently saved/retrieved from/to a database and files. We must consider all the different encoding (mapping characher-byte value): latin1 iso8859-1, latin9, UTF-8 (utf8) etc...
Lots of applications use ISO8859 encoding and some PHP functions to convert the characters (htmlentities, htmlspecialchars etc...)


The solution
Converting all the text from an encoding to another using PHP functions is unsafe, difficult and annoying.
Ad example the character "é" (encoded as iso8859) will printed as "é" if it's supposed to be encoded as utf8.
The solution is to use only one charset for files, Content-Type of the pages and the db. UTF-8 [wiki] is the best choice: a variable-lenght char encoding for the standard Unicode. If you use this charset in the HTML file, it won't need to convert the characters to the respective entities.

How To use UTF-8
  • set your IDE to save and open source files using utf8 encoding
  • set the content-type of your application to utf-8 (better a apache/htaccess rule instead of the meta tag).
  • set the database server to use utf8 encoding (also tables must be converted). If the db is utf8 but client encoding is latin1, execute first of all the query "SET NAMES utf8"
  • if the application was using latin1 and PHP convert functions, remove all the existing function to encode/decode special characters/entities.

Monday, September 28, 2009

Mass rows copying (duplicating) with filed customization - MySQL

Let's suppose we have a table with the following structure and data:

# `table`
id | a | b | c |
---------------------------
1 | "aaa" | "xxx" | "ccc"
2 | "aab" | "yyy" | "ccc"
3 | "aba" | "yyy" | "ccc"
4 | "abc" | "xxx" | "ccc"
5 | "dcz" | "xxx" | "eee"



Now, we want to copy some records (only some columns) to the same table AND change some of them with a fixed value.
Requirements:
- duplicate all the rows with `b` equal to "xxx"
- for the new rows inserted, the value of `c` has to be changed to "ddd" (fixed value!)
- `id` is the primary key, auto increment

Query (pay attention to the quote characters):
INSERT INTO `table`
(SELECT
NULL as `id`,
`a`,
`b`,
"ddd" as `c`
FROM `table`
WHERE `b`="xxx")

Result:

id | a | b | c |
---------------------------
1 | "aaa" | "xxx" | "ccc" #row1
2 | "aab" | "yyy" | "ccc"
3 | "aba" | "yyy" | "ccc"
4 | "abc" | "xxx" | "ccc" #row4
5 | "dcz" | "xxx" | "eee" #row5
6 | "aaa" | "xxx" | "ddd" #copied from #row1
7 | "aab" | "xxx" | "ddd" #copied from #row4
8 | "dcz" | "xxx" | "ddd" #copied from #row5



Note 1:
The insert operation by using a subquery is not an 'atomic' operation for the recent versions of MySQL. If there are table constraints and a new record is not accepted (e.g: duplicate record for a key defined on columns), only that record won't inserted (not all of them!).
Example: If before there was a record with the same values as in line 9 and there was a unique key on columns (a,b,c), the query would insert only rows on line 6,7 and 8 (9 fails).
Some old version of MySQL stop execution in case of duplicate entries. In this case, add IGNORE to the query to skip duplicates: INSERT IGNORE INTO ...
Note 2:
If you add a new column to the table, the query will fail !!


Tested on MySQL 5.1.36


Tuesday, September 15, 2009

How to recursively check syntax of PHP files

The executable of PHP supports the '-l' option, that checks the syntax instead of parsing the file.
Using the command 'find', it's possibile to do a interesting operation: syntax checking of all the files recursively, to avoid parse errors in some script !!

find ./ -type f -name \*.php -exec php -l {} \; ";

the result will be a list of files, example:

No syntax errors detected in ./codebase/controller/competition.inc.php
No syntax errors detected in ./codebase/controller/feed_data.inc.php
Errors parsing ./codebase/controller/site/contact.inc.php
No syntax errors detected in ./codebase/controller/compare_prices.inc.php


We can improve the script and print only the file with suntax errors using 'grep'

find ./ -type f -name \*.php -exec php -l {} \; | grep "Errors parsing ";

To launch it from a PHP script

passthru('find ./ -type f -name \*.php -exec php -l {} \; | grep "Errors parsing " ');

Updated: To skip .svn directories add the option :
-not -regex '.*/.svn/*.*'
 

PHP and tips|PHP