Unless you want loads of weird characters displayed in a php document object it is very important to use encoding and decode the orginal string to be processed first when loading the org. string

eg this uses utf-8

$xml = new DOMDocument();
$xml->encoding = 'utf-8';

Bye bye weird characters (doc objects in php tend to change the character codings, so this avoids the problem)