PHP DOM方式 操作XML 解析、创建、修改

    本文不涉及XML的知识,请自行学习,本文不涉及XML的高级操作,只是简单的增删改查。

类图

     从图中可以看到,DOMNode类是大部分类的父类,下面将会用到的一些成员函数和成员变量很多都是从他继承来的。比较常用的是标绿色的那几个,本文主要涉及到这几个类。

PHP DOM方式 操作XML 解析、创建、修改_第1张图片

XML创建、组合

创建DOMDocument

$doc=new DOMDocument("1.0","utf-8"); 
$doc->formatOutput=true; //Nicely formats output with indentation and extra space
echo  $doc->saveXML();

我们将得到一个空的XML,只有一个头,没有任何元素

<?xml version="1.0" encoding="utf-8"?>

注:saveXML()函数返回的是个XML字符串。下边我们边创建边组合,然后用此函数显示。下边的代码不断的追加即可看到整体的效果。

创建一个'空'根元素

$root = $doc->createElement("breakfast_menu");
$doc->appendChild($root);
echo $doc->saveXML();

结果是:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu/>


创建一个'空'子元素

$food = $doc->createElement("food"); // 创建一个名为food的空子元素
$root->appendChild($food);
echo $doc->saveXML();

结果是:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food/>
</breakfast_menu>


添加属性方法一

$food->setAttribute("onsale","yes"); // 添加属性方法一
echo $doc->saveXML();

结果:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food onsale="yes"/>
</breakfast_menu>


添加属性方法二

$food_attr = $doc->createAttribute("fresh");// 添加属性方法二
$food_attr_txt = $doc->createTextNode("no");
$food_attr->appendChild($food_attr_txt);
$food->appendChild($food_attr);
echo $doc->saveXML();

结果:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food onsale="yes" fresh="no"/>
</breakfast_menu>


添加带文本子元素方法一

$food_name = $doc->createElement("name","Waffles"); // 创建带文本元素,方法一
$food->appendChild($food_name);
echo $doc->saveXML();

结果:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food onsale="yes" fresh="no">
    <name>Waffles</name>
  </food>
</breakfast_menu>


添加带文本子元素方法二

$food_price = $doc->createElement("price"); // 创建带文本元素,方法二
$food_price_txt = $doc->createTextNode("$4");
$food_price->appendChild($food_price_txt);
$food->appendChild($food_price);
echo $doc->saveXML();

结果:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food onsale="yes" fresh="no">
    <name>Waffles</name>
    <price>$4</price>
  </food>
</breakfast_menu>


添加注释

$food_comment = $doc->createComment("Hello, I am a comment"); // 创建一条注释
$food->appendChild($food_comment);
echo $doc->saveXML();

结果:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food onsale="yes" fresh="no">
    <name>Waffles</name>
    <price>$4</price>
    <!--Hello, I am a comment-->
  </food>
</breakfast_menu>


添加文本

$txt = $doc->createTextNode("TXT");
$food->appendChild($txt);
echo $doc->saveXML();

结果:

<?xml version="1.0" encoding="utf-8"?>
<breakfast_menu>
  <food onsale="yes" fresh="no"><name>Waffles</name><price>$4</price><!--Hello, I am a comment-->TXT</food>
</breakfast_menu>


保存到文件

$doc->save("./create.xml");



当前目录下会生成create.xml文件

小结

    上边我们创建了有一个XML,基本的思路是用DOMDocument对象创建一系列要用的组件,然后把它们用appendChild()方法组合起来(也有部分直接设置的方法,比如setAttribute())。还有几个特性在此说明:

    1.根元素有且只有一个,但他也是一个普通元素,可以有属性、文本内容、注释、子元素的组件。

    2.更复杂的XML也是由一个个的元素和子元素组成。

    3.本人发现,一个元素如果有多个子元素,这些子元素之间都是有文本元素的,如果没有文本就是个空的文本元素。

另:有教程说,使用xml属性不方便扩展,建议只使用子元素。经本人研究,发现一个DOMElement对象对属性的操作有很多方法,比操作子对象的多多了。然后经思考,我觉得是不是这样更好:只有一个的信息用属性,同时有多个的用子元素,比如描述一个人,性别、年龄、父亲、母亲这样的用属性,兄弟姐妹、住房、车辆等用子元素。欢迎大家对此进行讨论。

上文的整体代码:

<?php

echo "=========创建一个document========\n";
$doc=new DOMDocument("1.0","utf-8");
$doc->formatOutput=true;//Nicely formats output with indentation and extra space
echo $doc->saveXML();
echo "=========创建一个'空'根元素========\n";
$root = $doc->createElement("breakfast_menu"); // 创建一个名为breakfast_menu的空根元素
$doc->appendChild($root);
echo $doc->saveXML();
echo "=========创建一个'空'子元素========\n";
$food = $doc->createElement("food"); // 创建一个名为food的空子元素
$root->appendChild($food);
echo $doc->saveXML();
echo "=========添加属性方法一========\n";
$food->setAttribute("onsale","yes"); // 添加属性方法一
echo $doc->saveXML();
echo "=========添加属性方法二========\n";
$food_attr = $doc->createAttribute("fresh");// 添加属性方法二
$food_attr_txt = $doc->createTextNode("no");
$food_attr->appendChild($food_attr_txt);
$food->appendChild($food_attr);
echo $doc->saveXML();
echo "=========添加带文本子元素方法一========\n";
$food_name = $doc->createElement("name","Waffles"); // 创建带文本元素,方法一
$food->appendChild($food_name);
echo $doc->saveXML();
echo "=========添加带文本子元素方法二========\n";
$food_price = $doc->createElement("price"); // 创建带文本元素,方法二
$food_price_txt = $doc->createTextNode("$4");
$food_price->appendChild($food_price_txt);
$food->appendChild($food_price);
echo $doc->saveXML();
echo "=========添加注释========\n";
$food_comment = $doc->createComment("Hello, I am a comment"); // 创建一条注释
$food->appendChild($food_comment);
echo $doc->saveXML();
echo "=========添加文本(缩进格式被破坏)========\n";
$txt = $doc->createTextNode("TXT");
$food->appendChild($txt);
echo $doc->saveXML();
$doc->save("./create.xml");
?>



DOMAttr组件操作详解

$att_txt1 = $doc->createTextNode("111"); // "111"
$att_txt2 = $doc->createTextNode("222"); // "222"
$att_txt3 = $doc->createTextNode("333"); // "333"
$att_txt4 = $doc->createTextNode("444"); // "444"

$att = $doc->createAttribute("id");      // id=""
$att->appendChild($att_txt1);            // id="111"
$att->appendChild($att_txt2);            // id="111222"
$att->replaceChild($att_txt3,$att_txt2); // id="111333"
$att->insertBefore($att_txt4,$att_txt3); // id="111444333"
$att->removeChild($att_txt4);            // id="111333"
echo $att->name;                 // id
echo "\n";
echo $att->value;                // 111333
echo "\n";
var_dump($att->hasAttributes()); // bool(false)
echo "\n";
var_dump($att->hasChildNodes()); // bool(true) 没错,属性是有子node的,
                                 // 组合时是可以用多个TextNode拼装的。
$txt_list = $att->childNodes;    // 把子node一个个取出来
foreach ($txt_list as $txt)      // 还有如下一些成员变量(继承自DOMNode)
{                                // $att->parentNode;
        print_r($txt);           // $att->firstChild;
        echo $txt->wholeText;         // $att->previousSibling;
        echo "\n";               // 等等
}


DOMText组件操作详解


        $txt = $doc->createTextNode("aab");  // "aab"
        $txt->appendData("bcc");             // "aabbcc"
        $txt->deleteData(2,2);               // "aacc"
        $txt->insertData(2,"bb");            // "aabbcc"
        $txt->replaceData(2,2,"BB");         // "aaBBcc"
        
        echo $txt->substringData(1,4);       // "aBBc"
        echo "\n";
        echo $txt->length;                   // 6
        echo "\n";
        echo $txt->wholeText;                // aaBBcc
        echo "\n";
        echo $txt->data;                     // aaBBcc
        echo "\n";



DOMComment组件操作详解


        $com = $doc->createComment("AAB"); // <!--"AAB"-->
        $com->appendData("BCC");           // <!--"AABBCC"--> $com->deleteData(2,2);             // <!--"AACC"-->
        $com->insertData(2,"BB");          // <!--"AABBCC"-->
        $com->replaceData(2,2,"bb");       // <!--"AAbbCC"-->
        echo $com->substringData(1,4);     // AbbC
        echo "\n";
        echo $com->data;                   // AAbbCC
        echo "\n";
        echo $com->length;                 // 6
        echo "\n";


XML解析

示例:      
$xml=<<<LLL
<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu id="1">
        AAA
        <food onsale="yes" fresh="yes">
                <name>Waffles</name>
                <price>$7.95</price>
                <calories>900</calories>
        </food>
        BBB
        <food onsale="yes" fresh="no">
                <name>Bread</name>
                <price>$8.95</price>
                <calories>900</calories>
        </food>
        CCC
        <drink onsale="no" fresh="yes">
                <name>Milk</name>
                <price>$4.50</price>
                <calories>600</calories>
        </drink>
        DDD
        <drink onsale="no" fresh="no">
                <name>Orange Juice</name>
                <price>$6.95</price>
                <calories>950</calories>
        </drink>
        EEE
        <!-- this is comment  -->
        FFF
</breakfast_menu>
LLL;



创建DOC,并加载XML

        $doc=new DOMDocument("1.0","utf-8"); 
        $doc->formatOutput=true; # Nicely formats output with indentation and extra space
        $doc->validateOnParse=true;
        $doc->loadXML($xml); // 从字符串加载  #$doc->load($xmlfilename); // 从文件中加载
        echo $doc->saveXML(); //生成字符串  #echo $doc->save($xmlfilename); // 保存到文件

获取root节点,(三种方法)

if ($doc->hasChildNodes())
{
                #第一种方法
                $root = $doc->firstChild;# because it has one and only one root node
                #第二种方法
                $root = $doc->childNodes->item(0);#the same as above  #第三种方法(推荐)
                $root_list = $doc->getElementsByTagName("breakfast_menu");# this xml may have root but not breakfast_menu  if ($root_list->length > 0)  {  $root = $root_list->item(0);  }  echo 'root name -> '.$root->nodeName;  echo "\n";  }


从root节点中获取子元素(知道子元素名),attribute(知道或不知道子元素名)

        $food_list = $root->getElementsByTagName("food");
        if ($food_list->length > 0) # check before use, anything is possible.
        {
                foreach ($food_list as $key => $value)
                {
                        print_list($key, $value); // 自定义函数,见下文
                        # 获取所有attribute,如果有未知的也可以得到
                        if ($value->hasAttributes())
                        {
                                $attribute_list = $value->attributes;
                                foreach ($attribute_list as $value)
                                {
                                        echo 'name -> '.$value->name;
                                        echo "\n";
                                        echo 'value -> '.$value->value;
                                        echo "\n";
                                }
                        }
                }
        }


        $drink_list = $root->getElementsByTagName("drink");
        if ($drink_list->length > 0) # check before use, anything is possible.
        {
                foreach ($food_list as $key => $value)
                {
                        print_list($key, $value);

                        if ($value->hasAttribute("onsale")
                        {
                                echo "onsale -> ".$value->getAttribute("onsale");

                        }
                        if ($value->hasAttribute("fresh")
                        {
                                echo "fresh -> ".$value->getAttribute("fresh");

                        }

                }
        }

        function print_list(&$key,&$value)
        {
                echo 'key -> '.$key."\n";
                #echo 'value type ->';print_r($value)."\n";
                $name_list = $value->getElementsByTagName('name');
                if ($name_list->length > 0)
                {
                        echo 'name -> '.$name_list->item(0)->nodeValue."\n";
                }
                $price_list = $value->getElementsByTagName('price');
                if ($price_list->length > 0)
                {
                        echo 'price -> '.$price_list->item(0)->nodeValue."\n";
                }
                $calories_list = $value->getElementsByTagName('calories');
                if ($calories_list->length > 0)
                {
                        echo 'calories -> '.$calories_list->item(0)->nodeValue."\n";
                }


        }


如果有更深层次的嵌套,再继续这样做下去就是了。

最后在补充下在不知道有哪些子元素情况下,解析XML的方法

        if ( false and $root->hasChildNodes())
        {
                # let us see, what we can get from a element childNodes
                # it's text comment and subelement, no attribute
                print_childs($root->childNodes);//见后文
        }



        function print_childs($nodelist)
        {
                echo "=====================\n";
                foreach ($nodelist as $key => $value)
                {
                        echo $key;
                        echo "\n";
                        print_r($value);
                        if (is_a($value,'DOMText'))
                        {
                                #echo $value->wholeText;
                                #echo '-------------';
                                echo $value->nodeValue;
                        }
                        if (is_a($value,'DOMElement'))
                        {
                                echo $value->tagName;
                                echo "\n-------------\n";
                                echo $value->nodeValue;
                        }
                        # this tell us comment can get 
                        if (is_a($value,'DOMComment'))
                        {
                                echo $value->nodeValue;
                        }
                        echo "\n";
                }
                echo "=====================\n";
        }




XML修改

原始xml

$xml=<<<LLL
<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu id="1">
        <food onsale="yes" fresh="yes">
                <name>Waffles</name>
                <price>$7.95</price>
                <calories>900</calories>
        </food>
        <food onsale="yes" fresh="no">
                <name>Bread</name>
                <price>$8.95</price>
                <calories>900</calories>
        </food>
        <!-- this is comment  -->
</breakfast_menu>
LLL;



修改的对象是子元素、元素文本值、元素属性
        
        $doc=new DOMDocument("1.0","utf-8"); # this is root : <?xml version="1.0" encoding="utf-8"\?\>
        $doc->formatOutput=true; # Nicely formats output with indentation and extra space
        $doc->validateOnParse=true;
        $doc->loadXML($xml);
        #$doc->load($xmlfilename);
        # print 
        echo $doc->saveXML();
        #echo $doc->save($xmlfilename);


        # get root
        if ($doc->hasChildNodes())
        {
                $root_list = $doc->getElementsByTagName("breakfast_menu");# this xml may have root but not breakfast_menu
                if ($root_list->length > 0)
                {
                        $root = $root_list->item(0);
                }

                echo 'root name -> '.$root->nodeName;
                echo "\n";
        }


        # get value
        echo " ===== food =====\n";
        $food_list = $root->getElementsByTagName("food");
        if ($food_list->length > 0) # check before use, anything is possible.
        {
                foreach ($food_list as $key => $value)
                {
                        $name_list = $value->getElementsByTagName('name');
                        if ($name_list->length > 0)
                        {
                              $name_list->item(0)->nodeValue="rice";
                        }
                        $price_list = $value->getElementsByTagName('price');
                        if ($price_list->length > 0)
                        {
                                $value->removeChild($price_list->item(0));
                        }
                        $calories_list = $value->getElementsByTagName('calories');
                        if ($calories_list->length > 0)
                        {
                                $calories_list->item(0)->nodeValue = "3999";
                        }

                        if ($value->hasAttribute("onsale"))
                        {
                                $value->setAttribute("onsale","modify");

                        }

                        if ($value->hasAttribute("fresh"))
                        {
                                $value->removeAttribute("fresh");

                        }
                }
        }


        echo $doc->saveXML();

?>





你可能感兴趣的:(PHP,xml,dom,domdocument)