From 186e491d1e7f7bddc04d5169084b224a648aa457 Mon Sep 17 00:00:00 2001 From: "arseny.kapoulkine" Date: Sun, 31 Oct 2010 07:45:27 +0000 Subject: docs: Regenerated HTML documentation git-svn-id: http://pugixml.googlecode.com/svn/trunk@790 99668b35-9821-0410-8761-19e4c4f06640 --- docs/manual/access.html | 102 ++++++------- docs/manual/apiref.html | 285 +++++++++++++++++++++++++++++++++--- docs/manual/changes.html | 228 ++++++++++++++++++++++++++++- docs/manual/dom.html | 277 +++++++++++++++++++---------------- docs/manual/install.html | 128 ++++++++--------- docs/manual/loading.html | 183 ++++++++++++----------- docs/manual/modify.html | 128 +++++++++-------- docs/manual/saving.html | 111 +++++++------- docs/manual/toc.html | 10 +- docs/manual/xpath.html | 368 +++++++++++++++++++++++++++++++++++++++-------- 10 files changed, 1286 insertions(+), 534 deletions(-) (limited to 'docs/manual') diff --git a/docs/manual/access.html b/docs/manual/access.html index 4581583..d4b38c2 100644 --- a/docs/manual/access.html +++ b/docs/manual/access.html @@ -4,14 +4,15 @@ Accessing document data - - + + -
pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -79,10 +80,11 @@

parent function returns the - node's parent; all nodes except the document have non-null parent. first_child and last_child - return the first and last child of the node, respectively; note that only - document nodes and element nodes can have non-empty child node list. If node - has no children, both functions return null nodes. next_sibling + node's parent; all non-null nodes except the document have non-null parent. + first_child and last_child return the first and last child + of the node, respectively; note that only document nodes and element nodes + can have non-empty child node list. If node has no children, both functions + return null nodes. next_sibling and previous_sibling return the node that's immediately to the right/left of this node in the children list, respectively - for example, in <a/><b/><c/>, @@ -93,9 +95,9 @@ results in handle pointing to <a/>. If node does not have next/previous sibling (this happens if it is the last/first node in the list, respectively), the functions return null nodes. first_attribute, last_attribute, - next_attribute and previous_attribute functions behave the - same way as corresponding child node functions and allow to iterate through - attribute list in the same way. + next_attribute and previous_attribute functions behave similarly + to the corresponding child node functions and allow to iterate through attribute + list in the same way.

@@ -111,7 +113,8 @@ Calling any of the functions above on the null handle results in a null handle - i.e. node.first_child().next_sibling() returns the second child of node, - and null handle if there is no children at all or if there is only one. + and null handle if node is + null, has no children at all or if it has only one child node.

With these functions, you can iterate through all child nodes and display @@ -142,12 +145,13 @@

Apart from structural information (parent, child nodes, attributes), nodes can have name and value, both of which are strings. Depending on node type, - name or value may be absent. node_document - nodes do not have name or value, node_element - and node_declaration nodes - always have a name but never have a value, node_pcdata, - node_cdata and node_comment nodes never have a name but - always have a value (it may be empty though), node_pi + name or value may be absent. node_document + nodes do not have a name or value, node_element + and node_declaration nodes always + have a name but never have a value, node_pcdata, + node_cdata, node_comment + and node_doctype nodes never have a name + but always have a value (it may be empty though), node_pi nodes always have a name and a value (again, value may be empty). In order to get node's name or value, you can use the following functions:

@@ -161,18 +165,18 @@

It is common to store data as text contents of some node - i.e. <node><description>This is a node</description></node>. In this case, <description> node does not have a value, but instead - has a child of type node_pcdata - with value "This is a node". - pugixml provides two helper functions to parse such data: + has a child of type node_pcdata with value + "This is a node". pugixml + provides two helper functions to parse such data:

const char_t* xml_node::child_value() const;
 const char_t* xml_node::child_value(const char_t* name) const;
 

child_value() - returns the value of the first child with type node_pcdata - or node_cdata; child_value(name) is - a simple wrapper for child(name).child_value(). + returns the value of the first child with type node_pcdata + or node_cdata; child_value(name) + is a simple wrapper for child(name).child_value(). For the above example, calling node.child_value("description") and description.child_value() will both produce string "This is a node". If there is no child with relevant type, or if the handle is null, child_value functions return empty string. @@ -194,15 +198,14 @@ const char_t* xml_attribute::value() const;

- In case attribute handle is null, both functions return empty strings - they - never return null pointers. + In case the attribute handle is null, both functions return empty strings + - they never return null pointers.

In many cases attribute values have types that are not strings - i.e. an attribute may always contain values that should be treated as integers, despite the fact that they are represented as strings in XML. pugixml provides several - accessors that convert attribute value to some other type. The accessors - are as follows: + accessors that convert attribute value to some other type:

int xml_attribute::as_int() const;
 unsigned int xml_attribute::as_uint() const;
@@ -241,7 +244,7 @@
         value to boolean as follows: if attribute handle is null or attribute value
         is empty, false is returned.
         Otherwise, true is returned
-        if first character is one of '1', 't',
+        if the first character is one of '1', 't',
         'T', 'y', 'Y'.
         This means that strings like "true"
         and "yes" are recognized
@@ -370,11 +373,11 @@
         return past-the-end iterator for node/attribute list, respectively - this
         iterator can't be dereferenced, but decrementing it results in an iterator
         pointing to the last element in the list (except for empty lists, where decrementing
-        past-the-end iterator is not defined). Past-the-end iterator is commonly
-        used as a termination value for iteration loops (see sample below). If you
-        want to get an iterator that points to an existing handle, you can construct
-        the iterator with the handle as a single constructor argument, like so:
-        xml_node_iterator(node).
+        past-the-end iterator results in undefined behavior). Past-the-end iterator
+        is commonly used as a termination value for iteration loops (see sample below).
+        If you want to get an iterator that points to an existing handle, you can
+        construct the iterator with the handle as a single constructor argument,
+        like so: xml_node_iterator(node).
         For xml_attribute_iterator,
         you'll have to provide both an attribute and its parent node.
       

@@ -527,7 +530,7 @@

While there are existing functions for getting a node/attribute with known contents, they are often not sufficient for simple queries. As an alternative - to iterating manually through nodes/attributes until the needed one is found, + for manual iteration through nodes/attributes until the needed one is found, you can make a predicate and call one of find_ functions:

@@ -546,24 +549,24 @@

find_attribute function iterates through all attributes of the specified node, and returns the first attribute - for which predicate returned true. - If predicate returned false + for which the predicate returned true. + If the predicate returned false for all attributes or if there were no attributes (including the case where the node is null), null attribute is returned.

find_child function iterates through all child nodes of the specified node, and returns the first node - for which predicate returned true. - If predicate returned false + for which the predicate returned true. + If the predicate returned false for all nodes or if there were no child nodes (including the case where the node is null), null node is returned.

find_node function performs a depth-first traversal through the subtree of the specified node (excluding - the node itself), and returns the first node for which predicate returned - true. If predicate returned + the node itself), and returns the first node for which the predicate returned + true. If the predicate returned false for all nodes or if subtree was empty, null node is returned.

@@ -622,11 +625,9 @@
xml_node xml_node::root() const;
 

- This function returns the node with type node_document, + This function returns the node with type node_document, which is the root node of the document the node belongs to (unless the node - is null, in which case null node is returned). Currently this function has - logarithmic complexity, since it simply finds such ancestor of the given - node which itself has no parent. + is null, in which case null node is returned).

While pugixml supports complex XPath expressions, sometimes a simple path @@ -643,9 +644,9 @@ path returns the path to the node from the document root, first_element_by_path looks for a node represented by a given path; a path can be an absolute one - (absolute paths start with delimiter), in which case the rest of the path - is treated as document root relative, and relative to the given node. For - example, in the following document: <a><b><c/></b></a>, + (absolute paths start with the delimiter), in which case the rest of the + path is treated as document root relative, and relative to the given node. + For example, in the following document: <a><b><c/></b></a>, node <c/> has path "a/b/c"; calling first_element_by_path for document with path "a/b" @@ -672,7 +673,7 @@

path function returns the - result as STL string, and thus is not available if PUGIXML_NO_STL + result as STL string, and thus is not available if PUGIXML_NO_STL is defined.

@@ -689,7 +690,7 @@ If the offset is not available (this happens if the node is null, was not originally parsed from a stream, or has changed in a significant way), the function returns -1. Otherwise it returns the offset to node's data from - the beginning of XML buffer in pugi::char_t + the beginning of XML buffer in pugi::char_t units. For more information on parsing offsets, see parsing error handling documentation.

@@ -704,7 +705,8 @@

-
pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/apiref.html b/docs/manual/apiref.html index 24120ad..5737c51 100644 --- a/docs/manual/apiref.html +++ b/docs/manual/apiref.html @@ -4,14 +4,15 @@ API Reference - - + + -
pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -110,7 +111,10 @@ node_pi
  • - node_declaration

    + node_declaration +
  • +
  • + node_doctype

  • @@ -261,12 +265,18 @@
  • parse_default
  • +
  • + parse_doctype +
  • parse_eol
  • parse_escapes
  • +
  • + parse_full +
  • parse_minimal
  • @@ -334,6 +344,10 @@ const;

    + +
  • + size_t hash_value() const;

    +
  • xml_attribute next_attribute() const; @@ -463,6 +477,10 @@ const;

    +
  • +
  • + size_t hash_value() const;

    +
  • xml_node_type type() @@ -621,6 +639,10 @@ xml_attribute append_attribute(const char_t* name);
  • +
  • + xml_attribute prepend_attribute(const char_t* + name); +
  • xml_attribute insert_attribute_after(const char_t* name, @@ -637,6 +659,11 @@ type = node_element);
  • +
  • + xml_node prepend_child(xml_node_type + type = + node_element); +
  • xml_node insert_child_after(xml_node_type type, @@ -647,10 +674,33 @@ type, const xml_node& node);

    +
  • +
  • + xml_node append_child(const char_t* + name); +
  • +
  • + xml_node prepend_child(const char_t* + name); +
  • +
  • + xml_node insert_child_after(const char_t* + name, + const xml_node& node); +
  • +
  • + xml_node insert_child_before(const char_t* + name, + const xml_node& node);

    +
  • xml_attribute append_copy(const xml_attribute& proto);
  • +
  • + xml_attribute prepend_copy(const xml_attribute& + proto); +
  • xml_attribute insert_copy_after(const xml_attribute& proto, @@ -666,6 +716,10 @@ xml_node append_copy(const xml_node& proto);
  • +
  • + xml_node prepend_copy(const xml_node& + proto); +
  • xml_node insert_copy_after(const xml_node& proto, @@ -738,7 +792,10 @@
  • xpath_node select_single_node(const char_t* - query) + query, + xpath_variable_set* + variables = + 0) const;
  • @@ -748,7 +805,10 @@
  • xpath_node_set select_nodes(const char_t* - query) + query, + xpath_variable_set* + variables = + 0) const;
  • @@ -769,6 +829,15 @@
  • ~xml_document();

    +
  • +
  • + void reset(); +
  • +
  • + void reset(const xml_document& + proto); +

    +
  • xml_parse_result load(std::istream& @@ -800,7 +869,15 @@ = parse_default, xml_encoding encoding = encoding_auto); -

    +
  • +
  • + xml_parse_result load_file(const wchar_t* + path, + unsigned int + options = + parse_default, + xml_encoding encoding + = encoding_auto);

  • @@ -839,6 +916,17 @@ encoding = encoding_auto) const; +
  • +
  • + bool save_file(const wchar_t* + path, + const char_t* indent + = "\t", unsigned + int flags + = format_default, xml_encoding + encoding = + encoding_auto) + const;

  • @@ -876,6 +964,10 @@ xml_encoding encoding = encoding_auto) const;


    + +
  • + xml_node document_element() const;

    +
  • @@ -960,32 +1052,58 @@ +
  • + struct xpath_parse_result +
    +
  • class xpath_query
  • class xpath_exception: public std::exception -
    • +
        +
      • virtual const char* what() const throw();

        -
      +
    • +
    • + const xpath_parse_result& result() const;

      + +
    • +
  • class xpath_node @@ -1055,6 +1186,18 @@
  • class xpath_node_set
      +
    • + xpath_node_set(); +
    • +
    • + xpath_node_set(const_iterator + begin, + const_iterator end, type_t + type = + type_unsorted); +

      + +
    • typedef const xpath_node* @@ -1100,6 +1243,97 @@
    • void sort(bool reverse = false); +

      + +
    • +
    +
  • +
  • + class xpath_variable +
      +
    • + const char_t* name() const; +
    • +
    • + xpath_value_type type() + const; +

      + +
    • +
    • + bool get_boolean() const; +
    • +
    • + double get_number() const; +
    • +
    • + const char_t* get_string() const; +
    • +
    • + const xpath_node_set& get_node_set() const;

      + +
    • +
    • + bool set(bool value); +
    • +
    • + bool set(double + value); +
    • +
    • + bool set(const char_t* + value); +
    • +
    • + bool set(const xpath_node_set& + value); +

      + +
    • +
    +
  • +
  • + class xpath_variable_set +
      +
    • + xpath_variable* + add(const char_t* + name, + xpath_value_type type);

      + +
    • +
    • + bool set(const char_t* + name, + bool value); +
    • +
    • + bool set(const char_t* + name, + double value); +
    • +
    • + bool set(const char_t* + name, + const char_t* value); +
    • +
    • + bool set(const char_t* + name, + const xpath_node_set& value);

      + +
    • +
    • + xpath_variable* + get(const char_t* + name); +
    • +
    • + const xpath_variable* get(const char_t* + name) + const; +

      +
  • @@ -1109,19 +1343,29 @@

    @@ -1134,7 +1378,8 @@

    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/changes.html b/docs/manual/changes.html index 38e0cda..78cde23 100644 --- a/docs/manual/changes.html +++ b/docs/manual/changes.html @@ -4,14 +4,15 @@ Changelog - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -29,6 +30,224 @@ +
    + 1.11.2010 - version + 1.0 +
    +

    + Major release, featuring many XPath enhancements, wide character filename support, + miscellaneous performance improvements, bug fixes and more. +

    +
      +
    • + XPath: +
        +
      1. + XPath implementation is moved to pugixml.cpp (which is the only source + file now); use PUGIXML_NO_XPATH if you want to disable XPath to reduce + code size +
      2. +
      3. + XPath is now supported without exceptions (PUGIXML_NO_EXCEPTIONS); + the error handling mechanism depends on the presence of exception + support +
      4. +
      5. + XPath is now supported without STL (PUGIXML_NO_STL) +
      6. +
      7. + Introduced variable support +
      8. +
      9. + Introduced new xpath_query::evaluate_string, which works without + STL +
      10. +
      11. + Introduced new xpath_node_set constructor (from an iterator range) +
      12. +
      13. + Evaluation function now accept attribute context nodes +
      14. +
      15. + All internal allocations use custom allocation functions +
      16. +
      17. + Improved error reporting; now a last parsed offset is returned together + with the parsing error +
      18. +
      +
    • +
    • + Bug fixes: +
        +
      1. + Fixed memory leak for loading from streams with stream exceptions + turned on +
      2. +
      3. + Fixed custom deallocation function calling with null pointer in one + case +
      4. +
      5. + Fixed missing attributes for iterator category functions; all functions/classes + can now be DLL-exported +
      6. +
      7. + Worked around Digital Mars compiler bug, which lead to minor read + overfetches in several functions +
      8. +
      9. + load_file now works with 2+ Gb files in MSVC/MinGW +
      10. +
      11. + XPath: fixed memory leaks for incorrect queries +
      12. +
      13. + XPath: fixed xpath_node() attribute constructor with empty attribute + argument +
      14. +
      15. + XPath: fixed lang() function for non-ASCII arguments +
      16. +
      +
    • +
    • + Specification changes: +
        +
      1. + CDATA nodes containing ]]> are printed as several nodes; while + this changes the internal structure, this is the only way to escape + CDATA contents +
      2. +
      3. + Memory allocation errors during parsing now preserve last parsed + offset (to give an idea about parsing progress) +
      4. +
      5. + If an element node has the only child, and it is of CDATA type, then + the extra indentation is omitted (previously this behavior only held + for PCDATA children) +
      6. +
      +
    • +
    • + Additional functionality: +
        +
      1. + Added xml_parse_result default constructor +
      2. +
      3. + Added xml_document::load_file and xml_document::save_file with wide + character paths +
      4. +
      5. + Added as_utf8 and as_wide overloads for std::wstring/std::string + arguments +
      6. +
      7. + Added DOCTYPE node type (node_doctype) and a special parse flag, + parse_doctype, to add such nodes to the document during parsing +
      8. +
      9. + Added parse_full parse flag mask, which extends parse_default with + all node type parsing flags except parse_ws_pcdata +
      10. +
      11. + Added xml_node::hash_value() and xml_attribute::hash_value() functions + for use in hash-based containers +
      12. +
      13. + Added internal_object() and additional constructor for both xml_node + and xml_attribute for easier marshalling (useful for language bindings) +
      14. +
      15. + Added xml_document::document_element() function +
      16. +
      17. + Added xml_node::prepend_attribute, xml_node::prepend_child and xml_node::prepend_copy + functions +
      18. +
      19. + Added xml_node::append_child, xml_node::prepend_child, xml_node::insert_child_before + and xml_node::insert_child_after overloads for element nodes (with + name instead of type) +
      20. +
      21. + Added xml_document::reset() function +
      22. +
      +
    • +
    • + Performance improvements: +
        +
      1. + xml_node::root() and xml_node::offset_debug() are now O(1) instead + of O(logN) +
      2. +
      3. + Minor parsing optimizations +
      4. +
      5. + Minor memory optimization for strings in DOM tree (set_name/set_value) +
      6. +
      7. + Memory optimization for string memory reclaiming in DOM tree (set_name/set_value + now reallocate the buffer if memory waste is too big) +
      8. +
      9. + XPath: optimized document order sorting +
      10. +
      11. + XPath: optimized child/attribute axis step +
      12. +
      13. + XPath: optimized number-to-string conversions in MSVC +
      14. +
      15. + XPath: optimized concat for many arguments +
      16. +
      17. + XPath: optimized evaluation allocation mechanism: constant and document + strings are not heap-allocated +
      18. +
      19. + XPath: optimized evaluation allocation mechanism: all temporaries' + allocations use fast stack-like allocator +
      20. +
      +
    • +
    • + Compatibility: +
        +
      1. + Removed wildcard functions (xml_node::child_w, xml_node::attribute_w, + etc.) +
      2. +
      3. + Removed xml_node::all_elements_by_name +
      4. +
      5. + Removed xpath_type_t enumeration; use xpath_value_type instead +
      6. +
      7. + Removed format_write_bom_utf8 enumeration; use format_write_bom instead +
      8. +
      9. + Removed xml_document::precompute_document_order, xml_attribute::document_order + and xml_node::document_order functions; document order sort optimization + is now automatic +
      10. +
      11. + Removed xml_document::parse functions and transfer_ownership struct; + use xml_document::load_buffer_inplace and xml_document::load_buffer_inplace_own + instead +
      12. +
      13. + Removed as_utf16 function; use as_wide instead +
      14. +
      +
    • +
    1.07.2010 - version 0.9 @@ -548,7 +767,8 @@

    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/dom.html b/docs/manual/dom.html index 2d65070..22509a9 100644 --- a/docs/manual/dom.html +++ b/docs/manual/dom.html @@ -4,14 +4,15 @@ Document object model - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -46,10 +47,10 @@

    pugixml stores XML data in DOM-like way: the entire XML document (both document structure and element data) is stored in memory as a tree. The tree can be - loaded from character stream (file, string, C++ I/O stream), then traversed - via special API or XPath expressions. The whole tree is mutable: both node - structure and node/attribute data can be changed at any time. Finally, the - result of document transformations can be saved to a character stream (file, + loaded from a character stream (file, string, C++ I/O stream), then traversed + with the special API or XPath expressions. The whole tree is mutable: both + node structure and node/attribute data can be changed at any time. Finally, + the result of document transformations can be saved to a character stream (file, C++ I/O stream or custom transport).

    @@ -58,12 +59,11 @@

    The XML document is represented with a tree data structure. The root of the - tree is the document itself, which corresponds to C++ type xml_document. Document has one or more - child nodes, which correspond to C++ type xml_node. - Nodes have different types; depending on a type, a node can have a collection - of child nodes, a collection of attributes, which correspond to C++ type - xml_attribute, and some additional - data (i.e. name). + tree is the document itself, which corresponds to C++ type xml_document. + Document has one or more child nodes, which correspond to C++ type xml_node. Nodes have different types; depending + on a type, a node can have a collection of child nodes, a collection of attributes, + which correspond to C++ type xml_attribute, + and some additional data (i.e. name).

    The tree nodes can be of one of the following types (which together form @@ -73,13 +73,13 @@

  • Document node (node_document) - this is the root of the tree, which consists of several child nodes. This - node corresponds to xml_document - class; note that xml_document - is a sub-class of xml_node, - so the entire node interface is also available. However, document node - is special in several ways, which will be covered below. There can be - only one document node in the tree; document node does not have any XML - representation.

    + node corresponds to xml_document + class; note that xml_document is + a sub-class of xml_node, so the entire + node interface is also available. However, document node is special in + several ways, which are covered below. There can be only one document + node in the tree; document node does not have any XML representation. +

  • @@ -87,13 +87,13 @@ is the most common type of node, which represents XML elements. Element nodes have a name, a collection of attributes and a collection of child nodes (both of which may be empty). The attribute is a simple name/value - pair. The example XML representation of element node is as follows: + pair. The example XML representation of element nodes is as follows:
  • <node attr="value"><child/></node>
     

    - There are two element nodes here; one has name "node", + There are two element nodes here: one has name "node", single attribute "attr" and single child "child", another has name "child" @@ -102,10 +102,10 @@

    • Plain character data nodes (node_pcdata) represent plain text in XML. PCDATA nodes have a value, but do not have - name or children/attributes. Note that plain character data is not a - part of the element node but instead has its own node; for example, an - element node can have several child PCDATA nodes. The example XML representation - of text node is as follows: + a name or children/attributes. Note that plain character data is not + a part of the element node but instead has its own node; for example, + an element node can have several child PCDATA nodes. The example XML + representation of text nodes is as follows:
    <node> text1 <child/> text2 </node>
     
    @@ -128,9 +128,9 @@

    • Comment nodes (node_comment) represent - comments in XML. Comment nodes have a value, but do not have name or - children/attributes. The example XML representation of comment node is - as follows: + comments in XML. Comment nodes have a value, but do not have a name or + children/attributes. The example XML representation of a comment node + is as follows:
    <!-- comment text -->
     
    @@ -138,14 +138,14 @@ Here the comment node has value "comment text". By default comment nodes are treated as non-essential part of XML markup and are not loaded during XML parsing. You can override - this behavior by adding parse_comments + this behavior with parse_comments flag.

    • Processing instruction node (node_pi) represent processing instructions (PI) in XML. PI nodes have a name and an optional value, but do not have children/attributes. The example XML representation - of PI node is as follows: + of a PI node is as follows:
    <?name value?>
     
    @@ -153,17 +153,17 @@ Here the name (also called PI target) is "name", and the value is "value". By default PI nodes are treated as non-essential part of XML markup and - are not loaded during XML parsing. You can override this behavior by adding - parse_pi flag. + are not loaded during XML parsing. You can override this behavior with + parse_pi flag.

    • Declaration node (node_declaration) represents document declarations in XML. Declaration nodes have a name ("xml") and an - optional collection of attributes, but does not have value or children. + optional collection of attributes, but do not have value or children. There can be only one declaration node in a document; moreover, it should be the topmost node (its parent should be the document). The example - XML representation of declaration node is as follows: + XML representation of a declaration node is as follows:
    <?xml version="1.0"?>
     
    @@ -172,12 +172,28 @@ and a single attribute with name "version" and value "1.0". By default declaration nodes are treated as non-essential part of XML markup - and are not loaded during XML parsing. You can override this behavior by - adding parse_declaration - flag. Also, by default a dummy declaration is output when XML document - is saved unless there is already a declaration in the document; you can - disable this by adding format_no_declaration - flag. + and are not loaded during XML parsing. You can override this behavior with + parse_declaration flag. Also, + by default a dummy declaration is output when XML document is saved unless + there is already a declaration in the document; you can disable this with + format_no_declaration flag. +

    +
    • + Document type declaration node (node_doctype) + represents document type declarations in XML. Document type declaration + nodes have a value, which corresponds to the entire document type contents; + no additional nodes are created for inner elements like <!ENTITY>. There can be only one document type + declaration node in a document; moreover, it should be the topmost node + (its parent should be the document). The example XML representation of + a document type declaration node is as follows: +
    +
    <!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]>
    +
    +

    + Here the node has value "greeting [ <!ELEMENT + greeting (#PCDATA)> ]". By default document type + declaration nodes are treated as non-essential part of XML markup and are + not loaded during XML parsing. You can override this behavior with parse_doctype flag.

    Finally, here is a complete example of XML document and the corresponding @@ -227,40 +243,45 @@

    Note

    - All pugixml classes and functions are located in pugi + All pugixml classes and functions are located in the pugi namespace; you have to either use explicit name qualification (i.e. pugi::xml_node), or to gain access to relevant symbols via using directive (i.e. using pugi::xml_node; or using - namespace pugi;). The namespace will be omitted from declarations - in this documentation hereafter; all code examples will use fully-qualified - names. + namespace pugi;). The namespace will be omitted from all + declarations in this documentation hereafter; all code examples will use + fully qualified names.

    Despite the fact that there are several node types, there are only three - C++ types representing the tree (xml_document, + C++ classes representing the tree (xml_document, xml_node, xml_attribute); some operations on xml_node - are only valid for certain node types. They are described below. + are only valid for certain node types. The classes are described below.

    -

    +

    xml_document is the owner of the entire document structure; it is a non-copyable class. The interface of xml_document consists of loading functions (see Loading document), saving functions (see Saving document) - and the interface of xml_node, + and the entire interface of xml_node, which allows for document inspection and/or modification. Note that while xml_document is a sub-class of xml_node, xml_node is not a polymorphic type; the - inheritance is only used to simplify usage. + inheritance is present only to simplify usage. Alternatively you can use + the document_element function + to get the element node that's the immediate child of the document.

    -

    +

    Default constructor of xml_document initializes the document to the tree with only a root node (document node). You can then populate it with data using either tree modification functions or loading functions; all loading functions destroy the previous tree with - all occupied memory, which puts existing nodes/attributes from this document - to invalid state. Destructor of xml_document + all occupied memory, which puts existing node/attribute handles for this + document to invalid state. If you want to destroy the previous tree, you + can use the xml_document::reset + function; it destroys the tree and replaces it with either an empty one or + a copy of the specified document. Destructor of xml_document also destroys the tree, thus the lifetime of the document object should exceed the lifetimes of any node/attribute handles that point to the tree.

    @@ -271,7 +292,7 @@

    While technically node/attribute handles can be alive when the tree they're - referring to is destroyed, calling any member function of these handles + referring to is destroyed, calling any member function for these handles results in undefined behavior. Thus it is recommended to make sure that the document is destroyed only after all references to its nodes/attributes are destroyed. @@ -279,16 +300,17 @@

    xml_node is the handle to - document node; it can point to any node in the document, including document - itself. There is a common interface for nodes of all types; the actual node - type can be queried via xml_node::type() method. Note that xml_node + document node; it can point to any node in the document, including the document + node itself. There is a common interface for nodes of all types; the actual + node type can be queried via the xml_node::type() + method. Note that xml_node is only a handle to the actual node, not the node itself - you can have several xml_node handles pointing to the same underlying object. Destroying xml_node handle does not destroy the node and does not remove it from the tree. The size of xml_node is equal to that of a pointer, so it is nothing more than a lightweight wrapper around - pointer; you can safely pass or return xml_node + a pointer; you can safely pass or return xml_node objects by value without additional overhead.

    @@ -300,14 +322,14 @@ for specific functions for more detailed information). This is useful for chaining calls; i.e. you can get the grandparent of a node like so: node.parent().parent(); if a node is a null node or it does not have a parent, the first parent() call returns null node; the second parent() - call then also returns null node, so you don't have to check for errors twice. + call then also returns null node, which makes error handling easier.

    xml_attribute is the handle to an XML attribute; it has the same semantics as xml_node, i.e. there can be several xml_attribute - handles pointing to the same underlying object, there is a special null attribute - value, which propagates to function results. + handles pointing to the same underlying object and there is a special null + attribute value, which propagates to function results.

    Both xml_node and xml_attribute have the default constructor @@ -316,16 +338,23 @@

    xml_node and xml_attribute try to behave like pointers, that is, they can be compared with other objects of the same type, making - it possible to use them as keys of associative containers. All handles to + it possible to use them as keys in associative containers. All handles to the same underlying object are equal, and any two handles to different underlying objects are not equal. Null handles only compare as equal to themselves. The result of relational comparison can not be reliably determined from the - order of nodes in file or other ways. Do not use relational comparison operators - except for search optimization (i.e. associative container keys). + order of nodes in file or in any other way. Do not use relational comparison + operators except for search optimization (i.e. associative container keys). +

    +

    + If you want to use xml_node + or xml_attribute objects + as keys in hash-based associative containers, you can use the hash_value member functions. They return + the hash values that are guaranteed to be the same for all handles to the + same underlying object. The hash value for null handles is 0.

    - Additionally handles they can be implicitly cast to boolean-like objects, - so that you can test if the node/attribute is empty by just doing if (node) { ... + Finally handles can be implicitly cast to boolean-like objects, so that you + can test if the node/attribute is empty with the following code: if (node) { ... } or if (!node) { ... } else { ... }. @@ -336,13 +365,14 @@ bool xml_node::empty() const;

    - Nodes and attributes do not exist outside of document tree, so you can't - create them without adding them to some document. Once underlying node/attribute + Nodes and attributes do not exist without a document tree, so you can't create + them without adding them to some document. Once underlying node/attribute objects are destroyed, the handles to those objects become invalid. While this means that destruction of the entire tree invalidates all node/attribute - handles, it also means that destroying a subtree (by calling remove_child) or removing an attribute - invalidates the corresponding handles. There is no way to check handle validity; - you have to ensure correctness through external mechanisms. + handles, it also means that destroying a subtree (by calling xml_node::remove_child) + or removing an attribute invalidates the corresponding handles. There is + no way to check handle validity; you have to ensure correctness through external + mechanisms.

    @@ -352,12 +382,14 @@

    There are two choices of interface and internal representation when configuring pugixml: you can either choose the UTF-8 (also called char) interface or - UTF-16/32 (also called wchar_t) one. The choice is controlled via PUGIXML_WCHAR_MODE define; you can set - it via pugiconfig.hpp or via preprocessor options, as discussed in Additional configuration - options. - If this define is set, the wchar_t interface is used; otherwise (by default) - the char interface is used. The exact wide character encoding is assumed - to be either UTF-16 or UTF-32 and is determined based on size of wchar_t type. + UTF-16/32 (also called wchar_t) one. The choice is controlled via PUGIXML_WCHAR_MODE + define; you can set it via pugiconfig.hpp or via preprocessor options, as + discussed in Additional configuration + options. If this define is set, the wchar_t + interface is used; otherwise (by default) the char interface is used. The + exact wide character encoding is assumed to be either UTF-16 or UTF-32 and + is determined based on the size of wchar_t + type.

    @@ -365,9 +397,9 @@
    Note

    - If size of wchar_t is 2, pugixml - assumes UTF-16 encoding instead of UCS-2, which means that some characters - are represented as two code points. + If the size of wchar_t is + 2, pugixml assumes UTF-16 encoding instead of UCS-2, which means that some + characters are represented as two code points.

    @@ -399,7 +431,7 @@ pugi::char_t upon document saving happen automatically, which also carries minor performance penalty. The general advice however is to select the character mode based on usage scenario, i.e. if UTF-8 is - inconvenient to process and most of your XML data is localized, wchar_t mode + inconvenient to process and most of your XML data is non-ASCII, wchar_t mode is probably a better choice.

    @@ -410,13 +442,18 @@ std::wstring as_wide(const char* str);

    - Both functions accept null-terminated string as an argument str, and return the converted string. + Both functions accept a null-terminated string as an argument str, and return the converted string. as_utf8 performs conversion from UTF-16/32 to UTF-8; as_wide performs conversion from UTF-8 to UTF-16/32. Invalid UTF sequences are silently discarded upon conversion. str has to be a valid string; passing null pointer results in undefined behavior. + There are also two overloads with the same semantics which accept a string + as an argument:

    +
    std::string as_utf8(const std::wstring& str);
    +std::wstring as_wide(const std::string& str);
    +
    @@ -425,8 +462,8 @@
    [Note]

    Most examples in this documentation assume char interface and therefore - will not compile with PUGIXML_WCHAR_MODE. - This is to simplify the documentation; usually the only changes you'll + will not compile with PUGIXML_WCHAR_MODE. + This is done to simplify the documentation; usually the only changes you'll have to make is to pass wchar_t string literals, i.e. instead of

    @@ -453,7 +490,7 @@

    • - it is safe to call free functions from multiple threads + it is safe to call free (non-member) functions from multiple threads
    • it is safe to perform concurrent read-only accesses to the same tree @@ -470,7 +507,7 @@ structure and altering individual node/attribute data, i.e. changing names/values.

      - The only exception is set_memory_management_functions; + The only exception is set_memory_management_functions; it modifies global variables and as such is not thread-safe. Its usage policy has more restrictions, see Custom memory allocation/deallocation functions. @@ -488,15 +525,16 @@ This is not applicable to functions that operate on STL strings or IOstreams; such functions have either strong guarantee (functions that operate on strings) or basic guarantee (functions that operate on streams). Also functions that - call user-defined callbacks (i.e. xml_node::traverse - or xml_node::find_node) do not provide any exception - guarantees beyond the ones provided by callback. + call user-defined callbacks (i.e. xml_node::traverse + or xml_node::find_node) do not + provide any exception guarantees beyond the ones provided by the callback.

      - XPath functions may throw xpath_exception - on parsing error; also, XPath implementation uses STL, and thus may throw - i.e. std::bad_alloc in low memory conditions. Still, - XPath functions provide strong exception guarantee. + If exception handling is not disabled with PUGIXML_NO_EXCEPTIONS + define, XPath functions may throw xpath_exception + on parsing errors; also, XPath functions may throw std::bad_alloc + in low memory conditions. Still, XPath functions provide strong exception + guarantee.

    @@ -514,10 +552,10 @@ functions

    - All memory for tree structure/data is allocated via globally specified - functions, which default to malloc/free. You can set your own allocation - functions with set_memory_management functions. The function interfaces - are the same as that of malloc/free: + All memory for tree structure, tree data and XPath objects is allocated + via globally specified functions, which default to malloc/free. You can + set your own allocation functions with set_memory_management function. + The function interfaces are the same as that of malloc/free:

    typedef void* (*allocation_function)(size_t size);
     typedef void (*deallocation_function)(void* ptr);
    @@ -532,14 +570,18 @@
     

    Allocation function is called with the size (in bytes) as an argument and - should return a pointer to memory block with alignment that is suitable - for pointer storage and size that is greater or equal to the requested - one. If the allocation fails, the function has to return null pointer (throwing - an exception from allocation function results in undefined behavior). Deallocation - function is called with the pointer that was returned by the previous call - or with a null pointer; null pointer deallocation should be handled as - a no-op. If memory management functions are not thread-safe, library thread - safety is not guaranteed. + should return a pointer to a memory block with alignment that is suitable + for storage of primitive types (usually a maximum of void* and double + types alignment is sufficient) and size that is greater than or equal to + the requested one. If the allocation fails, the function has to return + null pointer (throwing an exception from allocation function results in + undefined behavior). +

    +

    + Deallocation function is called with the pointer that was returned by some + call to allocation function; it is never called with a null pointer. If + memory management functions are not thread-safe, library thread safety + is not guaranteed.

    This is a simple example of custom memory management (samples/custom_memory_management.cpp): @@ -572,16 +614,6 @@ are destroyed, the new deallocation function will be called with the memory obtained by the old allocation function, resulting in undefined behavior.

    -
    - - - - - -
    [Note]Note

    - Currently memory for XPath objects is allocated using default operators - new/delete; this will change in the next version. -

    @@ -590,7 +622,7 @@

    Constructing a document object using the default constructor does not result - in any allocations; document node is stored inside the xml_document + in any allocations; document node is stored inside the xml_document object.

    @@ -598,11 +630,11 @@ function is used (see Loading document from memory), a complete copy of character stream is made; all names/values of nodes and attributes are allocated in this buffer. This buffer is allocated via a single large allocation - and is only freed when document memory is reclaimed (i.e. if the xml_document object is destroyed or if - another document is loaded in the same object). Also when loading from - file or stream, an additional large allocation may be performed if encoding - conversion is required; a temporary buffer is allocated, and it is freed - before load function returns. + and is only freed when document memory is reclaimed (i.e. if the xml_document object is destroyed or if another + document is loaded in the same object). Also when loading from file or + stream, an additional large allocation may be performed if encoding conversion + is required; a temporary buffer is allocated, and it is freed before load + function returns.

    All additional memory, such as memory for document structure (node/attribute @@ -632,7 +664,8 @@


    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/install.html b/docs/manual/install.html index d6d0327..9809a39 100644 --- a/docs/manual/install.html +++ b/docs/manual/install.html @@ -4,14 +4,15 @@ Installation - - - + + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -64,14 +65,16 @@ You can download the latest source distribution via one of the following links:

    -
    http://pugixml.googlecode.com/files/pugixml-0.9.zip
    -http://pugixml.googlecode.com/files/pugixml-0.9.tar.gz
    +
    http://pugixml.googlecode.com/files/pugixml-1.0.zip
    +http://pugixml.googlecode.com/files/pugixml-1.0.tar.gz
     

    The distribution contains library source, documentation (the manual you're reading now and the quick start guide) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed - archive. + archive. The files have different line endings depending on the archive + format - .zip archive has Windows line endings, .tar.gz archive has Unix + line endings. Otherwise the files in both archives are identical.

    If you need an older version, you can download it from the version @@ -91,7 +94,7 @@

    For example, to checkout the current version, you can use this command:

    -
    svn checkout http://pugixml.googlecode.com/svn/tags/release-0.9 pugixml
    +
    svn checkout http://pugixml.googlecode.com/svn/tags/release-1.0 pugixml

    To checkout the latest version, you can use this command:

    @@ -120,9 +123,9 @@ have to build them yourself.

    - The complete pugixml source consists of four files - two source files, pugixml.cpp and - pugixpath.cpp, and two header files, pugixml.hpp and pugiconfig.hpp. pugixml.hpp is - the primary header which you need to include in order to use pugixml classes/functions; + The complete pugixml source consists of three files - one source file, pugixml.cpp, + and two header files, pugixml.hpp and pugiconfig.hpp. pugixml.hpp is the primary + header which you need to include in order to use pugixml classes/functions; pugiconfig.hpp is a supplementary configuration file (see Additional configuration options). The rest of this guide assumes that pugixml.hpp is either in the current directory @@ -131,40 +134,31 @@ or include directory-relative path (i.e. #include <xml/thirdparty/pugixml/src/pugixml.hpp>).

    -
    - - - - - -
    [Note]Note

    - You don't need to compile pugixpath.cpp unless you use XPath. -

    - The easiest way to build pugixml is to compile two source files, pugixml.cpp and - pugixpath.cpp, along with the existing library/executable. This process - depends on the method of building your application; for example, if you're - using Microsoft Visual Studio[1], Apple Xcode, Code::Blocks or any other IDE, - just add pugixml.cpp and pugixpath.cpp to one of your projects. + The easiest way to build pugixml is to compile the source file, pugixml.cpp, + along with the existing library/executable. This process depends on the + method of building your application; for example, if you're using Microsoft + Visual Studio[1], Apple Xcode, Code::Blocks or any other IDE, just add pugixml.cpp to + one of your projects.

    If you're using Microsoft Visual Studio and the project has precompiled headers turned on, you'll see the following error messages:

    -
    pugixpath.cpp(3477) : fatal error C1010: unexpected end of file while looking for precompiled header. Did you forget to add '#include "stdafx.h"' to your source?
    +
    pugixml.cpp(3477) : fatal error C1010: unexpected end of file while looking for precompiled header. Did you forget to add '#include "stdafx.h"' to your source?

    - The correct way to resolve this is to disable precompiled headers for pugixml.cpp and - pugixpath.cpp; you have to set "Create/Use Precompiled Header" - option (Properties dialog -> C/C++ -> Precompiled Headers -> Create/Use - Precompiled Header) to "Not Using Precompiled Headers". You'll - have to do it for both pugixml.cpp and pugixpath.cpp, for all project configurations/platforms - (you can select Configuration "All Configurations" and Platform - "All Platforms" before editing the option): + The correct way to resolve this is to disable precompiled headers for pugixml.cpp; + you have to set "Create/Use Precompiled Header" option (Properties + dialog -> C/C++ -> Precompiled Headers -> Create/Use Precompiled + Header) to "Not Using Precompiled Headers". You'll have to do + it for all project configurations/platforms (you can select Configuration + "All Configurations" and Platform "All Platforms" before + editing the option):

    @@ -251,7 +245,7 @@ process does not differ from building any other library as DLL (adding -shared to compilation flags should suffice); if you're using MSVC-based toolchain, you'll have to explicitly mark exported symbols with a declspec - attribute. You can do it by defining PUGIXML_API + attribute. You can do it by defining PUGIXML_API macro, i.e. via pugiconfig.hpp:

    #ifdef _DLL
    @@ -260,6 +254,20 @@
     #define PUGIXML_API __declspec(dllimport)
     #endif
     
    +
    + + + + + +
    [Caution]Caution

    + If you're using STL-related functions, you should use the shared runtime + library to ensure that a single heap is used for STL allocations in your + application and in pugixml; in MSVC, this means selecting the 'Multithreaded + DLL' or 'Multithreaded Debug DLL' to 'Runtime library' property (/MD + or /MDd linker switch). You should also make sure that your runtime library + choice is consistent between different projects. +

    @@ -270,13 +278,13 @@ pugixml uses several defines to control the compilation process. There are two ways to define them: either put the needed definitions to pugiconfig.hpp (it has some examples that are commented out) or provide them via compiler - command-line. Define consistency is important, i.e. the definitions should - match in all source files that include pugixml.hpp (including pugixml sources) - throughout the application. Adding defines to pugiconfig.hpp lets you guarantee - this, unless your macro definition is wrapped in preprocessor #if/#ifdef - directive and this directive is not consistent. pugiconfig.hpp will never - contain anything but comments, which means that when upgrading to new version, - you can safely leave your modified version intact. + command-line. Consistency is important: the definitions should match in + all source files that include pugixml.hpp (including pugixml sources) throughout + the application. Adding defines to pugiconfig.hpp lets you guarantee this, + unless your macro definition is wrapped in preprocessor #if/#ifdef directive and this directive + is not consistent. pugiconfig.hpp will never contain anything but comments, + which means that when upgrading to a new version, you can safely leave + your modified version intact.

    PUGIXML_WCHAR_MODE define toggles @@ -289,10 +297,9 @@

    PUGIXML_NO_XPATH define disables XPath. - Both XPath interfaces and XPath implementation are excluded from compilation; - you can still compile the file pugixpath.cpp (it will result in an empty - translation unit). This option is provided in case you do not need XPath - functionality and need to save code space. + Both XPath interfaces and XPath implementation are excluded from compilation. + This option is provided in case you do not need XPath functionality and + need to save code space.

    PUGIXML_NO_STL define disables use of @@ -301,33 +308,11 @@ provided in case your target platform does not have a standard-compliant STL implementation.

    -
    - - - - - -
    [Note]Note

    - As of version 0.9, STL is used in XPath implementation; therefore, XPath - is also disabled if this macro is defined. This will change in version - 1.0. -

    PUGIXML_NO_EXCEPTIONS define disables use of exceptions in pugixml. This option is provided in case your target - platform does not have exception handling capabilities + platform does not have exception handling capabilities.

    -
    - - - - - -
    [Note]Note

    - As of version 0.9, exceptions are only - used in XPath implementation; therefore, XPath is also disabled if this - macro is defined. This will change in version 1.0. -

    PUGIXML_API, PUGIXML_CLASS and PUGIXML_FUNCTION defines let you @@ -414,8 +399,8 @@



    -

    [1] All trademarks used are properties - of their respective owners.

    +

    [1] All trademarks used are properties of their respective + owners.

    @@ -427,7 +412,8 @@

    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/loading.html b/docs/manual/loading.html index a3c1515..5b5576b 100644 --- a/docs/manual/loading.html +++ b/docs/manual/loading.html @@ -4,14 +4,15 @@ Loading document - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -44,11 +45,11 @@ non-validating parser. This parser is not fully W3C conformant - it can load any valid XML document, but does not perform some well-formedness checks. While considerable effort is made to reject invalid XML documents, some validation - is not performed because of performance reasons. Also some XML transformations - (i.e. EOL handling or attribute value normalization) can impact parsing speed - and thus can be disabled. However for vast majority of XML documents there - is no performance difference between different parsing options. Parsing options - also control whether certain XML nodes are parsed; see Parsing options for + is not performed for performance reasons. Also some XML transformations (i.e. + EOL handling or attribute value normalization) can impact parsing speed and + thus can be disabled. However for vast majority of XML documents there is no + performance difference between different parsing options. Parsing options also + control whether certain XML nodes are parsed; see Parsing options for more information.

    @@ -65,43 +66,36 @@

    -

    - The most common source of XML data is files; pugixml provides a separate - function for loading XML document from file: +

    + The most common source of XML data is files; pugixml provides dedicated functions + for loading an XML document from file:

    xml_parse_result xml_document::load_file(const char* path, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
    +xml_parse_result xml_document::load_file(const wchar_t* path, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
     

    - This function accepts file path as its first argument, and also two optional - arguments, which specify parsing options (see Parsing options) and - input data encoding (see Encodings). The path has the target + These functions accept the file path as its first argument, and also two + optional arguments, which specify parsing options (see Parsing options) + and input data encoding (see Encodings). The path has the target operating system format, so it can be a relative or absolute one, it should - have the delimiters of target system, it should have the exact case if target - file system is case-sensitive, etc. File path is passed to system file opening - function as is. + have the delimiters of the target system, it should have the exact case if + the target file system is case-sensitive, etc. +

    +

    + File path is passed to the system file opening function as is in case of + the first function (which accepts const + char* path); the second function either uses + a special file opening function if it is provided by the runtime library + or converts the path to UTF-8 and uses the system file opening function.

    load_file destroys the existing document tree and then tries to load the new tree from the specified file. - The result of the operation is returned in an xml_parse_result - object; this object contains the operation status, and the related information + The result of the operation is returned in an xml_parse_result + object; this object contains the operation status and the related information (i.e. last successfully parsed position in the input file, if parsing fails). See Handling parsing errors for error handling details.

    -
    - - - - - -
    [Note]Note

    - As of version 0.9, there is no function for loading XML document from wide - character path. Unfortunately, there is no portable way to do this; the - version 1.0 will provide such function only for platforms with the corresponding - functionality. You can use stream-loading functions as a workaround if - your STL implementation can open file streams via wchar_t - paths. -

    This is an example of loading XML document from file (samples/load_file.cpp):

    @@ -122,7 +116,7 @@ Loading document from memory

    - Sometimes XML data should be loaded from some other source than file, i.e. + Sometimes XML data should be loaded from some other source than a file, i.e. HTTP URL; also you may want to load XML data from file using non-standard functions, i.e. to use your virtual file system facilities or to load XML from gzip-compressed files. All these scenarios require loading document @@ -177,12 +171,12 @@

    It is equivalent to calling load_buffer - with size = - strlen(contents). - This function assumes native encoding for input data, so it does not do any - encoding conversion. In general, this function is fine for loading small - documents from string literals, but has more overhead and less functionality - than buffer loading functions. + with size being either strlen(contents) + or wcslen(contents) * sizeof(wchar_t), + depending on the character type. This function assumes native encoding for + input data, so it does not do any encoding conversion. In general, this function + is fine for loading small documents from string literals, but has more overhead + and less functionality than the buffer loading functions.

    This is an example of loading XML document from memory using different functions @@ -246,7 +240,7 @@ Loading document from C++ IOstreams

    - For additional interoperability pugixml provides functions for loading document + To enhance interoperability, pugixml provides functions for loading document from any object which implements C++ std::istream interface. This allows you to load documents from any standard C++ stream (i.e. file stream) or any third-party compliant implementation (i.e. Boost @@ -267,10 +261,10 @@

    load with std::wstream argument treats the stream contents as a wide character stream (encoding - is always encoding_wchar). - Because of this, using load - with wide character streams requires careful (usually platform-specific) - stream setup (i.e. using the imbue + is always encoding_wchar). Because + of this, using load with + wide character streams requires careful (usually platform-specific) stream + setup (i.e. using the imbue function). Generally use of wide streams is discouraged, however it provides you the ability to load documents from non-Unicode encodings, i.e. you can load Shift-JIS encoded data if you set the correct locale. @@ -330,7 +324,7 @@

  • status_io_error is returned by load_file function and by load functions with std::istream/std::wstream arguments; it means that some - I/O error has occured during reading the file/stream. + I/O error has occurred during reading the file/stream.
  • status_out_of_memory means that @@ -407,11 +401,11 @@ member, which contains the offset of last successfully parsed character if parsing failed because of an error in source data; otherwise offset is 0. For parsing efficiency reasons, pugixml does not track the current line during parsing; this offset is in - units of pugi::char_t (bytes for character mode, wide - characters for wide character mode). Many text editors support 'Go To Position' - feature - you can use it to locate the exact error position. Alternatively, - if you're loading the document from memory, you can display the error chunk - along with the error description (see the example code below). + units of pugi::char_t (bytes for character + mode, wide characters for wide character mode). Many text editors support + 'Go To Position' feature - you can use it to locate the exact error position. + Alternatively, if you're loading the document from memory, you can display + the error chunk along with the error description (see the example code below).

    @@ -490,9 +484,15 @@
  • parse_declaration determines if XML document declaration (node with type node_declaration) - are to be put in DOM tree. If this flag is off, it is not put in the - tree, but is still parsed and checked for correctness. This flag is - off by default.

    + is to be put in DOM tree. If this flag is off, it is not put in the tree, + but is still parsed and checked for correctness. This flag is off by default.

    + +
  • +
  • + parse_doctype determines if XML document + type declaration (node with type node_doctype) + is to be put in DOM tree. If this flag is off, it is not put in the tree, + but is still parsed and checked for correctness. This flag is off by default.

  • @@ -525,13 +525,13 @@ the cost of allocating and storing such nodes (both memory and speed-wise) can be significant. For example, after parsing XML string <node> <a/> </node>, <node> element will have three children when parse_ws_pcdata - is set (child with type node_pcdata + is set (child with type node_pcdata and value " ", - child with type node_element - and name "a", and - another child with type node_pcdata - and value " "), - and only one child when parse_ws_pcdata + child with type node_element and + name "a", and another + child with type node_pcdata and value + " "), and only + one child when parse_ws_pcdata is not set. This flag is off by default.
  • @@ -551,7 +551,7 @@ that as pugixml does not handle DTD, the only allowed entities are predefined ones). If character/entity reference can not be expanded, it is left as is, so you can do additional processing later. Reference expansion - is performed in attribute values and PCDATA content. This flag is on by default.

    + is performed on attribute values and PCDATA content. This flag is on by default.

  • @@ -569,9 +569,9 @@ if attribute value normalization should be performed for all attributes. This means, that whitespace characters (new line, tab and space) are replaced with space (' '). - New line characters are always treated as if parse_eol + New line characters are always treated as if parse_eol is set, i.e. \r\n - is converted to single space. This flag is on + is converted to a single space. This flag is on by default.

  • @@ -579,10 +579,10 @@ parse_wnorm_attribute determines if extended attribute value normalization should be performed for all attributes. This means, that after attribute values are normalized as - if parse_wconv_attribute + if parse_wconv_attribute was set, leading and trailing space characters are removed, and all sequences of space characters are replaced by a single space character. The value - of parse_wconv_attribute + of parse_wconv_attribute has no effect if this flag is on. This flag is off by default. @@ -595,24 +595,25 @@

    parse_wconv_attribute option performs transformations that are required by W3C specification for attributes - that are declared as CDATA; parse_wnorm_attribute + that are declared as CDATA; parse_wnorm_attribute performs transformations required for NMTOKENS attributes. - In the absence of document type declaration all attributes behave as if - they are declared as CDATA, thus parse_wconv_attribute + In the absence of document type declaration all attributes should behave + as if they are declared as CDATA, thus parse_wconv_attribute is the default option.

    - Additionally there are two predefined option masks: + Additionally there are three predefined option masks:

    • parse_minimal has all options turned off. This option mask means that pugixml does not add declaration nodes, - PI nodes, CDATA sections and comments to the resulting tree and does - not perform any conversion for input data, so theoretically it is the - fastest mode. However, as discussed above, in practice parse_default is usually equally fast. -

      + document type declaration nodes, PI nodes, CDATA sections and comments + to the resulting tree and does not perform any conversion for input data, + so theoretically it is the fastest mode. However, as mentioned above, + in practice parse_default is usually + equally fast.

    • @@ -622,7 +623,18 @@ entity reference expansion, replacing whitespace characters with spaces in attribute values and performing EOL handling. Note, that PCDATA sections consisting only of whitespace characters are not parsed (by default) - for performance reasons. + for performance reasons.

      + +
    • +
    • + parse_full is the set of flags which adds + nodes of all types to the resulting tree and performs default conversions + for input data. It includes parsing CDATA sections, comments, PI nodes, + document declaration node and document type declaration node, performing + character and entity reference expansion, replacing whitespace characters + with spaces in attribute values and performing EOL handling. Note, that + PCDATA sections consisting only of whitespace characters are not parsed + in this mode.

    @@ -705,36 +717,36 @@

  • encoding_utf8 corresponds to UTF-8 encoding - as defined in Unicode standard; UTF-8 sequences with length equal to - 5 or 6 are not standard and are rejected. + as defined in the Unicode standard; UTF-8 sequences with length equal + to 5 or 6 are not standard and are rejected.
  • encoding_utf16_le corresponds to - little-endian UTF-16 encoding as defined in Unicode standard; surrogate + little-endian UTF-16 encoding as defined in the Unicode standard; surrogate pairs are supported.
  • encoding_utf16_be corresponds to - big-endian UTF-16 encoding as defined in Unicode standard; surrogate + big-endian UTF-16 encoding as defined in the Unicode standard; surrogate pairs are supported.
  • encoding_utf16 corresponds to UTF-16 - encoding as defined in Unicode standard; the endianness is assumed to - be that of target platform. + encoding as defined in the Unicode standard; the endianness is assumed + to be that of the target platform.
  • encoding_utf32_le corresponds to - little-endian UTF-32 encoding as defined in Unicode standard. + little-endian UTF-32 encoding as defined in the Unicode standard.
  • encoding_utf32_be corresponds to - big-endian UTF-32 encoding as defined in Unicode standard. + big-endian UTF-32 encoding as defined in the Unicode standard.
  • encoding_utf32 corresponds to UTF-32 - encoding as defined in Unicode standard; the endianness is assumed to - be that of target platform. + encoding as defined in the Unicode standard; the endianness is assumed + to be that of the target platform.
  • encoding_wchar corresponds to the encoding @@ -823,7 +835,8 @@

  • -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/modify.html b/docs/manual/modify.html index f00e657..3db02e1 100644 --- a/docs/manual/modify.html +++ b/docs/manual/modify.html @@ -4,14 +4,15 @@ Modifying document data - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -62,12 +63,13 @@

    As discussed before, nodes can have name and value, both of which are strings. - Depending on node type, name or value may be absent. node_document - nodes do not have name or value, node_element - and node_declaration nodes - always have a name but never have a value, node_pcdata, - node_cdata and node_comment nodes never have a name but - always have a value (it may be empty though), node_pi + Depending on node type, name or value may be absent. node_document + nodes do not have a name or value, node_element + and node_declaration nodes always + have a name but never have a value, node_pcdata, + node_cdata, node_comment + and node_doctype nodes never have a name + but always have a value (it may be empty though), node_pi nodes always have a name and a value (again, value may be empty). In order to set node's name or value, you can use the following functions:

    @@ -78,16 +80,15 @@ Both functions try to set the name/value to the specified string, and return the operation result. The operation fails if the node can not have name or value (for instance, when trying to call set_name - on a node_pcdata node), if - the node handle is null, or if there is insufficient memory to handle the - request. The provided string is copied into document managed memory and can - be destroyed after the function returns (for example, you can safely pass - stack-allocated buffers to these functions). The name/value content is not - verified, so take care to use only valid XML names, or the document may become - malformed. + on a node_pcdata node), if the node handle + is null, or if there is insufficient memory to handle the request. The provided + string is copied into document managed memory and can be destroyed after + the function returns (for example, you can safely pass stack-allocated buffers + to these functions). The name/value content is not verified, so take care + to use only valid XML names, or the document may become malformed.

    - There is no equivalent of child_value + There is no equivalent of child_value function for modifying text children of the node.

    @@ -185,7 +186,7 @@ These operators simply call the right set_value function and return the attribute they're called on; the return value of set_value is ignored, so - errors are not detected. + errors are ignored.

    This is an example of setting attribute name and value (samples/modify_base.cpp): @@ -214,36 +215,48 @@

    -

    - Nodes and attributes do not exist outside of document tree, so you can't - create them without adding them to some document. A node or attribute can - be created at the end of node/attribute list or before/after some other node: +

    + Nodes and attributes do not exist without a document tree, so you can't create + them without adding them to some document. A node or attribute can be created + at the end of node/attribute list or before/after some other node:

    xml_attribute xml_node::append_attribute(const char_t* name);
    +xml_attribute xml_node::prepend_attribute(const char_t* name);
     xml_attribute xml_node::insert_attribute_after(const char_t* name, const xml_attribute& attr);
     xml_attribute xml_node::insert_attribute_before(const char_t* name, const xml_attribute& attr);
     
     xml_node xml_node::append_child(xml_node_type type = node_element);
    +xml_node xml_node::prepend_child(xml_node_type type = node_element);
     xml_node xml_node::insert_child_after(xml_node_type type, const xml_node& node);
     xml_node xml_node::insert_child_before(xml_node_type type, const xml_node& node);
    +
    +xml_node xml_node::append_child(const char_t* name);
    +xml_node xml_node::prepend_child(const char_t* name);
    +xml_node xml_node::insert_child_after(const char_t* name, const xml_node& node);
    +xml_node xml_node::insert_child_before(const char_t* name, const xml_node& node);
     

    append_attribute and append_child create a new node/attribute at the end of the corresponding list of the node the method is called on; - insert_attribute_after, + prepend_attribute and prepend_child create a new node/attribute + at the beginning of the list; insert_attribute_after, insert_attribute_before, insert_child_after and insert_attribute_before add the node/attribute - before or after specified node/attribute. + before or after the specified node/attribute.

    Attribute functions create an attribute with the specified name; you can specify the empty name and change the name later if you want to. Node functions - create the node with the specified type; since node type can't be changed, - you have to know the desired type beforehand. Also note that not all types - can be added as children; see below for clarification. + with the type argument create + the node with the specified type; since node type can't be changed, you have + to know the desired type beforehand. Also note that not all types can be + added as children; see below for clarification. Node functions with the + name argument create the + element node (node_element) with the + specified name.

    - All functions return the handle to newly created object on success, and null + All functions return the handle to the created object on success, and null handle on failure. There are several reasons for failure:

      @@ -251,32 +264,30 @@ Adding fails if the target node is null;
    • - Only node_element nodes - can contain attributes, so attribute adding fails if node is not an element; + Only node_element nodes can contain + attributes, so attribute adding fails if node is not an element;
    • - Only node_document and - node_element nodes can - contain children, so child node adding fails if target node is not an - element or a document; + Only node_document and node_element + nodes can contain children, so child node adding fails if the target + node is not an element or a document;
    • - node_document and node_null nodes can not be inserted - as children, so passing node_document - or node_null value as - type results in operation failure; + node_document and node_null + nodes can not be inserted as children, so passing node_document + or node_null value as type results in operation failure;
    • - node_declaration nodes - can only be added as children of the document node; attempt to insert - declaration node as a child of an element node fails; + node_declaration nodes can only + be added as children of the document node; attempt to insert declaration + node as a child of an element node fails;
    • Adding node/attribute results in memory allocation, which may fail;
    • - Insertion functions fail if the specified node or attribute is not in - the target node's children/attribute list. + Insertion functions fail if the specified node or attribute is null or + is not in the target node's children/attribute list.

    @@ -302,17 +313,14 @@

    // add node with some name
    -pugi::xml_node node = doc.append_child();
    -node.set_name("node");
    +pugi::xml_node node = doc.append_child("node");
     
     // add description node with text child
    -pugi::xml_node descr = node.append_child();
    -descr.set_name("description");
    +pugi::xml_node descr = node.append_child("description");
     descr.append_child(pugi::node_pcdata).set_value("Simple node");
     
     // add param node before the description
    -pugi::xml_node param = node.insert_child_before(pugi::node_element, descr);
    -param.set_name("param");
    +pugi::xml_node param = node.insert_child_before("param", descr);
     
     // add attributes to param node
     param.append_attribute("name") = "version";
    @@ -400,29 +408,32 @@
     
    -

    +

    With the help of previously described functions, it is possible to create trees with any contents and structure, including cloning the existing data. However since this is an often needed operation, pugixml provides built-in node/attribute cloning facilities. Since nodes and attributes do not exist - outside of document tree, you can't create a standalone copy - you have to + without a document tree, you can't create a standalone copy - you have to immediately insert it somewhere in the tree. For this, you can use one of the following functions:

    xml_attribute xml_node::append_copy(const xml_attribute& proto);
    +xml_attribute xml_node::prepend_copy(const xml_attribute& proto);
     xml_attribute xml_node::insert_copy_after(const xml_attribute& proto, const xml_attribute& attr);
     xml_attribute xml_node::insert_copy_before(const xml_attribute& proto, const xml_attribute& attr);
    +
     xml_node xml_node::append_copy(const xml_node& proto);
    +xml_node xml_node::prepend_copy(const xml_node& proto);
     xml_node xml_node::insert_copy_after(const xml_node& proto, const xml_node& node);
     xml_node xml_node::insert_copy_before(const xml_node& proto, const xml_node& node);
     

    These functions mirror the structure of append_child, - insert_child_before and related - functions - they take the handle to the prototype object, which is to be - cloned, insert a new attribute/node at the appropriate place, and then copy - the attribute data or the whole node subtree to the new object. The functions - return the handle to the resulting duplicate object, or null handle on failure. + prepend_child, insert_child_before and related functions + - they take the handle to the prototype object, which is to be cloned, insert + a new attribute/node at the appropriate place, and then copy the attribute + data or the whole node subtree to the new object. The functions return the + handle to the resulting duplicate object, or null handle on failure.

    The attribute is copied along with the name and value; the node is copied @@ -445,7 +456,7 @@

  • Node cloning starts with insertion of the node of the same type as that of the prototype; for this reason, cloning functions can not be directly - used to clone entire documents, since node_document + used to clone entire documents, since node_document is not a valid insertion type. The example below provides a workaround.
  • @@ -524,7 +535,8 @@

  • -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/saving.html b/docs/manual/saving.html index e12b31d..2cbf06e 100644 --- a/docs/manual/saving.html +++ b/docs/manual/saving.html @@ -4,14 +4,15 @@ Saving document - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -49,42 +50,45 @@ the relevant functionality.

    - The node/attribute data is written to the destination properly formatted according - to the node type; all special XML symbols, such as < and &, are properly - escaped. In order to guard against forgotten node/attribute names, empty node/attribute - names are printed as ":anonymous". - For proper output, make sure all node and attribute names are set to meaningful + Before writing to the destination the node/attribute data is properly formatted + according to the node type; all special XML symbols, such as < and &, + are properly escaped. In order to guard against forgotten node/attribute names, + empty node/attribute names are printed as ":anonymous". + For well-formed output, make sure all node and attribute names are set to meaningful values.

    -
    - - - - - -
    [Caution]Caution

    - Currently the content of CDATA sections is not escaped, so CDATA sections - with values that contain "]]>" - will result in malformed document. This will be fixed in version 1.0. -

    +

    + CDATA sections with values that contain "]]>" + are split into several sections as follows: section with value "pre]]>post" is written as <![CDATA[pre]]]]><![CDATA[>post]]>. + While this alters the structure of the document (if you load the document after + saving it, there will be two CDATA sections instead of one), this is the only + way to escape CDATA contents. +

    -

    - If you want to save the whole document to a file, you can use the following - function: +

    + If you want to save the whole document to a file, you can use one of the + following functions:

    bool xml_document::save_file(const char* path, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
    +bool xml_document::save_file(const wchar_t* path, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
     

    - This function accepts file path as its first argument, and also three optional + These functions accept file path as its first argument, and also three optional arguments, which specify indentation and other output options (see Output options) and output data encoding (see Encodings). The path has the target operating system format, so it can be a relative or absolute one, it should - have the delimiters of target system, it should have the exact case if target - file system is case-sensitive, etc. File path is passed to system file opening - function as is. + have the delimiters of the target system, it should have the exact case if + the target file system is case-sensitive, etc. +

    +

    + File path is passed to the system file opening function as is in case of + the first function (which accepts const + char* path); the second function either uses + a special file opening function if it is provided by the runtime library + or converts the path to UTF-8 and uses the system file opening function.

    save_file opens the target @@ -96,19 +100,6 @@ handle as the only constructor argument and then calling save; see Saving document via writer interface for writer interface details.

    -
    - - - - - -
    [Note]Note

    - As of version 0.9, there is no function for saving XML document to wide - character paths. Unfortunately, there is no portable way to do this; the - version 1.0 will provide such function only for platforms with the corresponding - functionality. You can use stream-saving functions as a workaround if your - STL implementation can open file streams via wchar_t paths. -

    This is a simple example of saving XML document to file (samples/save_file.cpp):

    @@ -126,11 +117,11 @@ Saving document to C++ IOstreams

    - For additional interoperability pugixml provides functions for saving document - to any object which implements C++ std::ostream interface. This allows you - to save documents to any standard C++ stream (i.e. file stream) or any third-party - compliant implementation (i.e. Boost Iostreams). Most notably, this allows - for easy debug output, since you can use std::cout + To enhance interoperability pugixml provides functions for saving document + to any object which implements C++ std::ostream + interface. This allows you to save documents to any standard C++ stream (i.e. + file stream) or any third-party compliant implementation (i.e. Boost Iostreams). + Most notably, this allows for easy debug output, since you can use std::cout stream as saving target. There are two functions, one works with narrow character streams, another handles wide character ones:

    @@ -142,7 +133,7 @@ argument saves the document to the stream in the same way as save_file (i.e. with requested header and with encoding conversions). On the other hand, save with std::wstream argument saves the document to - the wide stream with encoding_wchar + the wide stream with encoding_wchar encoding. Because of this, using save with wide character streams requires careful (usually platform-specific) stream setup (i.e. using the imbue @@ -201,7 +192,7 @@

    write function is called with relatively large blocks (size is usually several kilobytes, except for - the first block with BOM, which is output only if format_write_bom + the first block with BOM, which is output only if format_write_bom is set, and last block, which may be small), so there is often no need for additional buffering in the implementation.

    @@ -231,9 +222,8 @@ Saving a single subtree

    - While the previously described functions saved the whole document to the - destination, it is easy to save a single subtree. The following functions - are provided: + While the previously described functions save the whole document to the destination, + it is easy to save a single subtree. The following functions are provided:

    void xml_node::print(std::ostream& os, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto, unsigned int depth = 0) const;
     void xml_node::print(std::wostream& os, const char_t* indent = "\t", unsigned int flags = format_default, unsigned int depth = 0) const;
    @@ -246,10 +236,10 @@
           

    Saving a subtree differs from saving the whole document: the process behaves - as if format_write_bom is - off, and format_no_declaration - is on, even if actual values of the flags are different. This means that - BOM is not written to the destination, and document declaration is only written + as if format_write_bom is off, and + format_no_declaration is on, + even if actual values of the flags are different. This means that BOM is + not written to the destination, and document declaration is only written if it is the node itself or is one of node's children. Note that this also holds if you're saving a document; this example (samples/save_subtree.cpp) illustrates the difference: @@ -308,8 +298,8 @@ by default). If this flag is on, before every node the indentation string is output several times, where the amount of indentation depends on the node's depth relative to the output subtree. This flag has no effect - if format_raw is enabled. - This flag is on by default.

    + if format_raw is enabled. This flag + is on by default.

  • @@ -318,9 +308,9 @@ and also no newlines that are not part of document text are printed. Raw mode can be used for serialization where the result is not intended to be read by humans; also it can be useful if the document was parsed - with parse_ws_pcdata - flag, to preserve the original document formatting as much as possible. - This flag is off by default. + with parse_ws_pcdata flag, to + preserve the original document formatting as much as possible. This flag + is off by default.
  • @@ -429,7 +419,7 @@

    Also note that wide stream saving functions do not have encoding - argument and always assume encoding_wchar + argument and always assume encoding_wchar encoding.

    @@ -456,7 +446,8 @@

    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/toc.html b/docs/manual/toc.html index e078307..97d0b6c 100644 --- a/docs/manual/toc.html +++ b/docs/manual/toc.html @@ -4,13 +4,14 @@ Table of Contents - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -134,7 +135,8 @@

    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: diff --git a/docs/manual/xpath.html b/docs/manual/xpath.html index 731a969..5a97a79 100644 --- a/docs/manual/xpath.html +++ b/docs/manual/xpath.html @@ -4,14 +4,15 @@ XPath - - + + -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: @@ -33,6 +34,7 @@
    XPath types
    Selecting nodes via XPath expression
    Using query objects
    +
    Using variables
    Error handling
    Conformance to W3C specification
    @@ -54,18 +56,6 @@ at tizag.com, and the XPath 1.0 specification.

    -
    - - - - - -
    [Note]Note

    - As of version 0.9, you need both STL and exception support to use XPath; - XPath is disabled if either PUGIXML_NO_STL - or PUGIXML_NO_EXCEPTIONS - is defined. -

    XPath types @@ -76,7 +66,7 @@ type, number type corresponds to double type, string type corresponds to either std::string or std::wstring, depending on whether wide - character interface is enabled, and node set corresponds to xpath_node_set type. There is an enumeration, + character interface is enabled, and node set corresponds to xpath_node_set type. There is an enumeration, xpath_value_type, which can take the values xpath_type_boolean, xpath_type_number, xpath_type_string or xpath_type_node_set, @@ -117,12 +107,14 @@ for equality with each other.

    - You can also create XPath nodes with one of tree constructors: the default + You can also create XPath nodes with one of the three constructors: the default constructor, the constructor that takes node argument, and the constructor that takes attribute and node arguments (in which case the attribute must - belong to the attribute list of the node). However, usually you don't need - to create your own XPath node objects, since they are returned to you via - selection functions. + belong to the attribute list of the node). The constructor from xml_node is implicit, so you can usually + pass xml_node to functions + that expect xpath_node. Apart + from that you usually don't need to create your own XPath node objects, since + they are returned to you via selection functions.

    XPath expressions operate not on single nodes, but instead on node sets. @@ -153,7 +145,7 @@ the iterators are random-access, all of the above operations are constant time, and accessing the element at index that is greater or equal than the set size results in undefined behavior. You can use both iterator-based and - index-based access for iteration, however the iterator-based can be faster. + index-based access for iteration, however the iterator-based one can be faster.

    The order of iteration depends on the order of nodes inside the set; the @@ -195,6 +187,21 @@ the complexity does - if the set is sorted, the complexity is constant, otherwise it is linear in the number of elements or worse.

    +

    + While in the majority of cases the node set is returned by XPath functions, + sometimes there is a need to manually construct a node set. For such cases, + a constructor is provided which takes an iterator range (const_iterator + is a typedef for const xpath_node*), and an optional type: +

    +
    xpath_node_set::xpath_node_set(const_iterator begin, const_iterator end, type_t type = type_unsorted);
    +
    +

    + The constructor copies the specified range and sets the specified type. The + objects in the range are not checked in any way; you'll have to ensure that + the range contains no duplicates, and that the objects are sorted according + to the type parameter. Otherwise + XPath operations with this set may produce unexpected results. +

    @@ -204,8 +211,8 @@ If you want to select nodes that match some XPath expression, you can do it with the following functions:

    -
    xpath_node xml_node::select_single_node(const char_t* query) const;
    -xpath_node_set xml_node::select_nodes(const char_t* query) const;
    +
    xpath_node xml_node::select_single_node(const char_t* query, xpath_variable_set* variables = 0) const;
    +xpath_node_set xml_node::select_nodes(const char_t* query, xpath_variable_set* variables = 0) const;
     

    select_nodes function compiles @@ -219,7 +226,7 @@ returns null XPath node.

    - Both functions throw xpath_exception + If exception handling is not disabled, both functions throw xpath_exception if the query can not be compiled or if it returns a value with type other than node set; see Error handling for details.

    @@ -235,7 +242,7 @@ xpath_node_set xml_node::select_nodes(const xpath_query& query) const;

    - Both functions throw xpath_exception + If exception handling is not disabled, both functions throw xpath_exception if the query returns a value with type other than node set.

    @@ -268,8 +275,8 @@

    When you call select_nodes with an expression string as an argument, a query object is created behind - the scene. A query object represents a compiled XPath expression. Query objects - can be needed in the following circumstances: + the scenes. A query object represents a compiled XPath expression. Query + objects can be needed in the following circumstances:

    • @@ -296,33 +303,34 @@ You can create a query object with the constructor that takes XPath expression as an argument:

      -
      explicit xpath_query::xpath_query(const char_t* query);
      +
      explicit xpath_query::xpath_query(const char_t* query, xpath_variable_set* variables = 0);
       

      The expression is compiled and the compiled representation is stored in the - new query object. If compilation fails, xpath_exception - is thrown (see Error handling for details). After the query is created, - you can query the type of the evaluation result using the following function: + new query object. If compilation fails, xpath_exception + is thrown if exception handling is not disabled (see Error handling for + details). After the query is created, you can query the type of the evaluation + result using the following function:

      xpath_value_type xpath_query::return_type() const;
       

      You can evaluate the query using one of the following functions:

      -
      bool xpath_query::evaluate_boolean(const xml_node& n) const;
      -double xpath_query::evaluate_number(const xml_node& n) const;
      -string_t xpath_query::evaluate_string(const xml_node& n) const;
      -xpath_node_set xpath_query::evaluate_node_set(const xml_node& n) const;
      +
      bool xpath_query::evaluate_boolean(const xpath_node& n) const;
      +double xpath_query::evaluate_number(const xpath_node& n) const;
      +string_t xpath_query::evaluate_string(const xpath_node& n) const;
      +xpath_node_set xpath_query::evaluate_node_set(const xpath_node& n) const;
       

      All functions take the context node as an argument, compute the expression - and return the result, converted to the requested type. By XPath specification, - value of any type can be converted to boolean, number or string value, but - no type other than node set can be converted to node set. Because of this, - evaluate_boolean, evaluate_number and evaluate_string - always return a result, but evaluate_node_set - throws an xpath_exception - if the return type is not node set. + and return the result, converted to the requested type. According to XPath + specification, value of any type can be converted to boolean, number or string + value, but no type other than node set can be converted to node set. Because + of this, evaluate_boolean, + evaluate_number and evaluate_string always return a result, + but evaluate_node_set results + in an error if the return type is not node set (see Error handling).

      @@ -334,6 +342,36 @@ is equivalent to calling xpath_query("query").evaluate_node_set(node).

      +

      + Note that evaluate_string + function returns the STL string; as such, it's not available in PUGIXML_NO_STL + mode and also usually allocates memory. There is another string evaluation + function: +

      +
      size_t xpath_query::evaluate_string(char_t* buffer, size_t capacity, const xpath_node& n) const;
      +
      +

      + This function evaluates the string, and then writes the result to buffer (but at most capacity + characters); then it returns the full size of the result in characters, including + the terminating zero. If capacity + is not 0, the resulting buffer is always zero-terminated. You can use this + function as follows: +

      +
        +
      • + First call the function with buffer + = 0 + and capacity = + 0; then allocate the returned amount + of characters, and call the function again, passing the allocated storage + and the amount of characters; +
      • +
      • + First call the function with small buffer and buffer capacity; then, + if the result is larger than the capacity, the output has been trimmed, + so allocate a larger buffer and call the function again. +
      • +

      This is an example of using query objects (samples/xpath_query.cpp):

      @@ -367,22 +405,237 @@
    +

    + XPath queries may contain references to variables; this is useful if you + want to use queries that depend on some dynamic parameter without manually + preparing the complete query string, or if you want to reuse the same query + object for similar queries. +

    +

    + Variable references have the form $name; in order to use them, you have to provide + a variable set, which includes all variables present in the query with correct + types. This set is passed to xpath_query + constructor or to select_nodes/select_single_node functions: +

    +
    explicit xpath_query::xpath_query(const char_t* query, xpath_variable_set* variables = 0);
    +xpath_node xml_node::select_single_node(const char_t* query, xpath_variable_set* variables = 0) const;
    +xpath_node_set xml_node::select_nodes(const char_t* query, xpath_variable_set* variables = 0) const;
    +
    +

    + If you're using query objects, you can change the variable values before + evaluate/select + calls to change the query behavior. +

    +
    + + + + + +
    [Note]Note

    + The variable set pointer is stored in the query object; you have to ensure + that the lifetime of the set exceeds that of query object. +

    +

    + Variable sets correspond to xpath_variable_set + type, which is essentially a variable container. +

    +

    + You can add new variables with the following function: +

    +
    xpath_variable* xpath_variable_set::add(const char_t* name, xpath_value_type type);
    +
    +

    + The function tries to add a new variable with the specified name and type; + if the variable with such name does not exist in the set, the function adds + a new variable and returns the variable handle; if there is already a variable + with the specified name, the function returns the variable handle if variable + has the specified type. Otherwise the function returns null pointer; it also + returns null pointer on allocation failure. +

    +

    + New variables are assigned the default value which depends on the type: + 0 for numbers, false for booleans, empty string for strings + and empty set for node sets. +

    +

    + You can get the existing variables with the following functions: +

    +
    xpath_variable* xpath_variable_set::get(const char_t* name);
    +const xpath_variable* xpath_variable_set::get(const char_t* name) const;
    +
    +

    + The functions return the variable handle, or null pointer if the variable + with the specified name is not found. +

    +

    + Additionally, there are the helper functions for setting the variable value + by name; they try to add the variable with the corresponding type, if it + does not exist, and to set the value. If the variable with the same name + but with different type is already present, they return false; + they also return false on allocation + failure. Note that these functions do not perform any type conversions. +

    +
    bool xpath_variable_set::set(const char_t* name, bool value);
    +bool xpath_variable_set::set(const char_t* name, double value);
    +bool xpath_variable_set::set(const char_t* name, const char_t* value);
    +bool xpath_variable_set::set(const char_t* name, const xpath_node_set& value);
    +
    +

    + The variable values are copied to the internal variable storage, so you can + modify or destroy them after the functions return. +

    +

    + If setting variables by name is not efficient enough, or if you have to inspect + variable information or get variable values, you can use variable handles. + A variable corresponds to the xpath_variable + type, and a variable handle is simply a pointer to xpath_variable. +

    +

    + In order to get variable information, you can use one of the following functions: +

    +
    const char_t* xpath_variable::name() const;
    +xpath_value_type xpath_variable::type() const;
    +
    +

    + Note that each variable has a distinct type which is specified upon variable + creation and can not be changed later. +

    +

    + In order to get variable value, you should use one of the following functions, + depending on the variable type: +

    +
    bool xpath_variable::get_boolean() const;
    +double xpath_variable::get_number() const;
    +const char_t* xpath_variable::get_string() const;
    +const xpath_node_set& xpath_variable::get_node_set() const;
    +
    +

    + These functions return the value of the variable. Note that no type conversions + are performed; if the type mismatch occurs, a dummy value is returned (false for booleans, NaN + for numbers, empty string for strings and empty set for node sets). +

    +

    + In order to set variable value, you should use one of the following functions, + depending on the variable type: +

    +
    bool xpath_variable::set(bool value);
    +bool xpath_variable::set(double value);
    +bool xpath_variable::set(const char_t* value);
    +bool xpath_variable::set(const xpath_node_set& value);
    +
    +

    + These function modify the variable value. Note that no type conversions are + performed; if the type mismatch occurs, the functions return false; they also return false + on allocation failure. The variable values are copied to the internal variable + storage, so you can modify or destroy them after the functions return. +

    +

    + This is an example of using variables in XPath queries (samples/xpath_variables.cpp): +

    +

    + +

    +
    // Select nodes via compiled query
    +pugi::xpath_variable_set vars;
    +vars.add("remote", pugi::xpath_type_boolean);
    +
    +pugi::xpath_query query_remote_tools("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars);
    +
    +vars.set("remote", true);
    +pugi::xpath_node_set tools_remote = query_remote_tools.evaluate_node_set(doc);
    +
    +vars.set("remote", false);
    +pugi::xpath_node_set tools_local = query_remote_tools.evaluate_node_set(doc);
    +
    +std::cout << "Remote tool: ";
    +tools_remote[2].node().print(std::cout);
    +
    +std::cout << "Local tool: ";
    +tools_local[0].node().print(std::cout);
    +
    +// You can pass the context directly to select_nodes/select_single_node
    +pugi::xpath_node_set tools_local_imm = doc.select_nodes("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars);
    +
    +std::cout << "Local tool imm: ";
    +tools_local_imm[0].node().print(std::cout);
    +
    +

    +

    +
    +
    + -

    - As of version 0.9, all XPath errors result in thrown exceptions. The errors - can arise during expression compilation or node set evaluation. In both cases, - an xpath_exception object - is thrown. This is an exception object that implements std::exception - interface, and thus has a single function what(): +

    + There are two different mechanisms for error handling in XPath implementation; + the mechanism used depends on whether exception support is disabled (this + is controlled with PUGIXML_NO_EXCEPTIONS + define). +

    +

    + By default, XPath functions throw xpath_exception + object in case of errors; additionally, in the event any memory allocation + fails, an std::bad_alloc exception is thrown. Also xpath_exception is thrown if the query + is evaluated to a node set, but the return type is not node set. If the query + constructor succeeds (i.e. no exception is thrown), the query object is valid. + Otherwise you can get the error details via one of the following functions:

    virtual const char* xpath_exception::what() const throw();
    +const xpath_parse_result& xpath_exception::result() const;
    +
    +

    + If exceptions are disabled, then in the event of parsing failure the query + is initialized to invalid state; you can test if the query object is valid + by using it in a boolean expression: if + (query) { ... + }. Additionally, you can get parsing + result via the result() accessor: +

    +
    const xpath_parse_result& xpath_query::result() const;
     

    - This function returns the error message. Currently it is impossible to get - the exact place where query compilation failed. This functionality, along - with optional error handling without exceptions, will be available in version - 1.0. + Without exceptions, evaluating invalid query results in false, + empty string, NaN or an empty node set, depending on the type; evaluating + a query as a node set results in an empty node set if the return type is + not node set. +

    +

    + The information about parsing result is returned via xpath_parse_result + object. It contains parsing status and the offset of last successfully parsed + character from the beginning of the source stream: +

    +
    struct xpath_parse_result
    +{
    +    const char* error;
    +    ptrdiff_t offset;
    +
    +    operator bool() const;
    +    const char* description() const;
    +};
    +
    +

    + Parsing result is represented as the error message; it is either a null pointer, + in case there is no error, or the error message in the form of ASCII zero-terminated + string. +

    +

    + description() + member function can be used to get the error message; it never returns the + null pointer, so you can safely use description() even if query parsing succeeded. +

    +

    + In addition to the error message, parsing result has an offset + member, which contains the offset of last successfully parsed character. + This offset is in units of pugi::char_t (bytes + for character mode, wide characters for wide character mode). +

    +

    + Parsing result object can be implicitly converted to bool + like this: if (result) { ... } + else { ... }.

    This is an example of XPath error handling (samples/xpath_error.cpp): @@ -440,7 +693,7 @@ but instead has three.

  • - Since document can't have a document type declaration, id() + Since the document type declaration is not used for parsing, id() function always returns an empty node set.
  • @@ -459,13 +712,7 @@ value, depending on the library configuration; this means that some string functions are not fully Unicode-aware. This affects substring(), string-length() and translate() functions.
  • -
  • - Variable references are not supported. -
  • -

    - Some of these incompatibilities will be fixed in version 1.0. -

    @@ -477,7 +724,8 @@

    -
    pugixml 0.9 manual | + +pugixml 1.0 manual | Overview | Installation | Document: -- cgit v1.2.3