diff options
Diffstat (limited to 'docs/manual.adoc')
-rw-r--r-- | docs/manual.adoc | 472 |
1 files changed, 245 insertions, 227 deletions
diff --git a/docs/manual.adoc b/docs/manual.adoc index 6ffd844..856d1b3 100644 --- a/docs/manual.adoc +++ b/docs/manual.adoc @@ -252,16 +252,16 @@ NOTE: In that example `PUGIXML_API` is inconsistent between several source files pugixml is written in standard-compliant C{plus}{plus} with some compiler-specific workarounds where appropriate. pugixml is compatible with the C{plus}{plus}11 standard, but does not require C{plus}{plus}11 support. Each version is tested with a unit test suite (with code coverage about 99%) on the following platforms: * Microsoft Windows: - * Borland C{plus}{plus} Compiler 5.82 - * Digital Mars C{plus}{plus} Compiler 8.51 - * Intel C{plus}{plus} Compiler 8.0, 9.0 x86/x64, 10.0 x86/x64, 11.0 x86/x64 - * Metrowerks CodeWarrior 8.0 - * Microsoft Visual C{plus}{plus} 6.0, 7.0 (2002), 7.1 (2003), 8.0 (2005) x86/x64, 9.0 (2008) x86/x64, 10.0 (2010) x86/x64, 11.0 (2011) x86/x64/ARM, 12.0 (2013) x86/x64/ARM and some CLR versions - * MinGW (GCC) 3.4, 4.4, 4.5, 4.6 x64 +** Borland C{plus}{plus} Compiler 5.82 +** Digital Mars C{plus}{plus} Compiler 8.51 +** Intel C{plus}{plus} Compiler 8.0, 9.0 x86/x64, 10.0 x86/x64, 11.0 x86/x64 +** Metrowerks CodeWarrior 8.0 +** Microsoft Visual C{plus}{plus} 6.0, 7.0 (2002), 7.1 (2003), 8.0 (2005) x86/x64, 9.0 (2008) x86/x64, 10.0 (2010) x86/x64, 11.0 (2011) x86/x64/ARM, 12.0 (2013) x86/x64/ARM and some CLR versions +** MinGW (GCC) 3.4, 4.4, 4.5, 4.6 x64 * Linux (GCC 4.4.3 x86/x64, GCC 4.8.1 x64, Clang 3.2 x64) * FreeBSD (GCC 4.2.1 x86/x64) -* Apple MacOSX (GCC 4.0.1 x86/x64/PowerPC) +* Apple MacOSX (GCC 4.0.1 x86/x64/PowerPC, Clang 3.5 x64) * Sun Solaris (sunCC x86/x64) * Microsoft Xbox 360 * Nintendo Wii (Metrowerks CodeWarrior 4.1) @@ -804,14 +804,14 @@ include::samples/load_options.cpp[tags=code] [#xml_encoding] pugixml supports all popular Unicode encodings (UTF-8, UTF-16 (big and little endian), UTF-32 (big and little endian); UCS-2 is naturally supported since it's a strict subset of UTF-16) and handles all encoding conversions. Most loading functions accept the optional parameter `encoding`. This is a value of enumeration type `xml_encoding`, that can have the following values: -* [anchor encoding_auto] means that pugixml will try to guess the encoding based on source XML data. The algorithm is a modified version of the one presented in Appendix F.1 of XML recommendation; it tries to match the first few bytes of input data with the following patterns in strict order: - * If first four bytes match UTF-32 BOM (Byte Order Mark), encoding is assumed to be UTF-32 with the endianness equal to that of BOM; - * If first two bytes match UTF-16 BOM, encoding is assumed to be UTF-16 with the endianness equal to that of BOM; - * If first three bytes match UTF-8 BOM, encoding is assumed to be UTF-8; - * If first four bytes match UTF-32 representation of [^<], encoding is assumed to be UTF-32 with the corresponding endianness; - * If first four bytes match UTF-16 representation of [^<?], encoding is assumed to be UTF-16 with the corresponding endianness; - * If first two bytes match UTF-16 representation of [^<], encoding is assumed to be UTF-16 with the corresponding endianness (this guess may yield incorrect result, but it's better than UTF-8); - * Otherwise encoding is assumed to be UTF-8. +* [anchor encoding_auto] means that pugixml will try to guess the encoding based on source XML data. The algorithm is a modified version of the one presented in http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info[Appendix F.1 of XML recommendation]; it tries to match the first few bytes of input data with the following patterns in strict order: +** If first four bytes match UTF-32 BOM (Byte Order Mark), encoding is assumed to be UTF-32 with the endianness equal to that of BOM; +** If first two bytes match UTF-16 BOM, encoding is assumed to be UTF-16 with the endianness equal to that of BOM; +** If first three bytes match UTF-8 BOM, encoding is assumed to be UTF-8; +** If first four bytes match UTF-32 representation of [^<], encoding is assumed to be UTF-32 with the corresponding endianness; +** If first four bytes match UTF-16 representation of [^<?], encoding is assumed to be UTF-16 with the corresponding endianness; +** If first two bytes match UTF-16 representation of [^<], encoding is assumed to be UTF-16 with the corresponding endianness (this guess may yield incorrect result, but it's better than UTF-8); +** Otherwise encoding is assumed to be UTF-8. * [anchor encoding_utf8] corresponds to UTF-8 encoding as defined in the Unicode standard; UTF-8 sequences with length equal to 5 or 6 are not standard and are rejected. * [anchor encoding_utf16_le] corresponds to little-endian UTF-16 encoding as defined in the Unicode standard; surrogate pairs are supported. @@ -1553,7 +1553,7 @@ The failure conditions resemble those of `append_child`, `insert_child_before` a pugixml provides several ways to assemble an XML document from other XML documents. Assuming there is a set of document fragments, represented as in-memory buffers, the implementation choices are as follows: * Use a temporary document to parse the data from a string, then clone the nodes to a destination node. For example: - ++ [source] ---- bool append_fragment(pugi::xml_node target, const char* buffer, size_t size) @@ -1567,7 +1567,7 @@ bool append_fragment(pugi::xml_node target, const char* buffer, size_t size) ---- * Cache the parsing step - instead of keeping in-memory buffers, keep document objects that already contain the parsed fragment: - ++ [source] ---- bool append_fragment(pugi::xml_node target, const pugi::xml_document& cached_fragment) @@ -1578,7 +1578,7 @@ bool append_fragment(pugi::xml_node target, const pugi::xml_document& cached_fra ---- * Use `xml_node::append_buffer` directly: - ++ [source] ---- xml_parse_result xml_node::append_buffer(const void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto); @@ -2140,341 +2140,359 @@ Because of the differences in document object models, performance considerations [[changes]] == Changelog -[h5 15.04.2015 - version 1.6] +:!numbered: + +[[v1.6]] +=== v1.6 ^15.04.2015^ Maintenance release. Changes: * Specification changes: - # Attribute/text values now use more digits when printing floating point numbers to guarantee round-tripping. - # Text nodes no longer get extra surrounding whitespace when pretty-printing nodes with mixed contents + . Attribute/text values now use more digits when printing floating point numbers to guarantee round-tripping. + . Text nodes no longer get extra surrounding whitespace when pretty-printing nodes with mixed contents * Bug fixes: - # Fixed translate and normalize-space XPath functions to no longer return internal NUL characters - # Fixed buffer overrun on malformed comments inside DOCTYPE sections - # DOCTYPE parsing can no longer run out of stack space on malformed inputs (XML parsing is now using bounded stack space) - # Adjusted processing instruction output to avoid malformed documents if the PI value contains "?>" + . Fixed translate and normalize-space XPath functions to no longer return internal NUL characters + . Fixed buffer overrun on malformed comments inside DOCTYPE sections + . DOCTYPE parsing can no longer run out of stack space on malformed inputs (XML parsing is now using bounded stack space) + . Adjusted processing instruction output to avoid malformed documents if the PI value contains "?>" -[h5 27.11.2014 - version 1.5] +[[v1.5]] +=== v1.5 ^27.11.2014^ Major release, featuring a lot of performance improvements and some new features. * Specification changes: - # xml_document::load(const char_t*) was renamed to load_string; the old method is still available and will be deprecated in a future release - # xml_node::select_single_node was renamed to select_node; the old method is still available and will be deprecated in a future release. + . xml_document::load(const char_t*) was renamed to load_string; the old method is still available and will be deprecated in a future release + . xml_node::select_single_node was renamed to select_node; the old method is still available and will be deprecated in a future release. * New features: - # Added xml_node::append_move and other functions for moving nodes within a document - # Added xpath_query::evaluate_node for evaluating queries with a single node as a result + . Added xml_node::append_move and other functions for moving nodes within a document + . Added xpath_query::evaluate_node for evaluating queries with a single node as a result * Performance improvements: - # Optimized XML parsing (10-40% faster with clang/gcc, up to 10% faster with MSVC) - # Optimized memory consumption when copying nodes in the same document (string contents is now shared) - # Optimized node copying (10% faster for cross-document copies, 3x faster for inter-document copies; also it now consumes a constant amount of stack space) - # Optimized node output (60% faster; also it now consumes a constant amount of stack space) - # Optimized XPath allocation (query evaluation now results in fewer temporary allocations) - # Optimized XPath sorting (node set sorting is 2-3x faster in some cases) - # Optimized XPath evaluation (XPathMark suite is 100x faster; some commonly used queries are 3-4x faster) + . Optimized XML parsing (10-40% faster with clang/gcc, up to 10% faster with MSVC) + . Optimized memory consumption when copying nodes in the same document (string contents is now shared) + . Optimized node copying (10% faster for cross-document copies, 3x faster for inter-document copies; also it now consumes a constant amount of stack space) + . Optimized node output (60% faster; also it now consumes a constant amount of stack space) + . Optimized XPath allocation (query evaluation now results in fewer temporary allocations) + . Optimized XPath sorting (node set sorting is 2-3x faster in some cases) + . Optimized XPath evaluation (XPathMark suite is 100x faster; some commonly used queries are 3-4x faster) * Compatibility improvements: - # Fixed xml_node::offset_debug for corner cases - # Fixed undefined behavior while calling memcpy in some cases - # Fixed MSVC 2015 compilation warnings - # Fixed contrib/foreach.hpp for Boost 1.56.0 + . Fixed xml_node::offset_debug for corner cases + . Fixed undefined behavior while calling memcpy in some cases + . Fixed MSVC 2015 compilation warnings + . Fixed contrib/foreach.hpp for Boost 1.56.0 * Bug fixes - # Adjusted comment output to avoid malformed documents if the comment value contains "--" - # Fix XPath sorting for documents that were constructed using append_buffer - # Fix load_file for wide-character paths with non-ASCII characters in MinGW with C{plus}{plus}11 mode enabled + . Adjusted comment output to avoid malformed documents if the comment value contains "--" + . Fix XPath sorting for documents that were constructed using append_buffer + . Fix load_file for wide-character paths with non-ASCII characters in MinGW with C{plus}{plus}11 mode enabled -[h5 27.02.2014 - version 1.4] +[[v1.4]] +=== v1.4 ^27.02.2014^ Major release, featuring various new features, bug fixes and compatibility improvements. * Specification changes: - # Documents without element nodes are now rejected with status_no_document_element error, unless parse_fragment option is used + . Documents without element nodes are now rejected with status_no_document_element error, unless parse_fragment option is used * New features: - # Added XML fragment parsing (parse_fragment flag) - # Added PCDATA whitespace trimming (parse_trim_pcdata flag) - # Added long long support for xml_attribute and xml_text (as_llong, as_ullong and set_value/set overloads) - # Added hexadecimal integer parsing support for as_int/as_uint/as_llong/as_ullong - # Added xml_node::append_buffer to improve performance of assembling documents from fragments - # xml_named_node_iterator is now bidirectional - # Reduced XPath stack consumption during compilation and evaluation (useful for embedded systems) + . Added XML fragment parsing (parse_fragment flag) + . Added PCDATA whitespace trimming (parse_trim_pcdata flag) + . Added long long support for xml_attribute and xml_text (as_llong, as_ullong and set_value/set overloads) + . Added hexadecimal integer parsing support for as_int/as_uint/as_llong/as_ullong + . Added xml_node::append_buffer to improve performance of assembling documents from fragments + . xml_named_node_iterator is now bidirectional + . Reduced XPath stack consumption during compilation and evaluation (useful for embedded systems) * Compatibility improvements: - # Improved support for platforms without wchar_t support - # Fixed several false positives in clang static analysis - # Fixed several compilation warnings for various GCC versions + . Improved support for platforms without wchar_t support + . Fixed several false positives in clang static analysis + . Fixed several compilation warnings for various GCC versions * Bug fixes: - # Fixed undefined pointer arithmetic in XPath implementation - # Fixed non-seekable iostream support for certain stream types, i.e. boost file_source with pipe input - # Fixed xpath_query::return_type() for some expressions - # Fixed dllexport issues with xml_named_node_iterator - # Fixed find_child_by_attribute assertion for attributes with null name/value + . Fixed undefined pointer arithmetic in XPath implementation + . Fixed non-seekable iostream support for certain stream types, i.e. boost file_source with pipe input + . Fixed xpath_query::return_type() for some expressions + . Fixed dllexport issues with xml_named_node_iterator + . Fixed find_child_by_attribute assertion for attributes with null name/value -[h5 1.05.2012 - version 1.2] +[[v1.2]] +=== v1.2 ^1.05.2012^ Major release, featuring header-only mode, various interface enhancements (i.e. PCDATA manipulation and C{plus}{plus}11 iteration), many other features and compatibility improvements. * New features: - # Added xml_text helper class for working with PCDATA/CDATA contents of an element node - # Added optional header-only mode (controlled by PUGIXML_HEADER_ONLY define) - # Added xml_node::children() and xml_node::attributes() for C{plus}{plus}11 ranged for loop or BOOST_FOREACH - # Added support for Latin-1 (ISO-8859-1) encoding conversion during loading and saving - # Added custom default values for '''xml_attribute::as_*''' (they are returned if the attribute does not exist) - # Added parse_ws_pcdata_single flag for preserving whitespace-only PCDATA in case it's the only child - # Added format_save_file_text for xml_document::save_file to open files as text instead of binary (changes newlines on Windows) - # Added format_no_escapes flag to disable special symbol escaping (complements ~parse_escapes) - # Added support for loading document from streams that do not support seeking - # Added '''PUGIXML_MEMORY_*''' constants for tweaking allocation behavior (useful for embedded systems) - # Added PUGIXML_VERSION preprocessor define + . Added xml_text helper class for working with PCDATA/CDATA contents of an element node + . Added optional header-only mode (controlled by PUGIXML_HEADER_ONLY define) + . Added xml_node::children() and xml_node::attributes() for C{plus}{plus}11 ranged for loop or BOOST_FOREACH + . Added support for Latin-1 (ISO-8859-1) encoding conversion during loading and saving + . Added custom default values for '''xml_attribute::as_*''' (they are returned if the attribute does not exist) + . Added parse_ws_pcdata_single flag for preserving whitespace-only PCDATA in case it's the only child + . Added format_save_file_text for xml_document::save_file to open files as text instead of binary (changes newlines on Windows) + . Added format_no_escapes flag to disable special symbol escaping (complements ~parse_escapes) + . Added support for loading document from streams that do not support seeking + . Added '''PUGIXML_MEMORY_*''' constants for tweaking allocation behavior (useful for embedded systems) + . Added PUGIXML_VERSION preprocessor define * Compatibility improvements: - # Parser does not require setjmp support (improves compatibility with some embedded platforms, enables clr:pure compilation) - # STL forward declarations are no longer used (fixes SunCC/RWSTL compilation, fixes clang compilation in C{plus}{plus}11 mode) - # Fixed AirPlay SDK, Android, Windows Mobile (WinCE) and C{plus}{plus}/CLI compilation - # Fixed several compilation warnings for various GCC versions, Intel C{plus}{plus} compiler and Clang + . Parser does not require setjmp support (improves compatibility with some embedded platforms, enables clr:pure compilation) + . STL forward declarations are no longer used (fixes SunCC/RWSTL compilation, fixes clang compilation in C{plus}{plus}11 mode) + . Fixed AirPlay SDK, Android, Windows Mobile (WinCE) and C{plus}{plus}/CLI compilation + . Fixed several compilation warnings for various GCC versions, Intel C{plus}{plus} compiler and Clang * Bug fixes: - # Fixed unsafe bool conversion to avoid problems on C{plus}{plus}/CLI - # Iterator dereference operator is const now (fixes Boost filter_iterator support) - # xml_document::save_file now checks for file I/O errors during saving + . Fixed unsafe bool conversion to avoid problems on C{plus}{plus}/CLI + . Iterator dereference operator is const now (fixes Boost filter_iterator support) + . xml_document::save_file now checks for file I/O errors during saving -[h5 1.11.2010 - version 1.0] +[[v1.0]] +=== v1.0 ^1.11.2010^ Major release, featuring many XPath enhancements, wide character filename support, miscellaneous performance improvements, bug fixes and more. * XPath: - # XPath implementation is moved to pugixml.cpp (which is the only source file now); use PUGIXML_NO_XPATH if you want to disable XPath to reduce code size - # XPath is now supported without exceptions (PUGIXML_NO_EXCEPTIONS); the error handling mechanism depends on the presence of exception support - # XPath is now supported without STL (PUGIXML_NO_STL) - # Introduced variable support - # Introduced new xpath_query::evaluate_string, which works without STL - # Introduced new xpath_node_set constructor (from an iterator range) - # Evaluation function now accept attribute context nodes - # All internal allocations use custom allocation functions - # Improved error reporting; now a last parsed offset is returned together with the parsing error + . XPath implementation is moved to pugixml.cpp (which is the only source file now); use PUGIXML_NO_XPATH if you want to disable XPath to reduce code size + . XPath is now supported without exceptions (PUGIXML_NO_EXCEPTIONS); the error handling mechanism depends on the presence of exception support + . XPath is now supported without STL (PUGIXML_NO_STL) + . Introduced variable support + . Introduced new xpath_query::evaluate_string, which works without STL + . Introduced new xpath_node_set constructor (from an iterator range) + . Evaluation function now accept attribute context nodes + . All internal allocations use custom allocation functions + . Improved error reporting; now a last parsed offset is returned together with the parsing error * Bug fixes: - # Fixed memory leak for loading from streams with stream exceptions turned on - # Fixed custom deallocation function calling with null pointer in one case - # Fixed missing attributes for iterator category functions; all functions/classes can now be DLL-exported - # Worked around Digital Mars compiler bug, which lead to minor read overfetches in several functions - # load_file now works with 2+ Gb files in MSVC/MinGW - # XPath: fixed memory leaks for incorrect queries - # XPath: fixed xpath_node() attribute constructor with empty attribute argument - # XPath: fixed lang() function for non-ASCII arguments + . Fixed memory leak for loading from streams with stream exceptions turned on + . Fixed custom deallocation function calling with null pointer in one case + . Fixed missing attributes for iterator category functions; all functions/classes can now be DLL-exported + . Worked around Digital Mars compiler bug, which lead to minor read overfetches in several functions + . load_file now works with 2+ Gb files in MSVC/MinGW + . XPath: fixed memory leaks for incorrect queries + . XPath: fixed xpath_node() attribute constructor with empty attribute argument + . XPath: fixed lang() function for non-ASCII arguments * Specification changes: - # CDATA nodes containing ]]> are printed as several nodes; while this changes the internal structure, this is the only way to escape CDATA contents - # Memory allocation errors during parsing now preserve last parsed offset (to give an idea about parsing progress) - # If an element node has the only child, and it is of CDATA type, then the extra indentation is omitted (previously this behavior only held for PCDATA children) + . CDATA nodes containing ]]> are printed as several nodes; while this changes the internal structure, this is the only way to escape CDATA contents + . Memory allocation errors during parsing now preserve last parsed offset (to give an idea about parsing progress) + . If an element node has the only child, and it is of CDATA type, then the extra indentation is omitted (previously this behavior only held for PCDATA children) * Additional functionality: - # Added xml_parse_result default constructor - # Added xml_document::load_file and xml_document::save_file with wide character paths - # Added as_utf8 and as_wide overloads for std::wstring/std::string arguments - # Added DOCTYPE node type (node_doctype) and a special parse flag, parse_doctype, to add such nodes to the document during parsing - # Added parse_full parse flag mask, which extends parse_default with all node type parsing flags except parse_ws_pcdata - # Added xml_node::hash_value() and xml_attribute::hash_value() functions for use in hash-based containers - # Added internal_object() and additional constructor for both xml_node and xml_attribute for easier marshalling (useful for language bindings) - # Added xml_document::document_element() function - # Added xml_node::prepend_attribute, xml_node::prepend_child and xml_node::prepend_copy functions - # Added xml_node::append_child, xml_node::prepend_child, xml_node::insert_child_before and xml_node::insert_child_after overloads for element nodes (with name instead of type) - # Added xml_document::reset() function + . Added xml_parse_result default constructor + . Added xml_document::load_file and xml_document::save_file with wide character paths + . Added as_utf8 and as_wide overloads for std::wstring/std::string arguments + . Added DOCTYPE node type (node_doctype) and a special parse flag, parse_doctype, to add such nodes to the document during parsing + . Added parse_full parse flag mask, which extends parse_default with all node type parsing flags except parse_ws_pcdata + . Added xml_node::hash_value() and xml_attribute::hash_value() functions for use in hash-based containers + . Added internal_object() and additional constructor for both xml_node and xml_attribute for easier marshalling (useful for language bindings) + . Added xml_document::document_element() function + . Added xml_node::prepend_attribute, xml_node::prepend_child and xml_node::prepend_copy functions + . Added xml_node::append_child, xml_node::prepend_child, xml_node::insert_child_before and xml_node::insert_child_after overloads for element nodes (with name instead of type) + . Added xml_document::reset() function * Performance improvements: - # xml_node::root() and xml_node::offset_debug() are now O(1) instead of O(logN) - # Minor parsing optimizations - # Minor memory optimization for strings in DOM tree (set_name/set_value) - # Memory optimization for string memory reclaiming in DOM tree (set_name/set_value now reallocate the buffer if memory waste is too big) - # XPath: optimized document order sorting - # XPath: optimized child/attribute axis step - # XPath: optimized number-to-string conversions in MSVC - # XPath: optimized concat for many arguments - # XPath: optimized evaluation allocation mechanism: constant and document strings are not heap-allocated - # XPath: optimized evaluation allocation mechanism: all temporaries' allocations use fast stack-like allocator + . xml_node::root() and xml_node::offset_debug() are now O(1) instead of O(logN) + . Minor parsing optimizations + . Minor memory optimization for strings in DOM tree (set_name/set_value) + . Memory optimization for string memory reclaiming in DOM tree (set_name/set_value now reallocate the buffer if memory waste is too big) + . XPath: optimized document order sorting + . XPath: optimized child/attribute axis step + . XPath: optimized number-to-string conversions in MSVC + . XPath: optimized concat for many arguments + . XPath: optimized evaluation allocation mechanism: constant and document strings are not heap-allocated + . XPath: optimized evaluation allocation mechanism: all temporaries' allocations use fast stack-like allocator * Compatibility: - # Removed wildcard functions (xml_node::child_w, xml_node::attribute_w, etc.) - # Removed xml_node::all_elements_by_name - # Removed xpath_type_t enumeration; use xpath_value_type instead - # Removed format_write_bom_utf8 enumeration; use format_write_bom instead - # Removed xml_document::precompute_document_order, xml_attribute::document_order and xml_node::document_order functions; document order sort optimization is now automatic - # Removed xml_document::parse functions and transfer_ownership struct; use xml_document::load_buffer_inplace and xml_document::load_buffer_inplace_own instead - # Removed as_utf16 function; use as_wide instead + . Removed wildcard functions (xml_node::child_w, xml_node::attribute_w, etc.) + . Removed xml_node::all_elements_by_name + . Removed xpath_type_t enumeration; use xpath_value_type instead + . Removed format_write_bom_utf8 enumeration; use format_write_bom instead + . Removed xml_document::precompute_document_order, xml_attribute::document_order and xml_node::document_order functions; document order sort optimization is now automatic + . Removed xml_document::parse functions and transfer_ownership struct; use xml_document::load_buffer_inplace and xml_document::load_buffer_inplace_own instead + . Removed as_utf16 function; use as_wide instead -[h5 1.07.2010 - version 0.9] +[[v0.9]] +=== v0.9 ^1.07.2010^ Major release, featuring extended and improved Unicode support, miscellaneous performance improvements, bug fixes and more. * Major Unicode improvements: - # Introduced encoding support (automatic/manual encoding detection on load, manual encoding selection on save, conversion from/to UTF8, UTF16 LE/BE, UTF32 LE/BE) - # Introduced wchar_t mode (you can set PUGIXML_WCHAR_MODE define to switch pugixml internal encoding from UTF8 to wchar_t; all functions are switched to their Unicode variants) - # Load/save functions now support wide streams + . Introduced encoding support (automatic/manual encoding detection on load, manual encoding selection on save, conversion from/to UTF8, UTF16 LE/BE, UTF32 LE/BE) + . Introduced wchar_t mode (you can set PUGIXML_WCHAR_MODE define to switch pugixml internal encoding from UTF8 to wchar_t; all functions are switched to their Unicode variants) + . Load/save functions now support wide streams * Bug fixes: - # Fixed document corruption on failed parsing bug - # XPath string <-> number conversion improvements (increased precision, fixed crash for huge numbers) - # Improved DOCTYPE parsing: now parser recognizes all well-formed DOCTYPE declarations - # Fixed xml_attribute::as_uint() for large numbers (i.e. 2^32-1) - # Fixed xml_node::first_element_by_path for path components that are prefixes of node names, but are not exactly equal to them. + . Fixed document corruption on failed parsing bug + . XPath string <-> number conversion improvements (increased precision, fixed crash for huge numbers) + . Improved DOCTYPE parsing: now parser recognizes all well-formed DOCTYPE declarations + . Fixed xml_attribute::as_uint() for large numbers (i.e. 2^32-1) + . Fixed xml_node::first_element_by_path for path components that are prefixes of node names, but are not exactly equal to them. * Specification changes: - # parse() API changed to load_buffer/load_buffer_inplace/load_buffer_inplace_own; load_buffer APIs do not require zero-terminated strings. - # Renamed as_utf16 to as_wide - # Changed xml_node::offset_debug return type and xml_parse_result::offset type to ptrdiff_t - # Nodes/attributes with empty names are now printed as :anonymous + . parse() API changed to load_buffer/load_buffer_inplace/load_buffer_inplace_own; load_buffer APIs do not require zero-terminated strings. + . Renamed as_utf16 to as_wide + . Changed xml_node::offset_debug return type and xml_parse_result::offset type to ptrdiff_t + . Nodes/attributes with empty names are now printed as :anonymous * Performance improvements: - # Optimized document parsing and saving - # Changed internal memory management: internal allocator is used for both metadata and name/value data; allocated pages are deleted if all allocations from them are deleted - # Optimized memory consumption: sizeof(xml_node_struct) reduced from 40 bytes to 32 bytes on x86 - # Optimized debug mode parsing/saving by order of magnitude + . Optimized document parsing and saving + . Changed internal memory management: internal allocator is used for both metadata and name/value data; allocated pages are deleted if all allocations from them are deleted + . Optimized memory consumption: sizeof(xml_node_struct) reduced from 40 bytes to 32 bytes on x86 + . Optimized debug mode parsing/saving by order of magnitude * Miscellaneous: - # All STL includes except <exception> in pugixml.hpp are replaced with forward declarations - # xml_node::remove_child and xml_node::remove_attribute now return the operation result + . All STL includes except <exception> in pugixml.hpp are replaced with forward declarations + . xml_node::remove_child and xml_node::remove_attribute now return the operation result * Compatibility: - # parse() and as_utf16 are left for compatibility (these functions are deprecated and will be removed in version 1.0) - # Wildcard functions, document_order/precompute_document_order functions, all_elements_by_name function and format_write_bom_utf8 flag are deprecated and will be removed in version 1.0 - # xpath_type_t enumeration was renamed to xpath_value_type; xpath_type_t is deprecated and will be removed in version 1.0 + . parse() and as_utf16 are left for compatibility (these functions are deprecated and will be removed in version 1.0) + . Wildcard functions, document_order/precompute_document_order functions, all_elements_by_name function and format_write_bom_utf8 flag are deprecated and will be removed in version 1.0 + . xpath_type_t enumeration was renamed to xpath_value_type; xpath_type_t is deprecated and will be removed in version 1.0 -[h5 8.11.2009 - version 0.5] +[[v0.5]] +=== v0.5 ^8.11.2009^ Major bugfix release. Changes: * XPath bugfixes: - # Fixed translate(), lang() and concat() functions (infinite loops/crashes) - # Fixed compilation of queries with empty literal strings ("") - # Fixed axis tests: they never add empty nodes/attributes to the resulting node set now - # Fixed string-value evaluation for node-set (the result excluded some text descendants) - # Fixed self:: axis (it behaved like ancestor-or-self::) - # Fixed following:: and preceding:: axes (they included descendent and ancestor nodes, respectively) - # Minor fix for namespace-uri() function (namespace declaration scope includes the parent element of namespace declaration attribute) - # Some incorrect queries are no longer parsed now (i.e. foo: *) - # Fixed text()/etc. node test parsing bug (i.e. foo[text()] failed to compile) - # Fixed root step (/) - it now selects empty node set if query is evaluated on empty node - # Fixed string to number conversion ("123 " converted to NaN, "123 .456" converted to 123.456 - now the results are 123 and NaN, respectively) - # Node set copying now preserves sorted type; leads to better performance on some queries + . Fixed translate(), lang() and concat() functions (infinite loops/crashes) + . Fixed compilation of queries with empty literal strings ("") + . Fixed axis tests: they never add empty nodes/attributes to the resulting node set now + . Fixed string-value evaluation for node-set (the result excluded some text descendants) + . Fixed self:: axis (it behaved like ancestor-or-self::) + . Fixed following:: and preceding:: axes (they included descendent and ancestor nodes, respectively) + . Minor fix for namespace-uri() function (namespace declaration scope includes the parent element of namespace declaration attribute) + . Some incorrect queries are no longer parsed now (i.e. foo: *) + . Fixed text()/etc. node test parsing bug (i.e. foo[text()] failed to compile) + . Fixed root step (/) - it now selects empty node set if query is evaluated on empty node + . Fixed string to number conversion ("123 " converted to NaN, "123 .456" converted to 123.456 - now the results are 123 and NaN, respectively) + . Node set copying now preserves sorted type; leads to better performance on some queries * Miscellaneous bugfixes: - # Fixed xml_node::offset_debug for PI nodes - # Added empty attribute checks to xml_node::remove_attribute - # Fixed node_pi and node_declaration copying - # Const-correctness fixes + . Fixed xml_node::offset_debug for PI nodes + . Added empty attribute checks to xml_node::remove_attribute + . Fixed node_pi and node_declaration copying + . Const-correctness fixes * Specification changes: - # xpath_node::select_nodes() and related functions now throw exception if expression return type is not node set (instead of assertion) - # xml_node::traverse() now sets depth to -1 for both begin() and end() callbacks (was 0 at begin() and -1 at end()) - # In case of non-raw node printing a newline is output after PCDATA inside nodes if the PCDATA has siblings - # UTF8 -> wchar_t conversion now considers 5-byte UTF8-like sequences as invalid + . xpath_node::select_nodes() and related functions now throw exception if expression return type is not node set (instead of assertion) + . xml_node::traverse() now sets depth to -1 for both begin() and end() callbacks (was 0 at begin() and -1 at end()) + . In case of non-raw node printing a newline is output after PCDATA inside nodes if the PCDATA has siblings + . UTF8 -> wchar_t conversion now considers 5-byte UTF8-like sequences as invalid * New features: - # Added xpath_node_set::operator[] for index-based iteration - # Added xpath_query::return_type() - # Added getter accessors for memory-management functions + . Added xpath_node_set::operator[] for index-based iteration + . Added xpath_query::return_type() + . Added getter accessors for memory-management functions -[h5 17.09.2009 - version 0.42] +[[v0.42]] +=== v0.42 ^17.09.2009^ Maintenance release. Changes: * Bug fixes: - # Fixed deallocation in case of custom allocation functions or if delete[] / free are incompatible - # XPath parser fixed for incorrect queries (i.e. incorrect XPath queries should now always fail to compile) - # Const-correctness fixes for find_child_by_attribute - # Improved compatibility (miscellaneous warning fixes, fixed cstring include dependency for GCC) - # Fixed iterator begin/end and print function to work correctly for empty nodes + . Fixed deallocation in case of custom allocation functions or if delete[] / free are incompatible + . XPath parser fixed for incorrect queries (i.e. incorrect XPath queries should now always fail to compile) + . Const-correctness fixes for find_child_by_attribute + . Improved compatibility (miscellaneous warning fixes, fixed cstring include dependency for GCC) + . Fixed iterator begin/end and print function to work correctly for empty nodes * New features: - # Added PUGIXML_API/PUGIXML_CLASS/PUGIXML_FUNCTION configuration macros to control class/function attributes - # Added xml_attribute::set_value overloads for different types + . Added PUGIXML_API/PUGIXML_CLASS/PUGIXML_FUNCTION configuration macros to control class/function attributes + . Added xml_attribute::set_value overloads for different types -[h5 8.02.2009 - version 0.41] +[[v0.41]] +=== v0.41 ^8.02.2009^ Maintenance release. Changes: * Bug fixes: - # Fixed bug with node printing (occasionally some content was not written to output stream) + . Fixed bug with node printing (occasionally some content was not written to output stream) -[h5 18.01.2009 - version 0.4] +[[v0.4]] +=== v0.4 ^18.01.2009^ Changes: * Bug fixes: - # Documentation fix in samples for parse() with manual lifetime control - # Fixed document order sorting in XPath (it caused wrong order of nodes after xpath_node_set::sort and wrong results of some XPath queries) + . Documentation fix in samples for parse() with manual lifetime control + . Fixed document order sorting in XPath (it caused wrong order of nodes after xpath_node_set::sort and wrong results of some XPath queries) * Node printing changes: - # Single quotes are no longer escaped when printing nodes - # Symbols in second half of ASCII table are no longer escaped when printing nodes; because of this, format_utf8 flag is deleted as it's no longer needed and format_write_bom is renamed to format_write_bom_utf8. - # Reworked node printing - now it works via xml_writer interface; implementations for FILE* and std::ostream are available. As a side-effect, xml_document::save_file now works without STL. + . Single quotes are no longer escaped when printing nodes + . Symbols in second half of ASCII table are no longer escaped when printing nodes; because of this, format_utf8 flag is deleted as it's no longer needed and format_write_bom is renamed to format_write_bom_utf8. + . Reworked node printing - now it works via xml_writer interface; implementations for FILE* and std::ostream are available. As a side-effect, xml_document::save_file now works without STL. * New features: - # Added unsigned integer support for attributes (xml_attribute::as_uint, xml_attribute::operator=) - # Now document declaration (<?xml ...?>) is parsed as node with type node_declaration when parse_declaration flag is specified (access to encoding/version is performed as if they were attributes, i.e. doc.child("xml").attribute("version").as_float()); corresponding flags for node printing were also added - # Added support for custom memory management (see set_memory_management_functions for details) - # Implemented node/attribute copying (see xml_node::insert_copy_* and xml_node::append_copy for details) - # Added find_child_by_attribute and find_child_by_attribute_w to simplify parsing code in some cases (i.e. COLLADA files) - # Added file offset information querying for debugging purposes (now you're able to determine exact location of any xml_node in parsed file, see xml_node::offset_debug for details) - # Improved error handling for parsing - now load(), load_file() and parse() return xml_parse_result, which contains error code and last parsed offset; this does not break old interface as xml_parse_result can be implicitly casted to bool. + . Added unsigned integer support for attributes (xml_attribute::as_uint, xml_attribute::operator=) + . Now document declaration (<?xml ...?>) is parsed as node with type node_declaration when parse_declaration flag is specified (access to encoding/version is performed as if they were attributes, i.e. doc.child("xml").attribute("version").as_float()); corresponding flags for node printing were also added + . Added support for custom memory management (see set_memory_management_functions for details) + . Implemented node/attribute copying (see xml_node::insert_copy_* and xml_node::append_copy for details) + . Added find_child_by_attribute and find_child_by_attribute_w to simplify parsing code in some cases (i.e. COLLADA files) + . Added file offset information querying for debugging purposes (now you're able to determine exact location of any xml_node in parsed file, see xml_node::offset_debug for details) + . Improved error handling for parsing - now load(), load_file() and parse() return xml_parse_result, which contains error code and last parsed offset; this does not break old interface as xml_parse_result can be implicitly casted to bool. -[h5 31.10.2007 - version 0.34] +[[v0.34]] +=== v0.34 ^31.10.2007^ Maintenance release. Changes: * Bug fixes: - # Fixed bug with loading from text-mode iostreams - # Fixed leak when transfer_ownership is true and parsing is failing - # Fixed bug in saving (\r and \n are now escaped in attribute values) - # Renamed free() to destroy() - some macro conflicts were reported + . Fixed bug with loading from text-mode iostreams + . Fixed leak when transfer_ownership is true and parsing is failing + . Fixed bug in saving (\r and \n are now escaped in attribute values) + . Renamed free() to destroy() - some macro conflicts were reported * New features: - # Improved compatibility (supported Digital Mars C{plus}{plus}, MSVC 6, CodeWarrior 8, PGI C{plus}{plus}, Comeau, supported PS3 and XBox360) - # PUGIXML_NO_EXCEPTION flag for platforms without exception handling + . Improved compatibility (supported Digital Mars C{plus}{plus}, MSVC 6, CodeWarrior 8, PGI C{plus}{plus}, Comeau, supported PS3 and XBox360) + . PUGIXML_NO_EXCEPTION flag for platforms without exception handling -[h5 21.02.2007 - version 0.3] +[[v0.3]] +=== v0.3 ^21.02.2007^ Refactored, reworked and improved version. Changes: * Interface: - # Added XPath - # Added tree modification functions - # Added no STL compilation mode - # Added saving document to file - # Refactored parsing flags - # Removed xml_parser class in favor of xml_document - # Added transfer ownership parsing mode - # Modified the way xml_tree_walker works - # Iterators are now non-constant + . Added XPath + . Added tree modification functions + . Added no STL compilation mode + . Added saving document to file + . Refactored parsing flags + . Removed xml_parser class in favor of xml_document + . Added transfer ownership parsing mode + . Modified the way xml_tree_walker works + . Iterators are now non-constant * Implementation: - # Support of several compilers and platforms - # Refactored and sped up parsing core - # Improved standard compliancy - # Added XPath implementation - # Fixed several bugs + . Support of several compilers and platforms + . Refactored and sped up parsing core + . Improved standard compliancy + . Added XPath implementation + . Fixed several bugs -[h5 6.11.2006 - version 0.2] +[[v0.2]] +=== v0.2 ^6.11.2006^ First public release. Changes: * Bug fixes: - # Fixed child_value() (for empty nodes) - # Fixed xml_parser_impl warning at W4 + . Fixed child_value() (for empty nodes) + . Fixed xml_parser_impl warning at W4 * New features: - # Introduced child_value(name) and child_value_w(name) - # parse_eol_pcdata and parse_eol_attribute flags + parse_minimal optimizations - # Optimizations of strconv_t + . Introduced child_value(name) and child_value_w(name) + . parse_eol_pcdata and parse_eol_attribute flags + parse_minimal optimizations + . Optimizations of strconv_t -[h5 15.07.2006 - version 0.1] +[[v0.1]] +=== v0.1 ^15.07.2006^ First private release for testing purposes +:numbered: + [[apiref]] == API Reference |