From 80a8a77af46d39872426356f311b27934284e80b Mon Sep 17 00:00:00 2001 From: Arseny Kapoulkine Date: Tue, 24 Mar 2015 10:03:08 -0700 Subject: docs: Finishing touches It's almost done; the only remaining issue is that some section titles are too long. --- docs/manual.adoc | 81 ++++++++++++++++++++++++++-------------------------- docs/quickstart.adoc | 9 +++--- 2 files changed, 45 insertions(+), 45 deletions(-) diff --git a/docs/manual.adoc b/docs/manual.adoc index f8b226d..de78eec 100644 --- a/docs/manual.adoc +++ b/docs/manual.adoc @@ -88,12 +88,13 @@ pugixml is distributed in source form. You can either download a source distribu [[install.getting.source]] ==== Source distributions -You can download the latest source distribution via one of the following links: +You can download the latest source distribution as an archive: -* https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.zip -* https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.tar.gz +https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.zip[pugixml-{version}.zip] (Windows line endings) +/ +https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.tar.gz[pugixml-{version}.tar.gz] (Unix line endings) -The distribution contains library source, documentation (the manual you're reading now and the quick start guide) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed archive. The files have different line endings depending on the archive format - `.zip` archive has Windows line endings, `.tar.gz` archive has Unix line endings. Otherwise the files in both archives are identical. +The distribution contains library source, documentation (the manual you're reading now and the quick start guide) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed archive. If you need an older version, you can download it from the https://github.com/zeux/pugixml/releases[version archive]. @@ -156,16 +157,16 @@ The correct way to resolve this is to disable precompiled headers for `pugixml.c [[install.building.static]] ==== Building pugixml as a standalone static library -It's possible to compile pugixml as a standalone static library. This process depends on the method of building your application; pugixml distribution comes with project files for several popular IDEs/build systems. There are project files for Apple XCode3, Code::Blocks, Codelite, Microsoft Visual Studio 2005, 2008, 2010, and configuration scripts for CMake and premake4. You're welcome to submit project files/build scripts for other software; see <>. +It's possible to compile pugixml as a standalone static library. This process depends on the method of building your application; pugixml distribution comes with project files for several popular IDEs/build systems. There are project files for Apple XCode, Code::Blocks, Codelite, Microsoft Visual Studio 2005, 2008, 2010+, and configuration scripts for CMake and premake4. You're welcome to submit project files/build scripts for other software; see <>. There are two projects for each version of Microsoft Visual Studio: one for dynamically linked CRT, which has a name like `pugixml_vs2008.vcproj`, and another one for statically linked CRT, which has a name like `pugixml_vs2008_static.vcproj`. You should select the version that matches the CRT used in your application; the default option for new projects created by Microsoft Visual Studio is dynamically linked CRT, so unless you changed the defaults, you should use the version with dynamic CRT (i.e. `pugixml_vs2008.vcproj` for Microsoft Visual Studio 2008). -In addition to adding pugixml project to your workspace, you'll have to make sure that your application links with pugixml library. If you're using Microsoft Visual Studio 2005/2008, you can add a dependency from your application project to pugixml one. If you're using Microsoft Visual Studio 2010, you'll have to add a reference to your application project instead. For other IDEs/systems, consult the relevant documentation. +In addition to adding pugixml project to your workspace, you'll have to make sure that your application links with pugixml library. If you're using Microsoft Visual Studio 2005/2008, you can add a dependency from your application project to pugixml one. If you're using Microsoft Visual Studio 2010+, you'll have to add a reference to your application project instead. For other IDEs/systems, consult the relevant documentation. [cols="4*a",frame=none,options=header] |=== 2+| Microsoft Visual Studio 2005/2008 -2+| Microsoft Visual Studio 2010 +2+| Microsoft Visual Studio 2010+ | image::vs2005_link1.png[link="images/vs2005_link1.png"] | image::vs2005_link2.png[link="images/vs2005_link2.png"] | image::vs2010_link1.png[link="images/vs2010_link1.png"] @@ -186,7 +187,7 @@ It's possible to compile pugixml as a standalone shared library. The process is #endif ---- -CAUTION: If you're using STL-related functions, you should use the shared runtime library to ensure that a single heap is used for STL allocations in your application and in pugixml; in MSVC, this means selecting the 'Multithreaded DLL' or 'Multithreaded Debug DLL' to 'Runtime library' property (/MD or /MDd linker switch). You should also make sure that your runtime library choice is consistent between different projects. +CAUTION: If you're using STL-related functions, you should use the shared runtime library to ensure that a single heap is used for STL allocations in your application and in pugixml; in MSVC, this means selecting the 'Multithreaded DLL' or 'Multithreaded Debug DLL' to 'Runtime library' property (`/MD` or `/MDd` linker switch). You should also make sure that your runtime library choice is consistent between different projects. [[install.building.header]] ==== Using pugixml in header-only mode @@ -227,7 +228,7 @@ NOTE: In that example `PUGIXML_API` is inconsistent between several source files [[PUGIXML_MEMORY_PAGE_SIZE]]`PUGIXML_MEMORY_PAGE_SIZE`, [[PUGIXML_MEMORY_OUTPUT_STACK]]`PUGIXML_MEMORY_OUTPUT_STACK` and [[PUGIXML_MEMORY_XPATH_PAGE_SIZE]]`PUGIXML_MEMORY_XPATH_PAGE_SIZE` can be used to customize certain important sizes to optimize memory usage for the application-specific patterns. For details see <>. -[[PUGIXML_HAS_LONG_LONG]]`PUGIXML_HAS_LONG_LONG` define enables support for `long long` type in pugixml. This define is automatically enabled if your platform is known to have `long long` support (i.e. has C{plus}{plus}-11 support or uses a reasonably modern version of a known compiler); if pugixml does not recognize that your platform supports `long long` but in fact it does, you can enable the define manually. +[[PUGIXML_HAS_LONG_LONG]]`PUGIXML_HAS_LONG_LONG` define enables support for `long long` type in pugixml. This define is automatically enabled if your platform is known to have `long long` support (i.e. has C{plus}{plus}11 support or uses a reasonably modern version of a known compiler); if pugixml does not recognize that your platform supports `long long` but in fact it does, you can enable the define manually. [[install.portability]] === Portability @@ -289,7 +290,7 @@ Here `"node"` element has three children, two of which are PCDATA nodes with val ---- + -CDATA nodes make it easy to include non-escaped <, & and > characters in plain text. CDATA value can not contain the character sequence ]]>, since it is used to determine the end of node contents. +CDATA nodes make it easy to include non-escaped `<`, `&` and `>` characters in plain text. CDATA value can not contain the character sequence `]]>`, since it is used to determine the end of node contents. * Comment nodes ([[node_comment]]`node_comment`) represent comments in XML. Comment nodes have a value, but do not have a name or children/attributes. The example XML representation of a comment node is as follows: + @@ -442,7 +443,7 @@ Most examples in this documentation assume char interface and therefore will not `xml_node node = doc.child("bookstore").find_child_by_attribute("book", "id", "12345");` -you'll have to do +you'll have to use `xml_node node = doc.child(L"bookstore").find_child_by_attribute(L"book", L"id", L"12345");` ==== @@ -533,7 +534,7 @@ Constructing a document object using the default constructor does not result in When the document is loaded from file/buffer, unless an inplace loading function is used (see <>), a complete copy of character stream is made; all names/values of nodes and attributes are allocated in this buffer. This buffer is allocated via a single large allocation and is only freed when document memory is reclaimed (i.e. if the <> object is destroyed or if another document is loaded in the same object). Also when loading from file or stream, an additional large allocation may be performed if encoding conversion is required; a temporary buffer is allocated, and it is freed before load function returns. -All additional memory, such as memory for document structure (node/attribute objects) and memory for node/attribute names/values is allocated in pages on the order of 32 kilobytes; actual objects are allocated inside the pages using a memory management scheme optimized for fast allocation/deallocation of many small objects. Because of the scheme specifics, the pages are only destroyed if all objects inside them are destroyed; also, generally destroying an object does not mean that subsequent object creation will reuse the same memory. This means that it is possible to devise a usage scheme which will lead to higher memory usage than expected; one example is adding a lot of nodes, and them removing all even numbered ones; not a single page is reclaimed in the process. However this is an example specifically crafted to produce unsatisfying behavior; in all practical usage scenarios the memory consumption is less than that of a general-purpose allocator because allocation meta-data is very small in size. +All additional memory, such as memory for document structure (node/attribute objects) and memory for node/attribute names/values is allocated in pages on the order of 32 Kb; actual objects are allocated inside the pages using a memory management scheme optimized for fast allocation/deallocation of many small objects. Because of the scheme specifics, the pages are only destroyed if all objects inside them are destroyed; also, generally destroying an object does not mean that subsequent object creation will reuse the same memory. This means that it is possible to devise a usage scheme which will lead to higher memory usage than expected; one example is adding a lot of nodes, and them removing all even numbered ones; not a single page is reclaimed in the process. However this is an example specifically crafted to produce unsatisfying behavior; in all practical usage scenarios the memory consumption is less than that of a general-purpose allocator because allocation meta-data is very small in size. [[loading]] == Loading document @@ -571,7 +572,7 @@ include::samples/load_file.cpp[tags=code] === Loading document from memory [[xml_document::load_buffer]][[xml_document::load_buffer_inplace]][[xml_document::load_buffer_inplace_own]] -Sometimes XML data should be loaded from some other source than a file, i.e. HTTP URL; also you may want to load XML data from file using non-standard functions, i.e. to use your virtual file system facilities or to load XML from gzip-compressed files. All these scenarios require loading document from memory. First you should prepare a contiguous memory block with all XML data; then you have to invoke one of buffer loading functions. These functions will handle the necessary encoding conversions, if any, and then will parse the data into the corresponding XML tree. There are several buffer loading functions, which differ in the behavior and thus in performance/memory usage: +Sometimes XML data should be loaded from some other source than a file, i.e. HTTP URL; also you may want to load XML data from file using non-standard functions, i.e. to use your virtual file system facilities or to load XML from GZip-compressed files. All these scenarios require loading document from memory. First you should prepare a contiguous memory block with all XML data; then you have to invoke one of buffer loading functions. These functions will handle the necessary encoding conversions, if any, and then will parse the data into the corresponding XML tree. There are several buffer loading functions, which differ in the behavior and thus in performance/memory usage: [source] ---- @@ -673,7 +674,7 @@ Parsing status is represented as the `xml_parse_status` enumeration and can be o * [[status_out_of_memory]]`status_out_of_memory` means that there was not enough memory during some allocation; any allocation failure during parsing results in this error. * [[status_internal_error]]`status_internal_error` means that something went horribly wrong; currently this error does not occur -* [[status_unrecognized_tag]]`status_unrecognized_tag` means that parsing stopped due to a tag with either an empty name or a name which starts with incorrect character, such as #. +* [[status_unrecognized_tag]]`status_unrecognized_tag` means that parsing stopped due to a tag with either an empty name or a name which starts with incorrect character, such as `#`. * [[status_bad_pi]]`status_bad_pi` means that parsing stopped due to incorrect document declaration/processing instruction * [[status_bad_comment]]`status_bad_comment`, [[status_bad_cdata]]`status_bad_cdata`, [[status_bad_doctype]]`status_bad_doctype` and [[status_bad_pcdata]]`status_bad_pcdata` mean that parsing stopped due to the invalid construct of the respective type * [[status_bad_start_element]]`status_bad_start_element` means that parsing stopped because starting tag either had no closing `>` symbol or contained some incorrect symbol @@ -736,7 +737,7 @@ CAUTION: Using in-place parsing (<> node), if the node handle is null, or if there is insufficient memory to handle the request. The provided string is copied into document managed memory and can be destroyed after the function returns (for example, you can safely pass stack-allocated buffers to these functions). The name/value content is not verified, so take care to use only valid XML names, or the document may become malformed. -There is no equivalent of <> function for modifying text children of the node. - This is an example of setting node name and value (link:samples/modify_base.cpp[]): [source,indent=0] @@ -1556,7 +1555,7 @@ Since `append_buffer` needs to append child nodes to the current node, it only w Often after creating a new document or loading the existing one and processing it, it is necessary to save the result back to file. Also it is occasionally useful to output the whole document or a subtree to some stream; use cases include debug printing, serialization via network or other text-oriented medium, etc. pugixml provides several functions to output any subtree of the document to a file, stream or another generic transport interface; these functions allow to customize the output format (see <>), and also perform necessary encoding conversions (see <>). This section documents the relevant functionality. -Before writing to the destination the node/attribute data is properly formatted according to the node type; all special XML symbols, such as < and &, are properly escaped (unless <> flag is set). In order to guard against forgotten node/attribute names, empty node/attribute names are printed as `":anonymous"`. For well-formed output, make sure all node and attribute names are set to meaningful values. +Before writing to the destination the node/attribute data is properly formatted according to the node type; all special XML symbols, such as `<` and `&`, are properly escaped (unless <> flag is set). In order to guard against forgotten node/attribute names, empty node/attribute names are printed as `":anonymous"`. For well-formed output, make sure all node and attribute names are set to meaningful values. CDATA sections with values that contain `"]]>"` are split into several sections as follows: section with value `"pre]]>post"` is written as `post]]>`. While this alters the structure of the document (if you load the document after saving it, there will be two CDATA sections instead of one), this is the only way to escape CDATA contents. @@ -1673,7 +1672,7 @@ These flags control the resulting tree contents: * [[format_raw]]`format_raw` switches between formatted and raw output. If this flag is on, the nodes are not indented in any way, and also no newlines that are not part of document text are printed. Raw mode can be used for serialization where the result is not intended to be read by humans; also it can be useful if the document was parsed with <> flag, to preserve the original document formatting as much as possible. This flag is *off* by default. -* [[format_no_escapes]]`format_no_escapes` disables output escaping for attribute values and PCDATA contents. If this flag is off, special symbols (', &, <, >) and all non-printable characters (those with codepoint values less than 32) are converted to XML escape sequences (i.e. &) during output. If this flag is on, no text processing is performed; therefore, output XML can be malformed if output contents contains invalid symbols (i.e. having a stray < in the PCDATA will make the output malformed). This flag is *off* by default. +* [[format_no_escapes]]`format_no_escapes` disables output escaping for attribute values and PCDATA contents. If this flag is off, special symbols (`"`, `&`, `<`, `>`) and all non-printable characters (those with codepoint values less than 32) are converted to XML escape sequences (i.e. `&amp;`) during output. If this flag is on, no text processing is performed; therefore, output XML can be malformed if output contents contains invalid symbols (i.e. having a stray `<` in the PCDATA will make the output malformed). This flag is *off* by default. These flags control the additional output information: @@ -2046,7 +2045,7 @@ If exceptions are disabled, then in the event of parsing failure the query is in const xpath_parse_result& xpath_query::result() const; ---- -Without exceptions, evaluating invalid query results in `false`, empty string, NaN or an empty node set, depending on the type; evaluating a query as a node set results in an empty node set if the return type is not node set. +Without exceptions, evaluating invalid query results in `false`, empty string, `NaN` or an empty node set, depending on the type; evaluating a query as a node set results in an empty node set if the return type is not node set. [[xpath_parse_result]] The information about parsing result is returned via `xpath_parse_result` object. It contains parsing status and the offset of last successfully parsed character from the beginning of the source stream: @@ -2089,7 +2088,7 @@ Because of the differences in document object models, performance considerations * Consecutive text nodes sharing the same parent are not merged, i.e. in `text1 text2` node should have one text node child, but instead has three. * Since the document type declaration is not used for parsing, `id()` function always returns an empty node set. -* Namespace nodes are not supported (affects namespace:: axis). +* Namespace nodes are not supported (affects `namespace::` axis). * Name tests are performed on QNames in XML document instead of expanded names; for ``, query `foo/ns1:*` will return only the first child, not both of them. Compliant XPath implementations can return both nodes if the user provides appropriate namespace declarations. * String functions consider a character to be either a single `char` value or a single `wchar_t` value, depending on the library configuration; this means that some string functions are not fully Unicode-aware. This affects `substring()`, `string-length()` and `translate()` functions. @@ -2577,7 +2576,7 @@ const unsigned int +++parse_wnorm_attribute [source,subs="+macros"] ---- -class +++xml_attribute+++ ++++class xml_attribute+++ +++xml_attribute+++(); bool +++empty+++() const; @@ -2626,7 +2625,7 @@ class +++xml_attribute+++ xml_attribute& +++operator=+++(long long rhs); xml_attribute& +++operator=+++(unsnigned long long rhs); -class +++xml_node+++ ++++class xml_node+++ +++xml_node+++(); bool +++empty+++() const; @@ -2738,7 +2737,7 @@ class +++xml_node+++ xpath_node_set +++select_nodes+++(const char_t* query, xpath_variable_set* variables = 0) const; xpath_node_set +++select_nodes+++(const xpath_query& query) const; -class +++xml_document+++ ++++class xml_document+++ +++xml_document+++(); ~+++xml_document+++(); @@ -2767,7 +2766,7 @@ class +++xml_document+++ xml_node +++document_element+++() const; -struct +++xml_parse_result+++ ++++struct xml_parse_result+++ xml_parse_status +++status+++; ptrdiff_t +++offset+++; xml_encoding +++encoding+++; @@ -2775,17 +2774,17 @@ struct +++xml_parse_result+++ operator +++bool+++() const; const char* +++description+++() const; -class +++xml_node_iterator+++ -class +++xml_attribute_iterator+++ ++++class xml_node_iterator+++ ++++class xml_attribute_iterator+++ -class +++xml_tree_walker+++ ++++class xml_tree_walker+++ virtual bool +++begin+++(xml_node& node); virtual bool +++for_each+++(xml_node& node) = 0; virtual bool +++end+++(xml_node& node); int +++depth+++() const; -class +++xml_text+++ ++++class xml_text+++ bool +++empty+++() const; operator +++xml_text::unspecified_bool_type+++() const; @@ -2821,24 +2820,24 @@ class +++xml_text+++ xml_node +++data+++() const; -class +++xml_writer+++ ++++class xml_writer+++ virtual void +++write+++(const void* data, size_t size) = 0; -class +++xml_writer_file+++: public xml_writer ++++class xml_writer_file+++: public xml_writer +++xml_writer_file+++(void* file); -class +++xml_writer_stream+++: public xml_writer ++++class xml_writer_stream+++: public xml_writer +++xml_writer_stream+++(std::ostream& stream); +++xml_writer_stream+++(std::wostream& stream); -struct +++xpath_parse_result+++ ++++struct xpath_parse_result+++ const char* +++error+++; ptrdiff_t +++offset+++; operator +++bool+++() const; const char* +++description+++() const; -class +++xpath_query+++ ++++class xpath_query+++ explicit +++xpath_query+++(const char_t* query, xpath_variable_set* variables = 0); bool +++evaluate_boolean+++(const xpath_node& n) const; @@ -2853,12 +2852,12 @@ class +++xpath_query+++ const xpath_parse_result& +++result+++() const; operator +++unspecified_bool_type+++() const; -class +++xpath_exception+++: public std::exception ++++class xpath_exception+++: public std::exception virtual const char* +++what+++() const throw(); const xpath_parse_result& +++result+++() const; -class +++xpath_node+++ ++++class xpath_node+++ +++xpath_node+++(); +++xpath_node+++(const xml_node& node); +++xpath_node+++(const xml_attribute& attribute, const xml_node& parent); @@ -2871,7 +2870,7 @@ class +++xpath_node+++ bool +++operator==+++(const xpath_node& n) const; bool +++operator!=+++(const xpath_node& n) const; -class +++xpath_node_set+++ ++++class xpath_node_set+++ +++xpath_node_set+++(); +++xpath_node_set+++(const_iterator begin, const_iterator end, type_t type = type_unsorted); @@ -2889,7 +2888,7 @@ class +++xpath_node_set+++ type_t +++type+++() const; void +++sort+++(bool reverse = false); -class +++xpath_variable+++ ++++class xpath_variable+++ const char_t* +++name+++() const; xpath_value_type +++type+++() const; @@ -2903,7 +2902,7 @@ class +++xpath_variable+++ bool +++set+++(const char_t* value); bool +++set+++(const xpath_node_set& value); -class +++xpath_variable_set+++ ++++class xpath_variable_set+++ xpath_variable* +++add+++(const char_t* name, xpath_value_type type); bool +++set+++(const char_t* name, bool value); diff --git a/docs/quickstart.adoc b/docs/quickstart.adoc index 9084448..4807524 100644 --- a/docs/quickstart.adoc +++ b/docs/quickstart.adoc @@ -15,12 +15,13 @@ NOTE: No documentation is perfect; neither is this one. If you find errors or om [[install]] == Installation -pugixml is distributed in source form. You can download a source distribution via one of the following links: +You can download the latest source distribution as an archive: -* https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.zip -* https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.tar.gz +https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.zip[pugixml-{version}.zip] (Windows line endings) +/ +https://github.com/zeux/pugixml/releases/download/v{version}/pugixml-{version}.tar.gz[pugixml-{version}.tar.gz] (Unix line endings) -The distribution contains library source, documentation (the guide you're reading now and the manual) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed archive. The files have different line endings depending on the archive format - `.zip` archive has Windows line endings, `.tar.gz` archive has Unix line endings. Otherwise the files in both archives are identical. +The distribution contains library source, documentation (the guide you're reading now and the manual) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed archive. The complete pugixml source consists of three files - one source file, `pugixml.cpp`, and two header files, `pugixml.hpp` and `pugiconfig.hpp`. `pugixml.hpp` is the primary header which you need to include in order to use pugixml classes/functions. The rest of this guide assumes that `pugixml.hpp` is either in the current directory or in one of include directories of your projects, so that `#include "pugixml.hpp"` can find the header; however you can also use relative path (i.e. `#include "../libs/pugixml/src/pugixml.hpp"`) or include directory-relative path (i.e. `#include `). -- cgit v1.2.3