summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorArseny Kapoulkine <arseny.kapoulkine@gmail.com>2016-01-12 20:38:45 -0800
committerArseny Kapoulkine <arseny.kapoulkine@gmail.com>2016-01-12 20:38:45 -0800
commit85238132d3e3029d0454d5e16222a7d48b26e381 (patch)
treea901c39f63112851d0df1bad694f1fa770426ef7 /docs
parent71d3a797f4c1991070afb868e584f1ab3e8ad669 (diff)
docs: Add parse_embed_pcdata documentation
Diffstat (limited to 'docs')
-rw-r--r--docs/manual.adoc4
1 files changed, 4 insertions, 0 deletions
diff --git a/docs/manual.adoc b/docs/manual.adoc
index ccf3fe7..1d8d88a 100644
--- a/docs/manual.adoc
+++ b/docs/manual.adoc
@@ -746,6 +746,9 @@ These flags control the resulting tree contents:
* [[parse_ws_pcdata_single]]`parse_ws_pcdata_single` determines if whitespace-only PCDATA nodes that have no sibling nodes are to be put in DOM tree. In some cases application needs to parse the whitespace-only contents of nodes, i.e. `<node> </node>`, but is not interested in whitespace markup elsewhere. It is possible to use <<parse_ws_pcdata,parse_ws_pcdata>> flag in this case, but it results in excessive allocations and complicates document processing; this flag can be used to avoid that. As an example, after parsing XML string `<node> <a> </a> </node>` with `parse_ws_pcdata_single` flag set, `<node>` element will have one child `<a>`, and `<a>` element will have one child with type <<node_pcdata,node_pcdata>> and value `" "`. This flag has no effect if <<parse_ws_pcdata,parse_ws_pcdata>> is enabled. This flag is *off* by default.
+* [[parse_embed_pcdata]]`parse_embed_pcdata` determines if PCDATA contents is to be saved as element values. Normally element nodes have names but not values; this flag forces the parser to store the contents as a value if PCDATA is the first child of the element node (otherwise PCDATA node is created as usual). This can significantly reduce the memory required for documents with many PCDATA nodes. To retrieve the data you can use `xml_node::value()` on the element nodes or any of the higher-level functions like `child_value` or `text`. This flag is *off* by default.
+Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is *off* by default.
+
* [[parse_fragment]]`parse_fragment` determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid. This flag is *off* by default.
CAUTION: Using in-place parsing (<<xml_document::load_buffer_inplace,load_buffer_inplace>>) with `parse_fragment` flag may result in the loss of the last character of the buffer if it is a part of PCDATA. Since PCDATA values are null-terminated strings, the only way to resolve this is to provide a null-terminated buffer as an input to `load_buffer_inplace` - i.e. `doc.load_buffer_inplace("test\0", 5, pugi::parse_default | pugi::parse_fragment)`.
@@ -2611,6 +2614,7 @@ const unsigned int +++<a href="#parse_pi">parse_pi</a>+++
const unsigned int +++<a href="#parse_trim_pcdata">parse_trim_pcdata</a>+++
const unsigned int +++<a href="#parse_ws_pcdata">parse_ws_pcdata</a>+++
const unsigned int +++<a href="#parse_ws_pcdata_single">parse_ws_pcdata_single</a>+++
+const unsigned int +++<a href="#parse_embed_pcdata">parse_embed_pcdata</a>+++
const unsigned int +++<a href="#parse_wconv_attribute">parse_wconv_attribute</a>+++
const unsigned int +++<a href="#parse_wnorm_attribute">parse_wnorm_attribute</a>+++
----