From 900a1cc94353b9202dcaee66b95d67e31331940e Mon Sep 17 00:00:00 2001 From: Arseny Kapoulkine Date: Tue, 29 Aug 2017 20:46:30 -0700 Subject: docs: Clarify Unicode validation behavior It has always been the case that pugixml does not perform Unicode validation or name/tag Unicode character class validation, but it wasn't very obvious from documentation. Fixes #162 --- docs/manual.html | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) (limited to 'docs/manual.html') diff --git a/docs/manual.html b/docs/manual.html index 627f570..1bed481 100644 --- a/docs/manual.html +++ b/docs/manual.html @@ -1941,7 +1941,7 @@ The current behavior for Unicode conversion is to skip all invalid UTF sequences

Multiple attributes of the same node can have equal names.

  • -

    All non-ASCII characters are treated in the same way as symbols of English alphabet, so some invalid tag names are not rejected.

    +

    Tag and attribute names are not fully validated for consisting of allowed characters, so some invalid tags are not rejected

  • Attribute values which contain < are not rejected.

    @@ -1958,6 +1958,9 @@ The current behavior for Unicode conversion is to skip all invalid UTF sequences
  • Invalid document type declarations are silently ignored in some cases.

  • +
  • +

    Unicode validation is not performed so invalid UTF sequences are not rejected.

    +
  • @@ -5672,7 +5675,7 @@ If exceptions are disabled, then in the event of parsing failure the query is in -- cgit v1.2.3