diff options
author | Arseny Kapoulkine <arseny.kapoulkine@gmail.com> | 2015-05-03 11:42:19 -0700 |
---|---|---|
committer | Arseny Kapoulkine <arseny.kapoulkine@gmail.com> | 2015-05-03 11:42:19 -0700 |
commit | 873c8e50110348e3ccdb4627e994317522a47405 (patch) | |
tree | 05616c8bcb5cb5154216031d8c356608205e0822 /tests/data/utftest_utf32_le_bom.xml | |
parent | a6cc636a6b0d531686311b5666ea77225b10903e (diff) | |
parent | 9597265a122ce0ef8b2bb0099bb106ee85a74289 (diff) |
Merge pull request #42 from zeux/compact
Implement compact mode.
This introduces a new storage mode that dramatically reduces node size at some performance cost.
The mode is enabled by defining PUGIXML_COMPACT. This does not change API/ABI - all existing functionality still works.
The pointers are stored using delta encoding and bytes, with some additional tricks to make encoding more optimal for e.g. parent pointer and string pointers. Since the node is fixed size, we have to fall back to a hash table if the pointer does not fit. Thus all DOM operations still have amortized complexity - constant number of operations if you don't need the hash table and amortized constant if you do.
Aside from some performance loss (which is inevitable since decoding takes time), the only other caveat is that we can't remove entries from the hash table - so in some edge cases with a lot of node removals the peak memory consumption can grow indefinitely. In theory we can implement this later; it's unclear that this is useful at this point.
The resulting node/attribute sizes are as follows:
non-compact node: 28b 32-bit, 56b 64-bit
compact node: 12b 32/64-bit
non-compact attribute: 20b 32-bit, 40b 64-bit
compact attribute: 8b 32/64-bit
Diffstat (limited to 'tests/data/utftest_utf32_le_bom.xml')
0 files changed, 0 insertions, 0 deletions