Paper
21 February 2012 XML data compression in web publishing
Ruiheng Qiu, Wei Hu, Zhi Tang, Xiaoqing Lu, Lei Zhang
Author Affiliations +
Proceedings Volume 8302, Imaging and Printing in a Web 2.0 World III; 83020I (2012) https://doi.org/10.1117/12.905400
Event: IS&T/SPIE Electronic Imaging, 2012, Burlingame, California, United States
Abstract
XML is widely used in various document formats on the web. But it has caused negative impacts such as expensive document distribution time over the web, and long content jumping and rendering delay, especially on mobile devices. Hence we proposed a Schema-based efficient queryable XML compressor, called XTrim, which significantly improves compression ratio by utilizing optimized information in XML Schema while supporting efficient queries. Firstly, XTrim draws structure information from XML document and corresponding XML Schema. Then a novel technique is used to transform the XML tree-like structure into a compact indexed form to support efficient queries. At the same time, text values are obtained, and a language-based text trim method (LTT) that facilitates language-specific text compressors is adopted to reduce the size of text values in various languages. In LTT a word composition detection method is proposed to better process text in non-Latin languages. To evaluate the performance of XTrim, we have implemented a compressor and query engine prototype. Via extensive experiments, results show that XTrim outperforms XMill and existing queryable alternatives in terms of compression ratio, as well as the query efficiency. By applying XTrim to documents, the storage space can save up to 30% and the content jumping and rendering delay is reduced to less than 100ms from 4 seconds.
© (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ruiheng Qiu, Wei Hu, Zhi Tang, Xiaoqing Lu, and Lei Zhang "XML data compression in web publishing", Proc. SPIE 8302, Imaging and Printing in a Web 2.0 World III, 83020I (21 February 2012); https://doi.org/10.1117/12.905400
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data compression

Associative arrays

Prototyping

Computer science

Data processing

Mobile devices

Silicon

RELATED CONTENT

GOOSE: semantic search on internet connected sensors
Proceedings of SPIE (May 28 2013)
C2 design considerations for federated AI/ML systems
Proceedings of SPIE (June 06 2022)
XML documents cluster research based on frequent subpatterns
Proceedings of SPIE (December 03 2015)
Reactive broadcasting protocol for video on demand
Proceedings of SPIE (December 27 1999)

Back to Top