Why Off-The-Shelf RDBMSs are Better at XPath Than You Might Expect
Publication Details
Title
Why Off-The-Shelf RDMSs are Better at XPath Than You Might Expect
Authors
Torsten Grust, Jan Rittinger, and Jens Teubner
Published
Proceedings of the 2007 ACM SIGMOD Conference on Management of Data (Industrial Track)
Download
paper (PDF), presentation slides (PDF)
Abstract
To compensate for the inherent impedance mismatch between the relational data model (tables of tuples) and XML (ordered, unranked trees), tree join algorithms have become the prevalent means to process XML data in relational databases, most notably the TwigStack, structural join, and staircase join algorithms. However, the addition of these algorithms to existing systems depends on a significant invasion of the underlying database kernel, an option intolerable for most database vendors.
Here, we demonstrate that we can achieve comparable XPath performance without touching the heart of the system. We carefully exploit existing database functionality and accelerate XPath navigation by purely relational means: partitioned B-trees bring access costs to secondary storage to a minimum, while aggregation functions avoid an expensive computation and removal of duplicate result nodes to comply with the XPath semantics. Experiments carried out on IBM DB2 confirm that our approach can turn off-the-shelf database systems into efficient XPath processors.
Publication Log
March 2007
camera-ready for SIGMOD 2007
November 2006
submission to SIGMOD 2007 (accepted)
- reviews (results: accept, neutral)
June 2006
submission to ICDE 2007 (rejected)
- reviews (results: weak reject, weak accept, weak accept)