Reviews for paper Recursion in XQuery: Put Your Distributivity Safety Belt On, submitted to EDBT 2009.

**Overall rating:** accept

Definitely relevant

One step forward

Good work

Well written

Probably accept

The paper describes in detail an extension of XQuery with an inflationary fixpoint operator. The paper describes the operator and its semantics, a naive evaluation method and a "Delta" evaluation method similar to semi-naive evaluation of Datalog. The authors point out that Delta is not correct in general, but then define an intuitive semantic condition, called distributivity, that ensures Delta is correct.

They also provide a sufficient linear time syntactic check for distributivity. They show how to implement the operator on top of a relational back-end, by integrating it, as well as a distributivity check, in MonetDB/XQuery. The paper also includes experiments showing the expected benefit of Seminaive evaluation for fixpoint operations.

The paper is written very well, with enough examples as well as a very clear writing style. The contribution is not particularlydeep but it is complete, correct and bound to be very useful to XQuery compilers.

Specific comments:

Proposition 3 appears incorrect. The problem case is of course when e($x) is distributive but NOT distributivity-safe. It is not clear how Rule FOR2 helps in this case.

The discussion of possible extensions in Section 8 is too cursory to be of any use. Specifically, claiming that the large body of work on Datalog evaluation is "directly" transferable or that SQL:1999 WITH RECURSIVE evaluation can benefit from distributivity checks reminds one (on a different scale) of Fermat's note on about having the proof for his last theorem.

Appropriate to the point

A pioneer work

Good work

Well written

Definitely accept

The paper proposes a new construct for expressing a restricted form of recursive queries in XQuery. Although XQuery already supports user-defined recursive operators, the way they are evaluated by the query processor is dictated by the way these operators are defined by the user. The new construct is a high-level template which allows to express most common forms of recursive queries (including transitive closures and all Regular Xpath), leaving to the query processor the optimization of the evaluation. The authors show a significant gain of performances in this direction.

The new operator has a fixed point semantics, known from relational query languages. Also the proposed algorithms for the evaluation of the fixed point are precisely the classical Naive and Delta algorithms, directly borrowed from the relational context. Nevertheless, the use of XQuery expressions in combination with these existing algorithms poses new interesting questions. In particular in order for the more efficient Delta algorithm to implement the correct semantics, queries have to satisfy a distributivity property, which guarantees that the fix point can be computed with less redundant computation.

The core of the paper concentrates on the problem of assessing the distributivity property for Xquery expressions. Although the distributivity property is shown undecidable, the authors first propose a syntactic safety restriction guaranteeing distributivity, and then most importantly show how to approximate the distributivity recognition directly on the relational translation of Xquery expressions, in linear time. This allows to embed the distributivity check in the query optimization process.

The paper is well written and pleasant to read. Results are interesting and experiments show that queries recognized as distributive are evaluated much more efficiently. Nevertheless the two approximations of the distributivity property remain somewhat unrelated. It would be interesting to see a deeper comparison between the two. For instance What does the algebraic distributivity check on a distributivly-safe query? does it always succeed? Are there queries in the distributively-safe syntactic fragment which are not algebraically distributive? and so on... Also, soundness of algebraic distributivity should be stated more explicitely.

Definitely relevant

One step forward

Good work

Excellently written

Probably accept

The paper proposes addition of an inflationary fixed point operator to XQuery, and discusses how to check whether the semi-naive evaluation algorithm can be applied to compute query results (i.e. whether the body expression distributes over union). Two variants of the check are presented: Firstly, inference rules that compute, for a sublanguage for XQuery, whether the body is distrbutive, and hence semi-naive evaluation is applicable. Secondly, the body expression is compiled into an algebraic plan with a union at the bottom , and using rewrite rules, the compiler tries to move the union to the top of the plan - success implies applicbility of the semi-naive strategy. Experimental results confirm that the semi-naive strategy improves performance.

While the paper does not provide any new insights into recursive query processing, it is a solid, nontrivial transfer of existing knowledge to the field of XQuery. It is also extremely well written (I can't think of anything to improve the presentation), and deserves acceptance.

- submission (PDF)
- final paper (PDF) — published at EDBT 2009
- earlier version (PDF) — short paper published at ICDE 2008
- earlier version (PDF) — technical report published at arXiv.org