Jump label

Service navigation

Main navigation

You are here:

Main content

Recursion in XQuery: Put Your Distributivity Safety Belt On — VLDB 2008 Reviews

Reviews for paper Recursion in XQuery: Put Your Distributivity Safety Belt On, submitted to VLDB 2008.

Overall rating: reject

Reviewer 1

Novelty

Medium

Technical Depth

Medium

Presentation

Poor

Overall Rating

Weak Reject

Summary of the paper's contribution and Impact (up to one paragraph)

The paper proposes a controlled form of recursion in the form an inflationary fix point operator in XQuery. They draw parallelism to the relational counterpart and apply the known naïve and delta implementations for implementing the fix point operator in XQuery. They argue that the delta algorithm cannot be applied in the general case, and describe a distributivity property, which guarantees that the delta algorithm produces the correct result. They provide syntactic and algebraic rules to check for distributivity, and provide experiments that showcase the benefits of the delta algorithm.

Three strong points of the paper (please number them S1, S2, S3 in order of importance; if none, wite "none"

S1. The paper proposes an extension to XQuery (with-seeded-recurse construct) for a controlled form of recursion in XQuery.
S2. They describe a necessary condition for using the more efficient delta algorithm.
S3. They provide syntactic and algebraic rules for checking distributivity.

Three weak points of the paper (please number them W1, W2, W3 in order of importance; if none, write "none"

W1. The proposed form of recursion is a direct counterpart of the relational case and they use the known algorithms proposed for the relational case.
W2. They do not provide enough evidence that the proposed form of recursion handles most common use cases in XQuery.
W2. It is not clear what subset of XQuery they address. They refer to several different subsets. The paper is very poorly organized and presented.

Detailed comments (please number them)

C1. The main contribution of the paper is the with-seeded-recurse construct. But, it has not been properly handled. It is not clear how it fits into the XQuery language. Is it another expression that can compose with any other expression? It is not even clear whether they handle XQuery. Every section of the paper seems to be talking about a different XQuery subset. The rules provided for the syntactic check in Section 5.1 handle LiXQuery, whereas in all the discussion about transitive closure they address regular XPath.

C2. The proposed form of recursion is linear recursion and there exists efficient algorithms in the relational database literature for implementing them. The main contribution of the paper is the distributive property, which guarantees that the delta algorithm can be applied. The delta algorithm does not apply the body of the recursion to items from previous steps, hence avoid repeated computation. This has been the state-of-the-art in RDBMS technology over a decade. The paper does not address properly what additional challenges XQuery brings.

C3. XQuery provides recursive functions, and the proposed form of recursion can be expressed in user-defined functions. XPath already provides features to realize the most common, and useful, form of transitive closure, namely the descendant axis. What are the additional use cases for other transitive closures?

C4. The paper refers to many different XQuery subsets, and it is very hard to follow the paper with just references. It would have been very helpful to recap those subsets and explicitly define what types of expressions can be used in the proposed construct. In the definitions, they say that the expressions produce node* as a result. However, they use examples like count($x) >= 1, which produces Boolean results. Given the union in definition 3 is the XQuery union, which is only defined on sequences of nodes, it does not even make sense to discuss the distributivity of count($x) >= 1 expression.

C5. In Section 5.1, rules FOR1 and FOR2 enforce that the recursion variable $x occurs in either the body or the binding expression, but not both. I think this is too restrictive, because as long as there are no expressions that require to consume $x as a sequence (such as fn:position or fn:count) in the body of the recursion, there should not be any problem with distributivity. For example, if the body of the recursion is the following for-construct, and $x is bound to a set of nodes, then $x is distributive over this for-construct: for $a in $x/d return $x/e.

C6. Proposition 3 is confusing. It is clear that they are expressions which are distributive but not distributive-safe according to their definition, So, how can they claim expressive completeness of their notion?

C7. At the end of section 5.1, they say that at the expense of slight reformulation, they can provide hints to an XQuery processor. Yet, they do not explain what type of hints, or what type of reformulation they talk about.

C8. It is not clear why they mark the union operator they are trying to pull up. Why do they need to distinguish it in the query plan?

C9. In Section 6.1, they need to define what structural equality is. Are they talking about the query plan structure, or the XML structure produced as a result of the query plan? It seems they are talking about the former, but this needs to be explicit.

Reviewer 2

Novelty

Low

Technical Depth

Low

Presentation

Good

Overall Rating

Weak Reject

Summary of the paper's contribution and Impact (up to one paragraph)

The paper suggests a new syntactic construct in XQuery that can used to express a limited form of recursion that creates the potential for greater optimization.

Three strong points of the paper (please number them S1, S2, S3 in order of importance; if none, wite "none"

S1. The paper builds the new operator on top of ideas from SQL 1999.
S2. The authors provide rules for distributive safety detection in XQuery that has its use in a lot of other optimization scenarios.
S3. Experimental analysis shows promise for the shown recursive queries.

Three weak points of the paper (please number them W1, W2, W3 in order of importance; if none, write "none"

W1. The idea for the paper is not novel. Its a rehash of concepts from the relational world with an XQuery spin.
W2. A large part of XQuery is node construction. By eliminating that from the analysis, the authors have severely restricted the applicability of the suggested proposal. Such a restriction also violates the principle of orthogonality that XQuery currently has.

Detailed comments (please number them)

In the second paragraph in section 2, do you use e1(e2) to mean the replacement of $x by e2 or by the value of e2? Since XQuery is not a pure functional language, they can result in different results. If you preclude e2 containing node constructors, then they are equivalent.

Reviewer 3

Novelty

Medium

Technical Depth

Medium

Presentation

Good

Overall Rating

Weak Reject

Summary of the paper's contribution and Impact (up to one paragraph)

This paper studies the optimization of recursive XQueries. While general user-defined recursive functions in XQuery are notoriously hard to optimize, the authors investigate a limited form of recursive queries. More specifically, inspired by the relational case, they introduce a limited form of recursion in XQuery that corresponds to an inflationary fixpoint operator (IFP). Again, borrowing from the relational context, two candidate IFP evaluation methods are proposed: naive and delta. The latter, being in general more efficient, would be the desired choice but unfortunately it only correctly computes the IFP under certain assumptions. This assumption, called the distributive property, is rather restrictive and it is an undecidable property of XQuery expressions. Instead, the authors identify a syntactic fragment of XQuery in which every query is distributive and moreover it takes only linear time to check whether a query is in this fragment. Furthermore, any distributive query is expressible as a query in this fragment. A second way of enforcing distributive is then provided through an encoding of the XQuery expression into some relational algebra dialect. It is experimentally validated that for distributive queries, the delta evaluation method outperforms the naive method for the computation of IFPs, as expected.

Three strong points of the paper (please number them S1, S2, S3 in order of importance; if none, wite "none"

S1. The paper proposed a nice and elegant way of bring a controlled version of recursion into XQuery.
S2. An elegant, albeit restrictive, property is identified under which the fixpoint construct can be evaluated efficiently.
S3. The two ways of establishing distributivity of expressions are described with great care, making the paper nice to read.

Three weak points of the paper (please number them W1, W2, W3 in order of importance; if none, write "none"

W1. It is not entirely clear how big the class of distributive queries is. The distributivity property is rather restrictive and is only needed for the purpose of running the delta algorithm. There could be other efficient algorithms that do not require this assumption. In other words, instead of focussing on making the delta algorithm work, you could have invested time in the search for other evaluation algorithms.
W2. You do not allow node constructors in your distributivity-safe fragment. The ability to generate nodes is (in my opinion) an important feature of XQuery. Again, this is a sacrifice you make just because you want the delta algorithm to work.
W3. The paper does not really emphasize enough in which scenario's to use the syntactic approximation vs the algebraic approximation of distributivity.

Detailed comments (please number them)

P1. Following up on W1, I thoroughly belief that the addition of a simpler recursive construct to XQuery is a good idea and this paper makes significant contributions to make this also feasible in practice. It is not entirely clear, however, whether the focus on the delta algorithm is a healthy approach. It is indeed interesting to see when this algorithm would work correctly, but ultimately one wants an efficient algorithm that works in almost all cases.

P2. I am slightly confused why the authors do not cite their ICDE paper (An Inflationary Fixed Point Operator in XQuery. Afanasiev, Loredana; Grust, Torsten; Marx, Maarten; Rittinger, Jan; Teubner, Jens, Page(s): 1504-1506). This paper, although being a stripped down version of the current submission, certainly takes away some of the novelty.

Related Information



Sub content

Contact

Prof. Dr. Jens Teubner
Tel.: 0231 755-6481