<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi Balz<br>
<br>
Thanks very much for your feedback. Please find some comments inline
below<br>
<br>
On 3/29/11 11:06 AM, Balz Schreier wrote:
<blockquote
cite="mid:AANLkTimjmyE4qp20VPooWG15qhAWOiJV0CeHDXFyd98-@mail.gmail.com"
type="cite">Hi Michael,
<div><br>
</div>
<div>my observation with large data sets (<a
moz-do-not-send="true" href="http://zwischengas.com">zwischengas.com</a>)
is the following:</div>
<div>- usually you only want to retrieve a subset of all matching
documents</div>
<div>- therefore it is ok to include the "max documents" parameter
into the API too</div>
<div>- additionally you could also provide a method search(from,
max), which internally uses the lucene method search(from+max)
and then just skips the document before "from".</div>
</blockquote>
<blockquote
cite="mid:AANLkTimjmyE4qp20VPooWG15qhAWOiJV0CeHDXFyd98-@mail.gmail.com"
type="cite">
<div><br>
</div>
<div>I don't know where you want to provide this method </div>
</blockquote>
<br>
for example for retrieving revisions of a node. We have some real
world situations with more than 30K revisions per node.<br>
<br>
<blockquote
cite="mid:AANLkTimjmyE4qp20VPooWG15qhAWOiJV0CeHDXFyd98-@mail.gmail.com"
type="cite">
<div>but be careful with creating YarepNodes for the results, I
would deal with just the Yarep Paths as long as you can,
otherwise performance goes done dramatically.</div>
</blockquote>
<br>
I think it depends on the implementation. For example some
implementations read the properties during node init, which I
consider bad and I think we should change.<br>
<br>
Thanks<br>
<br>
Michael<br>
<blockquote
cite="mid:AANLkTimjmyE4qp20VPooWG15qhAWOiJV0CeHDXFyd98-@mail.gmail.com"
type="cite">
<div><br>
</div>
<div>Cheers</div>
<div>Balz<br>
<br>
<div class="gmail_quote">On Tue, Mar 29, 2011 at 10:50 AM,
Michael Wechner <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:michael.wechner@wyona.com">michael.wechner@wyona.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div bgcolor="#ffffff" text="#000000"> Hi<br>
<br>
I am currently thinking about introducing a new
VersionableV3 interface to access large sets of revisions<br>
(e.g. 50K) and make it scale better. Also it would be nice
to search revisions for particular tags.<br>
Hence I was looking at the search API of lucene, because
it has similar scalability issues:<br>
<br>
<a moz-do-not-send="true"
href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/Searcher.html#search%28org.apache.lucene.search.Query,%20org.apache.lucene.search.Filter,%20int%29"
target="_blank">http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/Searcher.html#search%28org.apache.lucene.search.Query,%20org.apache.lucene.search.Filter,%20int%29</a><br>
<br>
<pre>public <a moz-do-not-send="true" href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/TopDocs.html" title="class in org.apache.lucene.search" target="_blank">TopDocs</a> <b>search</b>(<a moz-do-not-send="true" href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/Query.html" title="class in org.apache.lucene.search" target="_blank">Query</a> query,
<a moz-do-not-send="true" href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/Filter.html" title="class in org.apache.lucene.search" target="_blank">Filter</a> filter,
int n)
throws <a moz-do-not-send="true" href="http://java.sun.com/j2se/1.5/docs/api/java/io/IOException.html" title="class or interface in java.io" target="_blank">IOException</a>
<a moz-do-not-send="true" href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/TopDocs.html" target="_blank">http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/TopDocs.html</a>
</pre>
<br>
<font size="-1"> <code><a moz-do-not-send="true"
href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/ScoreDoc.html"
title="class in org.apache.lucene.search"
target="_blank">ScoreDoc</a>[]</code></font> <code><b><a
moz-do-not-send="true"
href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/TopDocs.html#scoreDocs"
target="_blank">scoreDocs</a></b></code> <br>
The top hits for the query. <font size="-1"> <code> int</code></font>
<code><b><a moz-do-not-send="true"
href="http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/TopDocs.html#totalHits"
target="_blank">totalHits</a></b></code> <br>
The total number of hits for the query.<br>
<br>
but also see for example<br>
<br>
<a moz-do-not-send="true"
href="http://docs.codehaus.org/display/GEOTOOLS/Random+Data+Access"
target="_blank">http://docs.codehaus.org/display/GEOTOOLS/Random+Data+Access</a><br>
<br>
I am currently playing with the various APIs, but any
suggestions are very welcome.<br>
<br>
Cheers<br>
<br>
Michael<br>
</div>
<br>
--<br>
Yanel-development mailing list <a moz-do-not-send="true"
href="mailto:Yanel-development@wyona.com">Yanel-development@wyona.com</a><br>
<a moz-do-not-send="true"
href="http://lists.wyona.org/cgi-bin/mailman/listinfo/yanel-development"
target="_blank">http://lists.wyona.org/cgi-bin/mailman/listinfo/yanel-development</a><br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>