[Yanel-dev] Fwd: Cluster documentation

Balz Schreier balz.schreier at gmail.com
Wed Feb 2 18:41:26 CET 2011


1) Regarding "Build a locking mechanism into Yarep": How would that really
help?
Talking of a cluster, we mean multiple JVMs, each of them are running Yarep,
all accessing a shared file in the filesystem.

If you want to solve it with writing a lock file (like Lucene does), then
you need to be sure that the lock file gets written synchronously and in a
single atomic operation (form the JVM down to the harddrive, no buffers,
etc.).
NFS can not guarantee that.

What I read in the Lucene Book (Lucene in Action, 2nd Edition 2010) is the
following approach:
- The JVMs (and therefore Yarep too) do not write to the local file system
(reading is ok though), instead they call a central "write service". Of
course you have a single point of failure (for writing, reading continues to
work fine). But you don't have to deal with thinking about locking etc.

2) Regarding "Simply use a backend that can do concurrent writing": How do
you solve this then:

assume you have two JVMs running your app, both with the ability to write
nodes.
assume that at the same time, both JVMs read a file with a content element A
in it.
The first JVM adds B into the file (A,B), the second JVM adds C into the
file (A,C).
But "adding" means: read from yarep, modify, write back.
So even if you have proper locking in place, what do you end up with on the
filesystem? A,C or A,B ? What should be in the file is of course A,B,C or
A,C,B... and I think that this can only be done with a single "write"
service (equal to writing into the same database).

Cheers

On Wed, Feb 2, 2011 at 6:27 PM, Cedric Staub <cedric.staub at wyona.com> wrote:

> On Wed, Feb 02, 2011 at 06:21:33PM +0100, Michael Wechner wrote:
> > Hi Balz
> >
> > On 2/2/11 5:55 PM, Balz Schreier wrote:
> > > Hi Michael,
> > > interesting point is mentioned in the restrictions:
> > >
> > >     * Only one cluster node writes to the repository. This means that
> > >       the authoring environment is only accessible on one dedicated
> > >       cluster node, which is called the /master/ node.
> > >
> > >
> > > For the project I'm working for this is not a solution as each node
> > > must have write access to the repository as well as to the Lucene
> index.
> >
> > agreed and I don't think you need to worry
> >
> > > This is what I wanted to highlight in today's workshop that
> > > Yanel/Yarep should come up with a concept of a single "write" instance
> > > which gets used by all members in the cluster so that writing is also
> > > possible for other nodes... to be continued....
> >
> > wyona.com is running within a cluster and so far we didn't have any
> > problems.
> > Also please see Cedric's tests, whereas to be fair these tests were done
> > only with concurrent read access.
> > But we will also do concurrent write/read.
>
> I don't think doing concurrent writing with NFS will work though.
> We have to make sure only one process can write to a file at once.
>
> Some of my thoughts about possible solutions:
>
> * Build a locking mechanism into Yarep. This would solve the problem,
>  but I think it will be very difficult to build a good, solid locking
>  mechanism.
>
> * Simply use a backend that can do concurrent writing, e.g. a good
>  distributed filesystem or a database. It's easier, but not always
>  practical, e.g. what if someone has already deployed a given
>  filesystem somewhere and we can't change it etc.
>
> Just my thoughts,
> Cedric
> --
> Yanel-development mailing list Yanel-development at wyona.com
> http://lists.wyona.org/cgi-bin/mailman/listinfo/yanel-development
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wyona.org/pipermail/yanel-development/attachments/20110202/ebeda393/attachment-0001.html>


More information about the Yanel-development mailing list