[Yanel-dev] New XMLDB repository

Josias Thöny josias.thoeny at wyona.com
Mon Feb 12 16:55:27 CET 2007


Andreas Wuest wrote:
> Hi Josias
> 
> On 12.2.2007 15:55 Uhr, Josias Thöny wrote:
> 
>> Andreas Wuest wrote:
>>> Hi
>>>
>>> I've finished and checked in a basic implementation of the XMLDB 
>>> repository, based on the XML:DB API.
>>
>> Cool :)
>>
>>>
>>> Unfortunately, Yarep is documented really bad, so I couldn't find out 
>>> what the exact contracts for the various methods are. For example, 
>>> should getSize() or delete() throw a repository exception if the 
>>> resource does not exist, or return 0 or false, etc.
>>>
>>> I've extensively documented the XMLDBStorage class, so you can see 
>>> what it does on the first glance.
>>>
>>> The Reader/Writer and InputStream/OutputStream are implemented using 
>>> aggregation. Don't know if it would be more desireable to e.g. 
>>> subclass StringReader and override the close() method instead.
>>>
>>> Also, there are some other API related problems: Yanel always seems 
>>> to call getInputStream to directly read from the repo. Now, this is 
>>> all fine and dandy on a file based repo, but the XML database stores 
>>> XML documents as character data, and returns them as strings. With 
>>> other words, in order for the OutputStream to work, we have to 
>>> convert the string to bytes, which, of course, involves character 
>>> encoding. I just use UTF-8 to de- and encode, but of you really want 
>>> to read an XML resource, the getReader method should be used.
>>>
>>> The same goes for writing, but with some additional complication. You 
>>> should NEVER use getOutputStream to write an XML document. 
>>> getOutputStream creates a binary resource in the database. Use 
>>> getWriter instead to write character data, which creates an XML 
>>> resource.
>>
>> Well, I didn't realize that some repository implementations might 
>> handle binary data differently than text data. But I guess it makes 
>> sense.
>> So probably we should change yanel to use the reader/writer methods 
>> for text data, and add reader/writer methods to the node-based api, too.
>> Would that help?
> 
> That would help for sure. Although I don't know how Yanel can find out 
> which method to call for reading, because it does not know in the first 
> place if a requested resource is character-based or binary.

Yeah, I had some doubts about that also.
Maybe we could simply say that a FileResource is always treated as 
binary, and a XMLResource is always text. Would that be too simple?


> 
> One possible way would be for the repository implementation to guide 
> Yanel, because the repository should generally know what type of 
> resource is being requested (at least, XMLDB knows, we may see other 
> back-ends in the future which do not even know this one though). If 
> Yanel uses getInputStream(), and the repo decides that this is not a 
> binary resource, it could throw an exception, and Yanel would then try 
> getReader(), or vice versa. We could also introduce a flag on those two 
> methods, e.g. forceRead, which would prevent the repo impl from throwing 
> if the resource to be read is of the wrong type, but read anyway.
> 

If we say that the repo "knows" about the type of a resource, it could 
provide a method isBinary() or something like that, so yanel could know 
which method to call (getReader/getInputStream). I normally prefer to 
"ask first" instead of handling an error.
When someone calls a reading method which does not match the type, a 
best-effort conversion could be applied.
I'm not entirely sure though how the repo would know the type 
(text/binary). Should it assume that it's binary when it was written by 
getOutputStream, and text otherwise?
WDYT?

josias

> For writing, there should basically be no problem, since Yanel can 
> decide based on the MIME-type if it is going to write a character-based 
> or a binary resource.
> 




More information about the Yanel-development mailing list