[Yanel-dev] Boost access log file format
Michael Wechner
michael.wechner at wyona.com
Tue Jun 29 10:09:36 CEST 2010
Cedric Staub wrote:
> Hello there
>
>
>> Cedric pointed out to me that it might make sense to have a different
>> format for the boost access log. It currently reads
>>
>> Date, log-level, URL, realm-id, boost-cookie, user (if available),
>> Referer, User-Agent, E-Mail (if available)
>>
>
> A single log entry currently looks like this:
> ---
> http://www.example.org/index.html r:example c:YA-1 ref:null
> ua:Mozilla/4.0 (example)
> ---
>
> While the field:value pairs are actually quite useful and make parsing
> a log entry easier, the problem is that the values in the fields (e.g.
> the user agent) can also contain colons/spaces, making it in certain
> cases impossible to tell where a field starts or ends.
>
> I suggest we escape the colons and spaces, e.g. by using url encoding.
> This can easily be done using java.net.URLEncoder on the logging side
> and java.net.URLDecoder and the parsing side.
>
> I would also suggest that we add "url:" to the front of the url in
> order to make parsing more robust, right now the parser just assumes
> that the url is the first field. Then we could change the format later
> (e.g. moving the url to another position) without breaking the parser.
>
sounds good
> Then we'd just have to make sure any module that appends data actually
> uses url encoding and doesn't just print it verbatim. Maybe we can add
> a convenciance function somewhere to make this easier?
>
you mean for something like email:michi at wyona.org added by the contact
form resource?
> Anyway, I will try to reply with a patch implementing this shortly.
>
Looking forward to your patch :-)
Btw, you might want to consider encryption and decryption of the log
file. The reason I am saying this
is because of privacy, etc.
Thanks
MIchi
> Greetings
> Cedric
>
More information about the Yanel-development
mailing list