[arin-tech-discuss] bulkwhois libxml2
andy at arin.net
Wed Sep 22 09:51:39 EDT 2010
On Sep 21, 2010, at 3:37 PM, Wes Young wrote:
> Figured this was a good place to start...
> i'm debugging an issue (or some "functionality") with libxml2 and perl (using the XML::LibXML::Reader interface), where it' seems to be stumbling on the series of:
> if I insert a linebreak; (</asn>\n<asn>\n) the libxml2 reader function rips through it no problem, if there's no line break, it views the <asn> as a blank element and then reads the rest of the file as garbage data (trying to do this stream like instead of DOM like).
> I'm assuming there isn't anything wrong with the way it's outputted (guessing most people are just java-nuts and it works like that), but i'm curious if anyone has gotten around this issue with libxml2 (or alike) by setting some sort of parsing flag, etc.
> I've tested it a few times with the first few set of asn elements you'll find in the data; and the line break pretty much makes it reproducible... jw if anyone has worked around that from the perl side...
The line break shouldn't be necessary. If you have validation turned on, perhaps that is causing some sort of problem. Also check other parsing options.
I just ran last night's bulk whois through xmllint with the --stream option and it worked fine. xmllint uses libxml.
Sorry I can be of further help here.
More information about the arin-tech-discuss