LibXML2 namespace bug

April 3, 2008

Problem: we have an XML document with multiple namespaces, one of which has no prefix:

<collection xmlns="" xmlns:xsi="" xsi:schemaLocation="">


 	<controlfield tag="001">714400</controlfield>		<datafield tag="245" ind1="1" ind2="0">

 		<subfield code="a">Crete</subfield>

 		<subfield code="h">[electronic resource] /</subfield>

 		<subfield code="c">by D.M. Davin.</subfield>




Answer: Use XML::LibXML::XPathContext, defining the default namespace twice:

use XML::LibXML;

use XML::LibXML::XPathContext;my $parserTitles = XML::LibXML->new;

my $structAuthors = $parserTitles->parse_file( 'NZETC_marc.exp.200706211556.xml' );

my $rootTitles = XML::LibXML::XPathContext->new($structAuthors);

$rootTitles->registerNs('xsi', '');

$rootTitles->registerNs('m21', '');

my $titleNodes = ($rootTitles->findnodes("//m21:record/m21:datafield[attribute::tag='245']"));

if ($titleNodes)


 foreach my $titleNode ($titleNodes->get_nodelist)


 	$titleNode = XML::LibXML::XPathContext->new( $titleNode );

 	$titleNode->registerNs('m21', '');

 	my $titlesControlFieldNode = ($titleNode->findnodes("../m21:controlfield[attribute::tag='001']"))[0];

 	my $bbid = $titlesControlFieldNode->findvalue('.');

 	my $titlesRecordNode = ($titleNode->findnodes("ancestor::m21:record"))[0];

 	my $titlesTitle = $titleNode->findvalue("m21:subfield[attribute::code='a']/.");

print "$titlesTitle [$bbid]\n";



Unfortunately this nasty hack also seems to be necessary when dealing with only a single namespace, but where that namespace has no prefix.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: