There Are Days When I Hate XML
…and days when I really hate XML.
In this case, I have an XML document like
<?xml version="1.0" ?> <foo xmlns="http://some.org/foo"> <thing id="first"/> <thing id="second"/> <thing id="third"/> </foo>
and I want to get things out of it with XPath.
But I couldn’t manage to select things exactly, such as the <foo> element at the top. You’d think “/foo” would do it, but it didn’t.
Eventually I found out that the problem is the xmlns=”...” attribute. It looks perfectly normal, saying that if you have <thing> without a prefix (like “<ns:thing>”), then it’s in the “http://some.org/foo” namespace.
However, in XPath, if you specify “ns:thing”, it means “a thing element in whichever namespace the ns prefix corresponds to”. BUT “thing” means “a thing element that’s not in a namespace”.
So how do you specify an element that’s the empty-string namespace, as above? The obvious way would be to select “:thing”, but that doesn’t work. Too simple, I suppose. Maybe that gets confused with CSS pseudo-selectors or something.
No, apparently the thing you need to do is to invent a prefix for the standard elements of the file you’re parsing. That is, add “ns” as another prefix that maps onto “http://some.org/foo” and then select “ns:thing”. There are different ways of doing this, depending which library you’re using to make XPath queries, but still, it seems like a giant pain in the ass.