Xerces DOMParser funny…

Working with Xerces DOMParser I have noticed a weird exception being reported arbitrarily by certain instance documents. This exception message is “The processing instruction target matching “[xX][mM][lL]” is not allowed.“. Looking at the XML source file it does have the normal <?xml version=”1.0″ encoding=”UTF-8″?> prefix so was initially unsure why this exception was firing every now and then?

Turns out that using the Xerces (JDK default) DOMParser ¬†implementation, there is a parser exception ONLY if there are blank lines in the XML document before the <?xml version=”1.0″ encoding=”UTF-8″?> declaration..??

Removing these blank lines (NOT the XML declaration header) allows the parser to complete successfully. I don’t profess to understand the science bit here, but this did knock me off track for a while. Interested to hear what relevance the blank lines do have in a document where whitespace is irrelevant…?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: