Parsing XML with XMLStarlet

This week I learned to use a command line XML Toolkit called XMLStarlet. It’s pretty easy to use considering I was trying to create the same process in Java using XOM.

I’m a bit familiar with SQL Select statements and wanted something similar to that. XMLStarlet can use XPath statements which aren’t as powerful as SQL queries but was sufficient for my needs. It was a bit of a stretch but an hour or so at w3schools.com and I figured out the queries I needed.

Besides the XPath expressions, the general syntax is pretty simple.

I used the sel command to initiate a query. My XML data came from a series of web queries so I used the --net global option which allowed me to fetch entities over network.

I wanted the output to be a CSV file, so I opted for the -T global option which changes the output to text. I simply added commas by using -o ,.

XPath and XML queries seem to work by selecting a set and than calling the value of an element. The selecting was done using the -m option for “match” and the -v option for “print value”.

Here is what my command line options looked like:

xml sel --net -T -N wp="http://api.whitepages.com/schema/" -t -m /wp:wp/wp:listings/wp:listing -v wp:displayname -o , -v child::wp:phonenumbers/child::wp:phone/child::wp:fullphone -o , -v child::wp:address/child::wp:fullstreet  -n http:\\api.whitepages.com\enter_api_query_here

I ended up using some of my Java code and XMLStarlet together to meet the needs of this project. My next task is to incorporate my new found knowledge of XPath into the Java code I wrote. Let’s see what I learn next.

You can leave a response, or trackback from your own site.

Leave a Reply

Powered by WordPress | Designed by: video games | Thanks to Webdesign Agentur, SUV Reviews and Bed in a Bag