Wednesday 11 December 2013

Pentaho ETL : Reading an XML file in PDI

This Post describes how to read all the values from an XML file.....

1. Below is the sample xml file from which i want to read all the attributes, all the values from child tags.












2. Drag and drop a "Get data from XML" component in PDI,
    Edit this component, browse for the xml file which you want to parse.

3. select content tab and click on Get XPath Nodes, then you will get all the xpaths, select the xpath from      where you want to parse the xml, in my case i have selected /table as shown below:

















4.  select Fields tab, then click on Get Fields, now we have edit all the fields as shown below:

   

 













As Columns tag contains multiple child column tags, i have referred those child tags as columns/column[1]  ...column[2] .. etc ...

If we want to read attribute then you have specify attribute name along with @ symbol.

finally click on Preview rows, to get the data from xml.

I hope you have enjoyed the post :)


Thanks ............ Giri