Parse XML using Rhino included E4X. Much easier than using SAX or DOM.

Options
Clement Wong
Clement Wong E mod
edited February 11, 2022 in Analytics #1
It’s very simple to parse XML in BIRT scripting using Rhino’s included extension, E4X (https://developer.mozilla.org/en-US/docs/Archive/Web/E4X/Processing_XML_with_E4X). With E4X, XML is a primitive and is easy to access. So there’s no need for Java SAX and DOM packages.


The attached example retrieves an RSS feed from ESPN’s Top Headlines using a Scripted Data Source. Here are a few highlights, and look at the fetch event and you’ll see XML elements are accessed via dot notation.

beforeOpen

importPackage(Packages.java.io);
importPackage(Packages.java.net);

var inStream = new URL("http://sports.espn.go.com/espn/rss/news").openStream();
var inStreamReader = new InputStreamReader(inStream);
var bufferedReader = new BufferedReader(inStreamReader);
var line;
var result = "";
while ((line = bufferedReader.readLine()) != null)
       result += line;
inStream.close();


//FOR DEBUG, THIS WORKS ONLY IN COMMERCIAL BIRT
//logger = java.util.logging.Logger.getLogger("birt.report.logger");
//logger.warning (result);

//Remove the first line <?xml version="1.0" ?>" to make it well formed XML
result = result.replace(/<\?[^>]*>/g,""); 

rss = new XML(result);
totalItems = rss.channel.item.length();

fetch

if( currentrow >= totalItems ){
       return false;
}

row["title"] =  rss.channel.item[currentrow].title.toString();
row["description"] = rss.channel.item[currentrow].description.toString();
row["link"] = rss.channel.item[currentrow].link.toString();
row["pubDate"] = rss.channel.item[currentrow].pubDate.toString();

currentrow += 1;
return true;


This example was tested in commercial BIRT iHub 3.1 and OS BIRT 4.4.2, and it should work in earlier versions of BIRT. E4X was removed from Rhino in 2014, but these versions of BIRT still include the version of Mozilla Rhino JavaScript which still has E4X.

Warning No formatter is installed for the format ipb