Discussions
Categories
Groups
Community Home
Categories
INTERNAL ENABLEMENT
POPULAR
THRUST SERVICES & TOOLS
CLOUD EDITIONS
Quick Links
MY LINKS
HELPFUL TIPS
Back to website
Home
Intelligence (Analytics)
XLS file is too large to open. Other options? CSV?
gauglerb
Greetings,
I am using BIRT Report Engine 4.2 API to invoke the engine and generate reports from my web application. PDFs and XLS generate fine, however, the XLS file sizes are too large to open. (20-50MB) The reports can contain up to 60,000 rows of data, and I am assuming because of formatting- the resulting XLS are large, and therefore hard to open.
My first attempted fix was to simply create a plain grid, insert data without any formatting on the designer, and generate, but the resulting XLS files are still the same size.
Is there a way to configure the report or emitter to drop all formatting for XLS types? I am not truly sure why the XLS are so large anyways. I open them up in notepad and it contains serious amounts of hex values.
Is there a replacement emitter for 4.2 to generate cleaner and more efficent XLS files? (I haven't been able to implement any 3rd party emitters in 4.2 for using in the report engine API, perhaps there are better instructions out there I can't find?)
Ideally, I could simply install a CSV emitter and just be done with it, however I cannot seem to get the CSV emitter plugin working either for 4.2. The state of this contest project is unknown anyways.
What is the best option and approach to take? I also saw in BIRT there are CSV data extraction methods but that appears to only extract the data of a result set and not the entire contents of the report. Is that true, is this a possibility?
What is the best route to take my report and obtain a useable XLS or CSV file?
Thanks, Brian
Find more posts tagged with
Comments
kclark
You could try with the <a class='bbc_url' href='
http://www.birt-exchange.com/be/marketplace/app-showcase/?app=102'>SpudSoft
xls emitter</a> instead of the default. I think you should have better results with it, let me know if this helps with your issue.
gauglerb
<blockquote class='ipsBlockquote' data-author="'kclark'" data-cid="112342" data-time="1355252915" data-date="11 December 2012 - 12:08 PM"><p>
You could try with the <a class='bbc_url' href='
http://www.birt-exchange.com/be/marketplace/app-showcase/?app=102'>SpudSoft
xls emitter</a> instead of the default. I think you should have better results with it, let me know if this helps with your issue.<br /></p></blockquote>
<br />
Thanks, I had not yet come across this one in my research. I will try to monkey around with it and see if it helps. Not really any options or tricks within BIRT framework? Thanks again
Yaytay
<blockquote class='ipsBlockquote' data-author="'gauglerb'" data-cid="112347" data-time="1355257819" data-date="11 December 2012 - 01:30 PM"><p>
Thanks, I had not yet come across this one in my research. I will try to monkey around with it and see if it helps. Not really any options or tricks within BIRT framework? Thanks again<br /></p></blockquote>
<br />
I wrote it, and I use it with the report engine in a web service application, so it shouldn't be difficult to get working.<br />
<br />
I use maven, and as long as I declare the POI dependencies, then the Excel emitter dependency and then the BIRT dependency it works just fine.<br />
Note that it's not possible to remove the default emitter if you use BIRT via maven so be ure to specify the emitter ID when you render (actually it might be possible to remove the default emitter using exclusions, but I don't bother).<br />
<br />
Let me know if you have any trouble.<br />
<br />
Incidentally, I doubt that it's the file size that's causing you problems, but rather the hard limit of ~65000 rows that Excel versions before 2007 have.<br />
If you want >650000 rows you need to be sing XLSX files.<br />
<br />
Jim
gauglerb
<blockquote class='ipsBlockquote' data-author="'Yaytay'" data-cid="112376" data-time="1355340100" data-date="12 December 2012 - 12:21 PM"><p>
I wrote it, and I use it with the report engine in a web service application, so it shouldn't be difficult to get working.<br />
<br />
I use maven, and as long as I declare the POI dependencies, then the Excel emitter dependency and then the BIRT dependency it works just fine.<br />
Note that it's not possible to remove the default emitter if you use BIRT via maven so be ure to specify the emitter ID when you render (actually it might be possible to remove the default emitter using exclusions, but I don't bother).<br />
<br />
Let me know if you have any trouble.<br />
<br />
Incidentally, I doubt that it's the file size that's causing you problems, but rather the hard limit of ~65000 rows that Excel versions before 2007 have.<br />
If you want >650000 rows you need to be sing XLSX files.<br />
<br />
Jim<br /></p></blockquote>
<br />
OK, thanks! I got it working no problem at all, also in a web service. Just set the emitterID in the code and worked like a charm. <br />
<br />
The real issue was the default BIRT Excel Emitter. The XLS files that are created are clogged with something that makes the file sizes extremely large. And I don't have to tell you that, because you made the alternative XLS emitter.<br />
<br />
For my 20,000 row result set:<br />
Default BIRT XLS ExcelEmitter: 26MB file<br />
Open in Mic. Excel. (Takes 20 minutes to load, but it finally does)<br />
Without changing anything, Save As .xls and replace current existing file.<br />
Resulting file size is 9MB without any noticeable difference looking at the spreadsheet.<br />
<br />
Thanks for the plugin.
Yaytay
<blockquote class='ipsBlockquote' data-author="'gauglerb'" data-cid="112380" data-time="1355344147" data-date="12 December 2012 - 01:29 PM"><p>
The real issue was the default BIRT Excel Emitter. The XLS files that are created are clogged with something that makes the file sizes extremely large. And I don't have to tell you that, because you made the alternative XLS emitter.<br /></p></blockquote>
<br />
The default emitter doesn't actually generate XLS files at all - it generates files that match an XML format specified by Microsoft for data interchange.<br />
Unfortunately Microsoft didn't see fit to allocate a specific extension to the format, so XLS is used to tell Windows to open Excel - but recent versions of Excel then get uppity about the format not matching the extension (and recent versions of PowerPoint don't open the files at all!).<br />
I haven't looked into it in depth, but I think there are a few reasons for the files being large:<br />
1. I don't think the interchange format uses styles, so format information is repeated a lot (I may be wrong on this one).<br />
2. The xml format is designed to be semi-human readable, unlike the XML formats used in XLSX files which are very terse (and XLS files are binary, so inherently smaller).<br />
3. The files aren't compressed, unlike XLSX files.<br />
<br />
Proper XLS[X] files have an internal set of styles and each cell is formatted according to a style.<br />
<br />
All just FYI.<br />
<br />
Jim
gauglerb
<blockquote class='ipsBlockquote' data-author="'Yaytay'" data-cid="112381" data-time="1355344755" data-date="12 December 2012 - 01:39 PM"><p>
The default emitter doesn't actually generate XLS files at all - it generates files that match an XML format specified by Microsoft for data interchange.<br />
Unfortunately Microsoft didn't see fit to allocate a specific extension to the format, so XLS is used to tell Windows to open Excel - but recent versions of Excel then get uppity about the format not matching the extension (and recent versions of PowerPoint don't open the files at all!).<br />
I haven't looked into it in depth, but I think there are a few reasons for the files being large:<br />
1. I don't think the interchange format uses styles, so format information is repeated a lot (I may be wrong on this one).<br />
2. The xml format is designed to be semi-human readable, unlike the XML formats used in XLSX files which are very terse (and XLS files are binary, so inherently smaller).<br />
3. The files aren't compressed, unlike XLSX files.<br />
<br />
Proper XLS[X] files have an internal set of styles and each cell is formatted according to a style.<br />
<br />
All just FYI.<br />
<br />
Jim<br /></p></blockquote>
<br />
<br />
That all makes a lot more sense now. Thank you for the information, and I suspect you are correct about all of your assumptions based on what I'm seeing in the resulting output files. BIRT really should incorporate your plugin into the base code as an alternative emitter. A CSV emitter too! Thanks fellas