Importing Flat Data sources

Hi, I'm trying to generate a BIRT report from several log files. I want to create a Data Source from this several logs, but my problem is the folder structure. My log folders seems something like this:

[html]
<my_directory_root_logs>/<yyyyddmm_subdir1>/<file1_log>.csv
/<file2_log>.csv
/<file3_log>.csv
/.........
/.........
/<fileN_log>.csv

<my_directory_root_logs>/<yyyyddmm_subdir2>/<file1_log>.csv
/<file2_log>.csv
/.........
/.........
/<fileM_log>.csv
[/html]

Each day, I need to import new CSV files for each day, and I want if it's possible to do this dinamically from several subdirectories, not creating each day a new data set manually.

Thanks in advance

Find more posts tagged with

Comments

bhanley

Once you create a flat file data source, you can modify the path dynamically in the scripting layer. 
 
1) Click on the Data Source in the Data Explorer 
 
2) Then select the "Scripts" tab of the edit canvas. 
 
3) Make sure you are modifying the "beforeOpen" script. In this event you can re-build the path tot he CSV directory. 
 
<pre class='_prettyXprint _lang-auto _linenums:0'>
var folder = "Some path including a date";
this.setExtensionProperty("HOME",folder);
</pre>
 
Good Luck!

katanga

Thanks a lot. I've try it and it works. Unfortunately, it solves only first problem (change directory dynamically).

The second problem I want to know if it's possible to be solved: how can I read several CSV files of each subdirectory to load them into a data source/several data sources? Not manually, but dynamically. Next, a pseudo-algorithm that represents that why I can get:

[html]
for each subfolder in root_folder
for each csv_file in subfolder
data_source.append (csv_file.data) ----> DATA SOURCE GLOBAL
or
data_sources(i) = new data_source
data_sources(i) = load(csv_file.data) ----> ARRAY DATA SOURCES
endfor
endfor
[/html]

Thanks for your help!!!

bhanley

The specific CSV is not bound to the data source, it is bound to the data set (notice you never actually select a file when you set up the FF data source). So once you update the path for the data set, you can tweak what file(s) are processed as a result. 
 
If you can consolidate your logs into a single file, then a single data set can simply consume that file (no tweaking of the data set required). If you have multiple files each time, the data sets map to the files 1-to-1. If the number of files is different each day that makes things even more complex (clearly). 
 
Here is what I would do: 
 
If you can consolidate all logs into a single file (you reference csv_file.data) then I would use the Flat File data source, modify the path as I laid out earlier and then point the data set to the consolidated file. This should work well. 
 
If there must be multiple files (and even more so if the file count changes) then I would consider using a Scripted Data Source. This leverages a POJO to implement the processing logic against the files and a bit of JavaScript on the report to fire off that POJO. Given the inherent complexity, it will be a lot easier to manage the business logic this way. 
 
If you need to go with a scripted data source, let me know and I can get you started. 
 
Good Luck!

katanga

In first place, thanks again for your fast and clear answer. In second place, sorry for my English

. Now, some comments.

First solution is "easiest" way to solve my problem, and surely that will be the implemented solution. On the other way, I've been thinking about future and more complex applications (for example, generate different reports from several differents logs sources and formats), and I would like to see how to implement script solution. Help is always welcome, I'll be grateful if you can show me some examples.

And last doubt, when you say "the data sets map to the files 1-to-1", it means that if i got a single file log for each day, I could "create" one dataset (not data sources) dynamically for each file??

Thank you very much

weasley

<blockquote class='ipsBlockquote' data-author="bhanley">
If you can consolidate your logs into a single file, then a single data set can simply consume that file (no tweaking of the data set required). If you have multiple files each time, the data sets map to the files 1-to-1. If the number of files is different each day that makes things even more complex (clearly). 
 
Here is what I would do: 
 
If you can consolidate all logs into a single file (you reference csv_file.data) then I would use the Flat File data source, modify the path as I laid out earlier and then point the data set to the consolidated file. This should work well. 
 
If there must be multiple files (and even more so if the file count changes) then I would consider using a Scripted Data Source. This leverages a POJO to implement the processing logic against the files and a bit of JavaScript on the report to fire off that POJO. Given the inherent complexity, it will be a lot easier to manage the business logic this way. 
 
If you need to go with a scripted data source, let me know and I can get you started. 
 
Good Luck!</blockquote>
 
I need to report on few csv file's and I made a POJO that combines them into a vector. Now my problem is when all the csv are say 6MB the report doesn't work and the server runs out of memory. I have one question and that is. How would you send over the data from you POJO to the report as needed and not as one Object?