looping through data set to construct a new data set?

deejThomson
edited February 11, 2022 in Analytics #1

I am hella new to BIRT and my googlefu is failing me...so I am hoping the fine people here might be able to point me in the right direction. I have a data-set of time series data that I need to iterate over and find all the groups that values over a certain constant. I need to know two things 1) when the data crosses the constant and 2) for how long the data is greater than the constant - and then load those two things (time and duration) into a 2nd data set that is what is displayed in my report. BIRT is the tool I need to use and I am unclear how to best solve this. If I was doing this entirely programmatically it would be something like this (forgive the **** pseudo-python code below -- there may be logic errors in that, but it should be hopefully clear enough what I am trying to do)

outData = []
for x in dataSet:
if x.value >= constant:
startPoint = x.timestamp
nextPoint = startPoint + 5min
while x >= constant and x.timestamp <= nextPoint
nextPoint = x.timestamp + 5min
x = x.next
endPoint = x.timestamp
duration = endPoint - startPoint
outData.add(x.tagName, startPoint, duration)

I know an option would be to do this data crunching on the database side and then just feed the cleaned data to a BIRT report - ideally though I would like to feed BIRT the raw data and have the report do the cleaning - currently that seems the most straightforward with my known constraints

Anyways - any direction or help or suggestions is greatly appreciated. I just started really looking at BIRT a few days ago so my understanding of what it can do is neophyte at best and comical at worst.

Comments

  • In the attached report, I added a computed column to show the date when the threshold is exceeded and a second computed column to show the duration when the value drops below the threshold. There is a filter on the table to filter out all other rows. I used hidden report parameters as global variables because they are easy to insert into computed columns.

    Your duration calculation will be different than the example given that your data is different, but the method will be similar. Two notes on the duration calculation:

    1. The duration calculation in this example computes the difference between the first date over the threshold and the last date over the threshold. Since the last date over the threshold is not known until the row value drops below the threshold, the duration calculation does not trigger until the row after the last date above the threshold. Therefore, the code saves the current row date to a parameter so it can be used in the following row's calculation.

    2. A dummy row was added to the query to be able to compute the duration when the last value in the result set is above the threshold as explained above.

    Warning No formatter is installed for the format ipb
  • Amazing! This is more than what I needed so thank you. Now off to apply tweaks and what not. I was deep into the rabbit whole of scripted data sets and global variables.