Filtering in datasets to return only most recent row with a given ID?

jhalper
edited February 11, 2022 in Analytics #1
I have two datasets, both sourced from an XML document, which are participating in a joint data set (LEFT outerjoin). The right-side table in the outerjoin is the one that needs filtering.

The dataset rows have a String ID and a Date field (among other fields), and I would like the dataset feeding into the joint dataset to return only the most recent row for each distinct String ID. I've played with using Total.isTopN() in a filter expression using the Date field as the TopN value, and "row" the optional grouping, but haven't gotten anywhere. I haven't seen documentation that grouping-type capability would work in a dataset, but I had to give it a try.

Is this type of more advanced dataset filtering possible? The examples I've seen are rather basic.

Using BIRT 2.1.2.

Thanks,
Jonathan

Comments

  • Virgil Dodson
    Virgil Dodson E admin
    edited December 31, 1969 #2
    Hi Jonathan,

    Can you add a simple XML structure example to this post? Is it one XML file that has two data sets or two XML files?

    Virgil
    Warning No formatter is installed for the format ipb
  • jhalper
    edited December 31, 1969 #3
    <blockquote class='ipsBlockquote' data-author="vdodson"><p>Hi Jonathan,<br />
    <br />
    Can you add a simple XML structure example to this post? Is it one XML file that has two data sets or two XML files?<br />
    <br />
    Virgil</p></blockquote>
    <br />
    I've attached a sample report and data. It's one XML data file with two data sets. The data point names and data is all mocked up (there are several more data points in the real datasets), but it's pretty straightforward. I need all data from dataset 1, but filter down to only the most recent record for dataset 2. In this case, I would need the Apr 1 (no foolin') entry for AAA, as well as the BBB entry to join to dataset 1. <br />
    <br />
    In the real dataset I'm working with, once the data is joined this way, I need to calculate a few computed columns, then sort based on one of those computed columns, presenting a TopN filtered view of the sorted results.<br />
    <br />
    Thanks again.
  • Virgil Dodson
    Virgil Dodson E admin
    edited December 31, 1969 #4
    Hi Jonathan,

    The ideal situation is if you can filter the data prior to getting the XML file. That way you are not pulling lots of extra information that won't be used on the report. If that part is out of your control, then you can use filtering at several levels as you have already figured out. You can filter you original data sets, or your joint data set, or you can apply filtering on the table and within group levels.

    To make this example work like you expected, I first sorted the data by ID acsending, and date decending so the newest date for each ID would move to the top. This sorting can happen within your Data Sets or I just did the sorting on the table. I added a group level to the table on the ID field and then moved all the fields out of the detail section and up to the group. Then just hide the detail row with the visibility property. The result is that I only get one row for each unique ID and it is the row with the newest date.

    The finished example is enclosed but I used BIRT 2.2 to modify it so you might not be able to open it with BIRT 2.1.2 which is why I put the details above.
    Warning No formatter is installed for the format ipb