Discussions
Categories
Groups
Community Home
Categories
INTERNAL ENABLEMENT
POPULAR
THRUST SERVICES & TOOLS
CLOUD EDITIONS
Quick Links
MY LINKS
HELPFUL TIPS
Back to website
Home
Content Management (Extended ECM)
API, SDK, REST and Web Services
LiveReport for Duplicate documents from a folder and its subfolders
Khalid_Omar
The default OT report runs through the entire repository and that times out as the report is for the entire database. I tried another RPT from the OT KB but that pulls out all versions of the documents as well. I want to Select a folder and get the report of duplicate documents through its subfolders showing the Document name, dataid, username, parent folder name. Also, another modified query to select a user and all of his duplicates - we have users that copy documents on to several locations that creates verson/update problems.Thanks...Wilson
Find more posts tagged with
Comments
Tim_Hunter
how do you define a duplicate? just the document name? name+owner? etc...
Jim_Coursey
Message from Coursey, Jim (AS) <
jim.coursey@ngc.com
> via eLink
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">eLink
MIME type should be one fo the criteria, and you'd like the sizes to be pretty close - I don't think all "Duplicates" will have exactly the same size. I'd like for the index info to be "identical", too, but that's asking for a lot from a comparison engine.
Issues relating to duplicates are difficult to resolve and are impossible to resolve using a pure technical solution. A good part of the whole solution is including duplicates in the overall leadership, governance, and training. Leadership to let users know that it isn't acceptable for undesired duplicates to be added to the system, Governance to set what is a duplicate and when (and if) duplicates are desirable, and traiining so all users will know the rules and why they should be followed. Each of Leadership, governance, and training should obviously include a lot of other things as well.
Khalid_Omar
Tim/Jim,Agree with Jim's statement and we do have governance, training etc, but at times we face tough customers who has their lead's support. At this point we are tasked to show some report of the duplicates created by this user. I found the report query from OT KB and it works partially. I will copy the report query and report below.select dataid, name, parentid, datasize from dtree, dversdata where name in (select name from dtree connect by parentid = prior dataid start with dataid = %1 group by name having count(name) > 1 ) and dtree.dataid = dversdata.docid and %2 and name != 'customview.html' order by nameReport runs from a selected folder with filter permissions applied and the following result comes up.DATAID NAME PARENTID DATASIZE 5984389 0 Final Insp Report List.xls 1716355 1715200 6933357 0 Final Insp Report List.xls 6933229 505344 5984389 0 Final Insp Report List.xls 1716355 1719296 5984389 0 Final Insp Report List.xls 1716355 1720832 5984389 0 Final Insp Report List.xls 1716355 1709568 If I can get the report streamlined to get only current version not all versions of a doc as shown in the reort. Also add another field to show the parent folder name instead of parentid, and that should tackle my problem to a great extend. If the report can be further filtered to a particular user, then that should help us prove our case. Thanks...W
Jim_Coursey
Message from Coursey, Jim (AS) <
jim.coursey@ngc.com
> via eLink
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">eLink
I sympathize! Those of us who haven't been in the same place really haven't lived yet. We have to remember "The customer is always right, even when he's wrong or she's stubborn." As long as they put their documents in Livelink, everyone wins to a certain degree. Plus, it sets the stage for improved content management over time.
Tim_Hunter
If you don't want to see all versions take out the join to dversdata, dtree should contain everything you need.
Khalid_Omar
Thanks Tim. Removed the dversdata table as I can live without the datasize that comes from it and modified to add parentfolder name using connect by clause.It is not an easy thing when it comes to "Duplicate names" as this is possible in different containers. So, the report has to be further analysed for finding the actual duplicates and I would leave that to the DRM team. Also, the report should be run from a lower level folder or it can crawl the server.Thanks to you both...Wilson
Sheaffe_Monteith
hello, i was wondeirng could i have a copy of the final sql that you ran? it sounds like something I am trying to do now.Thanks,Sheaffe