RETURN_TOP limit

paul_wms
edited February 27, 2021 in Documentum #1

I'm using IDfQuery.execute() to run a DQL query that is expected to return ~14m results. This is on a D6.7 docbase with an Oracle DB. I'm trying it with "ENABLE(RETURN_TOP limit)" hint in order to try to improve performance. When I set the limit to 999,999, the query returns within a few minutes without a problem. But when I increase the limit to 19,999,999 in order to try to get the complete result set, I immediately get DM_SESSION_E_DEADLOCK exception every time. Am I exceeding some upper limit for the RETURN_TOP hint?

Tagged:

Comments

  • I would think this is dependent on Oracle DB. Curiously, what are you doing that you need to bring back 1M records? Are you searching on fulltext or generic attribute values?

    There are other ways to improve performance instead of returning TOP 1M rows.

  • This is preparation for a content migration effort Johnny. Was trying to get counts of objects. Not a user function, so performance is not critical. I had heard that the return_top hint could speed up the query which was exceeding my VPN timeout, but when I tried it at 19m, I got the deadlock error which didn't seem like an accurate error message. I've since used trial and error to discover that (at least for this configuration) 8.9m executes without error, but 9m errors out right away. Curious!

  • Ah - what's your query? I have never had problems running select count(*) from ...

  • I'm not using count(*) because I want a count of each object type in the object model for each cabinet. In handling the result set, my tool does the counting as it loops through the collection. Here is the simple query that fails with the DEADLOCK exception:

    SELECT d.r_object_id, d.i_chronicle_id, d.r_object_type, d.r_is_virtual_doc 
    FROM my_document d WHERE FOLDER('/[cabinet name]', descend) 
    enable(return_top 9000000)
    go
    

    I can use a different strategy to generate the info I need, but I was hoping to find the reason why this happens when some magic number is exceeded in that return_top hint for an Oracle repo.

  • Return_top is implemented differently for different databases, so its not standard SQL. Instead of trying to understand why, I just figure out what needs to get the job done and do it sometimes ?

  • Yeah, me too. But sometimes an unanswered question nags at me incessantly and I have to see if I can figure it out in the background while I'm pursuing workarounds. Know what I mean? ? Thanks for the replies.

  • Have you submitted ticket to OpenText support? They can probably tell you why - albeit, I couldnt find anything in the DQL documentation.

  • crittedv
    edited March 2, 2021 #9

    I'm not using count(*) because I want a count of each object type in the object model for each cabinet.

    I'm probably oversimplifying your requirement but what about a group by?

    select r_object_type, count(*) from my_doc where folder('/wherever',descend) group by r_object_type
    

    I'm only suggesting this based on your query - you're only going to get one entry (r_object_id) per i_chronicle_id as you're not searching on all versions "my_doc (ALL), or ENABLE(ROW_BASED)", so you really do just seem interested in the counts of types & virtual docs...

    Pretty sure you could even do:

    select r_object_type, r_is_virtrual_doc, count(*) from my_doc where folder('/wherever',descend) group by r_object_type, r_is_virtual_doc
    

    And if you wanted to speed this up (your reason for using RETURN_TOP), make an index on the attributes & type you're querying.

  • I like the idea of group by condition as it can do the calculations at the source instead of processing through custom code.

    Can you try below query and see if it meets your expectations. This will count your object types per cabinet

    SELECT 

    d.i_cabinet_id, d.r_object_type, d.r_is_virtual_doc 

    FROM my_document d 

    group by 

    d.i_cabinet_id, d.r_object_type, d.r_is_virtual_doc

    order by 

    d.i_cabinet_id

  • Apologies for not getting back to this sooner. My initial post was to try to understand what the upper limit was for the return_top hint in an Oracle docbase. I was able to complete my investigation of the docbase's content profile in other ways. So, it was never a major road block for me. But thanks for the replies and suggestions. Counting in groups (suggested by @crittedv ) is a great approach that I have used successfully.