Hello Documentum/DFC experts!
I am troubleshooting some performance problems in a Documentum 5.3 SP4 application, and I’m looking for ideas and recommendations about what could be going on, and what logs, tracing, or other lines of inquiry would be most useful.
The basic architecture:
- two content servers, one repository
- four Weblogic nodes
- thousands of concurrent users making perhaps dozens or hundreds of repository requests at any one time
Our logging output shows that the application is suffering inordinate delays trying to acquire a DFC session, which is having a severe impact on usability across the application. The problem occurs in all four web application instances. (I am working today to determine whether it is also present in command-line applications which utilize the same code, or if it is only happening within Weblogic.)
2009-06-18 18:34:57,249 Constructing session manager for user **** and storing on threadlocal for use by the current thread.
2009-06-18 18:34:57,249 Attempting to initialize DFC session manager for user **** on repository YYY
2009-06-18 18:34:57,249 Successfully initialized DFC session manager for user **** on repository YYY
2009-06-18 18:34:57,249 Setting a new thread-local session manager...
2009-06-18 18:35:20,437 Acquired session s2
2009-06-18 18:35:20,905 0.47 DFC getObjectByQualification dm_webc_config where object_name = '...'
2009-06-18 18:35:20,921 0.02 DFC select a_last_completion, a_current_status from dm_job where any method_arguments like '%0801bc828000b133%' order by a_last_completion DESC
2009-06-18 18:35:20,921 Cleaning up session s2
2009-06-18 18:35:20,921 Clearing current thread-local session manager...
This example took 23 seconds to acquire a session. I’ve seen cases in the logs taking more than 60 seconds. Notice that once the session is acquired the queries execute relatively quickly (sub-second).
Why should it ever take 23 seconds to acquire a session?
Full Disclosure: I am not certain yet whether the delay is truly within the getSession call, or if it’s really context-switching or otherwise application/Weblogic related... I’ll be working today to add additional logging to the code to get at some of this deeper information.
The session manager it is using has typically been recently created (within seconds), though in some cases it may have been pulled from a pool of longer-lived session managers, but never anything more than a few minutes old.
My first thought was that we were hitting some connection/session pool threshold, so I dug into server and client settings…
concurrent_sessions
In Production, the concurrent_sessions property of server.ini on both content servers is currently set to 100, which suggests this could just be too many sessions coming from the application into Documentum. But the Sessions listing in DA never shows more than about 60-70 sessions. This could mean we aren’t hitting the 100 or it could mean DA is not showing all the sessions, or maybe Documentum begins throttling before it gets to the max… Need to study up on this one.
Tonight we’ll be bumping that value up to 1000. I have hopes, but given what I’m seeing in DA I’m not so confident this is going to help.
max_session_count
In Production, the max_session_count property of dmcl.ini/dfc.properties for all the web application servers is set to 1000, which is part of our installation because we found 10 and 100 were too few in our performance tests during launch.
As I said, I’m working today to add more logging code so I can get a more granular view of what’s happening. I won’t get to see the results of those changes until Tuesday at the earliest.
Also, it’s unlikely I’m going to have an opportunity to do any serious profiling in Production, although I’m not ruling it out if it looks like our best chance of figuring things out soon.
Could the multiple content servers be adding some overhead to session creation? Seems like a session manager would always be bound to a specific content server, but I really don’t know much about this.
I’m looking for ideas and recommendations about what could be going on here, and what logs, tracing, or other lines of inquiry would be most useful. I’m not very well-versed in content server tuning and tracing and logs, so any pointers would be appreciated!!
Thanks!
-Jason
__________________
Jason Duke
Senior Consultant
Blue Fish Development Group