Workflow Successor Tasks and Performance

My question is regarding performance on a TeamSite box. We are currently using TS6.1 SP2 on Windows 2000. I wrote a dynamic workflow that returns to the start task and can do this multiple times over and over until the user decides to end the workflow/submit. In order to accomplish this, the workflow builds all of the paths ahead of time and builds the XML doc. Depending on how it is set up, dynamically there can be a large number of tasks (successor and pred). Is there a maximum number of allowed tasks and paths, or a recommended amount?

My question stems from the fact that I am hearing that our workflow will crash the TeamSite server. I wanted to know if this sounds feasible.

Thanks for looking at my post.

NOTE: I have no access to the server for administration and am not an administrator on the box. I have no materials regarding these issues hence my question.

Find more posts tagged with

Comments

iwovGraduate

large number of tasks -- how many tasks are you talking about here ?
5 ? 10 ? 100 ? 1000 ? more ?

Adam Stoller

Here are some things to keep in mind (corrections welcome):

The number of tasks in a job, once instantiated, is static.
- best practice: Try to keep the number of tasks per job to something on the order of 25 or less. This is not a hard-rule, but a workflow with 100 tasks is generally unwieldy and can lead to performance issues (*).
The number of times you can transition into/out of a given task during the lifespan of the job is dynamic.
- There's nothing specific about this that would affect performance other than having jobs sit around in the queue for very long periods of time (*).
The number of tasks represented in the workflow backing store is dynamic based upon the the number of jobs that are active and the number of tasks defined for each of those jobs.
- [*] I don't know what the exact numbers are, but I believe it used to be (5.5.2) something like 2000 (or 20,000?) tasks within the workflow backing store would tend to lead to performance degredation of the server. So if you think of a job with 50 tasks and imagine 40 such jobs active at the same time - you'll have 2000 tasks defined within the workflow backing store; alternatively if you have a job with 10 tasks - but it's kept around for years (**) and you eventually have 200 of these jobs lying around - that too would get you up to 2000 tasks defined within the workflow backing store.

[**] The notion of keeping jobs alive for extended periods of time is generally "bad" for several reasons, but perhaps the simplest reason is that the behavior of workflow processes across server shutdown's, power-outages, crashes, is inherently indeterminate and thus depending on very long life-cycles for jobs is generally unrealistic. Also, any files associated with those jobs will lead to potential issues if attempts are made to add them to other jobs or if they are modified/locked outside the workflow context while still associated with an existing job.
I don't think there is anything that categorically says you cannot do this - just "common sense" saying that you should not do this.

While I believe that performace degredation occurs around the 2000-task level -- I don't believe it should cause the server to crash until much higher, if at all. So unless you're thinking of associating each of your 1,000,000+ assets with its own job and having that job be perpetually active -- I'd think it fairly unlikely for you to reach this critical-mass point -- but it still makes sense to try to avoid the scenario even at lower task-level threshholds.

--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com

james1

> The number of times you can transition into/out of a given task
> during the lifespan of the job is dynamic.

I don't know what you mean by "dynamic".

A single workflow *job* can have at most 1000 task transitions occur. For this reason, the original poster should not have a potentially neverending job.

Also, I am not so sure about your 2000 object threshold for performance degradation. I believe that it is currently higher than that by multiple orders of magnitude. I do not have specific numbers, nor do I know which release these improvements were made in.

-- James

--
James H Koh
Interwoven Engineering

Adam Stoller

The number of times you can transition into/out of a given task during the lifespan of the job is dynamic.
I don't know what you mean by "dynamic".

A single workflow *job* can have at most 1000 task transitions occur. For this reason, the original poster should not have a potentially neverending job.

Hmm - by "dynamic" I meant that it's not something pre-determined at instantiation time.
Can you provide any kind of detail behind the 1000 task transition limitation? Like what aspect of the workflow system imposes this limitation and/or the rationale behind it?
(mind you, I think that's a pretty large number and not something that should be run into very often, but I am curious...)

Also, I am not so sure about your 2000 object threshold for performance degradation. I believe that it is currently higher than that by multiple orders of magnitude. I do not have specific numbers, nor do I know which release these improvements were made in.

Hmm - well looking through the forums I can unfortunately only point to my own posts regarding this (and in at least one of them I also wasn't sure about the number of 0's at the end and listed it as possibly being between 20,000 and 50,000) I believe this number came from Evers at some point - but no longer have my email archives from that far back to find the actual source.

I'm certainly happy to believe the number is larger than 2000 (chances are the improvements were made first in 5.5.2 [SP2b] and then perhaps more improvements in later SPs). However, since we're talking about it ... perhaps you can find someone who's done some of the benchmarking in the past and see if you can dig up the real numbers?

(I still think it's a bad idea to have large quantities of jobs in the system over an extended period of time - but perhaps its more of a theoretical issue than a technical issue...)

Thanks for the corrections!

--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com

Migrateduser

I too would like to know more about this limit on the maximum number of transitions in a single instantiated workflow job. 1000 seems like a small number to me, though I don't think I have had one with anywhere near this number in 5 years, but I do believe that it would be possible, especially with external tasks and looping. Would you please elaborate on this limit?

-------------------------------
E-Mail : farnsaw@stonedoor.com

james1

I haven't found any hard "threshold" numbers that I can share, but I can say that *significant* workflow store performance improvements were made in the TS6.1 and TS6.5 releases, compared to TS5.5.2 performance.

Thanks
-- James

--
James H Koh
Interwoven Engineering

dootndo2

Thanks to everyone for their input.

The specific scenario that we are using (granted it may not be optimised or best practices) has a CGITask that has multiple successors. In some cases (depending on how we set it up), it can have 20 successor tasks and for a full implementation it has around 580 successor tasks. These tasks are generated dynamically (through perl code) upon instantiation of the workflow. The process it follows is that on the CGITask page (the start task), we return the Callback depending on what values the user selects. The user can select from multiple dropdown values to determine who to send email notifications to. The email external task is the successor in all cases. It can send to 1, 2, or 3 groups or individuals. The successors are uniquely identified based upon the full path chosen in the CGITask. There are two additional callbacks. One ends the workflow without submitting, and one submits and ends the workflow.

Example possible flows:

CallBack[0] = CGITask1 -> SubmitTask -> EndTask
CallBack[1] = CGITask1 -> EndTask
CallBack[2] = CGITask1 -> ExternalTask1 -> GroupTask1 -> Back to CGITask1
CallBack[3] = CGITask1 -> ExternalTask1 -> GroupTask1 -> ExternalTask2 -> GroupTask2 -> Back to CGITask1
CallBack[4] = CGITask1 -> ExternalTask1 -> GroupTask1 -> ExternalTask2 -> GroupTask2 -> ExternalTask3 -> GroupTask3 -> Back to CGITask1
CallBack = ... in here there are many combinations depending on the number of users and groups to send notifications to ...

It always returns back to the same CGITask. The user then again picks the path and then it is sent through again.

The point of this is that we need to define all of the paths at instantiate time because after the emails are sent out, and the path is followed, upon completion, it returns to the CGITask. Hence, the CGITask is like the controller for the workflow (like a dashboard). The user can decide to send the content through the workflow multiple times or zero times (via submit or end). Once the user is happy with it, then they end the workflow through submit or end.

I was told that having a large number of successor tasks would cause degradation on the server and cause it to crash. I didn't necessarily believe it, but was looking for more information. I would definately be interested in knowing at what point would a workflow crash a server. In my case, my workflow has a potential of ~600 paths. Our volume of workflows would only be between 2-4 per week and include only about 13 total users.

Much thanks to all who have taken the time to view my posting.
dootndo2

Adam Stoller

I can say that *significant* workflow store performance improvements were made in the TS6.1 and TS6.5 releases, compared to TS5.5.2 performance.

That sounds good - and it might also explain my recollection of 2000[0] tasks - as I'm pretty sure that was somewhere between 4.5 and 5.5.2 time-frame.

Can you elaborate a bit on the 1000 transition limitation?

This sounds like something that *could* affect dootndo2 with their potential 581 successors and re-cycling through the workflow any number of times.

Is it 1000 transitions for the entire workflow? or 1000 transitions for a single task in a workflow?
What happens when the limit is hit? (I doubt it crashes the server)

If you can put together some good prose on this - you might also ship it to the TechPubs folks for inclusion in the next update to the WF manual.

Thanks.

--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com

james1

As I previously noted in this thread, and as was noted two years ago ;-), there is a hard limit of 1000 task transitions per job. Therefore, if your job always looped around your 5th possible flow, which seems to have 7 transitions, plus 1 to get back to the beginning, then you would only be able to run this loop 1000 / 8 = 125 times. If your intent is seriously to let this loop run indefinitely, this alone should make you re-examine the design.

As mentioned by ghoti already, it is inefficient to have hundreds of superfluous tasks in the workflow store. Workflow query performance (i.e., the "My Tasks" list) is dependent on the number of objects in the workflow store. So if you instantiate 580 tasks, but are in a loop utilizing only 8 of them, then you have 572 dead-weight tasks. Just 572 tasks isn't anything to worry about. But if, for some reason, you invoke this workflow frequently, then the 572's will eventually add up.

There is no particular "point" at which any particular task layout will cause your server to crash! I know of no limit to the number of transitions out of any one task. I know of no limit to the parallelism (or other kind of complexity) of a job's task network, outside of that which the job instantiator already enforces (i.e., a task cannot transition to itself).

If you share the reasoning behind this design decision to have 600 paths that all loop back to the beginning, then perhaps someone on the forum can suggest alternatives that will suit your needs.

I hope this information helps.

Thanks
-- James

--
James H Koh
Interwoven Engineering

Adam Stoller

If you have a nested workflow - does the 1000 transitions apply to each individual job or to the combined parent/child job chain?

I'm thinking that dootndo2 might be able to have the cgitask spawn a nested job for the review process - in which the nested job would be instantiated with only the tasks necessary for the selected review process, and then when it hands control back to the parent they can allow the succeeding task to transition back to the cgitask for the next review process (or to proceed through submit, etc within the parent job).

If the transition count is per job and not per "job set" - this should (a) greatly reduce the chance of hitting the 1000 transitions limit and (b) reduce the overhead of unused tasks within any instance of the child workflow being instantiated - yes?

If the transition count is per "job set" this would still help reduce the overhead of unused tasks - but the chances of hitting the 1000 transitions limit would still be present.

--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com

james1

The 1000-transition limit should apply to individual jobs, regardless of parent/child relationships.

Personally, I still consider this putting off the inevitable, but it certainly is a step in the right direction.

-- James

--
James H Koh
Interwoven Engineering

dootndo2

Thanks again to all your replies. I think we need to distinguish between to main issues.

1. There is an number of transition tasks (which seems to have a limit of 1000)
2. There is also a number of transition paths.

Here is my newbie understanding of this:

The transition task is the physical hops that the workflow takes during the workflow.
The transition path is the path defined on "Submit" that defines successors

In my case, I was told that I have too many optional paths. I don't anticipate that the workflow will ever need to transition over 1000 times.

We have a small team of folks that will be managing the workflow. We have significantly reduced the number of paths down to under 20 to keep from "crashing" our server. I just wish that I could have variables in tasks that could change at runtime rather than needing to create subworkflows to accomplish this.

Is there a way to manually edit the XML doc created (wft) so that we can change it during runtime? Or can we request this as a feature?

Thanks to all again.

dootndo2

Adam Stoller

Try to describe in more detail what you want to do.
Within a fixed set of defined tasks it is possible to alter many of the attributes of those tasks via perl code / CLTs - e.g. task owner, task area, files attached to the task, etc.

It is not possible to dynamically add/remove tasks from the process flow once the workflow has been instantiated - but it is possible to programmatically generate the wft *while* it is getting instantiated based on information provided in the instantiation form (e.g. configurable workflows - where you specify you have N reviewers and thus N review tasks (potentially paired with N notification tasks, etc) are created *during* instantiation.

Between those two mechanisms you can accomplish quite a bit.

Nested workflows adds yet another layer of flexibility to the mixture. From your initial description - it sounded like a reasonable proposal. From your revised description (albeit sparse in detail) it may not be necessary.

--fish
Senior Consultant, Quotient Inc.
http://www.quotient-inc.com