Conditional Actions and Timers Failing

Hi all,

This issue as been addressed to before, but didn't get the information I needed to resolve this issue.

Basically, we have a couple of processes that have a system stage that every certain time (for example every 5 hours - using Timed Event: When Folder Last Updated), tries to see if a value in another database is available and if it is available, the folder moves on to the next stage, if not it will stay there and wait repeat the process. If after a certain amount of attempts, the information is not retrieved it will inform the users.

We have the processes working, but every know and then (I would say 2/50 folders) the folders stay stuck in the stage, and obviously these will not notify users,move to the following stage, nor write the record on the ewait table.

From previous post I've read that sometimes the engine fails and the record is not created in the ewait table, therefore the folders stay stuck in these stages, so we have to manually move these folder with an user action.

So here are my primary questions:

Why does the engine fail to write these timed or conditional actions? Is there a way to prevent it?
On the other hand, is there a way to set a second timer (based on the Last Update Time) in case the first one fails?

Thanks in advance for any given suggestions!

Find more posts tagged with

BPM

Comments

Richard Burt

We have had a similar situation. A folder is due to be updated 5 minutes after 'folder last updated' time. Occasionally when this action fires the folder will be involved in a deadlock on the eFolder table in the database. Therefore the action rolls back and the folder was then just stopping where it was.

Our solution was to create a duplicate of the action but timed for 10 minutes after 'folder last updated'. If the first action fails to commit then the folder last updated time remains the same therefore the second action will fire a few minutes later (and it can only be reached if the first one fails).

So far this seems to have done the trick.

Rick.

David Conrads

We have had steady issues with this too, always deadlock related. I'll echo what Rick said, we also put a second timer in on system stages where we don't expect the folder to stop and hang out. We have them firing every 15 minutes, and for the most part that works. However, we had to go a step further for a couple reasons. First, the secondary timers are also susceptible to deadlocks and can fail in the same way. Second, there is a way to fix the deadlock issue (I'll explain) but we weren't will to accept the performance drop. So we ended up building a nightly ECL batch process that queries for these stuck folders and fires a simple loopback action (with rebuild todo list option) on them. This closes the gap completely.

Assuming you're using SQL Server and deadlocks are your underlying issue (check elog for one of the affected folders), you can do two things to help with that. First you can set the transaction isolation on the Metastorm database to be READ_COMMITTED_SNAPSHOT. We did this and it cut down the number of deadlocks by about 90%. You can also set MAXDOP=1 on your SQL server instance which should completely resolve the deadlocks, but also makes your database run on a single CPU. We didn't do that because we'd rather deal with the deadlocks than degrading performance.