3 eevents row after processing timed action

Andrea Cammarata · April 26, 2011

has anyone ever encountered this problem?

Jerome_Pearce · April 26, 2011

We know this to be an issue if the database and the Metastorm server are not at identical times. There seems to be a problem because one is used to set the time, and the other to execute it, so one thinks it has not yet passed and sets the timer again.

More details >here<

Andrea Cammarata · April 26, 2011

Yeah Jerome, I was aware of that issue, but this situation seemed to me a little bit different, first of all because times on oracle db server and on 2 engines are identical and also because engines write always 3 records on eevents table.... why 3 records?

Thanks

Jerome_Pearce · April 26, 2011

I would suspect the multi-engine environment then. Try with one engine turned off and see what happens.

achopra · April 26, 2011

I have seen this exact same issue on various VM's where there was only 1 Engine and 1 Database, all on the same machine. In fact I just saw this issue last week when using v9 SR3 HF2. I was using a very short timeframe (.1 minutes after entry time)...adjusting it to .3 minutes seemed to make the issue dis-appear. Not sure why and didn't have time to investigate it further at the time.

Jerome, any ideas?

Jerome_Pearce · April 26, 2011

I've not seen it myself on single machine environments. I'll set up a test and keep it running in a few places and see what we get.

Andrea Cammarata · April 28, 2011

It's Funny!! At the beginning I agreed with Jerome:

Same customer

1) dev environment 1 Engine (VM) + 1 Oracle Db (Unix) v9 SR3 HF3 : 1 row in eEvent table

2) test environment 1 Engine (VM) + 1 Oracle Db (Unix) v9 SR3 HF3 : 1 row in eEvent table

3) pre live environment 2 Engines (VM) + 1 Oracle Db (Unix) v9 SR3 HF2 : 3 rows in eEvent table

4) Live environment 2 Engines (phisical) + 1 Oracle Db (Unix) v9 SR3 HF1 : 3 rows in eEvent table

As suggested I've tried turning off one engine on pre live environment and result is 1 row in eEvent

BUT:

Another customer

1) test environment 1 Engine (VM) + 1 Oracle Db (Unix) v9 SR3 HF3 : 3 rows in eEvent table

.....!!!

This issue happens not only with short but also with long (days after entry time) timeframe....

.....

Jerome_Pearce · April 28, 2011

It's a rounding error.

My test (attached) gives you varying results. Oddly enough, if you look in SQL Server, where the times are shown to the 1000th of a second, the LAST time the loopback timer occurs is x.998 of a second, eg:

2011-04-29 12:54:38.990, or 2011-04-29 12:53:44.980

This is when the problem happens. When it does not, the fraction of a second could be anything. For example I have one where the two identical actions (the second is spurious) are at: 2011-04-29 12:56:38.733 and 2011-04-29 12:56:38.873

In all cases where more than one action occurs, however, the seconds are identical, just not the fraction.

Timed loop-backs should not repeat if the time is in the past, in order to prevent infinite loop-backs. I suspect the engine is looking at a time of 01:01:15.33 and checking if it is later than 01:01:15 (ignoring the fractions of a second), and coming back with the result "yes, it is in the future' and setting the timed action again.

If the engine is under any kind of load, the event manager will not look at it for a few milliseconds, and the time is passed, so the fault is both random (depending on the current fraction of a second and dependant on load (how many system actions are queued up).

achopra · April 29, 2011

Wow, I would have never figured that out! Sooooo...given these findings, would you consider this to be a bug then? If so has it been reported?

Jerome_Pearce · April 29, 2011

'Tis one of my areas of exertise, figuring out these bugs in the product. Has been for 14 years now. Metastorm still do not give us a way to report issues however (we have repeatedly asked), so no, it has not been reported.

achopra · April 29, 2011

I don't understand. You can submit support requests simply by emailing helpdesk@metastorm.com, or using the online form, right?

http://ework.metastorm.com/web/support/online_support.asp

Jerome_Pearce · April 30, 2011

If we had an account to do so, yes. So far, none has been providfed, despite several requests through the formal channels.

Andrea Cammarata · May 2, 2011

Ok! I've just submitted a support request.

In your opinion, what about switching the event manager by setting a pool interval in the registry key? It could be useful?

Andrea

Jerome_Pearce · May 2, 2011

I'm not sure, sorry.

Brian Mellert · February 14, 2013

I know this is old, and I raised it on the other forums as well, but with the 9.2.1.0 upgrade, I have timed loop back actions running hundreds of times. That's bad enough, but each is sending an email message. Thankfully, we haven't implemented 9.2.1 in production yet it is still 9.1.2.2.

A quick check yesterday showed my engine server was less than 30 seconds ahead of the database server, so that may be the factor .. but it shouldn't be. Going to be raising a ticket. Looking further, the fact that 2 of the datetime types have a time zone adjustment while the system doesn't has presented "date difference" issues in the past, but that's another topic.

Jacob_Eastman · March 24, 2013

This is a known issue if there is a time difference between the engine and database. I'd recommend always syncing your servers to the same time server such as time.nist.gov.

Brian Mellert · March 25, 2013

They do get synced on a regular basis, but time creep can still happen between syncs (in particualr in development enviroments). It still doesn't make sence to me that the engine uses two different sources to decide whether to run something.

To get around this issue since time creep can happen, I used Jerome's suggestion of creating a variable that gets set in the first loop so it won't run repeat [potentially hundreds] due to this "feature". I haven't had to worry about clearing said variable, quite yet, so things do run normally as they should when the timers do run properly.

Brian Mellert · September 16, 2013

I know we've beaten this one down ... but since we've been burned by this again in production recently (resulting it way too many emails) I thought I'd bring up again.

Obviously, in this solution I didn't put in the "I ran" flag. Guess what I'm doing today? (In addition to contacting the server admins about the clocks not being resynced.)

3 eevents row after processing timed action

Comments

Categories