Load Balancing Process Engine v9.2

<?xml version="1.0" encoding="UTF-8"?>
<ServiceDirectorConfig Name="Metastorm Engine Service List">
    <PreAuthenticatedRaiseFlag>true</PreAuthenticatedRaiseFlag>
    <!--  It is only possible to support one HTTP channel formatter at a time.
           In order to change formatter, the client must be stopped and restarted.
           If HTTPUseSOAP is true (case sensitive), the channel formatter will be SOAP, 
          otherwise binary. -->
    <HTTPUseSOAP>false</HTTPUseSOAP>
    <ServiceList Type="Engine">
        <Service Name="Metastorm BPM Server" Description="Engines">
            <Engine Name="Engine1">
                <Transport Type="Remoting">
                    <Server>tcp://SERVER1:4001/ECL</Server>
                </Transport>
            </Engine>
            <Engine Name="Engine2">
                <Transport Type="Remoting">
                    <Server>tcp://SERVER2:4001/ECL</Server>
                </Transport>
            </Engine>
        </Service>
    </ServiceList>
</ServiceDirectorConfig>

 Above is our EngineService.config file for the process engine. 

 

We are struggling with our attempt at load balancing the engine across two machines.

Theoretically it is working, if one goes down the other jumps in and starts working. We can only assume while they are both up that it is using round robin correctly. 

The issue is that once one of the services goes down we start experiencing an average time of 10-15seconds for the ECL to respond to our requests. Once both are up the response time falls to 500ms.

We read the logic:

 

Duration of time that the Engine is ignored = FailuresCount * 2

 

So we assumed perhaps that after the " If a connection could not be created on one of the engines, the machine is considered as unavailable for a specified period of time (2 minutes). " the failcount would go up 1 and the machine would be ignored thus bringing us back to 500ms response time. 

Not so lucky.

 

Does anyone have experience with setting up their environment in such a way?

 

Thanks,

 

...aaron

Tagged:

Comments

  • I know this is an old thread but some useful information can be had. 

     

    Is this a single Web server with 2 Process Engines?  or 2 servers running both web + Proc?

  • I believe this was an issue with the ECL specifically not handling engine failovers as well as the standard web client. If you run into this issue I would suggest opening a ticket with BPM Support so we can look into it in more depth.


  • Jeastman wrote:

    I believe this was an issue with the ECL specifically not handling engine failovers as well as the standard web client. If you run into this issue I would suggest opening a ticket with BPM Support so we can look into it in more depth.


    If you are saying the load balancing is working as it should with the web client then yes, it is applied only to the ECL.

    I am running two servers with two engines. Neither are using the web client.