When the status of a portfolio as returned from portfolio_stat
is Pending, portfolio_mon
starts a counter, which is then incremented each time portfolio_stat
returns Pending. If that counter exceeds the STALL_TOLERANCE
parameter (see eop_run Configuration File for more information), and email is sent notifying the operator that the portfolio has Stalled. Some situations are innocuous and some indicate problems which may require operator intervention.
The stall counter is then set back to zero and STALL_TOLERANCE
is doubled. This is so that it does not become too pesky, yet does not just give up either. Again, when STALL_TOLERANCE
is exceeded, another email is sent. This process repeats indefinitely.
When a Stalled portfolio begins to run again the status goes back to Running. If STALL_TOLERANCE
had been exceeded (and email alert of the stall was sent), then email is sent alerting the operator that the portfolio is Running. STALL_TOLERANCE
is then reset to an average of its last value and its initial value, making it somewhat more tolerant of subsequent Stalled periods.
For example, If STALL_TOLERANCE
is given as 5 in the config file, then email will be sent on the 5th time Pending, and then on the 10th time Pending, and on the 20th time Pending. If the portfolio begins to run again, then STALL_TOLERANCE
is set to (20 + 5) / 2 = 12. The next time that the portfolio is Pending more than 12 cycles then it will be considered Stalled.
The scope of STALL_TOLERANCE
is restricted to one portfolio and one invocation of portfolio_mon
. If portfolio_mon
has to be restarted, then STALL_TOLERANCE
will be set as indicated in the config file.