[Info-vax] How to deal with FPG process state :-(

Sat Jan 31 06:40:40 EST 2009

On Jan 30, 11:41 pm, JF Mezei <jfmezei.spam... at vaxination.ca> wrote:
> Found my alpha to become unresponsive. But from another node, was able
> to do SHOW SYS to find a number of processes in the never before seen
> state of FPG.
>
> Console had the terrible "page file fragmented attempting to continue"
> message.
>
> I tried to STOP/ID the offending IMAP processes to no avail. Tried to
> STOP/ID any other non critical process, but it just made them into FPG
> statis (except for a couple that stayed in COMO).
>
> Question, with an unresponsive system where only services that don't
> require any memory work (SHOW SYSTEM worked, and PINGING the node
> worked), is there anything that can be done to recover that system
> without the big fat HALT comand from >>> prompt ?
>
> Is this a sign that one of my sysgen parameters needs to be adjusted
> (freelim etc ?) to ensure there is alwasy enough memory to be able to
> kill a process, or is this a hopeless case when you have runaway sofware
> like TCPIP Services which started to create processes left and right
> that each go nuts with memory and page file ?
>
>
>
> > 0200136 TCPIP$MOUNTD_1  LEF     10     2042   0 00:00:00.67       953     25  N
> > 20200137 TCPIP$NTP_1     FPG     10  8787983   0 00:00:54.06      1849     38  N
> > 20200138 TCPIP$POP_1     HIB     10    12120   0 00:00:03.41      2974     25  N
> > 2020013B SYSLOGD_1       FPG      6   726954   0 00:04:04.32      2243     35  N
> > 2020013D WWW server 80   FPG      6 11734083   0 01:02:05.22      4818     39  N
> > 20205D47 TCPIP$IMAP_370  HIB     10   929515   0 00:01:54.73     82116     90  N
> > 20205F53 DECW$TE_5F53    FPG      6    22937   0 00:00:05.41      1424     28
> > 20205F54 _FTA28:         LEFO     4       --  swapped  out  --             32
> > 2020035A SYMBIONT_86     FPG      6     4805   0 00:00:08.61      2473     39
> > 202049A0 TCPIP$TFT_BG199 LEF     10    14139   0 00:00:01.47       581     29  N
> > 20205BA3 DNFS1ACP        FPG     10      231   0 00:00:00.03       187     25
> > 20205FA8 _FTA29:         LEFO     6       --  swapped  out  --             30
> > 202060AC TCPIP$IMAP_371  FPG     10    33933   0 00:00:12.78     74376   3594  N
> > 202060AD TCPIP$IMAP_372  FPG     10    21073   0 00:00:06.79     46001   3190  N
> > 202060AE TCPIP$IMAP_373  FPG     10    15720   0 00:00:04.92     32795   3308  N
> > 202060AF TCPIP$IMAP_374  FPG     10     9531   0 00:00:03.38     20218   3243  N
> > 202060B0 TCPIP$IMAP_375  FPG     10     5719   0 00:00:02.10     13923   3194  N
> > 20205AB1 TCPIP$IMAP_376  FPG     10     4015   0 00:00:01.52      5249    849  N
> > 20205AB2 TCPIP$IMAP_377  FPG     10     3049   0 00:00:01.21      3407   1401  N
> > 202060B3 TCPIP$IMAP_378  FPG     10     1798   0 00:00:00.56      1860    233  N
> > 202042CE WWW_SERVE_1     LEF      6       94   0 00:00:00.04       493     25  S
> > 202047D4 SYSTEM          COMO    11       --  swapped  out  --             31
> > 20201BDE SERVER_0005     FPG      6    15077   0 00:00:09.67     22941     22  N
> > 20201ADF SERVER_0004     FPG      6     1334   0 00:00:00.67      2963     22  N
>
> Does anyone know if the limit for those IMAP process would be controlled
> by the UAF parameter
>          /maxjobs ?
>          /maxacctjobs ?
>          /maxdetach ?
>
> (Or which parameter would be suggested as best way to prevent IMAP from
> creating processes without the old one going away first) ?
>
> Since these processes went nuts with memory and page file, I have to
> assume that they were not subprocesses of a single job, so the
> /PRCLM parameter (subprocesses) wouldn't seem to be the limiting factor.

Dear JF Mezei:

Experiencing several processes in the FRP state and the console
telling you "page file fragmented attempting to continue", you need to
increase the page file size or add another page file on another disk
volume. A very good book on the subject is: VMS Performance Management
by James W. Colburn, page 82-83.

A general rule of thumb is that the page file should be enlarged
anytime when the total page file(s) consumption is more than 50% on
average. I would recommend multiple page files on shadow disks. How
large should be the page file? Some people recommend 1.5 times the
total memory to be the total page file size. In other words, if you
have a 2 GB of memory, then your page file should be at initially at 3
GB.

Once the page file is consumed, the VMS system can become not very
responsive. You are ultimately left with the system shutdown. You need
to write DCL programs to keep track of the page file consumption to
tell you when it has consume more than 50% of the page files. You also
need to find out how many IMAP processes are being generated. IMAP
processes are memory hogs!

The email system that I worked on was a VMS 8.3 GS80 where the IMAP
process count was controlled by the email software. However, too many
IMAP processes will consume process slots, RAM, and page file if your
sysgen parameters aren’t set properly. Make sure you have set the
proper size for WSquota and WSextent. If you set WSquota too high, an
increase consumption of the page file can occur.

What you need to do is get the system configured and tuned properly!

I hope this helps!

Regards,
Daryl Jones