[Info-vax] What choices are available to the OpenVMS Process Scheduler once all the CPUs are in use?

Bob Gezelter gezelter at rlgsc.com
Tue Feb 9 23:17:24 EST 2016


On Tuesday, February 9, 2016 at 8:45:25 PM UTC-5, se... at obanion.us wrote:
> We had an unusual beginning of Feb., where our typical heavy month end (of Jan.) pushed us to an unexpected 32 CUR processes (25 or so are typical for month end) for about 7 hours during the day, resulting in unusual performance. With a peak of about 120 COM processes (measured from PAWZ from PerfCap, and a typical value for month end), CPU utilization averaged below 20% (on Monitor System, CPU Busy reported 500-600% out of 3200). We observed that higher Priority processes, including internal connected users and external networked companies/services all expecting responses in 2-4 seconds had no complaints, but many reports in batch took 8 to 10 times as long as typical to complete.
> Additionally, disk I/O rates and response times were good, paging rates were low and there was substantial memory on the free page list, and virtual IO cache (approximately 32 GB) appeared to be fully used.
>  
> We are on OpenVMS 8.4, patches to Update 10, BL870 (16 cores with Hyper threading enabled = 32 CPUs seen, as recommended by our Cache application vendor), 64GB memory, Fibre Channel to P9500 storage array, Cache 2010 database and applications.
>  
> We have engaged PerfCap, HP support, and our application vendor to understand what we measured and observed, including loading T4 onto a production scale test system to validate what PAWZ is saying and to try to replicate some of the issues.
>  
> But my question is:
>  
> What choices are available to the OpenVMS Process Scheduler once all the CPUs are in use?
> And if (I suspect) the Scheduler was struggling to choose what processes to execute, how would it's efforts (CPU cycles) be reported/measured?
> 
> 
> Sean

Sean,

I would not be surprised that the BATCH jobs were delayed in such a situation. What priorities are the set for the interactive users and the batch queues?

If the batch jobs are at a lower base priority than the interactive processes, one can easily experience a situation where the batch jobs end up waiting for either CPU or disk. 

A note about IO counts: Are you looking at actual disk operations, or operations presented through the cache? 

A peak of 120 COM processes also regrettably not speak to the question of whether that was a sustained peak, or a momentary phenomenon.

Full T4 sampling of the system under these conditions is often illuminating. One of my standard recommendations is to always run T4 data gathering. The worst that one can do is write the resulting data files to archival, offline storage.

- Bob Gezelter, http://www.rlgsc.com



More information about the Info-vax mailing list