[Info-vax] What choices are available to the OpenVMS Process Scheduler once all the CPUs are in use?

Stephen Hoffman seaohveh at hoffmanlabs.invalid
Wed Feb 10 10:48:59 EST 2016


On 2016-02-10 01:45:22 +0000, sean at obanion.us said:

Lots of interesting data, but no specific model of Itanium processor 
involved.   Given the version, I'm going to guess i2 Tukwila?

> What choices are available to the OpenVMS Process Scheduler once all 
> the CPUs are in use?

There's a threshold set by a system parameter that triggers an email 
message to HPE and the system manager, with specifications for the 
recommended server upgrade.

But seriously, the scheduler preempts whatever is running when there's 
an equal or higher-priority process computable, after quantum has 
expired, and starts up the next computable process or next pthread.  
There's a whole lot of information on scheduling in the IDSM for what 
happens in general, though I don't know that the internals of the 
pthread and of the hyperthreads work — pthreads and hyperthreads are 
not the same — were ever particularly documented.   The basics are the 
same, however.

> And if (I suspect) the Scheduler was struggling to choose what 
> processes to execute, how would it's efforts (CPU cycles) be 
> reported/measured?

You have 16 cores.  Depending on the specific Itanium processor 
involved, hyperthreads either is a fast(er) process context switch, or 
is something rather less than a full core.   (If each of the threads in 
a core had access to a completely parallel set of processor resources, 
they'd be called cores and not threads.)     If you had anything 
approaching 32 current processes, your system was saturated.  Which 
would preempt most of the lower-priority activities; batch processes 
would get very little processor time.

Also consider that the data you're getting might not be entirely 
active.  The mechanisms used for locking down and reading through the 
scheduler queues to collect the data have been known to sometimes skew 
what's reported, either due to contention on the database, or due to 
the poller and the processes all happening to be computable together on 
the same interval.

Rather than looking at the scheduler queues, look at the run-times with 
your processes, and the trends over the runs.  See when you're going to 
exit your window, and work to change that through tuning or workload 
sharding and/or removing the bottlenecks (which could be other 
processes, of course), and/or plan for a processor upgrade.

-- 
Pure Personal Opinion | HoffmanLabs LLC 




More information about the Info-vax mailing list