In the early days of CDC operating systems, much of the operating system was coded on PPs. Probably, some motivating factors were:
Most CDC machines had only a single CPU, but some models like the 6500 had two. The I/O frame either contained seven PP's, ten PP's (initially on the Control Data 6400 at TNO), 14 PP's (we paid for later) or 20 PP's. Note that some PP's were pre-allocated or most of the time continuously occupied:
For the OS "kernel" itself (though that word was never used), both CPU and PP components were done entirely in assembly language. PPs did not have an RA register to bias addresses, and CPU OS code always ran with RA=0. As I recall, there was a limited amount of overlaying done with CPU OS code. This with exception of Fortran based overlay programs and additional system 'utilities' as TNO's Single User Editor (SUEDI/SUEDA). But if you like overlays, PPs are the place for you.
All PP programs reserved locations 0 - 77B as direct cells. For PP programs that used STL, STL was located in locations 100B - 777B, with the program itself starting at 1000B. Other programs simply started at 100B.
Because memory was so limited in PPs, many PP programs were written using overlays. By convention, overlays were written to load at a multiple of 1000 octal. PP programs were given names with three characters, and by convention the first character was a digit representing where the program should load (the address divided by 1000B). Main overlays typically loaded at 1000B, so many programs had names that started with 1. For instance, 1AJ (Advance Job) was called when a command in a job was completed and the next control card needed to be read, parsed, and executed. Child overlays loaded at higher locations, so their names started with bigger digits, such as 4.
Another reason for using a number at the start of a PP name was security. Only PP programs starting with a alphabetical character could be called by users when the access level of the PP was set within the user authorization bounds.
There was one important PP routine that was a hybrid: 1SP (later 1SQ), the Stack Processor. 1SP was responsible for the actual disk I/O. It processed a list of disk I/O requests that were organized in priority lists, the so called stacks. The stack processor tried to optimize head movements and sector selections to obtain the highest overall throughput and to minimize waiting times. Responsive disk I/O was very important to system performance, of course, so the system made sure that a copy of 1SP was always loaded into at least one PP, even if there were no outstanding disk I/O requests. In fact, since there were multiple disk controllers and disk units, the system could do true simultaneous disk I/O, and therefore tried to keep multiple copies of 1SP loaded to allow this to happen. The system dynamically adjusted the number of copies of 1SP/1SQ in PPs. If there was a lot of disk I/O on multiple units for a while, more copies of 1SP would be loaded. However, you wouldn't want to tie up too many PPs with idle copies of 1SP, so the number would be allowed to dwindle when the I/O load decreased.
Most PP routines were stored on disk, but the master copy of 1SP was kept in central memory as well as the code of some other PP's and DSD overlays. That code was required to reside in the expensive main memory, e.g. because the code was required to handle disk error situations or monitored tasks.
CDC operating systems implemented an unusual system call mechanism. System requests - referred to as PP requests even if no PP program was involved - were made by placing a specially-formatted word at address 1 of a program's field length (i.e., RA+1). This location was scanned periodically by MTR (or CPUMTR). When the system noticed that a job's RA+1 was non-zero, it would zero the location and start servicing the request. By convention, applications would loop, waiting for RA+1 to zero both before and after issuing a request. It certainly was necessary for an application to ensure that RA+1 was zero before issuing a request, lest a previously-issued but as yet unserviced request be overwritten. But this could have been done by consistently checking either before or after each request.
A PP could also request the CPU monitor to reserve a resource. As this was done asynchronously from the CPU execution, the request had to be handled in such a way that it could not be interrupted. CPU monitor (CPUMTR) would test whether the requested resource was available. If not, the PP would wait a couple of milliseconds before reissuing the request. If the resource was available, the CPUMTR would lock the resource and inform the PP. To lock the resource, CPUMTR Program Mode coding made use of the fact that the CPU is interrupted at word-boundaries only. Thus a read, test, set and rewrite were all performed in executing one word to prevent an XJ-interrupt of the process.
* B2 contains address of the word with the bit to be tested and set (if free) * X6 contains the bit mask with the bit to be tested & set
+ SA1 B2 read the word with the resource to obtain
* the + indicates force-upper for Compass .. start at word boundary BX3 X1*X6 extract the resource bit's current state BX7 X1+X6 AND the bit and move the result to X7 SA7 A1 rewrite interlock word; if resource was set, nothing changed * if the resource was free, it is locked now * CPUMTR can now test X3. If X3 is non-zero, the resource is occupied * which needs to be signaled to the PP in order to reissue the request * later. If X3 is zero, the PP locked the resource.
When the CPU was in Monitor Mode (machine with CEJ/MEJ hardware), the monitor could not be interrupted by subsequent exchange requests. Thus, no semaphores where necessary to change the status of resources (e.g. lock-bits in tables). Note that the PP had to verify whether its request was honored or ignored. On dual-CPU systems, the hardware ensured that only one CPU could be in Monitor Mode at a time.
In the early days, a significant amount of the system's CPU time (probably 5-10%) was spent by applications looping, waiting for the system to notice their RA+1 requests. An optional instruction, the Central Processor Exchange Jump, was available to allow an application to transfer control to the OS and have it notice the request. This XJ instruction was kind of like a software interrupt.
By the way, channel requests were issued by PP MTR (PP0). A PP program wanting to request a channel, placed its request in its output register, a special reserved word in CPU memory. PP MTR scanned all PP output registers at a regular interval. In case of a request, it granted channel access when the resource was free. Each PP program, after obtaining access to a channel was responsible for returning the resource in time. The only exception was a deliberate hang by a PP when it deternined some inconsistency in the system. By going into a hung state, the PP could avoid futher corruption of the system.
(with special thanks to Mark Riordan who provided the basis for this page)