Monday, December 15, 2014

Message flow and processes on FTAs and classic E2E

I've received a request from a customer for information about the flow of messages (events) in Tivoli Workload Scheduler for z/OS classic End-to-End.
With messages I'm referring to the information exchanged between TWS agents and servers in order to start and track job execution, submit new workload, modify existing workload, etc..
I'll not use the word event, that is sometime used in this context, to avoid confusion with the events of Event Driven Workload Automation (EDWA).

This is a very specific and technical topic, however understanding this flow was the first think I made when I started working on the TWSd code to start the porting on z/OS and integration with TWSz (OPC at that time) to make first release of the classic E2E. It was the year 2000 and TWS development was still in Santa Clara, while OPC was already here in Rome. I started creating the diagrams that I'll use in this article and they was on the wall in front of me for several months.
Even if this information is also available in the manuals, I think it could be useful to have this information also on this blog.


The picture above represents the basic message flows for a Fault Tolerant Agent (FTA).

In TWS every message file can have multiple writers, but only a single process can read from that file, working as the input queue for that process.

The netman process is the main TWS daemon, a kind of inetd on UNIX, it's role is to start TWS services according to the requests coming from the network on the FTA port or locally via the NetReq.msg file.
One of the processes started by netman is writer. When an agent link to another agent, it requests to the remote netman to start a writer process, this process is dedicated to receive messages from the agent that has made the request. The writer role is to receive messages from the remote agent and write them in Mailbox.msg file. There is a different writer process for each remote node sending events to the agent.
Mailman is the consumer of Mailbox.msg, it's started by netman on the start request, and its role is to route messages to the right agents. Depending on the role of the local node and on the message, mailman sends the message to the master or domain manager, to the agents below and/or to the local Intercom.msg.
All the messages that need to be processed locally are sent to Intercom.msg that is consumed by batchman. Batchman is started by mailman and is the only process that can update the local Symphony file (that contains the current plan).
Batchman is also responsible to select the job to start, when all the dependencies are satisfied, and allocating logical resources and limits. When a job need to start it writes a message to Courier.msg.
Jobman is started by batchman and consumes the lanch job request from the Courier.msg file, actually starting the jobs and monitoring their execution.

Mailbox.msg is also used to queue any user command to TWS, e.g. from conman or TDWC. Also Jobman and Batchman and other TWS processes are writing messages in Mailbox.msg every time they need to send a message to other node or to update the Symphony file.

Most of this flow is present also in TWS z/OS classic End-to-End running on z/OS USS (Unix System Services).
On USS, most of the FTA processes are running inside the server address space, with some additions.
Mailman duplicates most of the messages also to the tomaster.msg file that in E2E is used as input queue for the Input Translator thread.
Input Translator convert the message in TWSz format and write it in the input dataset (EQQTWSIN).
The receiver subtask in the controller reads the messages from EQQTWSIN and send them to the Event Manager via an in-memory queue. The Event Manager finally applies the message to the z/OS Current Plan.
In the opposite direction, Sender subtask in the controller receives messages from General Services, Normal Mode Manager and Event Manger on a in-memory queue and writes them in output dataset (EQQTWSOU).
In the server, the Output Translator thread consumes messages from EQQTWSOU, convert them in TWSd format and writes the result in Mailbox.msg on USS where they can continue to flow as on an FTA.
Few requests are handled in a different way, e.g. for job log retrievals or downloads of centralized scripts, the Output translator starts a specific thread to handle the request, connecting to the remote agent.

Reference:

If you like this article and you find it useful, please share it on social media so other people may take advantage of it.


No comments:

Post a Comment