dubfi.linalg.mpi_worker_main

Main module for MPI worker processes.

Added in version 0.1.0: (initial release)

Program structure

The MPI parallelization is designed to have one parent (root) process and multiple worker processes. Each MPI worker process executes this module. The process will enter main() and await instructions from the parent process.

Instructions from the parent process can create, modify, and delete variables. Creating and modifying variables happens by sending variable names and calling methods of variables. The parent must know the variable names and types. The parent must know when the child process expects to receive data from the parent and when it will send data to the parent.

Caution

Parent and worker process must agree when which type of data should be sent. For example, when the parent instructs the workers to compute the scalar product of two vectors, it must receive the result. Inconsistencies will lead to a blocked state.

Example

Use export LOG_LEVEL=DEBUG to show debugging information from parent and child. We can distinguish the parent logger from the two worker processes by the label “(0)” or “(1)” that distinguishes the worker processes. Some debug messages have been removed for clarity:

>>> # Example: compute the norm of a vector.
>>> import numpy as np
>>> from dubfi.linalg.mpi_parent import CTX, MpiVector, simple_segments
>>> # Start child processes
>>> CTX.init()
14:21:09 INFO Parent setup: root=-4, offset=0, workers=2
14:21:09 (0) INFO Child 0 created
14:21:09 (1) INFO Child 1 created
>>>
>>> n = 20  # size of the vector space
>>> # Segments define how the vector space is distributed to the worker processes.
>>> seg = simple_segments(n, 4)
>>> # Create vector object from array. The array will be sent to the worker processes.
>>> vec = MpiVector.fromarray(seg, np.arange(n))
14:21:09 (0) DEBUG ['new', 'MpiVectorWorker', 'MpiVector001']
14:21:09 (1) DEBUG ['new', 'MpiVectorWorker', 'MpiVector001']
>>> # Now, each child owns a vector called "MpiVector001". The name
>>> # is chosen automatically.
>>> # Compute the scalar product (vec, vec) to compute the squared norm:
>>> norm2 = vec @ vec
14:21:09 DEBUG Send: call MpiVector001 apply MpiVector001 from -4
14:21:09 DEBUG Collecting Reduce: shape ()
14:21:09 (0) DEBUG ['call', 'MpiVector001', 'apply', 'MpiVector001']
14:21:09 (1) DEBUG ['call', 'MpiVector001', 'apply', 'MpiVector001']
14:21:09 (0) DEBUG Sending Reduce to parent: shape (1,) from <class 'dubfi.linalg.mpi_worker.MpiVectorWorker'>.apply
14:21:09 (1) DEBUG Sending Reduce to parent: shape (1,) from <class 'dubfi.linalg.mpi_worker.MpiVectorWorker'>.apply
>>> # The child processes executed MpiVector001.apply(MpiVector001).
>>> norm2
array(2470.)
>>> del vec
14:21:09 (0) DEBUG ['del', 'MpiVector001']
14:21:09 (1) DEBUG ['del', 'MpiVector001']

Following the instruction from the parent, the worker process executes the following steps inside the main() loop:

>>> variables = {}
>>> # Receive instruction: "new MpiVectorWorker MpiVector001"
>>> variables["MpiVector001"] = MpiVectorWorker.fromparent()
>>> # MpiVectorWorker.fromparent will wait to receive the vector
>>> # shape and entries from the parent.
>>> # Receive instruction: "MpiVector001.apply(MpiVector001)"
>>> variables["MpiVector001"].apply(variables["MpiVector001"])
>>> # MpiVector001.apply will send data to parent.
>>> # Receive instruction: "del MpiVector001"
>>> variables.pop("MpiVector001")

Functions

main()

Await and follow instructions from parent.

Module Contents

dubfi.linalg.mpi_worker_main.main()

Await and follow instructions from parent.

Run an infinite loop waiting for instructions from parent. Each instruction is broadcasted to children as buffer of 512 bytes encoding a string of words, followed by null characters.

The following commands are available:

new VAR1 TYPE [METHOD=fromparent]

Create a new distributed object of given type by calling TYPE.METHOD and assign it to variable name VAR1. METHOD must be a classmethod and defaults to “fromparent”. If VAR1 exists already, it will be overwritten. Typically, METHOD expects that the parent sends some data to initialize VAR1.

del VAR1

Delete variable name VAR1.

call VAR1 METHOD [VAR2]

Call method of variable name VAR1, optionally with argument VAR2. This calls VAR1.METHOD() or VAR1.METHOD(VAR2). The result of the function call will be discarded. The called method may expect to receive data from parent or send data to parent.

assign VAR3 VAR1 METHOD [VAR2]

Call method of variable name VAR1 and assign the result to a new variable VAR3. This differs by call only by using the result of the function call. Usually, this result should be a shared MPI object.

quit

Quit the worker process.

In the above commands, variable names can be any alphanumerical strings.