Skip to content

2022 03 28 webex joint ftwg

Howard Pritchard edited this page Apr 11, 2022 · 1 revision

#03/28/22 meeting notes for joint FT/Sessions WGs meeting

Attending: Howard Pritchard, Brian Smith, Trupeshkumar Patel, Aurelien Bouteiller, Martin Schulz, Dan Holmes, Isais Urena, Thomas Hines

Agenda items

  • Continue discussion concerning agreement (see notes at the bottom of the miro document’s note section - https://miro.com/app/board/o9J_l_Rxe9Q=/ in particular we wanted to hear from Martin Schreiber about their asymmetric use of process set names
  • Topics from the FT WG (maybe reinit + sessions)?

Notes

Martin Schreiber reviews at a high level how Jan Fecht's/Dominik Huber research was using process sets. Where would a consensus mechanism be used in this model? Dan asks how others are suppose to use the string broadcast from the lead process set. This approach implies that the other MPI processes MPI state knows about this process set. Dan thinks the consensus has to be done inside MPI. However, Aurelien was envisioning some fencing mechanism invoked by the application processes. Fencing not as an explicit collective operation. If we require a collective type of operation to query for process sets, how would processes know who to synchronize with.

Martin Schulz asks couldn't we defer the collective behavior till communicator creation time? Dan sees the group from pset intermediate stage problematic here.

Runtime must expose available process sets to MPI processes in a self-consistent manner. Discussion of blocking waiting for the runtime to provide updates of the available process sets to help achieve this. The blocking would be local in an MPI sense. Isaias notes that at least static process sets would be set up prior to creation of the MPI processes.

Mutable process sets - back to versioning again. Again consider relying on the runtime to provide a consistent view of process sets with versioning. Example of a user consensus. This could be difficult to implement. Dan thinks there will need to be a comm_agree like behavior within MPI_Comm_create_from_group. Aurelien points out that if you don't know who will be participating then consensus (comm_agree) behavior is not possible. Example of comm from group where the process set names used to create the group are different. Dan argues that this kind of "deadlock" would be detectable.

The skipping of versions is what could be causing problems. Example of comm from group where the process set names used to create the group are different. Dan argues that this kind of "deadlock" would be detectable.

The skipping of versions is what introduces complexity here. Some process for example are using version 3 of a process set name to get to a communicator, but some processes were busy doing something else then query the runtime and get version 5. Maybe mandate that process sets using versions will be created in order 1,2,...? Example of a version 3, then a version 4 with fewer processes, then a version 5 with more processes. Do all processes have to use each of these process sets? Dan's pretty convinced this is necessary to avoid many problems. Discuss notion of some kind of create from group but not actually to create a communicator - a bit like the ULFM revoke method.

Aurelien's idea of revoking process sets - initiated by an MPI process. This would cause calls by other processes to currently creating communicators from versions that are being revoked to return error/fail.

Clone this wiki locally