User Tools

Site Tools


bprun

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

bprun [2010/04/15 21:18]
bprun [2010/04/15 21:18] (current)
Line 1: Line 1:
 +
 +===BPSH===
 +
 +All non-MPI jobs on slave nodes are executed through __bpsh__. Though **bpsh** is really a simple wrapper for the bproc system, it's use and functionality will be documented seperatly here, since the functionality of bproc itself falls more in the scope of advanced programming than everyday use.
 +
 +
 +===CWD and bproc===
 +
 +The bproc system does it's best to preserve the running environment of the user's shell on the slave node. This include the CWD. BPROC will set the CWD of a migrated process to the same pathname as the cwd on the master if it exists. Note that this is an equivilent NAME only and so may or may not actually be the same filesystem. Bproc does not (and really cannot) take the possability of an alternate mountpoint for an NFS filesystem into account. Depending on wheather the user's CWD is on a shared filesystem, and wheather an equivilant (perhaps shared) structure exists on the slave node, the CWD on the slave node will be in one of 3 states:
 +
 +    * If on a shared filesystem (e.g. /home/user where slaves NFS mount /home as /home), the CWD will be the very same directory.
 +    * If on a parallel but unshared structure (e.g. /​scratch/​user exist on master and slave but are NOT shared through NFS or similar, the CWD on the slave will be the same **NAME** but will be unshared.
 +    * Should no equivilant path exist on the slave, CWD will be / 
 +
 +**How it works**
 +
 +A primary point of the bproc system is to allow the master to contain all configuration,​ libraries, and program binaries yet allow processes to be run seamlessly on slave nodes. To that end, it implements process migration. As a wrapper for **bproc**, **bpsh** uses that migration capability. For an example, let's say you run
 +
 +//bpsh 4 myprog <​mydata//​
 +
 +   1. Loads the process into memory much like any shell would.
 +   2. Migrates the process to node 4 in the same cwd if possable
 +   3. Arranges through **bpmaster** and **bpslave** for stdin, stdout, and stderr to be redirected back to the master.
 +   4. Transfers control to the program
 +   5. The contents of mydata are available to the process from stdin as usual. ​
 +
 +
 +===Environment===
 +
 +
 +All environment variables are transfered to the slave node by virtue of the process migration. That is, since the process'​s memory image is transferred,​ so to are the environment variables it had access to.
 +
 +
 +===Redirections===
 +
 +stdin, stdout, and stderr are re-directed back to the master. Thus, any command line redirections through the <, >, |, etc will refer to files on the master node, without regard to their possable existance on the slave node. In general, this is **//The Right Thing//**. On occasion, it is not (such as when you want output from a long running process to be held in scratch space on the slave to be copied back later). In these rare cases, it will be necessary to place the binary in a shared directory, and wrap it's invocation inside a shell such as:
 +
 +<code bash>
 +bpsh bash -c '/​shared_space/​myprog >/​scratch/​myout1'​
 +</​code>​
 +
 +Future versions of __bpsh__ may address this limitation if interest is sufficient.
 +
 +===PIDs===
 +
 +**Bproc** unifies the pid space of the cluster. Thus, the process will see all processes it has permission to see anywhere in the cluster. It may freely signal any such process from the slave node and will have the same effect as if it were running on the master.
  
bprun.txt ยท Last modified: 2010/04/15 21:18 (external edit)