LBOS User Manual Bob Walton Wed Apr 19 13:25:37 EDT 2000 FIRST DRAFT 1. Introduction LBOS, the Limited Batch Operating System, runs batch tasks that can be CPU intensive but must NOT be IO or memory intensive. LBOS is designed to run these tasks on desktop computers whose users are willing to permit such tasks in background. An LBOS task consists of a program that inputs and outputs files. An LBOS task is submitted in a form that permits it to run under different operating systems. Owners of desktop computers can easily plug into the LBOS system to make their computers available for LBOS tasks. Owners can place some restrictions on use of their computers, such as time-of-day restrictions, though usually restrictions are unnecessary since LBOS tasks are configured to not interfere with normal compu- ter use. In a pinch, computer owners can kick LBOS or individual LBOS tasks off their computers. 2. System Integrity LBOS is a broker that matches LBOS users, who have tasks to execute, with LBOS providers, who provide computers that execute tasks. Unintentional misbehavior of users or providers will happen often enough to be observed by most serious users and most serious providers. Some- times intentional misbehavior of users or providers will occur. On the part of providers, the most common unintentional misbehavior will be to crash tasks that should not crash, because the provider computer is running out of some resource or has a flaw that is acting up. The other unintentional provider misbehavior may be to produce incorrect output. This may be more common than with normal server computers, because most personal computers do not have memory parity, so a random RAM memory error in a computational run is more likely to be observed only as corrupted output. On the part of users, the most common unintentional misbehavior will be to write a task that does too much IO or takes too much disk or RAM space. Most operating systems that a provider will have can place processor usage of a task at very low priority, and prevent the task from interfering with the provider's use of the computer by taking too much processor time. But most operating systems do not have similar ability to restrict a task's IO, so the assertion that a task should not do much IO requires honor system enforcement. Some operating systems may be able to enforce limits on RAM and disk space usage, and some may require honor system enforcement of these limits too. Intentional misbehavior, while rare, can be quite damaging. A user could write a task that attempted to install a virus on the provider's computer. For this reason, users belong to certification groups. An example of a certification group would be users who are accessing LBOS under the auspices of a university professor, and for whom the professor vouches. Also, users who provide tasks are required to provide source code to the LBOS system, to ensure that any miscreants will leave behind a damning audit trail. A provider can intentionally try to corrupt the output of a task, and can violate the privacy of the task. For this reason, users are discouraged from running tasks that require privacy. Users are also encouraged to use the LBOS feature in which a task is run on several nominally independent providers and the results compared to ensure they have integrity. To help keep watch on this situation, there is a task oriented complaint system, in which users or providers can complain about the way a task executed. The LBOS manager catalogs complaints and permits administrators to take corrective action. LBOS also maintains an extensive audit trail that permits administrators to examine source code and input files and rerun tasks that are suspected of being problematic. 2. Terminology and Control Files An `LBOS manager' is a server process to which `LBOS users' can submit LBOS tasks to be executed, and `LBOS providers' can submit statements of willingness to execute tasks. An LBOS manager has a `manager name' that is of the form host-name[:port] where the host-name and optional port are a normal internet host name and port number. An `LBOS provider' is a computer that has been made available by the computer's owner to an LBOS manager for the purpose of executing LBOS tasks. The owner of this computer is also referred to as an `LBOS provider'. An `LBOS task' has a `task directory' on its `home computer' that contains input, output, and control files for the task. When an LBOS task is executing, the execution, known as a `task run', has a `task run directory' on the provider computer in which all program, input, and output files are stored. An `LBOS task' is assigned a `task ID' by the LBOS manager when the task is submitted. This task ID is any string of characters, and is typically a long random number. The task ID is stored in a control file in the task and task run directories, so it usually does not have to be directly known by LBOS users or providers. The LBOS manager maintains an `LBOS task queue' which records all the tasks submitted to the manager and their status. LBOS tasks that were finished or aborted a very long time (say 2 weeks) ago are pruned from this task queue. The directory of an LBOS task contains two files that have special function. The `LBOS environment file', which is always named `LBOSenv.txt', contains parameters that control the management of the task. The `LBOS transcript' file, named `LBOSscript.txt', contains a record of all the LBOS manager actions performed for the task. At the users request, a task can be run several times on nominally different providers, and the results can be compared as a means of ensuring integrity of the providers. The various task runs are numbered 1, 2, 3, etc. If no request is made for more than one run, then there will be only one run numbered 1. Each task run is executed in a task run directory on a provider's computer. This directory also contains the LBOSscript.txt file mentioned above, though this file in a task run directory does not have information about other runs of the same task. Files in the task directory that are not `LBOS*.*' files or program source files are input files. The output files from task run R are placed in the subdirectory `outR'. When a task is submitted to an LBOS manager, any existing `out[1-9]*' subdirectories of the task directory are deleted, and when task run R delivers its output files, the associated `outR' subdirectory that holds these files is created. An input file that is modified by the task execution is also an output file. LBOS users actually submit tasks to LBOS accounts on LBOS managers. These accounts have names of the form xxx-yyy-zzz which signify that xxx is under the auspices of yyy who is under the auspices of zzz. Should xxx-yyy-zzz mis- behave, yyy-zzz will be notified, and should yyy-zzz misbehave, zzz will be notified. Similarly LBOS providers define themselves to LBOS managers by using LBOS accounts. Each LBOS account has a public and a private key. LBOS managers know the public keys; LBOS users and providers know their private keys. Only a holder of an account's private key can access an LBOS manager under the aus- pices of that account. The keys for an account are generated by the ssh-keygen program and nominally stored in the files: ~/.ssh/ACCOUNTNAME-key ~/.ssh/ACCOUNTNAME-key.pub The first file is the private key and the second the public key. The account name is also stored as the ssh key `comment'. The private key may be encrypted by a password. 3. The LBOS Environment File (LBOSenv.txt) This file controls the environment of the job. It consists of the following `command' lines, in any order. There may be spaces or tabs at the beginning of the line before the first non-blank character, and lines can be continued to the next line by putting a \ character just before the line end. Capitalized words are to be replaced by appropriate values. Expressions of the form [...] mean ... is optional, expressions of the form {...|...} mean one of ... or ... is to be chosen, and expressions of the form [...|...] mean one of ... or ... is to be optionally chosen. # ... A comment line. manager = MANAGERNAME This records the manager to which the task is to be submitted. The default is to use the manager named in the operating system LBOS_MANAGER environment variable. account = ACCOUNTNAME This records the account to which the task is to be submitted. The default is to use the account named in the operating system LBOS_ACCOUNT environment variable. system {big-endian|little-endian|nt|unix} Specifies limits on the provider that runs the task. If the task uses binary data, it may need to restrict the endian type of provider. All providers are assumed to use IEEE floating point formats. Normally the type of operating system on the provider need not be specified. The default is to place no limits on the provider. program = {gcc|g++|g77} ... Specifies the GNU gcc, g++, or g77 command that compiles and links the program to be executed. The line may contain .c, .cc, or .f files, but not preprocessed, assembly, object files, or user provided library files. All the source files must be in (or symbolically linked into) the current directory. Note that the LBOS manager caches compiled programs, so specifying the same compilation in many tasks generally does not cause the compilation to be done more than once. Also note that the LBOS manager may archive the source files as part of its audit trail in case someone accuses a program of behaving adversarially. Programs are compiled using GNU compilers installed on appropriate Unix or Windows NT systems. The computers that do the compilation are under the control of the LBOS manager, and are not LBOS providers: that is, compilation is a service of the manager and NOT an LBOS task. The programs can open, create, close, read, write, and modify files using the standard libraries and commands of the source language. Only files in the current directory may be named and accessed by the program. All other IO is forbidden. Other systems operations, such as creating subprocesses, are forbidden. The programs may allocate and deallocate RAM memory using the standard libraries and commands of the source language. Files in the current directory that are not program source files are input files, except for the LBOS*.* files. Input files are copied to the provider current directory before the task is executed. Any of these files that change during the task execu- tion, and any new files made during that execution, are output files, and are moved back to an `out*' subdirectory of the task directory at the end of execution. Often the same file is an input file or code file in many task directories. This can be accomplished by placing symbolic links to the actual file in all these task directories. arguments = ... Specifies arguments for the program when it is run. For UNIX a line is formed consisting of the name of the program binary followed by all the characters after `arguments =', and this line is passed to the UNIX sh shell to execute the program. Therefore the arguments may be quoted as for a UNIX shell. ram = NNNN mb Specifies the maximum number of megabytes of random access memory (RAM) that the program will need during execution. The default is 2 mb. disk = NNNN mb Specifies the maximum number of megabytes of disk file space that the program will need during execution. This includes all input and output files and program binary files. The default is 10 mb. cpu = NNNN sec Specifies the maximum number of central processor unit (CPU) seconds that the program will need to execute. The default is 3600 sec. runs = N Specifies the number of nominally independent runs of the task that are requested. Usually no more than 3 runs may be requested. The default is N = 1. 4. The LBOS Transcript File (LBOSscript.txt) The LBOS transcript files consists of one-line entries. A line can be continued by placing a \ just before the end of the line. In the following, R is the run number, 1, 2, 3, etc., to which an entry applies. #... A comment line. Used for extra information. manager = MANAGERNAME This records the manager the task was submitted to. account = ACCOUNTNAME If the LBOSscript.txt file is in a task directory, this records the account the task was submitted to. If the LBOSscript.txt file is in a task run direc- tory, this records the account the provider was attached to when the provider acquired the task. task id = TASKID This records the special id number assigned to the task. This id is a character string, usually a large random number, that can be used to identify the task when making complaints, etc. Since the task ID is recorded in this file, programs can use a directory name as the name of the task, and get the task ID from this file in the directory. time SSSS DDDD Date and time, to the nearest second (at least). Every action is preceded and followed by a `time' entry to indicate the start and stop time of the action. If one action immediately follows another, the stop time of the first and the start time of the second are given by the same `time' entry. If two actions are very quick, the `time' entry between them may be omitted. SSSS is the time in seconds from the start of the task. So the first time entry in the file has SSSS equal to 0. DDDD is the date and time in appro- priate format, such as: Tue Apr 11 16:52:39 EDT 2000 status R = \ {waiting|started|finished|done|killed|suspended} Records any change in status of a task run in the manager's task queue. The suspended state means that the task will not start; once started a task cannot be suspended. Finished means a task run has finished executing, and done means in addition that all run status and run output files have been copied back into the task directory. inputting FILENAME outputting FILENAME Records file copying between the user's or pro- vider's computer and the LBOS manager. FILENAME is the name of a file relative to the task or task run directory. Outputs for the task directory have FILENAMES beginning with `outR/' that identify the run R that produced the file. ram R = NNNN mb disk R = NNNN mb cpu R = NNNN sec Records the amount of RAM in megabytes, disk space in megabytes, and cpu time in seconds actually used by the task run execution. These amounts are stored in the task queue entry of a `finished' task run. system R {big-endian|little-endian|nt|unix} Records facts about the providers system for a run. This information is also stored in the run task queue entry when the run is started. 5. LBOS Accounts and Account Commands LBOS commands that a user or provider uses to access an LBOS manager require that the user or provider have an account and the private key of that account. There are three ways to manage this: (1) Best. For UNIX put the line: if ( ! $?SSH_AUTH_SOCK ) eval `ssh-agent` in your .login file, protect your private keys with a password, and put the private key for account ACC in the file named `~/.ssh/ACC-key'. Then when the private key is first needed after you have initially logged into the computer you are using you will be asked for the password you used to protect the private key, and thereafter you will not be asked again for the password. The .login line ensures you are running an ssh-agent process, and that process remembers all the private keys that have been unencrypted using their assoc- iate passwords. The unencrypted keys are never stored in a file. (2) Least secure. Put your private key for account ACC in the file named `~/.ssh/ACC-key' but do NOT protect the key with a password. You will never be asked for a password. (3) Like (1) but put nothing in your .login file so no ssh-agent process is running. Then EVERY time a key is needed you will be asked to type the password of the key. Annoying if you do very much. The following commands can be executed by EITHER an LBOS user or an LBOS provider to create and inspect LBOS accounts. lbos make key ACCOUNTNAME [FILENAME] This command makes a new private/public key pair for the given account. The command asks you to type a password twice and creates: ~/.ssh/ACCOUNTNAME-key ~/.ssh/ACCOUNTNAME-key.pub if you gave no FILENAME. If you gave a FILENAME it creates `FILENAME' and `FILENAME.pub' instead to hold the private and public key. To create a private key without a password use an `empty' (zero length) password. It is recommended that passwords be made by finding a rememberable sentence of at least 10 words, and taking the first letters of each word. You may also use symbols to abbreviate some words, like `4' for `for' or `&' for `and'. This command without a FILENAME is the same as: ssh-keygen -f ~/.ssh/ACCOUNTNAME-key \ -C ACCOUNTNAME lbos change password ACCOUNTNAME [FILENAME] This command changes the password for an existing private key. You will be asked to type a new pass- word twice. The private key is stored in the file: ~/.ssh/ACCOUNTNAME-key if no FILENAME is given, or in the file named `FILENAME' otherwise. To remove a password form a private key give the key the `empty' (zero length) password. This command without a FILENAME is the same as: ssh-keygen -f ~/.ssh/ACCOUNTNAME-key -p lbos account new ACCOUNTNAME[@MANAGERNAME] \ EMAIL-ADDRESS [FILENAME.pub] Creates a new account on the given manager. If the manager is not given, then the manager named in the operating system LBOS_MANAGER environment variable is used. A public key must be provided. This is stored by default in the file ~/.ssh/ACCOUNTNAME-key.pub However, it can be stored in a file designated as FILENAME.pub in the command line. An email address must be given that permits the LBOS manager to contact the people responsible for the account. The `EMAIL-ADDRESS' parameter can in fact contain several email addresses: use commas to separate these and either use no spaces or "" the entire argument. E.g.: foo.oops.edu,fum.oops.edu "foo.oops.edu, fum.oops.edu" However, it is normally not necessary to include email addresses of the parent account among the email addresses of a child account. If this command has access to the parent of the new account (i.e., has the private key for that parent account), then the new account is created. If this command does not have access to the parent of the new account, then email is sent to the people responsible for the parent account asking them to establish the new account. You will be asked to type a message using your favorite email editor. The parent of top level accounts is a special top level account named `manager'. lbos account key ACCOUNTNAME[@MANAGERNAME] \ [FILENAME.pub] lbos account email ACCOUNTNAME[@MANAGERNAME] \ EMAIL-ADDRESS lbos account disable ACCOUNTNAME[@MANAGERNAME] lbos account enable ACCOUNTNAME[@MANAGERNAME] lbos account list ACCOUNTNAME[@MANAGERNAME] These commands may be used to change the public key or email address[es] of an existing account, or disable or enable an account, or list the status (key, email addresses, enabledness) of the account. If one of these commands other than `list' does not have access to the existing account or its parent, email is sent to the person responsible for the parent account. These commands have parameters similar to those of `lbos new'. No command other than these can access a disabled account until it is reenabled. The command to reenable an account must have access to the parent of the account. lbos account fix FILENAME This performs commands sent by email using the LBOS `account new', `account key', `account email', `account disable', or `account enable' commands above. The person who receives the email can store the email in the FILENAME file and execute this command to execute the LBOS commands encoded in the email. The encoding of the LBOS commands in the email is obvious, so these commands can also be edited first. It is rare to execute `account {key,email,disable, enable}' requests this way, so the `account fix' command asks for confirmation that these should really be done. 6. LBOS User Commands The following commands are executed by the LBOS user on the computer containing the task directories. In these commands, a TASK-DIRECTORY-LIST is a space-separated list of task directory names, that defaults to `.', the current directory. The account and manager used for a task is determined by the account LBOSenv.txt file when the task is started, or by the LBOS_MANAGER and LBOS_ACCOUNT operating system environment variables if when an LBOSenv.txt file has no manager or account entry and the task is started. Note that output files and updates of the LBOSscript.txt file in a task directory are only written when an `lbos update' command for the task directory is executed. lbos submit [TASK-DIRECTORY-LIST] Submit the named tasks to the LBOS manager. This places tasks on the manager's task queues. If any task directory has a LBOSscript.txt file, this is deleted and reinitialized at this time. New task IDs are assigned and written to the reinitialized LBOSscript.txt file. The manager and account of a task is also remembered in the LBOSscript.txt file. If any task directory has `out[1-9]*' subdirector- ies, these are deleted at this time. This command finishes by printing the status of the tasks in their queues. lbos list [TASK-DIRECTORY-LIST] Print the status of the tasks on their queues. lbos update [TASK-DIRECTORY-LIST] Updates any files in the task directories, writing output files and LBOSscript.txt files as necessary. Also prints status of tasks. lbos {kill|suspend|resume} [R] [TASK-DIRECTORY-LIST] Change the status of the run R to `killed', `suspended', or `waiting'. It is an error to change the status of an already `finished', `done, or `killed' task run and nothing will be done. It is a similar error to suspend or resume a `started' task run. If the run number R is not given, all runs of the task will be processed. lbos gripe R [TASK-DIRECTORY] Complain about run R of the task. The default TASK-DIRECTORY is `.', the current directory. You will be asked for a message. 7. LBOS Provider Commands To become an LBOS provider, find the appropriate web page and download the designated software. Then the following commands can be executed. Note that the names of these commands are distinct from the names of commands used by an LBOS user. E.g., to kill a task a user uses the `kill' command, and to kill a task run the provider uses the `murder' command. This permits the same person to be both a user and a provider. In the following commands ACCOUNTNAME and MANAGERNAME default to the values of the operating system LBOS_ ACCOUNT and LBOS_MANAGER environment variables. lbos attach [ACCOUNTNAME[@MANAGERNAME]] Attach the current computer as a provider for the named manager, thus making the current computer available to execute tasks for that manager. A computer can be a provider to several managers at one time, but a provider will run only one task at a time. There is no provision for being attached to two accounts at the same manager at the same time. lbos detach [ACCOUNTNAME[@MANAGERNAME]] Detach the provider from the manager. New tasks will not start, but if there is a current task, it may finish. lbos peek [N] List the ID's, status, statistics, etc. of the last N tasks executed or being executed by the provider. N defaults to 1. Also lists the managers and accounts to which the provider is currently attached. lbos murder Kill the current task. You will be asked for a message: please use it to explain the reason for killing the task. lbos complain [T] Complain about task T, where T is the `task number' given to the task by the last `lbos peek' command (and is typically the number of the task since the first `lbos attach' command). If T is missing, the last task murdered is complained about. You will be asked for a message. 8. LBOS Account Housekeeping Commands The following commands permit a person who has access to an account ACC to do housekeeping on the task runs of accounts that are children of ACC. In the following commands ACCOUNTNAME and MANAGERNAME default to the values of the LBOS_ACCOUNT and LBOS_ MANAGER environment variables. The following commands require access to the account specified or to one of its ancestors. lbos account {task|children} peek [N|D days] \ [ACCOUNTNAME[@MANAGERNAME]] List the same information as `lbos list' for the tasks run by a given account (task peek) or the children of a given account (children peek). All tasks are listed unless the list is restricted to the latest N tasks or all tasks in the last D days. The time assigned to each task for these purposes is the time of the last action taken by or status change of the task. lbos account task list TASK-LIST lbos account task {kill|suspend|resume} [R] [TASK-LIST] lbos account task gripe R TASK Execute the `list', `kill', `suspend', `resume', or `gripe' commands for the task indicated by the TASK-LIST or TASK. In TASK-LIST and TASK tasks are identified by `task numbers' given in the last execution of a `account task peek' or `account children peek' command. lbos account audit TASK [FILENAME] Copy the file with the given FILENAME, relative to the task directory, for the given task, into the current directory (with FILENAME as its name relative to the current directory). TASK is a `task number' given by the last execution of `account {task|children} peek'. If FILENAME is omitted, all the task files are copied. Source code, input files, and output files may be copied for inspection. This is primarily to catch miscreants. lbos account retain {task|source|account} D days \ [ACCOUNTNAME[@MANAGERNAME]] Set the retention time of the account for task information, source code and input files, or disabled accounts. Information is retained if the account or any ancestor of the account wants it retained. Thus if the `root' account wants disabled accounts retained 365 days, all disabled accounts will be retained at least 365 days. If D is given as zero, all information of the indicated kind that in normal operation need never be accessed again is deleted. Thus source files of tasks all of whose runs are finished will be deleted. The time assigned to task information and task source code and input files is the time the task information was last updated. The time assigned to disabled accounts is the time the account was disabled.