next up previous contents
Next: 1.6 Availability Up: 1. Overview Previous: 1.4 Distinguishing Features

1.5 Current Limitations

Limitations on Jobs which can Checkpointed
Although Condor can schedule and run any type of process, Condor does have some limitations on jobs that it can transparently checkpoint and migrate:
1.
On some platforms, specifically HPUX and Digital Unix (OSF/1), shared libraries are not supported; therefore on these platforms applications must be statically linked (Note: shared library checkpoint support is available on IRIX, Solaris, and LINUX).
2.
Only single process jobs are supported, i.e. the fork(2), exec(2), system(3) and similar calls are not implemented.
3.
Signals and signal handlers are supported, but Condor reserves the SIGUSR2 and SIGTSTP signals and does not permit their use by user code.
4.
Many interprocess communication (IPC) calls are not supported, i.e. the socket(2), send(2), recv(2), and similar calls are not implemented.
5.
All file operations must be idempotent -- read-only and write-only file accesses work correctly, but programs which both read and write to the same file may not.
6.
Each Condor job that has been checkpointed has an associated checkpoint file which is approximately the size of the address space of the process. Disk space must be available to store the checkpoint file on the submitting machines (or on the optional Checkpoint Server module).

Note: these limitations only apply to jobs which Condor has been asked to transparently checkpoint. If job checkpointing is not desired, the limitations above do not apply.

Security Implications.
Condor does a significant amount of work to prevent security hazards, but loopholes are known to exist. Condor can be instructed to run user programs only as user ``nobody'', a user login which traditionally has very restricted access. But even with access solely as user nobody, a sufficiently malicious individual could do such things as fill up /tmp (which is world writable) and/or gain read access to world readable files (which are the only files user nobody can access). Furthermore, where security of the machines in the pool is a high concern, only machines where the ``root'' user on that machine can be trusted should be admitted into the pool. (Note: Condor provides the administrator with IP-based security mechanisms to enforce this).

Jobs need to be relinked to get Checkpointing and Remote System Calls
Although typically few to none source code changes are required, Condor requires that the jobs be relinked with the Condor libraries to offer checkpointing and remote system calls. This often precludes commercial software binaries from taking advantage of these services because commercial packages rarely make their object code available. However, one can certainly still submit and run commercial packages in Condor and still take advantage of Condor's other services.


next up previous contents
Next: 1.6 Availability Up: 1. Overview Previous: 1.4 Distinguishing Features
condor-admin@cs.wisc.edu