Next: 3.5 Configuring The Startd Up: 3. Administrators' Manual Previous: 3.3 Installing Contrib Modules

Subsections

3.4 Configuring Condor

This section describes how to configure all parts of the Condor system. First, we describe some general information about the config files, their syntax, etc. Then, we describe settings that effect all Condor daemons and tools. Finally, we have a section describing the settings for each part of Condor. The only exception to this are the settings that control the policy under which Condor will start, suspend, resume, vacate or kill jobs. These settings (and other important concepts from the condor_startd) are described in section 3.5 on ``Configuring Condor's Job Execution Policy''.

3.4.1 Introduction to Config Files

The Condor configuration files are used to customize how Condor operates at a given site. The basic configuration as shipped with Condor should work well for most sites, with a few exceptions of things that might need special customization. Please see the section from the Installation section of this manual for details on where Condor's config files are found.

Each condor program will, as part of its initialization process, ``configure'' itself by calling a library routine which parses the various config files that might be used including pool-wide, platform-specific, machine-specific, and root-owned config files. The result is a list of constants and expressions which the program may evaluate as needed at run time.

Definitions in the configuration file come in two flavors, macros and expressions. Macros provide string valued constants which remain static throughout the execution of the program. Expressions can be arithmetic, boolean, or string valued, and can be evaluated dynamically at run time.

The order in which Macros and Expressions are defined is important, since you cannot define anything in terms of something else that hasn't been defined yet. This is particularly important if you break up your config files using the LOCAL_CONFIG_FILE setting described in sections 3.4.2 and 3.9.2 below.

3.4.1.1 Config File Macros

Macro definitions are of the form:

        <macro_name> = <macro_definition>

NOTE: You must have whitespace between the macro name, the ``='' sign, and the macro definition. Macro invocations are of the form:

        $(macro_name)

Macro definitions may contain references to previously defined macros. Nothing in a config file can reference Macros which have not yet been defined. Thus,

        A = xxx
        C = $(A)

is a legal set of macro definitions, and the resulting value of ``C'' is ``xxx''. Note that ``C'' is actually bound to ``$(A)'', not its value, thus

        A = xxx
        C = $(A)
        A = yyy

is also a legal set of macro definitions and the resulting value of ``C'' is ``yyy''. However,

        A = $(C)
        C = xxx

is not legal, and will result in the Condor daemons and tools exiting when they try to parse their config files.

3.4.1.2 Config File Expressions

Expression definitions are of the form:

        <expression_name> : <expression>

NOTE: You must have whitespace between the expression name, the ``:'' sign, and the expression definition. Expressions may contain constants, operators, and other expressions. Macros may also be used to aid in writing expressions. Constants may be booleans, denoted by ``true'' (or ``t'') or ``false'' (or ``f''), signed integers, floating point values, or strings enclosed in double quotes ("). All config file expressions are simply inserted into various ClassAds. Please see the appendix on ClassAds for details about ClassAd expression operators, and how ClassAd expressions are evaluated.

Note that expression which contain references to other expressions are bound to the expressions definition, not its current value, but expressions which contain macro invocations are bound to the current value of the macro. Thus

        X : "xxx"
        Y : X
        X : "yyy"

will result in ``Y'' being evaluated as ``yyy'' at run time, but

        X = "xxx"
        Y : $(X)
        X = "yyy"

will result in ``Y'' having a run time value of ``xxx''.

3.4.1.3 Other Syntax

Other than macros and expressions, a Condor config file can contain comments or continuations. A comment is any line begining with a ``#''. A continuation is any entry (either macro or expression) that continues across multiples lines. This is accomplished with the `` $\backslash$ '' sign at the end of any line that you wish to continue onto another. For example,

        START : (KeyboardIdle > 15 * $(MINUTE)) && \
                ((LoadAvg - CondorLoadAvg) <= 0.3)

or,

        ADMIN_MACHINES = condor.cs.wisc.edu, raven.cs.wisc.edu, \
                        stork.cs.wisc.edu, ostrich.cs.wisc.edu, \
                        bigbird.cs.wisc.edu
        HOSTALLOW_ADMIN = $(ADMIN_MACHINES)

3.4.1.4 Pre-Defined Macros and Expressions

Condor provides a number of pre-defined macros and expressions that help you configure Condor. Pre-defined macros are listed as $(macro_name), while pre-defined expressions are just listed as expression_name, to denote how they should be referenced in other macros or expressions.

The first set are special entries whose values are determined at runtime and cannot be overridden. These are inserted automatically by the library routine which parses the config files.

CurrentTime: This expression provides the current result of the system call time(2). This is an integer containing the number of seconds since an arbitrary date defined by UNIX as the ``beginning of time'', hereafter referred to as the UNIX date.
$(FULL_HOSTNAME): This is the fully qualified hostname of the local machine (hostname plus domain name).
$(HOSTNAME): This is just the hostname of the local machine (no domain name).
$(TILDE): This is the full path to the home directory of user ``condor'', if such a user exists on the local machine.
$(SUBSYSTEM): This ``subsystem'' name of the daemon or tool that is evaluating the macro. The different subsystem names are described in section 3.4.1 below.

The final set are entries whose default values are determined automatically at runtime but which can be overridden.

$(ARCH): This setting defines the string used to identify the architecture of the local machine to Condor. The condor_startd will advertise itself with this attribute so that users can submit binaries compiled for a given platform and force them to run on the correct machines. condor_submit will automatically append a requirement to the job ClassAd that it must run on the same ARCH and OPSYS of the machine where it was submitted, unless the user specifies ARCH and/or OPSYS explicitly in their submit file. See the condor_submit(1) man page for details.
$(OPSYS): This setting defines the string used to identify the operating system of the local machine to Condor. See the entry on ARCH above for more information. If this setting is not defined in the config file, Condor will automatically insert the operating system of this machine as determined byuname.
$(FILESYSTEM_DOMAIN): This parameter defaults to the fully qualified hostname of the machine it is evaluated on. See section 3.4.5 on ``Shared Filesystem Config File Entries'' below for the full description of its use and under what conditions you would want to override it.
$(UID_DOMAIN): This parameter defaults to the fully qualified hostname of the machine it is evaluated on. See section 3.4.5 on ``Shared Filesystem Config File Entries'' below for the full description of its use and under what conditions you would want to override it.

Since ARCH and OPSYS will automatically be set to the right things, we recomend that you do not override them yourself. Only do so if you know what you are doing.

3.4.1.5 Condor Subsystem Names

IMPORTANT NOTE: Many of the entries in the config file will be named with the subsystem of the various Condor daemons. This is a unique string which identifies a given daemon within the Condor system. The possible subsystem names are:

STARTD
SCHEDD
MASTER
COLLECTOR
NEGOTIATOR
KBDD
SHADOW
STARTER
CKPT_SERVER
SUBMIT

In the description of the actual config file entries, ``SUBSYS'' will stand for one of these possible subsystem names.

3.4.2 Condor-wide Config File Entries

This section describes settings which effect all parts of the Condor system.

CONDOR_HOST

This macro is just used to define the NEGOTIATOR_HOST and COLLECTOR_HOST macros. Normally, the condor_collector and condor_negotiator would run on the same machine. If for some reason they weren't, CONDOR_HOST would not be needed. Some of the host-based security macros use CONDOR_HOST by default. See section 3.7 on ``Setting up IP/host-based security in Condor'' for details.

COLLECTOR_HOST

The hostname of the machine where the condor_collector is running for your pool. Normally it would just be defined with the CONDOR_HOST macro described above.

NEGOTIATOR_HOST

The hostname of the machine where the condor_negotiator is running for your pool. Normally it would just be defined with the CONDOR_HOST macro described above.

RELEASE_DIR

The full path to the Condor release directory, which holds the bin, etc, lib, and sbin directories. Other macros are defined relative to this one.

BIN

This directory points to the Condor bin directory, where user-level programs are installed. It is usually just defined relative to the RELEASE_DIR macro.

LIB

This directory points to the Condor lib directory, where libraries used to link jobs for Condor's standard universe are stored. The condor_compile program uses this macro to find these libraries, so it must be defined. LIB is usually just defined relative to the RELEASE_DIR macro.

SBIN

This directory points to the Condor sbin directory, where Condor's system binaries (such as the binaries for the Condor daemons) and administrative tools are installed. Whatever directory SBIN points to should probably be in the PATH of anyone who is acting as a Condor administrator.

LOCAL_DIR

The location of the local Condor directory on each machine in your pool. One common option is to use the condor user's home directory which you could specify with $(tilde). For example:

        LOCAL_DIR = $(tilde)

On machines with a shared filesystem, where either the $(tilde) directory or another directory you want to use is shared among all machines in your pool, you might use the $(hostname) macro and have a directory with many subdirectories, one for each machine in your pool, each named by hostnames. For example:

        LOCAL_DIR = $(tilde)/hosts/$(hostname)

or:

        LOCAL_DIR = $(release_dir)/hosts/$(hostname)

LOG

This entry is used to specify the directory where each Condor daemon writes its log files. The names of the log files themselves are defined with other macros, which use the LOG macro by default. The log directory also acts as the current working directory of the Condor daemons as the run, so if one of them should drop a core file for any reason, it would wind up in the directory defined by this macro. Normally, LOG is just defined in terms of $(LOCAL_DIR).

SPOOL

The spool directory is where certain files used by the condor_schedd are stored, such as the job queue file, and the initial executables of any jobs that have been submitted. In addition, if you are not using a checkpoint server, all the checkpoint files from jobs that have been submitted from a given machine will be store in that machine's spool directory. Therefore, you will want to ensure that the spool directory is located on a partition with enough disk space. If a given machine is only setup to execute Condor jobs and not submit them, it would not need a spool directory (or this macro defined). Normally, SPOOL is just defined in terms of $(LOCAL_DIR).

EXECUTE

This directory acts as the current working directory of any Condor job that is executing on the local machine. If a given machine is only setup to only submit jobs and not execute them, it would not need an execute directory (or this macro defined). Normally, EXECUTE is just defined in terms of $(LOCAL_DIR).

LOCAL_CONFIG_FILE

The location of the local, machine-specific config file for each machine in your pool. The two most common options would be putting this file in the $(LOCAL_DIR) you just defined, or putting all local config files for your pool in a shared directory, each one named by hostname. For example:

        LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local

or:

        LOCAL_CONFIG_FILE = $(release_dir)/etc/$(hostname).local

or, not using your release directory:

        LOCAL_CONFIG_FILE = /full/path/to/configs/$(hostname).local

Begining with Condor version 6.0.1, the LOCAL_CONFIG_FILE is treated as a list of files, not a single file. So, you can use either a comma or space seperated list of files as its value. This allows you to specify multiple files as the ``local config file'' and each one will be processed in order (with parameters set in later files overridding values from previous files). This allows you use one global config file for multiple platforms in your pool, define a platform-specific config file for each platform, and finally use a local config file for each machine. For more information on this, see section 3.9.2 on ``Configuring Condor for Multiple Platforms'' on page

CONDOR_ADMIN

This is the email address that Condor will send mail to when something goes wrong in your pool. For example, if a daemon crashes, the condor_master can send an obituary to this address with the last few lines of that daemon's log file and a brief message that describes what signal or exit status that daemon exited with.

MAIL

This is the full path to a mail sending program that understands that ``-s'' means you wish to specify a subject to the message you're sending. On all platforms, the default shipped with Condor should work. Only if you have installed things in a non-standard location on your system would you need to change this setting.

RESERVED_SWAP

This setting determines how much swap space you want to reserve for your own machine. Condor will not start up more condor_shadow processes if the amount of free swap space on your machine falls below this level.

RESERVED_DISK

This setting determines how much disk space you want to reserve for your own machine. When Condor is reporting the amount of free disk space in a given partition on your machine, it will always subtract this amount. For example, the condor_startd advertises the amount of free space in the EXECUTE directory described above.

LOCK

Condor needs to create a few lock files to synchronize access to various log files. Because of problems we've had with network filesystems and file locking over the years, we highly recommend that you put these lock files on a local partition on each machine. If you don't have your LOCAL_DIR on a local partition, be sure to change this entry. Whatever user (or group) condor is running as needs to have write access to this directory. If you're not running as root, this is whatever user you started up the condor_master as. If you are running as root, and there's a condor account, it's probably condor. Otherwise, it's whatever you've set in the CONDOR_IDS environment variable. See section 3.10.2 on ``UIDs in Condor'' for details on this.

HISTORY

This entry defines the location of the Condor history file, which stores information about all Condor jobs that have completed on a given machine. This entry is used by both the condor_schedd which appends the information, and condor_history, the user-level program that is used to view the history file.

DEFAULT_DOMAIN_NAME

If you don't use a fully qualified name in your /etc/hosts file (or NIS, etc.) for either your official hostname or as an alias, Condor wouldn't normally be able to use fully qualified names in places that it'd like to. You can set this parameter to the domain you'd like appended to your hostname, if changing your host information isn't a good option. This parameter must be set in the global config file (not the LOCAL_CONFIG_FILE specified above. The reason for this is that the FULL_HOSTNAME special macro is used by the config file code in Condor which needs to know the full hostname. So, for DEFAULT_DOMAIN_NAME to take effect, Condor must already have read in its value. However, Condor must set the FULL_HOSTNAME special macro since you might use that to define where your local config file is. So, after reading the global config file, Condor figures out the right values for HOSTNAME and FULL_HOSTNAME and inserts them into its configuration table.

CREATE_CORE_FILES

Condor can be told whether or not you want the Condor daemons to create a core file if something really bad happens. This just sets the resource limit for the size of a core file. By default, we don't do anything, and leave in place whatever limit was in effect when you started the Condor daemons (normally the condor_master). If this parameter is set and "True", we increase the limit to as large as it gets. If it's set to "False", we set the limit at 0 (which means that no core files are even created). Core files greatly help the Condor developers debug any problems you might be having. By using the parameter, you don't have to worry about tracking down where in your boot scripts you need to set the core limit before starting Condor, etc. You can just set the parameter to whatever behavior you want Condor to enforce. This parameter has no default value, and is commented out in the default config file.

3.4.3 Daemon Logging Config File Entries

These entries control how and where the Condor daemons write their log files. All of these entries are named with the subsystem (as described in section 3.4.1 above) of the daemon you wish to control logging for.

SUBSYS_LOG

This is the name of the log file for the given subsystem. For example, STARTD_LOG gives the location of the log file for the condor_startd. These entries are defined relative to the LOG macro described above. The actual names of the files are also used in the VALID_LOG_FILES entry used by condor_preen, which is described below. If you change one of the filenames with this setting, be sure to change the VALID_LOG_FILES entry as well, or condor_preen will delete your newly named log files.

MAX_SUBSYS_LOG

This setting controls the maximum length in bytes to which the various logs will be allowed to grow. Each log file will grow to the specified length, then be saved to a ``.old'' file. The ``.old'' files are overwritten each time the log is saved, thus the maximum space devoted to logging for any one program will be twice the maximum length of its log file.

TRUNC_SUBSYS_LOG_ON_OPEN

If this macro is defined and set to ``True'' the affected log will be truncated and started from an empty file with each invocation of the program. Otherwise, new invocations of the program will simply append to the previous log file. By default this setting is turned off for all daemons.

SUBSYS_DEBUG

All of the Condor daemons can produce different levels of output depending on how much information you want to see. The various levels of verbosity for a given daemon are determined by this entry. All daemons have a default level, D_ALWAYS, and log message for that level will be printed to the daemon's log, regardless of what you have set here. The other possible debug levels are:

D_FULLDEBUG: Generally, turning on this setting provides very verbose output in the log files.
D_DAEMONCORE: This provides log file entries for things that are specific to DaemonCore, such as timers the daemons have set, the commands that are registered, and so on. If both D_FULLDEBUG and D_DAEMONCORE are set, you get VERY verbose output.
D_PRIV: This flag provides turns on log messages about the privilege state switching that the daemons do. See section 3.10.2 on UIDs in Condor for more details.
D_COMMAND: With this flag set, a Any daemon that uses DaemonCore will print out a log message whenever a command comes in. The name and integer of the command are printed, whether the command was sent via UDP or TCP, and where the command was sent from. Because the condor_kbdd works by sending UDP commands to the condor_startd whenever there is activity on the X server, we don't recommend turning on D_COMMAND login in the condor_startd, since you will get so many messages that the log file will be fairly useless to you. On platforms that use the condor_kbdd, this is turned off in the condor_startd by default.
D_LOAD: The condor_startd keeps track of the load average on the machine where it is running. Both the general system load average, and the load average being generated by Condor's activity there. With this flag set, the condor_startd will print out a log message with the current state of both of these load averages whenever it computes them. This flag only effects the condor_startd.
D_JOB: When this flag is set, the condor_startd will dump out to its log file the contents of any job ClassAd that the condor_schedd sends to claim the condor_startd for its use. This flag only effects the condor_startd.
D_MACHINE: When this flag is set, the condor_startd will dump out to its log file the contents of its resource ClassAd when the condor_schedd tries to claim the condor_startd for its use. This flag only effects the condor_startd.
D_SYSCALLS: This flag is used to make the condor_shadow log remote syscall requests and return values. This can help track down problems a user is having with a particular job since you can see what system calls the job is performing, which, if any, are failing, and what the reason for the failure is. The condor_schedd also uses this flag for the server portion of the queue management code. So, with D_SYSCALLS defined in SCHEDD_DEBUG you will see verbose logging of all queue management operations the condor_schedd performs.

3.4.4 DaemonCore Config File Entries

Please read section 3.6 on ``DaemonCore'' for details about DaemonCore is. There are certain config file settings that DaemonCore uses which affect all Condor daemons (except the checkpoint server, shadow, and starter, none of which use DaemonCore yet).

HOSTALLOW...

All of the settings that begin with either HOSTALLOW or HOSTDENY are settings for Condor's host-based security. Please see section 3.7 on ``Setting up IP/host-based security in Condor'' for details on all of these settings and how to configure them.

SHUTDOWN_GRACEFUL_TIMEOUT

This entry determines how long you are willing to let daemons try their graceful shutdown methods before they do a hard shutdown. It is defined in terms of seconds. The default is 1800 (30 minutes).

SUBSYS_ADDRESS_FILE

Every Condor daemon that uses DaemonCore has a command port where commands can be sent. The IP/port of the daemon is put in that daemon's ClassAd so that other machines in the pool can query the condor_collector (which listens on a well-known port) to find the address of a given daemon on a given machine. However, tools and daemons executing on the same machine they wish to communicate with don't have to query the collector, they can simply look in a file on the local disk to find the IP/port. Setting this entry will cause daemons to write the IP/port of their command socket to the file you specify. This way, local tools will continue to operate, even if the machine running the condor_collector crashes. Using this file will also generate slightly less network traffic in your pool (since condor_q, condor_rm, etc won't have to send any messages over the network to locate the condor_schedd, for example). This entry is not needed for the collector or negotiator, since their command sockets are at well-known ports anyway.

SUBSYS_EXPRS

This entry allows you to have the any DaemonCore daemon advertise arbitrary expressions from the config file in its ClassAd. Give the comma-separated list of entries from the config file you want in the given daemon's ClassAd. NOTE: The condor_negotiator and condor_kbdd do not send ClassAds now, so this entry does not effect them at all. The condor_startd, condor_schedd, condor_master, and condor_collector do send ClassAds, so those would be valid subsystems to set this entry for. OTHER NOTE: Setting SUBMIT_EXPRS has the slightly different effect of having the named expressions inserted into all the job ClassAds that condor_submit creates. This is equivalent to the ``+'' syntax in submit files. See the condor_submit(1) man page for details.

OTHER NOTE: Because of the different syntax of the config file and ClassAds, you might have to do a little extra work to get a given entry into the ClassAd. In particular, ClassAds require quote marks (") around your strings. Numeric values can go in directly, as can expressions or boolean macros. For example, if you wanted the startd to advertise a macro that was a string, a numeric macro, and a boolean expression, you'd have to do something like the following:

        STRING_MACRO = This is a string macro
        NUMBER_MACRO = 666
        BOOL_MACRO = True
        EXPR : CurrentTime >= $(NUMBER_MACRO) || $(BOOL_MACRO)
        MY_STRING_MACRO = "$(STRING_MACRO)"
        STARTD_EXPRS = MY_STRING_MACRO, NUMBER_MACRO, BOOL_MACRO, EXPR

3.4.5 Shared Filesystem Config File Entries

These entries control how Condor interacts with various shared and network filesystems. If you are using AFS as your shared filesystem, be sure to read section 3.9.1 on ``Using Condor with AFS''

UID_DOMAIN

Often times, especially if all the machines in the pool are administered by the same organization, all the machines to be added into a Condor pool share the same login account information. Specifically, does user X have UID Y on all machines within a given Internet/DNS domain? This is usually the case if a central authority creates user logins and maintains a common /etc/passwd file on all machines (perhaps via NIS/Yellow Pages, distributing the passwd file, etc). If this is the case, then set this macro to the name of the Internet/DNS domain where this is true. For instance, if all the machines in this Condor pool within the Internet/DNS zone ``cs.wisc.edu'' have a common passwd file, UID_DOMAIN would be set to ``cs.wisc.edu''. If this is not the case you can comment out the entry and Condor will automatically use the fully qualified hostname of each machine. If you put in a ``*'', that means a wildcard to match all domains and therefore to honor all UIDs - dangerous idea. Condor uses this information to determine if it should run a given Condor job on the remote execute machine with the UID of whomever submitted the job or with the UID of user ``nobody''. If you set this to ``none'' or don't set it at all, then Condor jobs will always execute with the access permissions of user ``nobody''. For security purposes, it is not a bad idea to have Condor jobs that migrate around on machines across an entire organization to run as user ``nobody'', which by convention has very restricted access to the disk files of a machine. Standard Universe Condor jobs are perfectly happy to run as user nobody since all I/O is redirected back via remote system calls to a shadow process running on the submit machine which is authenticated as the user. If you only plan on running Standard Universe jobs, then it is a good idea to simply set this to ``none'' or don't define it. Vanilla Universe jobs, however, cannot take advantage of Condor's remote system calls. Vanilla Universe jobs are dependent upon NFS, RFS, AFS, or some shared filesystem setup to read/write files as they bounce around from machine to machine. If you want to run Vanilla jobs and your shared filesystems are via AFS, then you can safely leave this as ``none'' as well. But if you wish to use Vanilla jobs with Condor and you have shared filesystems via NFS or RFS, then you should enter in a legitimate domain name where all your UIDs match (you should be doing this with NFS anyway!) on all machines in the pool, or else users in your pool who submit Vanilla jobs will have to make their files world read/write (so that user nobody can access them). Some gritty details for folks who want to know: If the submitting machine and the remote machine about to execute the job both have the same login name in the passwd file for a given UID, and the UID_DOMAIN claimed by the submit machine is indeed found to be a subset of what an inverse lookup to a DNS (domain name server) or NIS reports as the fully qualified domain name for the submit machine's IP address (this security measure safeguards against the submit machine from simply lying), THEN the job will run with the same UID as the user who submitted the job. Otherwise it will run as user ``nobody''. Note: the UID_DOMAIN parameter is also used when Condor sends email back to the user about a completed job; the address Job-Owner@UID_DOMAIN is used, unless UID_DOMAIN is ``none'', in which case Job-Owner@submit-machine is used.

SOFT_UID_DOMAIN

This setting is used in conjuction with the UID_DOMAIN setting described above. If the UID_DOMAIN settings match on both the execute and submit machines, but the uid of the user who submitted the job isn't in the passwd file (or password info if NIS is being used) of the execute machine, the condor_starter will normally exit with an error. If you set SOFT_UID_DOMAIN to be ``True'', Condor will simply start the job with the specified uid, even if it's not in the passwd file.

FILESYSTEM_DOMAIN

This setting is similar in concept to UID_DOMAIN, but here we need the Internet/DNS domain name where all the machines within that domain can access the same set of NFS file servers. Often times, especially if all the machines in the pool are administered by the same organization, all the machines to be added into a Condor pool can mount the same set of NFS fileservers onto the same place in the directory tree. Specifically, do all the machines in the pool within a specific Internet/DNS domain mount the same set of NFS file servers onto the same path mount-points? If this is the case, then set this macro to the name of the Internet/DNS domain where this is true. For instance, if all the machines in the Condor pool within the Internet/DNS zone ``cs.wisc.edu'' have a common passwd file and mount the same volumes from the same NFS servers, set FILESYSTEM_DOMAIN to ``cs.wisc.edu''. If this is not the case you can comment out the entry, and Condor will automatically set it to the fully qualified hostname of the local machine.

HAS_AFS

Set this to ``True'' if all the machines you plan on adding in your pool all can access a common set of AFS fileservers. Otherwise, set it to ``False''.

FS_PATHNAME

If you're using AFS, Condor needs to know where the AFS ``fs'' command is located so that it can verify the AFS cell-names of machines in the pool. The default value of /usr/afsws/bin/fs is also the default that AFS uses.

VOS_PATHNAME

If you're using AFS, Condor needs to know where the AFS ``vos'' command is located so that it can compare fileserver names of volumes. The default value of /usr/afsws/etc/vos is also the default that AFS uses.

RESERVE_AFS_CACHE

If your machine is running AFS and the AFS cache lives on the same partition as the other Condor directories, and you want Condor to reserve the space that your AFS cache is configured to use, set this entry to ``True''. It defaults to ``False''.

USE_NFS

This setting influences how Condor jobs running in the Standard Universe will access their files. Condor will normally always redirect the file I/O requests of Standard Universe jobs back to be executed on the machine which submitted the job. Because of this, as a Condor job migrates around the network, the filesystem always appears to be identical to the filesystem where the job was submitted. However, consider the case where a user's data files are sitting on an NFS server. The machine running the user's program will send all I/O over the network to the machine which submitted the job, which in turn sends all the I/O over the network a second time back to the NFS file server. Thus, all of the program's I/O is being sent over the network twice. If you set this macro to ``True'', then Condor will attempt to read/write files without redirecting them back to the submitting machine if both the submitting machine and the machine running the job are both accessing the same NFS servers (if they're both in the same FILESYSTEM_DOMAIN, as described above). The result is I/O performed by Condor Standard Universe programs is only sent over the network once. While sending all file operations over the network twice might sound really bad, unless you are operating over networks where bandwidth as at a very high premium, practical experience reveals that this scheme offers very little real performance gain. There are also some (fairly rare) situations where this scheme can break down. Setting USE_NFS to ``False'' is always safe. It may result in slightly more network traffic, but Condor jobs are ideally heavy on CPU and light on I/O anyway. It also ensures that a remote Standard Universe Condor job will always use Condor's remote system calls mechanism to reroute I/O and therefore see the exact same filesystem that the user sees on the machine where she/he submitted the job. Some gritty details for folks who want to know: If the you set USE_NFS to ``True'', and the FILESYSTEM_DOMAIN of both the submitting machine and the remote machine about to execute the job match, and the FILESYSTEM_DOMAIN claimed by the submit machine is indeed found to be a subset of what an inverse lookup to a DNS (domain name server) reports as the fully qualified domain name for the submit machine's IP address (this security measure safeguards against the submit machine from simply lying), THEN the job will access files via a local system call, without redirecting them to the submitting machine (a.k.a. with NFS). Otherwise, the system call will get routed back to the submitting machine via Condor's remote system call mechanism.

USE_AFS

If your machines have AFS and the submit and execute machines are in the same AFS cell, this setting determines whether Condor will use remote system calls for Standard Universe jobs to send I/O requests to the submit machine, or if it should use local file access on the execute machine (which will then use AFS to get to the submitter's files). Read the setting above on USE_NFS for a discussion of why you might want to use AFS access instead of remote system calls. One important difference between USE_NFS and USE_AFS is the AFS cache. With USE_AFS set to ``True'', the remote Condor job executing on some machine will start messing with the AFS cache, possibly evicting the machine owner's files from the cache to make room for its own. Generally speaking, since we try to minimize the impact of having a Condor job run on a given machine, we don't recomend using this setting.

While sending all file operations over the network twice might sound really bad, unless you are operating over networks where bandwidth as at a very high premium, practical experience reveals that this scheme offers very little real performance gain. There are also some (fairly rare) situations where this scheme can break down. Setting USE_AFS to ``False'' is always safe. It may result in slightly more network traffic, but Condor jobs are ideally heavy on CPU and light on I/O anyway. ``False'' ensures that a remote Standard Universe Condor job will always see the exact same filesystem that the user on sees on the machine where he/she submitted the job. Plus, it will ensure that the machine where the job executes doesn't have its AFS cache screwed up as a result of the Condor job being there. However, things may be different at your site, which is why the setting is there.

3.4.6 Checkpoint Server Config File Entries

These entries control whether or not Condor user a checkpoint server. In addition, if you are using a checkpoint server, this section describes the settings that the checkpoint server itself needs to have defined. If you decide to use a checkpoint server, you must install it seperately (it is not included in the main Condor binary distribution or installation procedure). See section 3.3.5 on ``Installing a Checkpoint Server'' for details on installing and running a checkpoint server for your pool.

NOTE: If you are setting up a machine to join to UW-Madison CS Department Condor pool, you should configure the machine to use a checkpoint server, and use ``condor-ckpt.cs.wisc.edu'' as the checkpoint server host (see below).

USE_CKPT_SERVER: A boolean which determines if you want a given machine machine to use the checkpoint server for your pool.
CKPT_SERVER_HOST: The hostname of the checkpoint server for your pool.
CKPT_SERVER_DIR: The checkpoint server needs this macro defined to the full path of the directory the server should use to store checkpoint files. Depending on the size of your pool and the size of the jobs your users are submitting, this directory (and it's subdirectories) might need to store many megabytes of data.

3.4.7 condor_master Config File Entries

These settings control the condor_master.

DAEMON_LIST

This macro determines what daemons the condor_master will start and keep its watchful eyes on. The list is a comma or space seperated list of subsystem names (described above in section 3.4.1). For example,

        DAEMON_LIST = MASTER, STARTD, SCHEDD

NOTE: On your central manager, your DAEMON_LIST will be different from your regular pool, since it will include entries for the condor_collector and condor_negotiator. NOTE: On machines running Digital Unix or IRIX, your DAEMON_LIST will also include ``KBDD'', for the condor_kbdd, which is a special daemon that runs to monitor keyboard and mouse activity on the console. It is only with this special daemon that we can aquire this information on those platforms.

SUBSYS

Once you have defined which subsystems you want the condor_master to start, you must provide it with the full path to each of these binaries. For example:

        MASTER          = $(SBIN)/condor_master
        STARTD          = $(SBIN)/condor_startd
        SCHEDD          = $(SBIN)/condor_schedd

Generally speaking, these would be defined relative to the SBIN macro.

PREEN

In addition to the daemons defined in DAEMON_LIST, the condor_master also starts up a special process, condor_preen to clean out junk files that have been left lying around by Condor. This macro determines where the condor_master finds the preen binary. It also controls how condor_preen behaves by the command-line arguments you pass to ``-m'' means you want email about files condor_preen finds that it thinks it should remove. ``-r'' means you want condor_preen to actually remove these files. If you don't want preen to run at all, just comment out this setting.

PREEN_INTERVAL

This macro determines how often condor_preen should be started. It is defined in terms of seconds and defaults to 86400 (once a day).

PUBLISH_OBITUARIES

When a daemon crashes, the condor_master can send email to the address specified by CONDOR_ADMIN with an obituary letting the administrator know that the daemon died, what it's cause of death was (which signal or exit status it exited with), and (optionally) the last few entries from that daemon's log file. If you want these obituaries, set this entry to ``True''.

OBITUARY_LOG_LENGTH

If you're getting obituaries, this setting controls how many lines of the log file you want to see.

START_MASTER

If this setting is defined and set to ``False'' when the master starts up, the first thing it will do is exit. This may seem strange, but perhaps you just don't want Condor to run on certain machines in your pool, yet the boot scripts for your entire pool are handled by a centralized system that starts up the condor_master automatically. This is certainly an entry you'd most likely find in a local config file, not your global config file.

START_DAEMONS

This setting is similar to the START_MASTER macro described above. However, the condor_master doesn't exit, it just doesn't start any of the daemons listed in the DAEMON_LIST. This way, you could start up the daemons at some later time with a condor_on command if you wished.

MASTER_UPDATE_INTERVAL

This entry determines how often the condor_master sends a ClassAd update to the condor_collector. It is defined in seconds and defaults to 300 (every 5 minutes).

MASTER_CHECK_NEW_EXEC_INTERVAL

This setting controls how often the condor_master checks the timestamps of the daemons it's running. If any daemons have been modified, the master restarts them. It is defined in seconds and defaults to 300 (every 5 minutes).

MASTER_NEW_BINARY_DELAY

Once the condor_master has discovered a new binary, this macro controls how long it waits before attempting to execute the new binary. This delay is here because the condor_master might notice a new binary while you're in the process of copying in new binaries and the entire file might not be there yet (in which case trying to execute it could yield unpredictable results). The entry is defined in seconds and defaults to 120 (2 minutes).

SHUTDOWN_FAST_TIMEOUT

This macro determines the maximum amount of time you're willing to give the daemons to perform their fast shutdown procedure before the condor_master just kills them outright. It is defined in seconds and defaults to 120 (2 minutes).

MASTER_BACKOFF_FACTOR

If a daemon keeps crashing, we use exponential backoff so we wait longer and longer before restarting it. At the end of this section, there is an example that shows how all these settings work. This setting is the base of the exponent used to determine how long to wait before starting the daemon again. It defaults to 2.

MASTER_BACKOFF_CEILING

This entry determines the maximum amount of time you want the master to wait between attempts to start a given daemon. (With 2.0 as the MASTER_BACKOFF_FACTOR, you'd hit 1 hour in 12 restarts). This is defined in terms of seconds and defaults to 3600 (1 hour).

MASTER_RECOVER_FACTOR

How long should a daemon run without crashing before we consider it recovered. Once a daemon has recovered, we reset the number of restarts so the exponential backoff stuff goes back to normal. This is defined in terms of seconds and defaults to 300 (5 minutes).

Just for clarity, here's a little example of how all these exponential backoff settings work. The example is worked out in terms of the default settings.

When a daemon crashes, it is restarted in 10 seconds. If it keeps crashing, we wait longer and longer before restarting it based on how many times it's been restarted. We take the number of times the daemon has restarted, take the MASTER_BACKOFF_FACTOR (defaults to 2) to that power, and add 9. Sounds complicated, but here's how it works:

        1st crash:  restarts == 0, so, 9 + 2^0 = 9 + 1 = 10 seconds
        2nd crash:  restarts == 1, so, 9 + 2^1 = 9 + 2 = 11 seconds
        3rd crash:  restarts == 2, so, 9 + 2^2 = 9 + 4 = 13 seconds
        ...
        6th crash:  restarts == 5, so, 9 + 2^5 = 9 + 32 = 41 seconds
        ...
        9th crash:  restarts == 8, so, 9 + 2^8 = 9 + 256 = 265 seconds

If the daemon kept dying and restarting, after the 13th crash, you'd have:

        13th crash:  restarts == 12, so, 9 + 2^12 = 9 + 4096 = 4105 seconds

This is bigger than the MASTER_BACKOFF_CEILING, which defaults to 3600, so the daemon would really be restarted after only 3600 seconds, not 4105. Assuming a few hours went by like this, with the condor_master trying again every hour (since the numbers would get even more huge, but would always be capped by the ceiling). Eventually, imagine that daemon finally started and didn't crash (for example, after the email you got about the daemon crashing, you realized that you had accidentally deleted its binary so you reinstalled it). If it stayed alive for MASTER_RECOVER_FACTOR seconds (defaults to 5 minutes). We'd reset the count of how many restarts this daemon has performed. So, if 15 minutes later, it died again, it would be restarted in 10 seconds, not 1 hour.

The moral of the story is that this is some relatively complicated stuff. The defaults we have work quite well, and you probably won't want to change them for any reason.

MASTER_EXPRS: This setting is described above in section 3.4.4 as SUBSYS_EXPRS.
MASTER_DEBUG: This setting (and other settings related to debug logging in the master) is described above in section 3.4.3 as SUBSYS_DEBUG.
MASTER_ADDRESS_FILE: This setting is described above in section 3.4.4 as SUBSYS_ADDRESS_FILE

3.4.8 condor_startd Config File Entries

These settings control general operation of the condor_startd. Information on how to configure the condor_startd to start, suspend, resume, vacate and kill remote Condor jobs can be found in a separate top-level section, section 3.5 on ``Configuring The Startd Policy''. In there, you will find information on the startd's states and activities. If you see entries in the config file that are not described here, it is because they control state or activity transitions within the condor_startd and are described in section`3.5.

STARTER: This macro holds the full path to the regular condor_starter binary the startd should spawn. It is normally defined relative to $(SBIN).
ALTERNATE_STARTER_1: This macro holds the full path to the special condor_starter.pvm binary the startd spawns to service PVM jobs. It is normally defined relative to $(SBIN), since by default, condor_starter.pvm is installed in the regular Condor release directory.
POLLING_INTERVAL: When a startd is claimed, this setting determines how often we should poll the state of the machine to see if we need to suspend, resume, vacate or kill the job. Defined in terms of seconds and defaults to 5.
UPDATE_INTERVAL: This entry determines how often the startd should send a ClassAd update to the condor_collector. The startd also sends update on any state or activity change, or if the value of its START expression changes. See section 3.5.5 on ``condor_startd States'', section 3.5.6 on ``condor_startd Activities'', and section 3.5.3 on ``condor_startd START expression'' for details on states, activities, and the START expression respectively. This entry is defined in terms of seconds and defaults to 300 (5 minutes).
STARTD_HAS_BAD_UTMP: Normally, when the startd is computing the idle time of all the users of the machine (both local and remote), it checks the utmp file to find all the currently active ttys, and only checks access time of the devices associated with active logins. Unfortunately, on some systems, utmp is unreliable, and the startd might miss keyboard activity by doing this. So, if your utmp is unreliable, set this setting to ``True'' and the startd will check the access time on all tty and pty devices.
CONSOLE_DEVICES: This macro allows the startd to monitor console (keyboard and mouse) activity by checking the access times on special files in /dev. Activity on these files shows up as ``ConsoleIdle'' time in the startd's ClassAd. Just give a comma-separated list of the names of devices you want considered the console, without the ``/dev/'' portion of the pathname. The defaults vary from platform to platform, and are usually correct.
One possible exception to this is that on Linux, we use ``mouse'' as one of the entries here. Normally, Linux installations put in a soft link from /dev/mouse that points to the appropriate device (for example, /dev/psaux for a PS/2 bus mouse, or /dev/tty00 for a serial mouse connected to com1). However, if your installation doesn't have this soft link, you will either need to put it in (which you'll be glad you did), or change this setting to point to the right device. Unfortunately, there are no such devices on Digital Unix or IRIX (don't be fooled by /dev/keyboard0, etc, the kernel does not update the access times on these devices) so this entry is not useful there, and we must use the condor_kbdd to get this information by connecting to the X server.
STARTD_JOB_EXPRS: When the startd is claimed by a remote user, it can also advertise arbitrary attributes from the ClassAd of the job its working on. Just list the attribute names you want advertised. Note: since these are already ClassAd expressions, you don't have to do anything funny with strings, etc.
STARTD_EXPRS: This setting is described above in section 3.4.4 as SUBSYS_EXPRS.
STARTD_DEBUG: This setting (and other settings related to debug logging in the startd) is described above in section 3.4.3 as SUBSYS_DEBUG.
STARTD_ADDRESS_FILE: This setting is described above in section 3.4.4 as SUBSYS_ADDRESS_FILE

3.4.9 condor_schedd Config File Entries

These settings control the condor_schedd.

SHADOW: This macro determines the full path of the condor_shadow binary that the condor_schedd spawns. It is normally defined in terms of $(SBIN).
SHADOW_PVM: This macro determines the full path of the special condor_shadow.pvm binary used for supporting PVM jobs that the condor_schedd spawns. It is normally defined in terms of $(SBIN).
MAX_JOBS_RUNNING: This setting controls the maximum number of condor_shadow processes you're willing to let a given condor_schedd spawn. The actual number of condor_shadow's might be less than that if you reached your RESERVED_SWAP limit.
MAX_SHADOW_EXCEPTIONS: This setting controls the maximum number of times that a condor_shadow processes can have a fatal error (exception) before the condor_schedd will simply relinquish the match associated with the dying shadow. Defaults to 5.
SCHEDD_INTERVAL: This entry determines how often the condor_schedd should send a ClassAd update to the condor_collector. It is defined in terms of seconds and defaults to 300 (every 5 minutes).
JOB_START_DELAY: When the condor_schedd has finished negotiating and has a lot of new condor_startd's that it has claimed, the condor_schedd can wait a certain delay before starting up a condor_shadow for each job it's going to run. This prevents a sudden, large load on the submit machine as it spawns many shadows simultaneously, and having to deal with their startup activity all at once. This macro determines how how long the condor_schedd should wait in between spawning each condor_shadow. Defined in terms of seconds and defaults to 2.
ALIVE_INTERVAL: This setting determines how often the schedd should send a keep alive message to any startd it has claimed. When the schedd claims a startd, it tells the startd how often it's going to send these messages. If the startd doesn't get one of these messages after 3 of these intervals has passed, the startd releases the claim, and the schedd is no longer paying for the resource (in terms of priority in the system). The macro is defined in terms of seconds and defaults to 300 (every 5 minutes).
SHADOW_SIZE_ESTIMATE: This entry is the estimated virtual memory size of each condor_shadow process. Specified in kilobytes. The default varies from platform to platform.
SHADOW_RENICE_INCREMENT: When the schedd spawns a new condor_shadow, it can do so with a nice-level. This is a mechanism in UNIX where you can assign your own processes a lower priority so that they don't interfere with interactive use of the machine. This is very handy for keeping a submit machine with lots of shadows running still useful to the owner of the machine. The entry can be any integer between 1 and 19. It defaults to 10.
QUEUE_CLEAN_INTERVAL: The schedd maintains the job queue on a given machine. It does so in a persistent way such that if the schedd crashes, it can recover a valid state of the job queue. The mechanism it uses is a transaction-based log file (this is the job_queue.log file, not the SchedLog file). This file contains some initial state of the job queue, and a series of transactions that were performed on the queue (such as new jobs submitted, jobs completing, checkpointing, whatever). Periodically, the schedd will go through this log, truncate all the transactions and create a new file with just the new initial state of the log. This is a somewhat expensive operation, but it speeds up when the schedd restarts since there are less transactions it has to play to figure out what state the job queue is really in. This macro determines how often the schedd should to this ``queue cleaning''. It is defined in terms of seconds and defaults to 86400 (once a day).
ALLOW_REMOTE_SUBMIT: Starting with Condor Version 6.0, users can run condor_submit on one machine and actually submit jobs to another machine in the pool. This is called a remote submit. Jobs submitted in this way are entered into the job queue owned by user ``nobody''. This entry determines whether you want to allow such a thing to happen to a given schedd. It defaults to ``False''.
QUEUE_SUPER_USERS: This macro determines what usernames on a given machine have super-user access to your job queue, meaning that they can modify or delete the job ClassAds of other users. (Normally, you can only modify or delete ClassAds that you own from the job queue). Whatever username corresponds with the UID that Condor is running as (usually ``condor'') will automatically get included in this list because that is needed for Condor's proper functioning. See section 3.10.2 on ``UIDs in Condor'' for more details on this. By default, we give ``root'' the ability to remove other user's jobs, in addition to user ``condor''.
SCHEDD_LOCK: This entry specifies what lock file should be used for access to the SchedLog file. It must be a separate file from the SchedLog, since the SchedLog may be rotated and you want to be able to synchronize access across log file rotations. This macro is defined relative to the LOCK macro described above. If, for some strange reason, you decide to change this setting, be sure to change the VALID_LOG_FILES entry that condor_preen uses as well.
SCHEDD_EXPRS: This setting is described above in section 3.4.4 as SUBSYS_EXPRS.
SCHEDD_DEBUG: This setting (and other settings related to debug logging in the schedd) is described above in section 3.4.3 as SUBSYS_DEBUG.
SCHEDD_ADDRESS_FILE: This setting is described above in section 3.4.4 as SUBSYS_ADDRESS_FILE

3.4.10 condor_shadow Config File Entries

This setting effects the condor_shadow

MAX_DISCARDED_RUN_TIME: If the shadow is unable to read a checkpoint file from the checkpoint server, it keeps trying only if the job has accumulated more than this many seconds of CPU usage. Otherwise, the job is started from scratch. Defaults to 3600 (1 hour). This setting is only used if USE_CKPT_SERVER is True.
SHADOW_LOCK: This entry specifies what lock file should be used for access to the ShadowLog file. It must be a separate file from the ShadowLog, since the ShadowLog may be rotated and you want to be able to synchronize access across log file rotations. This macro is defined relative to the LOCK macro described above. If, for some strange reason, you decide to change this setting, be sure to change the VALID_LOG_FILES entry that condor_preen uses as well.
SHADOW_DEBUG: This setting (and other settings related to debug logging in the shadow) is described above in section 3.4.3 as SUBSYS_DEBUG.

3.4.11 condor_shadow.pvm Config File Entries

These settings control the condor_shadow.pvm, the special shadow that supports PVM jobs inside Condor. See section ``Installing PVM Support in Condor'' for details.

PVMD: This macro holds the full path to the special condor_pvmd, the Condor PVM Daemon. This daemon is installed in the regular Condor release directory by default, so the macro is usually defined in terms of $(SBIN).
PVMGS: This macro holds the full path to the special condor_pvmgs, the Condor PVM Group Server Daemon, which is needed to support PVM groups. This daemon is installed in the regular Condor release directory by default, so the macro is usually defined in terms of $(SBIN).
SHADOW_DEBUG: This setting (and other settings related to debug logging in the shadow) is described above in section 3.4.3 as SUBSYS_DEBUG.

3.4.12 condor_starter Config File Entries

This setting effects the condor_starter.

JOB_RENICE_INCREMENT: When the starter spawns a Condor job, it can do so with a nice-level. This is a mechanism in UNIX where you can assign your own processes a lower priority so that they don't interfere with interactive use of the machine. If you have machines with lots of real memory and swap space so the only scarce resource is CPU time, you could use this setting in conjunction with a policy that always allowed Condor to start jobs on your machines so that Condor jobs would always run, but interactive response on your machines would never suffer. You probably wouldn't even notice Condor was running jobs. See section 3.5 on ``Configuring The Startd Policy'' for full details on setting up a policy for starting and stopping jobs on a given machine. The entry can be any integer between 1 and 19. It is commented out by default.
STARTER_LOCAL_LOGGING: This macro determines whether the starter should do local logging to its own log file, or send debug information back to the condor_shadow where it will end up in the ShadowLog. It defaults to ``True''
STARTER_DEBUG: This setting (and other settings related to debug logging in the starter) is described above in section 3.4.3 as SUBSYS_DEBUG.

3.4.13 condor_submit Config File Entries

If you want condor_submit to automatically append an expression to the Requirements expression or Rank expression of jobs at your site use the following entries:

APPEND_REQ_VANILLA: Expression to append to vanilla job requirements.
APPEND_REQ_STANDARD: Expression to append to standard job requirements.
APPEND_RANK_STANDARD: Expression to append to vanilla job rank.
APPEND_RANK_VANILLA: Expression to append to standard job rank.

IMPORTANT NOTE: The APPEND_RANK_STANDARD and APPEND_RANK_VANILLA macros were called ``APPEND_PREF_STANDARD'' and ``APPEND_PREF_VANILLA'' in previous versions of Condor.

In addition, you can provide default Rank expressions if your users don't specify their own:

DEFAULT_RANK_VANILLA: Default Rank for vanilla jobs.
DEFAULT_RANK_STANDARD: Default Rank for standard jobs.

Both of these macros default to the jobs preferring machines where there is more main memory than the image size of the job, expressed as:

        ((Memory*1024) > Imagesize)

3.4.14 condor_preen Config File Entries

These settings control condor_preen.

PREEN_ADMIN: This entry determines what email address condor_preen will send email to (if it's configured to send email at all... see the entry for PREEN above). Defaults to $(CONDOR_ADMIN).
VALID_SPOOL_FILES: This entry contains a (comma or space separated) list of files that condor_preen considers valid files to find in the SPOOL directory. Defaults to all the files that are valid. If you change the HISTORY setting above, you'll want to change this setting as well.
VALID_LOG_FILES: This entry contains a (comma or space separated) list of files that condor_preen considers valid files to find in the LOG directory. Defaults to all the files that are valid. If you change the names of any of the log files above, you'll want to change this setting as well. In addition the defaults for the SUBSYS_ADDRESS_FILE are listed here, so if you change those, you'll need to change this entry, too.

3.4.15 condor_collector Config File Entries

These settings control the condor_collector.

CLASSAD_LIFETIME: This macro determines how long a ClassAd can remain in the collector before it is discarded as stale information. The ClassAds sent to the collector might also have an attribute that says how long the lifetime should be for that specific ad. If that attribute is present the collector will either use it or the CLASSAD_LIFETIME, whichever is greater. The macro is defined in terms of seconds, and defaults to 900 (15 minutes).
MASTER_CHECK_INTERVAL: This setting defines often the collector should check for machines that have ClassAds from some daemons, but not from the condor_master (orphaned daemons) and send email about it. Defined in seconds, defaults to 10800 (3 hours)
CLIENT_TIMEOUT: Network timeout when talking to daemons that are sending an update. Defined in seconds, defaults to 30.
QUERY_TIMEOUT: Network timeout when talking to anyone doing a query. Defined in seconds, defaults to 60.
CONDOR_DEVELOPERS: Condor will send email once per week to this address with the output of the condor_status command, which simply lists how many machines are in the pool and how many are running jobs. Use the default value of ``condor-admin@cs.wisc.edu''. This default will send the weekly status message to the Condor Team at University of Wisconsin-Madison, the developers of Condor. The Condor Team uses these weekly status messages in order to have some idea as to how many Condor pools exist in the world. We would really appreciate getting the reports as this is one way we can convince funding agencies that Condor is being used in the ``real world''. If you do not wish this information to be sent to the Condor Team, you could enter ``NONE'' which disables this feature, or put in some other address that you want the weekly status report sent to.
COLLECTOR_NAME: The parameter is used to specify a short description of your pool. It should be about 20 characters long. For example, the name of the UW-Madison Computer Science Condor Pool is ``UW-Madison CS''.
CONDOR_DEVELOPERS_COLLECTOR: By default, every pool sends periodic updates to a central condor_collector at UW-Madison with basic information about the status of your pool. This includes only the number of total machines, the number of jobs submitted, the number of machines running jobs, the hostname of your central manager, and the COLLECTOR_NAME specified above. These updates help us see how Condor is being used around the world. By default, they will be sent to condor.cs.wisc.edu. If you don't want these updates to be sent from your pool, set this entry to ``NONE''.
COLLECTOR_DEBUG: This setting (and other settings related to debug logging in the collector) is described above in section 3.4.3 as SUBSYS_DEBUG.

3.4.16 condor_negotiator Config File Entries

These settings control the condor_negotiator.

NEGOTIATOR_INTERVAL: How often should the negotiator start a negotiation cycle? Defined in seconds, defaults to 300 (5 minutes).
NEGOTIATOR_TIMEOUT: What timeout should the negotiator use on it's network connections to the schedds and startds? Defined in seconds, defaults to 30.
PRIORITY_HALFLIFE: This entry defines the half-life of the user priorities. See section 2.8.2 on User Priorities for more details. Defined in seconds, defaults to 86400 (1 day).
PREEMPTION_HOLD: If the PREEMPTION_HOLD expression evaluates to true, the condor_negotiator won't preempt the job running on a given machine even if a user with a higher priority has jobs they want to run. This helps prevents thrashing. The default is to wait 2 hours before preempting any job.
NEGOTIATOR_DEBUG: This setting (and other settings related to debug logging in the negotiator) is described above in section 3.4.3 as SUBSYS_DEBUG.

Next: 3.5 Configuring The Startd Up: 3. Administrators' Manual Previous: 3.3 Installing Contrib Modules

condor-admin@cs.wisc.edu