DCSC logo
+Open all         -Close all

 Our IBM x3550 and iDataplex dx360 servers are equipped with 2
 quad-core Penrynn or Nehalem CPUs. This implied that we have
 8 functional 'cpu-units' per node.

 This will necessitate some changes to the job scripts compared
 to the 'old' days with 1 node = 1 cpu = 1 core.

 We have created a web-page which will be updated once every 10 
 minutes, which shows the idle core time per node per (PBS) job,
 please consult this page in order to gauge how your job is 
 using the allocated resources:

  Horseshoe5 jobs
  Horseshoe6 jobs

 Since the jobs are individual we can only make some general 
 recommendations for how-to optimize the use of the nodes:

 1) If running NAMD please use +p#N#

    where #N# is 8 times the number of nodes requested.

 2) If running MPI jobs - just use '-np #N#', where
    #N# is 8 times the number of nodes requested.

 3) If using other types of inter node communications
    software - please find a way of over-subscribing the
    number of nodes by a factor of 8.

 4) If having scalar jobs - which is the most difficult
    challenge - a simple method is to "pack" multiple 
    individual jobs in a PBS job script:

      job_script1 &
      job_script2 &
      job_script3 &
      job_script4 &
      job_script5 &
      job_script6 &
      job_script7 &
      job_script8 &

    This will start 8 processes (or scripts) and the 'wait'
    command will not terminate until the last of the 8
    processes has finished.

    This is not optimal unless the 8 process use the same
    amount of runtime - but consider it a simple first step.
    *BEWARE* of usage in /scratch - the 8 processes should
    probably use 8 different sub-directories in /scratch.