Bulletin 133 - 2005 Feb 17

  1. HPCCC SX-6 scheduling, new Bureau queues, and a new qsub wrapper
  2. HPCCC NQS II option processing
  3. Use of TX7 utilities - changes to path
  4. Flushing of file systems
  5. HPCCC Web Site
  6. CSIRO External Services Network (ESN)
  7. cherax updates
  8. cherax - better performance through processor-memory association
  9. CSIRO Portal/farrer service

1. HPCCC SX-6 scheduling, new Bureau queues, and a new qsub wrapper

On Wednesday 16th February, a new option was invoked in the Enhanced Resource Scheduler, which will make the scheduler less likely to over-commit the processors on an SX-6 node.

To use this feature optimally, we have enhanced the qsub wrapper script, to support the parameter "-l cpunum_job=ncpus", where ncpus is the maximum number of simultaneous processors to be used in a job. Users should also continue to use the parameter "-l cpunum_prc=ncpus", which specifies the number of simultaneous processors to use for each executable in a job.

The new qsub wrapper is temporarily located in /usr/local/bin/qsubnew on the SX-6/TX7 system, and can be accessed with sxqsubnew on the cross environments. On or soon after Wednesday 23rd February, the new version will replace the current qsub wrapper.

All users of the new Bureau nodes and new Bureau queues (bm2, bmml2, bmmn2) are encouraged to use the new parameter on their jobs and use the new qsub wrapper immediately, so that the enhanced scheduling can be used.

All other users are encouraged to begin using the new feature, after it becomes available through the default qsub wrapper.

This new NQS II parameter simulates a like-named new NEC feature available in the next release of SUPER-UX. When SUPER-UX is upgraded, specifying this parameter on all multi-cpu jobs will be required. Early addition of this parameter will result in your jobs getting better service from the system today, and will assist you in moving to the next SUPER-UX level through early adoption.


2. HPCCC NQS II option processing

Note that for NQS II qsub, both with and without the above-mentioned wrapper script, when you specify one -l option=value parameter on the command line, any other -l option=value parameters embedded in the script are ignored, even if they are for different sub-options.

So, for example, if you use the command

 qsub -l memsz_job=100MB jobfile

to submit the job

#!/bin/csh
#PBS -l cputim_prc=70
#PBS -l cputim_job=80

then the time limits are not set.

(This does not apply to torque on cherax and the CSIRO cluster systems).


3. Use of TX7 utilities - changes to path

The proposed changes to the TX7 path to include the tuned utilities (See HPCbull 132.8) will occur on or soon after Wednesday 23rd February.


4. Flushing of file systems

As a next step to automating flushing on the SX-6/TX7 $WORKDIR file systems, the flushing process will be started on the /cs/flush1 and /cs/flush2 areas on the SX-6/TX7 system on or soon after Wednesday 1st March.

Given the current levels of usage, we don't expect any files to be flushed in the early stages: but, remember that files newer than 7 days are exempt from the flushing, but any older files in $WORKDIR may be subject to flushing.

Please check that any important files in $WORKDIR have copies elsewhere (remember too that there is no backup of $WORKDIR or $DATADIR).

Reports on the flushing can be found in the file flush.status in the top-level of file systems where the automatic flushing has been invoked.


5. HPCCC Web Site

All users are asked to note that www.hpccc.gov.au continues to be expanded, and now has a user application form, various policy statements, links/info to the REQ system, SX6 performance metrics, other useful information, and it is expanding regularly. Please contact P.Tannenbaum@bom.gov.au with any suggestions to make the web site more valuable.


6. CSIRO External Services Network (ESN)

Further to the HPCbull item 123.7, we wish to advise that from 18th February, users will have difficulty accessing the HPCCC and CSIRO HPSC systems through the CSIRO network from non-CSIRO locations, without making special arrangements.

This also includes CSIRO users travelling and logging in through ISPs, etc.

These changes are required to satisfy CSIRO ITS commitments to Management and to the CSIRO Board Audit Committee.

If you need continued access from outside, please contact HPSC staff to receive a CD and guide to enable continued logins using secured channels.


7. cherax updates

The patch to fix various problems on cherax was installed on Wednesday morning 2nd February.

A crash occurred at about 10:25 on Tuesday 15th February - service was restored at about 11:40; and again at about 16:15, with service restored at about 16:25. The crashes are related to the use of the Comprehensive System Accounting (CSA) package - usage has been disabled. A further crash occurred around 06:15 on Wednesday 16th, with a restart at 08:59, and at about 16:30 on Thursday 17th, with a restart at 16:37.

There will be a re-boot on Wednesday morning, 23rd February, to allow the installation of a new version of the batch system, torque. Unfortunately, all running and queued jobs will be lost at the upgrade. Please defer submitting any long jobs (without internal checkpoint capabilities) until after that.

The local userguide on the CSIRO Data Store at http://intra.hpsc.csiro.au/userguides/ds/ has been updated to improve the material on backups, flushing and quotas, and to provide more information on the new strategies we have implemented for improving the performance. In particular, the new caching strategy is providing a far higher read hit rate, and we have implemented changes so that:

  • all files 64 kbyte or smaller will always have a copy on the primary disc area
  • all files bigger than 64 kbyte and smaller than 2 Mbyte will always have a copy on the cache
  • all files recalled to the primary disc will also be copied into the cache area.

(Many old small migrated files are being recalled: this process is nearly complete).

Recently, there were 1.9 million files 64 kbyte or smaller, and 0.8 million files between 64 kbyte and 2 Mbyte - this means around 80% of the 3.3 million files in the Data Store are always on disc, and will never need to be recalled from tape.

The Data Store guide has material about getting the most from the migrating file system - how to use dmget commands effectively.

The system remains under heavy load, with an average of about 0.3 Tbyte ingested each day. On one day in January, 0.5 Tbyte was stored and 1.6 Tbyte recalled. Recently, nearly 1 Tbyte from a portable disc system was copied to the system in about a day. To give you an idea of the churn rate, about 5.5 Tbyte of the 69 Tbytes in the store have been deleted by users, and are awaiting the final hard delete (delivered after 35 days).

And over the last weekend, one user recalled about 130,000 files - all were on disc cache. About 150 retrievals per minute were achieved.

Users can see a graph of the total usage at
http://intra.hpsc.csiro.au/user/usage/ds/DMF.pri.png

and a report (updated monthly) of their holdings in the Data Store at http://intra.hpsc.csiro.au/user/usage/ds/
or at http://www.hpccc.gov.au/, and follow the link to "System Statistics".

- these latter reports are the only current report on total holdings available to users.


8. cherax - better performance through processor-memory association

On cherax, individual processes will in general perform better if they stay on one processor (so the processor cache is used well) and use memory that is located nearby. SGI provides the cpusets facility/capability including the dplace command to bind processes to free cpus.

Gareth has done some investigation and found that dplace should be called in the following ways to avoid binding non-cpu intensive threads of parallel processes to cpus.

For a single-CPU program:

 dplace my_serial_exe

For an OpenMP program:

 dplace -x2 my_intel_openmp_exe

For an MPI program executable:

 mpirun -np $n dplace -x1 my_sgi_mpi_exe

where $n is the number of processes requested

There is significant potential for performance improvement for each of these types of serial and parallel processes, especially for cache-unfriendly processes. Several user applications have already demonstrated significant performance improvement using this facility.

In the future we will incorporate the cpusets facility into the batch system so that each batch job will have exclusive access to a collection of cpus. Until that time, it would be of benefit to all users if all users would explicitly use dplace in their batch jobs for cpu-intensive commands/processes. Even when the cpusets facility is incorporated into the batch system, the use of dplace with mask option for parallel executables may be necessary/useful to help the parallel job to place its threads well within the assigned cpuset.

Gareth will notify current intensive cherax users individually.

It is possible that using dplace will have a bad affect on some jobs with the current cherax configuration. Please look for significant changes in performance when you start using dplace and send feedback to hpchelp.


9. CSIRO Portal/farrer service

The machine underlying the Portal/farrer service had a disc failure recently. We restarted the service on another system, but with a different Linux distribution, so there are differences, and we did not attempt to re-install all the software packages. Please let us know if you encounter problems, or need software installed.



BoM Solar Help:

CSIRO ASC Help:

For urgent help at all times:
  • CSIRO users 0428 108 333
  • Bureau out of hours emergencies are managed through internal policy
HPCCC WWW Site: http://www.hpccc.gov.au/
CSIRO External ASC Site: http://www.hpsc.csiro.au/
CSIRO ASC Users' Site: http://intra.hpsc.csiro.au/

Comments to:


© Copyright 2010, CSIRO Australia
Use of this web site and information available from it is subject to our Legal Notice and Disclaimer and Privacy Statement