Bulletin 171 - 2007 August 13

  1. Upgrade schedule for Super-UX 17.1
  2. Multi-node SX jobs - requesting CPU time
  3. NQSII and ERSII upgrades
  4. CSIRO workshop on Advanced Scientific Computing: 14-15 Aug
  5. New disc on cherax
  6. Staff changes
  7. Parallel programming courses at VPAC

Note: "CSIRO" items can apply to BoM users of cherax and burnet


1. Upgrade schedule for Super-UX 17.1

SX6 Nodes will be upgraded to Super-UX 17 in 6 phases, with the nodes upgraded on the dates shown below.

System Change Notice (SCN) will be issued this week with more details. Routing queues for submission to the upgraded nodes will be included in the SCN. Release 17.1 Upgrade Web Page will be accessible from the "What's New" section of www.hpccc.gov.au (HPCCC Home page).

Schedule for phased upgrade of SX6 nodes to Release 17.1

Tue 21 August SX600, SX601
Thu 23 August SX605, SX606

Tue 28 August SX622 through SX627
Wed 29 August SX602 through SX604

Tue 4 September SX607 through SX613
(One week for Operational testing)

Tue 11 September SX614 through SX621

[ page top ]


2. Multi-node SX jobs - requesting CPU time

To estimate CPU time for an SX-6 request you can use the NQSII option -l cputim_job=<time_limit>. Please be aware that for multi- node jobs this limit applies to each node, as specified by the NQSII option -b <nodes>. Effectively this is a per-node cpu time limit. There is no option for specifying an overall cpu time limit for multi-node jobs.

[ page top ]


3. NQSII and ERSII upgrades

NQSII and ERSII on the SX-6/TX7 and front-ends were upgraded on the morning of Tuesday 7th August.

The upgrade brings fixes for several problems, plus upgraded features. See the change notice at http://www.hpccc.gov.au/hpccc/user_news_advice/

In particular, there is now an erstatj option (-N) to give node information on the same line as the other information about a job.

With the upgrade to NQSII, memory limits are now being enforced on TX7 jobs.

(Often, there is very little diagnosis when jobs hit a memory limit, because there is no memory left to diagnose that there is no memory left! )

[ page top ]


4. CSIRO workshop on Advanced Scientific Computing: 14-15 Aug

CSIRO is holding a planning workshop on Advanced Scientific Computing. More information about this can be found at http://intranet.csiro.au/intranet/it/communication/BriefUpdate/UpdateJuly07.pdf

[ page top ]


5. New disc on cherax

The new disc was brought into operation over the weekend of 4th-5th August.

Unfortunately, the downtime extended longer than we had planned, because the restore process of the /cs/datastore file system to the new disc slowed down as it progressed, and took 28.5 hours. The main bottleneck was in the loading of 10 million inodes.

the main user changes are:

  • /cs/datastore expanded from 2.6 to 6.6 Tbyte

  • the data store cache expanded to 12.5 Tbyte

  • the $WORKDIR area expanded from 400 Gbyte to 2.7 Tbyte

  • the /cs/datastore being on a new disk array, with speed increased to 800 Mbyte/s peak

The expansion of /cs/datastore will allow more data to be stored online, and reduce contention for tape drives -- resulting in fewer recalls, and faster recalls.

The expansion of the cache increases the amount of data that can be recalled almost instantly, rather than requiring a tape to be mounted. So this will also lead to fewer recalls from tape, and faster recalls, but in a different way.

We plan to make available a $DATADIR area available shortly. This area will be subject to quotas, but will have no backup or flushing.

[ page top ]


6. Staff changes

Phil Tannenbaum, the HPCCC Manager, is acting as Assistant Director Computing for the Bureau. Richard Oxbrow is acting as HPCCC Manager.

Jeroen van den Muyzenberg has indicated that he will be leaving CSIRO on 17th August.

Jeroen has been with CSIRO HPSC and the HPCCC since 1999, and has been the chief guardian of the CSIRO Data Store at the HPCCC.

We thank Jeroen for the enormous amount of work that he has put in, particularly during evenings, over weekends, and sometimes in the dead of night to maintain the system, and keep making improvements.

The Data Store primary copies have grown from about 4 Tbyte to 375 Tbyte during this period.

We wish him well for the future.

[ page top ]


7. Parallel programming courses at VPAC

VPAC will be running some introductory courses on parallel computing after introductory courses on the systems they provide on the following dates. Interested people should contact HPCCC and/or VPAC.

For more see http://www.vpac.org

  • Intro to VPAC Course Wed 29th Aug 2007
  • HPC & Parallel Programming Thu 30th Aug 2007
  • Intermediate MPI Course Fri 31st Aug 2007
  • Intro to VPAC Course Wed 5th Dec 2007
  • HPC & Parallel Programming Thu 6th Dec 2007

[ page top ]




BoM Solar Help:

CSIRO ASC Help:

For urgent help at all times:
  • CSIRO users 0428 108 333
  • Bureau out of hours emergencies are managed through internal policy
HPCCC WWW Site: http://www.hpccc.gov.au/
CSIRO External ASC Site: http://www.hpsc.csiro.au/
CSIRO ASC Users' Site: http://intra.hpsc.csiro.au/

Comments to:


© Copyright 2010, CSIRO Australia
Use of this web site and information available from it is subject to our Legal Notice and Disclaimer and Privacy Statement