Pictured above: SGI Altix 4700
Pictured above: CSIRO ASC's Sun/StorageTek SL8500 tape library
CSIRO ASC SGI Altix 4700 NUMA - cherax
This machine is a large shared memory multiprocessor which has 128 1.67Hz Itanium (ia64)
processor cores and 512 Gbyte of memory as of August 2008. For
full specifications see the SGI web site where it is also called the Infinite Storage Data
Lifecycle Management Server (DLM Server).
The Altix is tightly coupled with the CSIRO Data Store and provides significant processing
capacity for working with the Data Store holdings.
CSIRO ASC Data Store
The Altix machine has 38 terabytes of disc and hosts a hierarchical data store with data on 6 terabytes
of high-performance disk being staged to or from cache disc and magnetic tape cartridges in automatic tape
libraries (Sun/StorageTek SL8500) as required. The tape libraries have capacities in excess of 5 petabytes.
The Data Store is used as a central data repository for users of the ASC systems, with live rather
than archive access, and providing virtually 'infinite' storage capacity.
The data holdings reached 1 petabyte in October 2009.
Two or more copies of all files are kept, with copies of small files
being kept off-site.
CSIRO ASC Compute Cluster - burnet
CSIRO has a IBM
eServer Cluster 1350 system of about 123 nodes.
The cluster is available for general purpose use by
all CSIRO ASC registered users. Specific research groups that have co-invested with ASC have priority access
to portions of the cluster.
Each system is comprised of nodes with mostly intel Xeon processors in rack mounted HS20
"blades".
There are four main hardware configurations of general nodes. 41 nodes have have 2GB
of memory, 28 nodes have 4GB of memory, 28 nodes have 2GB of memory, Intel64 processors,
more (and faster) disk and infiniband interconnect and 26 nodes have 8GB
of memory, more disk and faster Intel64 processors. Formal specifications are
available from the IBM 1350 cluster web site.
CSIRO GPU Cluster - linuxgpu
The new CSIRO high performance computing cluster will deliver up to 256 plus Teraflops of computing performance and consists of the following components:
- 128 Dual Xeon E5462 Compute Nodes (i.e. a total of 1024 2.8GHz compute cores) with 16 GB / 32 GB of RAM, 500 GB SATA storage and DDR InfiniBand interconnect
- 64 Tesla S1070 (256 GPUs with a total of 61 440 streaming processor cores)
- 144 port DDR InfiniBand Switch
- 80 Terabyte Hitachi NAS file system.
The cluster is supplied by Xenon Systems of Melbourne and is located in Canberra, Australia.
Condor Cycle Harvesting
Condor has been setup to take advantage of the large number of Windows desktop PCs across CSIRO. The majority of these machines are idle for many hours each the day. To utilise this spare computing capacity, state-based central manager computers have been configured to manage a pool of nominated PCs within each state. Jobs submitted in one state will preferentially run in that pool but can migrate (or "flock") to other pools if necessary.
Jobs are generally only allowed to run overnight, i.e. between 6:00pm and 8:00am, although a small class of "shortjobs" can run at any time. Individual desktop owners always have priority use of their machines. If a job is running on a desktop and any owner activity is detected (keyboard, mouse, CPU) then that job is terminated and sent elsewhere to run.
More detailed information about Condor in CSIRO can be found in the ASC User Documentation, and about Condor generally at the University of Wisconsin Condor website [external link] , where Condor was developed.
Queensland Facility for Advanced Bioinformatics - octopus
The Queensland Facility for Advanced Bioinformatics (QFAB) provides a compute cluster
and a variety of bioinformatics tools and databases for all interested CSIRO scientists.
For more information about the facility please visit the QFAB/CSIRO Research Computing Cluster portal
|