Arizona Transfer and Articulation

 

Opteron Cluster

Server

 

Supported by a SCREMS grant from the National Science Foundation
NSF grant 0421846 PI: Renate Mittelmann

 

Overview

Opteron is a Linux cluster built using Rocks Cluster Distribution, an award winning open-source high-performance Linux cluster solution. The machine is managed by the computing support group from the School of Mathematical and Statistical Sciences, Arizona State University.

Technical Description

Opteron has four components:

  • Systemserver (front-end)
  • Compute dodes
  • RAID disk
  • Switch

There are 17 compute nodes managed by one server, Opteron, two CPU's in each computer. Hence, the cluster system can provide up to 36 CPU's for computation at a time. And all these machines are interconnected using a switch.

System Server

Server machine is the gateway between the external network and the internal compute nodes. Server node has:

  • AMD Opteron® processor model 250 Dual CPU's
  • 2.40 GHz processor internal clock speed
  • 1MB internal L2 cache size
  • 8 x 1GB DDR-333 PC2700 RT-ECC memory
  • Tyan Thunder server motherboard
  • LSI Logic/Symbios Logic 53c1010 Ultra3 SCSI adapter
  • 3ware Inc 3ware 7000-series ATA-RAID
  • 2 x 250GB 8MB 7200 RPM SATA-150 hard disk


Computing nodes

Each node has:

  • AMD Opteron® processor model 250 Dual CPU's
  • 2.40 GHz processor internal clock speed
  • 1MB internal L2 cache size
  • 8 x 1GB DDR-333 PC2700 RT-ECC memory
  • AMD 8113 server motherboard
  • 36.4GB 8MB 10000RPM U320 SCSI hard disk (except one with ATA)


RAID

A RAID system is used to store user files. It has five 250GB SATA-150 hard disks in an Aries 12-bay U320 SATA system. This is connected to the system server using a LSI Logic/Symbios Logic 53c1010 Ultra3 SCSI adapter.

Switch

Cisco Catalyst 2970 switch that has 24 ports is used to connect the compute nodes and the system server.

Architecture

The architercture of CLUSTER platform (ROCKS):

Graphical view of the CLUSTER architecture platform

Software

 

  • Useful commands
    • $ cluster-fork ps -U$USER
      • To execute "ps" on all the nodes and check the processes for USER
      • e.g.: cluster-fork ps -Usting
    • $ cluster-ps PATTERN
      • To execute "ps -aux | grep PATTERN" on all the nodes
      • e.g.: cluster-ps MATLAB
    • $ cluster-fork [cmd]
      • Normally, you can run any command on all the nodes through cluster-fork
      • e.g.: cluster-fork cp -r MatLabToolBox/ tmp/
    • $ cluster-fork --query="select name from nodes where name like 'scompute-0-%'" [cmd]
      • To execute [cmd] on specific nodes through SQL command. The result of the SQL is a list of no des
      • e.g.: cluster-fork --query="select name from nodes where name like 'scompute-0-1%'" ps -Usting
    • $ cluster-fork --nodes="scompute-0-%d:1-4 scompute-0-%d:7,9,11,15-16" [cmd]
      • To execute [cmd] on specific nodes by assigning through "--nodes".
      • e.g.: cluster-fork --nodes="scompute-0-%d:1-4 scompute-0-%d:7,9,11,15-16" ps
  • Distributed Matlab
  • Torque Queue Information
  • Matlab using TORQUE Scheduler
  • Rocks Cluster
  • g95 Fortran 95 Compiler
  • GNU Compiler: gcc, g++, g77