Research Facilities: Linux Cluster

Linux Cluster Performance

Hardware Configuration
Pentium 4 Performance Characteristics
 



Hardware Configuration

A high-performance Linux cluster using RedHat Linux and Open Source Cluster Application Resources (OSCAR)


The current kernel in use as of 06/17/2003 is 2.4.21

Head Node:
CPU: Intel Pentium 4 - 1.70GHz
L1 Cache: 20KB; 8KB data + 12KB Execution Trace
L2 Cache: 256KB
Memory: 1024MB RDRAM @ 400MHz (100MHz * 4)
Front Side Bus: 400 MHz
Storage: 1358735 MB, SCSI RAID 5 (120GB * 12) using a 3ware Escalade 7500 Parallel ATA RAID controller
Network: Gigabit Ethernet (copper)

Compute Nodes (1-26):
CPU: Intel Pentium 4 - 1.70GHz
L1 Cache: 20KB; 8KB data + 12KB Execution Trace
L2 Cache: 256KB
Memory: 1024MB RDRAM @ 400MHz (100MHz * 4)
Storage: 80GB ATA100
Network: Gigabit Ethernet (copper)

Compute Nodes (27-42):
CPU: Intel Pentium 4 - 1.70GHz
L1 Cache: 20KB; 8KB data + 12KB Execution Trace
L2 Cache: 256KB
Memory: 2048MB RDRAM @ 400MHz (100MHz * 4)
Storage: 40GB ATA100
Network: Gigabit Ethernet (copper)

The head and 42 compute nodes are currently linked using full-duplex Gigabit Ethernet interconnect.


Performance Characteristics of P4

The Intel Pentium 4 1.7GHz processor used is a 7th generation x86 processor with additional features to improve performance. These features are the basis of the Intel NetBurst microarchitecture. Features ideal for the Pentium 4's use as a computation processor:
1. Hyper-Pipelined Technology - The pipeline depth is doubled from 10 used on the P6 microarchitecture to 20. The deeper pipeline allows the architecture design to easily scale to high frequencies.
2. 400-MHz System Bus - A quad pumped 400MHz system bus allows up to 3.2GB/sec of bandwidth from the processor to memory. This is a large improvement compared to 1.06GB/sec from the P6 133MHz system bus.
3. Level 1 Execution Trace Cache and Advanced Dynamic Execution - Used to deliver a high volume of instructions to processor execution units and in reducing the recovery time needed for mis-predicted branches. A reduction of up to 33% of branch mis-predictions over the P6 architecture is possible.
4. Rapid Execution Engine - The Arithmetic Logic Units (ALUs) operate at twice the core frequency. This allows basic integer instructions to execute in one-half a clock cycle. This results in a 3.4GHz frequency when using 1.7GHz processors.

Additional information on the Pentium 4 can be found at: http://developer.intel.com/design/Pentium4/prodbref/

Despite lower Instructions Per Clock (IPC) than the P6 core (due to the increased pipeline depth), the Pentium 4 1.7 is capable of performance of up to 2 GFLOPS due to high clock speeds. This is shown by the SiSoft Sandra 2001 CPU synthetic Benchmark composed of the Dhrystone Benchmark for ALU performance in MIPS and the Whetstone Benchmark for FPU performance in MFLOPS.

Source: http://www.activewin.com/reviews/hardware/processors/intel/p417ghz/bench.shtml
Page Top | Homepage | Feedback | Privacy Statement | Statistics
HPCFD Group, Virginia Tech, 2002 Department Of Mechanical Engineering at VT