Comparison of cluster software

From Wikipedia, the free encyclopedia

The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).

General information[]

Software Maintainer Category Development status ArchitectureOCS High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available
Accelerator Altair Job Scheduler actively developed Master/worker distributed HPC/HTC Proprietary Linux, Windows Cost Yes
Amoeba No active development MIT
Base One Foundation Component Library Proprietary
DIET INRIA, SysFera, Open Source All in one GridRPC, SPMD, Hierarchical and distributed architecture, CORBA HTC/HPC CeCILL Unix-like, Mac OS X, AIX Free
Enduro/X Mavimax, Ltd. Job/Data Scheduler actively developed SOA Grid HTC/HPC/HA GPLv2 or Commercial Linux, FreeBSD, MacOS, Solaris, AIX Free / Cost Yes
Ganglia Monitoring actively developed BSD Unix, Linux, Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. Free
Globus Toolkit Globus Alliance, Argonne National Laboratory Job/Data Scheduler actively developed SOA Grid Linux Free
Grid MP Univa (formerly United Devices) Job Scheduler no active development Distributed master/worker HTC/HPC Proprietary Windows, Linux, Mac OS X, Solaris Cost
Apache Mesos Apache actively developed Apache license v2.0 Linux Free Yes
Moab Cluster Suite Adaptive Computing Job Scheduler actively developed HPC Proprietary Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms Cost Yes
NetworkComputer Runtime Design Automation actively developed HTC/HPC Proprietary Unix-like, Windows Cost
OpenHPC OpenHPC project all in one actively developed HPC Linux (CentOS) Free No
OpenLava Teraproc Job Scheduler actively developed Master/Worker, multiple admin/submit nodes HTC/HPC GPL Linux Free Yes
PBS Pro Altair Job Scheduler actively developed Master/worker distributed with fail-over HPC/HTC AGPL or Proprietary Linux, Windows Free or Cost Yes
Proxmox Virtual Environment Proxmox Server Solutions Complete actively developed Open-source AGPLv3 Linux, Windows, other operating systems are known to work and are community supported Free Yes
Rocks Cluster Distribution Open Source/NSF grant All in one actively developed HTC/HPC OpenSource CentOS Free
Popular Power
ProActive INRIA, , Open Source All in one actively developed Master/Worker, SPMD, Distributed Component Model, Skeletons HTC/HPC GPL Unix-like, Windows, Mac OS X Free
RPyC Tomer Filiba actively developed MIT License *nix/Windows Free
SLURM SchedMD Job Scheduler actively developed HPC/HTC GPL Linux/*nix Free Yes
Spectrum LSF IBM Job Scheduler actively developed Master node with failover/exec clients, multiple admin/submit nodes, Suite addOns HPC/HTC Proprietary Unix, Linux, Windows Cost and Academic - model - Academic, Express, Standard, Advanced and Suites Yes
Oracle Grid Engine | Oracle Grid Engine (Sun Grid Engine, SGE) Altair Job Scheduler active Development moved to Altair Grid Engine Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary *nix/Windows Cost
Some of Grid Engine | Son of Grid Engine daimh Job Scheduler actively developed (stable/maintenance) Master node/exec clients, multiple admin/submit nodes HPC/HTC Open-source SISSL *nix Free No
SynfiniWay Fujitsu actively developed HPC/HTC ? Unix, Linux, Windows Cost
TORQUE Resource Manager Adaptive Computing Job Scheduler actively developed Proprietary Linux, *nix Cost Yes
UniCluster Univa All in One Functionality and development moved to UniCloud (see above) Free Yes
UNICORE
Altair Job Scheduler actively developed Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary *nix/Windows Cost
Xgrid Apple Computer
Software Maintainer Category Development status Architecture High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available

Table explanation

  • Software: The name of the application that is described

Technical information[]

Software Implementation Language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing
Enduro/X C/C++ OS Authentication GPG, AES-128, SHA1 None Any cluster Posix FS (gfs, gpfs, ocfs, etc.) Any cluster Posix FS (gfs, gpfs, ocfs, etc.) Heterogeneous OS Nice level OS Nice level SOA Queues, FIFO Yes OS Limits OS Limits Yes Yes No
HTCondor C++ GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous None, Triple DES, BLOWFISH None, MD5 None, NFS, AFS Not official, hack with ACL and NFS4 Heterogeneous Yes Yes Fair-share with some programmability basic (hard separation into different node) tested ~10000? tested ~100000? Yes MPI, OpenMP, PVM Yes
PBS Pro C/Python OS Authentication, Munge Any, e.g., NFS, Lustre, GPFS, AFS Limited availability Heterogeneous Yes Yes Fully configurable Yes tested ~50,000 Millions Yes MPI, OpenMP Yes
OpenLava C/C++ OS authentication None NFS Heterogeneous Linux Yes Yes Configurable Yes Yes, supports preemption based on priority Yes Yes
Slurm C Munge, None, Kerberos Heterogeneous Yes Yes Multifactor Fair-share yes tested 120k tested 100k No Yes Yes
Spectrum LSF C/C++ Multiple - OS Authentication/Kerberos Optional Optional Any - GPFS/Spectrum Scale, NFS, SMB Any - GPFS/Spectrum Scale, NFS, SMB Heterogeneous - HW and OS agnostic (AIX, Linux or Windows) Policy based - no queue to computenode binding Policy based - no queue to computegroup binding Batch, interactive, checkpointing, parallel and combinations yes and GPU aware (GPU License free) > 9.000 compute hots > 4 mio jobs a day Yes, supports preemption based on priority, supports checkpointing/resume Yes, fx parallel submissions for job collaboration over fx MPI Yes, with support for user, kernel or library level checkpointing environments
Torque C SSH, munge None, any Heterogeneous Yes Yes Programmable Yes tested tested Yes Yes Yes
C OS Authentication/Kerberos/Oauth2 Certificate Based Integrity Arbitrary, e.g. NFS, Lustre, HDFS, AFS AFS Fully heterogeneous Yes; automatically policy controlled (e.g. fair-share, deadline, resource dependent) or manual Yes; can be dependent on user groups as well as projects and is governed by policies Batch, interactive, checkpointing, parallel and combinations Yes, with core binding, GPU and Intel Xeon Phi support commercial deployments with many tens of thousands hosts >300K tested in commercial deployments Yes; can suspend job on interactive usage Yes, with support of arbitrary parallel environments such as OpenMPI, MPICH 1/2, MVAPICH 1/2, LAM, etc. Yes, with support for user, kernel or library level checkpointing environments
Software Implementation Language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing

Table Explanation

  • Software: The name of the application that is described
  • SMP aware:
    • basic: hard split into multiple virtual host
    • basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
    • dynamic: split the resource of the computer (CPU/Ram) on demand

History and adoption[]

See also[]

Notes[]

Retrieved from ""