Comparison of cluster software
The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).
General information[]
Software | Maintainer | Category | Development status | ArchitectureOCS | High-Performance/ High-Throughput Computing | License | Platforms supported | Cost | Paid support available |
---|---|---|---|---|---|---|---|---|---|
Accelerator | Altair | Job Scheduler | actively developed | Master/worker distributed | HPC/HTC | Proprietary | Linux, Windows | Cost | Yes |
Amoeba | No active development | MIT | |||||||
Base One Foundation Component Library | Proprietary | ||||||||
DIET | INRIA, SysFera, Open Source | All in one | GridRPC, SPMD, Hierarchical and distributed architecture, CORBA | HTC/HPC | CeCILL | Unix-like, Mac OS X, AIX | Free | ||
Enduro/X | Mavimax, Ltd. | Job/Data Scheduler | actively developed | SOA Grid | HTC/HPC/HA | GPLv2 or Commercial | Linux, FreeBSD, MacOS, Solaris, AIX | Free / Cost | Yes |
Ganglia | Monitoring | actively developed | BSD | Unix, Linux, Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. | Free | ||||
Globus Toolkit | Globus Alliance, Argonne National Laboratory | Job/Data Scheduler | actively developed | SOA Grid | Linux | Free | |||
Grid MP | Univa (formerly United Devices) | Job Scheduler | no active development | Distributed master/worker | HTC/HPC | Proprietary | Windows, Linux, Mac OS X, Solaris | Cost | |
Apache Mesos | Apache | actively developed | Apache license v2.0 | Linux | Free | Yes | |||
Moab Cluster Suite | Adaptive Computing | Job Scheduler | actively developed | HPC | Proprietary | Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms | Cost | Yes | |
NetworkComputer | Runtime Design Automation | actively developed | HTC/HPC | Proprietary | Unix-like, Windows | Cost | |||
OpenHPC | OpenHPC project | all in one | actively developed | HPC | Linux (CentOS) | Free | No | ||
OpenLava | Teraproc | Job Scheduler | actively developed | Master/Worker, multiple admin/submit nodes | HTC/HPC | GPL | Linux | Free | Yes |
PBS Pro | Altair | Job Scheduler | actively developed | Master/worker distributed with fail-over | HPC/HTC | AGPL or Proprietary | Linux, Windows | Free or Cost | Yes |
Proxmox Virtual Environment | Proxmox Server Solutions | Complete | actively developed | Open-source AGPLv3 | Linux, Windows, other operating systems are known to work and are community supported | Free | Yes | ||
Rocks Cluster Distribution | Open Source/NSF grant | All in one | actively developed | HTC/HPC | OpenSource | CentOS | Free | ||
Popular Power | |||||||||
ProActive | INRIA, , Open Source | All in one | actively developed | Master/Worker, SPMD, Distributed Component Model, Skeletons | HTC/HPC | GPL | Unix-like, Windows, Mac OS X | Free | |
RPyC | Tomer Filiba | actively developed | MIT License | *nix/Windows | Free | ||||
SLURM | SchedMD | Job Scheduler | actively developed | HPC/HTC | GPL | Linux/*nix | Free | Yes | |
Spectrum LSF | IBM | Job Scheduler | actively developed | Master node with failover/exec clients, multiple admin/submit nodes, Suite addOns | HPC/HTC | Proprietary | Unix, Linux, Windows | Cost and Academic - model - Academic, Express, Standard, Advanced and Suites | Yes |
Oracle Grid Engine | Oracle Grid Engine (Sun Grid Engine, SGE) | Altair | Job Scheduler | active Development moved to Altair Grid Engine | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Proprietary | *nix/Windows | Cost | |
Some of Grid Engine | Son of Grid Engine | daimh | Job Scheduler | actively developed (stable/maintenance) | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Open-source SISSL | *nix | Free | No |
SynfiniWay | Fujitsu | actively developed | HPC/HTC | ? | Unix, Linux, Windows | Cost | |||
TORQUE Resource Manager | Adaptive Computing | Job Scheduler | actively developed | Proprietary | Linux, *nix | Cost | Yes | ||
UniCluster | Univa | All in One | Functionality and development moved to UniCloud (see above) | Free | Yes | ||||
UNICORE | |||||||||
Altair | Job Scheduler | actively developed | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Proprietary | *nix/Windows | Cost | ||
Xgrid | Apple Computer | ||||||||
Software | Maintainer | Category | Development status | Architecture | High-Performance/ High-Throughput Computing | License | Platforms supported | Cost | Paid support available |
Table explanation
- Software: The name of the application that is described
Technical information[]
Software | Implementation Language | Authentication | Encryption | Integrity | Global File System | Global File System + Kerberos | Heterogeneous/ Homogeneous exec node | Jobs priority | Group priority | Queue type | SMP aware | Max exec node | Max job submitted | CPU scavenging | Parallel job | Job checkpointing |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Enduro/X | C/C++ | OS Authentication | GPG, AES-128, SHA1 | None | Any cluster Posix FS (gfs, gpfs, ocfs, etc.) | Any cluster Posix FS (gfs, gpfs, ocfs, etc.) | Heterogeneous | OS Nice level | OS Nice level | SOA Queues, FIFO | Yes | OS Limits | OS Limits | Yes | Yes | No |
HTCondor | C++ | GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous | None, Triple DES, BLOWFISH | None, MD5 | None, NFS, AFS | Not official, hack with ACL and NFS4 | Heterogeneous | Yes | Yes | Fair-share with some programmability | basic (hard separation into different node) | tested ~10000? | tested ~100000? | Yes | MPI, OpenMP, PVM | Yes |
PBS Pro | C/Python | OS Authentication, Munge | Any, e.g., NFS, Lustre, GPFS, AFS | Limited availability | Heterogeneous | Yes | Yes | Fully configurable | Yes | tested ~50,000 | Millions | Yes | MPI, OpenMP | Yes | ||
OpenLava | C/C++ | OS authentication | None | NFS | Heterogeneous Linux | Yes | Yes | Configurable | Yes | Yes, supports preemption based on priority | Yes | Yes | ||||
Slurm | C | Munge, None, Kerberos | Heterogeneous | Yes | Yes | Multifactor Fair-share | yes | tested 120k | tested 100k | No | Yes | Yes | ||||
Spectrum LSF | C/C++ | Multiple - OS Authentication/Kerberos | Optional | Optional | Any - GPFS/Spectrum Scale, NFS, SMB | Any - GPFS/Spectrum Scale, NFS, SMB | Heterogeneous - HW and OS agnostic (AIX, Linux or Windows) | Policy based - no queue to computenode binding | Policy based - no queue to computegroup binding | Batch, interactive, checkpointing, parallel and combinations | yes and GPU aware (GPU License free) | > 9.000 compute hots | > 4 mio jobs a day | Yes, supports preemption based on priority, supports checkpointing/resume | Yes, fx parallel submissions for job collaboration over fx MPI | Yes, with support for user, kernel or library level checkpointing environments |
Torque | C | SSH, munge | None, any | Heterogeneous | Yes | Yes | Programmable | Yes | tested | tested | Yes | Yes | Yes | |||
C | OS Authentication/Kerberos/Oauth2 | Certificate Based | Integrity | Arbitrary, e.g. NFS, Lustre, HDFS, AFS | AFS | Fully heterogeneous | Yes; automatically policy controlled (e.g. fair-share, deadline, resource dependent) or manual | Yes; can be dependent on user groups as well as projects and is governed by policies | Batch, interactive, checkpointing, parallel and combinations | Yes, with core binding, GPU and Intel Xeon Phi support | commercial deployments with many tens of thousands hosts | >300K tested in commercial deployments | Yes; can suspend job on interactive usage | Yes, with support of arbitrary parallel environments such as OpenMPI, MPICH 1/2, MVAPICH 1/2, LAM, etc. | Yes, with support for user, kernel or library level checkpointing environments | |
Software | Implementation Language | Authentication | Encryption | Integrity | Global File System | Global File System + Kerberos | Heterogeneous/ Homogeneous exec node | Jobs priority | Group priority | Queue type | SMP aware | Max exec node | Max job submitted | CPU scavenging | Parallel job | Job checkpointing |
Table Explanation
- Software: The name of the application that is described
- SMP aware:
- basic: hard split into multiple virtual host
- basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
- dynamic: split the resource of the computer (CPU/Ram) on demand
History and adoption[]
This section is empty. You can help by . (July 2010) |
See also[]
- List of distributed computing projects
- List of cluster management software
- Computer cluster
- Grid computing
- World Community Grid
- Distributed computing
- Distributed resource management
- High-Throughput Computing
- Job Processing Cycle
- Batch processing
- Fallacies of Distributed Computing
Notes[]
Categories:
- Cluster computing
- Software comparisons
- Job scheduling