Parallel Python
Overview
Parallel Python is a python module which provides mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network).
It is light, easy to install and integrate with other python software.
Parallel Python is an open source and cross-platform module written in pure python
Features
- Parallel execution of python code on SMP and clusters
- Easy to understand and implement job-based parallelization technique (easy to convert serial application in parallel)
- Automatic detection of the optimal configuration (by default the number of worker processes is set to the number of effective processors)
- Dynamic processors allocation (number of worker processes can be changed at runtime)
- Low overhead for subsequent jobs with the same function (transparent caching is implemented to decrease the overhead)
- Dynamic load balancing (jobs are distributed between processors at runtime)
- Fault-tolerance (if one of the nodes fails tasks are rescheduled on others)
- Auto-discovery of computational resources
- Dynamic allocation of computational resources (consequence of auto-discovery and fault-tolerance)
- SHA based authentication for network connections
- Cross-platform portability and interoperability (Windows, Linux, Unix, Mac OS X)
- Cross-architecture portability and interoperability (x86, x86-64, etc.)
- Open source
Motivation
Nowadays software written in python finds applications in broad range of the categories including business logic, data analysis and scientific calculations. This together with wide availability of SMP computers (multi-processor or multi-core) and clusters (computers connected via network) on the market create the demand in parallel execution of python code.
The most simple and common way to write parallel applications for SMP computers is to use threads. Although, it appears that if the application is computation-bound using 'thread' or 'threading' python modules will not allow to run python byte-code in parallel. The reason is that python interpreter uses GIL (Global Interpreter Lock) for internal bookkeeping. This lock allows to execute only one python byte-code instruction at a time even on an SMP computer.
Parallel Python module overcomes this limitation and provides a simple way to write parallel python applications. Internally ppsmp uses processes and IPC (Inter Process Communications) to organize parallel computations. All the details and complexity of the latter are completely taken care of, and your application just submits jobs and retrieves their results (the easiest way to write parallel applications).
To make things even better, the software written with Parallel Python works in parallel even on many computers connected via local network or Internet. Cross-platform portability and dynamic load-balancing allows Parallel Python to parallelize computations efficiently even on heterogeneous and multi-platform clusters.
Installation
Any platform: download a module archive and extract it to a local directory. Run the setup script: python setup.py install
Windows: download and execute windows installer binary.
Documentation
Module API
Quick start guide, SMP
Quick start guide, clusters
Advanced guide, clusters
Command line options, ppserver.py
Parallel Python FAQ
Examples
Parallel Python usage examples
Downloads
© 2024 parallelpython.com All rights reserved