Overview: PP is a python module which provides mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network). It is light, easy to install and integrate with other python software. PP is an open source and cross-platform module written in pure python Features:
- Parallel execution of python code on SMP and clusters
- Easy to understand and implement job-based parallelization technique (easy to convert serial application in parallel)
- Automatic detection of the optimal configuration (by default the number of worker processes is set to the number of effective processors)
- Dynamic processors allocation (number of worker processes can be changed at runtime)
- Low overhead for subsequent jobs with the same function (transparent caching is implemented to decrease the overhead)
- Dynamic load balancing (jobs are distributed between processors at runtime)
- Fault-tolerance (if one of the nodes fails tasks are rescheduled on others)
- Auto-discovery of computational resources
- Dynamic allocation of computational resources (consequence of auto-discovery and fault-tolerance)
- SHA based authentication for network connections
- Cross-platform portability and interoperability (Windows, Linux, Unix, Mac OS X)
- Cross-architecture portability and interoperability (x86, x86-64, etc.)
- Open source
request a feature Motivation:
Nowadays software written in python finds applications in broad range of the categories including business logic, data analysis and scientific calculations. This together with wide availability of SMP computers (multi-processor or multi-core) and clusters (computers connected via network) on the market create the demand in parallel execution of python code.
The most simple and common way to write parallel applications for SMP computers is to use threads. Although, it appears that if the application is computation-bound using 'thread' or 'threading' python modules will not allow to run python byte-code in parallel. The reason is that python interpreter uses GIL (Global Interpreter Lock) for internal bookkeeping. This lock allows to execute only one python byte-code instruction at a time even on an SMP computer. PP module overcomes this limitation and provides a simple way to write parallel python applications. Internally ppsmp uses processes and IPC (Inter Process Communications) to organize parallel computations. All the details and complexity of the latter are completely taken care of, and your application just submits jobs and retrieves their results (the easiest way to write parallel applications). To make things even better, the software written with PP works in parallel even on many computers connected via local network or Internet. Cross-platform portability and dynamic load-balancing allows PP to parallelize computations efficiently even on heterogeneous and multi-platform clusters. continue discussion
Installation: Any platform: download a module archive and extract it to a local directory. Run the setup script: python setup.py install Windows: download and execute windows installer binary. Documentation: Module API Quick start guide, SMP Quick start guide, clusters Advanced guide, clusters Command line options, ppserver.py PP FAQ
Examples: Parallel Python usage examples Download: Parallel Python downloads Support forums: Parallel Python forums provides help from parallel python community. Please help us to spread the word, link to us: <a href="http://www.parallelpython.com">Parallel Python</a> |