Existing Libraries¶
Among the most popular approaches is the one taken by Cython [BBC+11]. In fact, Cython is not merely a library, but an entire programming language, which is a strict superset of the Python language. This means that all Python programs are valid Cython programs, but the converse is not true. Cython adds type annotations to the Python language, where each variable can be optionally assigned a static type. This extra information allows for the source code to be translated directly into C, and compiled and linked against Python, and then imported as a module. While the approach taken by Cython is quite powerful, it does have its drawbacks. First, it seems to be mainly geared towards speeding up existing Python applications by adding type annotations to the critical parts, and not towards porting existing C/C++ libraries. Interfacing with external libraries is quite cumbersome as one has to copy all of the header file declarations into a form that can be parsed by Cython. This violates the “don’t repeat yourself” (DRY) principle where every piece of knowledge must have a single unambiguous representation within the system. In the case of Cython, however, declarations must exist both in the C/C++ header files as well as within Cython files. Additionally, just to bridge Python and C/C++ one has to learn a completely new third programming language with its new syntax and semantics.
Another approach for speeding up Python is to write the low level glue code manually in C/C++. This requires intimate knowledge of the CPython API as well as the addition of a large amount of boilerplate code for manual data type conversion and memory management. While this gives a lot of control to the programmer, it also places limitations on productivity due to the time consuming nature of programming at such a low level. Additionally, this approach intimately depends on the internals of the original Python implementation, and is not portable to other implementations of the same programming language such as PyPy, Jython, or IronPython.
A third approach is to use the Ctypes module that is included by default in the Python interpreter. Ctypes is a foreign function library for Python that provides C compatible data types that can be used to call functions directly from compiled DLL or shared library files. Ctypes works at the application binary interface (ABI) level rather than the application programming interface (API) level. One still needs to provide descriptions in Python of the exact signatures of the functions that are being exported. In fact one has to be quite careful, because at the ABI level if there is any mismatch, one can get quite nasty segmentation faults and data corruption errors, whereas the worst that can happen at the API level is for the program not to compile. Again, like the approach taken by Cython, this violates the DRY principle since declarations must be copied into Python.