Standard library extension modules¶
In this section, we explain how to configure and compile the CPython project with a C extension module. We will not explain how to write a C extension module and prefer to give you some links where you can read good documentation:
Some modules in the standard library, such as datetime or pickle, have identical implementations in C and Python; the C implementation, when available, is expected to improve performance (such extension modules are commonly referred to as accelerator modules).
Other modules mainly implemented in Python may import a C helper extension providing implementation details (for instance, the csv module uses the internal _csv module defined in Modules/_csv.c).
Classifying extension modules¶
Extension modules can be classified into two categories:
A built-in extension module is a module built and shipped with the Python interpreter. A built-in module is statically linked into the interpreter, thereby lacking a __file__ attribute.
See also
sys.builtin_module_names — names of built-in modules.
Built-in modules are built with the Py_BUILD_CORE_BUILTIN macro defined.
A shared (or dynamic) extension module is built as a shared library (.so or .dll file) and is dynamically linked into the interpreter.
In particular, the module’s __file__ attribute contains the path to the .so or .dll file.
Shared modules are built with the Py_BUILD_CORE_MODULE macro defined. Using the Py_BUILD_CORE_BUILTIN macro instead causes an ImportError when importing the module.
Note
Informally, built-in extension modules can be regarded as required while shared extension modules are optional in the sense that they might be supplied, overridden or disabled externally.
Usually, accelerator modules are built as shared extension modules, especially if they already have a pure Python implementation.
According to PEP 399, new extension modules MUST provide a working and tested pure Python implementation, unless a special dispensation from the Steering Council is given.
Adding an extension module to CPython¶
Assume that the standard library contains a pure Python module foo with the following foo.greet() function:
Instead of using the Python implementation of foo.greet(), we want to use its corresponding C extension implementation exposed in the _foo module. Ideally, we want to modify Lib/foo.py as follows:
Note
Accelerator modules should never be imported directly. The convention is to mark them as private implementation details with the underscore prefix (namely, _foo in this example).
In order to incorporate the accelerator module, we need to determine:
where to update the CPython project tree with the extension module source code,
which files to modify to configure and compile the CPython project, and
which Makefile rules to invoke at the end.
Updating the CPython project tree¶
Usually, accelerator modules are added in the Modules directory of the CPython project. If more than one file is needed for the extension module, it is more convenient to create a sub-directory in Modules.
In the simplest example where the extension module consists of one file, it may be placed in Modules as Modules/_foomodule.c. For a non-trivial example of the extension module _foo, we consider the following working tree:
Modules/_foo/_foomodule.c — the extension module implementation.
Modules/_foo/helper.h — the extension helpers declarations.
Modules/_foo/helper.c — the extension helpers implementations.
By convention, the source file containing the extension module implementation is called <NAME>module.c, where <NAME> is the name of the module that will be later imported (in our case _foo). In addition, the directory containing the implementation should also be named similarly.
Tip
Functions or data that do not need to be shared across different C source files should be declared static to avoid exporting their symbols from libpython.
If symbols need to be exported, their names must start with Py or _Py. This can be verified by make smelly. For more details, please refer to the section on Changing Python’s C API.
Tip
Recall that the PyInit_<NAME> function must be suffixed by the module name <NAME> used in import statements (here _foo), and which usually coincides with PyModuleDef.m_name.
Other identifiers such as those used in Argument Clinic inputs do not have such naming requirements.
Configuring the CPython project¶
Now that we have added our extension module to the CPython source tree, we need to update some configuration files in order to compile the CPython project on different platforms.
Updating Modules/Setup.{bootstrap,stdlib}.in¶
Depending on whether the extension module is required to get a functioning interpreter or not, we update Modules/Setup.bootstrap.in or Modules/Setup.stdlib.in. In the former case, the extension module is necessarily built as a built-in extension module.
Tip
For accelerator modules, Modules/Setup.stdlib.in should be preferred over Modules/Setup.bootstrap.in.
For built-in extension modules, update Modules/Setup.bootstrap.in by adding the following line after the *static* marker:
The syntax is <NAME> <SOURCES> where <NAME> is the name of the module used in import statements and <SOURCES> is the list of space-separated source files.
For other extension modules, update Modules/Setup.stdlib.in by adding the following line after the *@MODULE_BUILDTYPE@* marker but before the *shared* marker:
The @MODULE_<NAME_UPPER>_TRUE@<NAME> marker expects <NAME_UPPER> to be the upper-cased form of <NAME>, where <NAME> has the same meaning as before (in our case, <NAME_UPPER> and <NAME> are _FOO and _foo respectively). The marker is followed by the list of source files.
If the extension module must be built as a shared module, put the @MODULE__FOO_TRUE@_foo line after the *shared* marker:
Updating configure.ac¶
Locate the SRCDIRS variable and add the following line:
AC_SUBST([SRCDIRS]) SRCDIRS="\ ... Modules/_foo \ ..."Note
This step is only needed when adding new source directories to the CPython project.
Find the section containing PY_STDLIB_MOD and PY_STDLIB_MOD_SIMPLE usages and add the following line:
dnl always enabled extension modules ... PY_STDLIB_MOD_SIMPLE([_foo], [-I\$(srcdir)/Modules/_foo], []) ...The PY_STDLIB_MOD_SIMPLE macro takes as arguments:
the module name <NAME> used in import statements,
the compiler flags (CFLAGS), and
the linker flags (LDFLAGS).
If the extension module may not be enabled or supported depending on the host configuration, use the PY_STDLIB_MOD macro instead, which takes as arguments:
the module name <NAME> used in import statements,
a boolean indicating whether the extension is enabled or not,
a boolean indicating whether the extension is supported or not,
the compiler flags (CFLAGS), and
the linker flags (LDFLAGS).
For instance, enabling the _foo extension on Linux platforms, but only providing support for 32-bit architecture, is achieved as follows:
PY_STDLIB_MOD([_foo], [test "$ac_sys_system" = "Linux"], [test "$ARCH_RUN_32BIT" = "true"], [-I\$(srcdir)/Modules/_foo], [])More generally, the host’s configuration status of the extension is determined as follows:
Enabled
Supported
Status
true
true
yes
true
false
missing
false
true or false
disabled
The extension status is n/a if the extension is marked unavailable by the PY_STDLIB_MOD_SET_NA macro. To mark an extension as unavailable, find the usages of PY_STDLIB_MOD_SET_NA in configure.ac and add the following line:
dnl Modules that are not available on some platforms AS_CASE([$ac_sys_system], ... [PLATFORM_NAME], [PY_STDLIB_MOD_SET_NA([_foo])], ... )
Tip
Consider reading the comments and configurations for existing modules in configure.ac for guidance on adding new external build dependencies for extension modules that need them.
Updating Makefile.pre.in¶
If needed, add the following line to the section for module dependencies:
The MODULE_<NAME_UPPER>_DEPS variable follows the same naming requirements as the @MODULE_<NAME_UPPER>_TRUE@<NAME> marker.
Updating MSVC project files¶
We describe the minimal steps for compiling on Windows using MSVC.
Update PC/config.c:
... // add the entry point prototype extern PyObject* PyInit__foo(void); ... // update the entry points table struct _inittab _PyImport_Inittab[] = { ... {"_foo", PyInit__foo}, ... {0, 0} }; ...Each item in _PyImport_Inittab consists of the module name to import, here _foo, with the corresponding PyInit_* function correctly suffixed.
Update PCbuild/pythoncore.vcxproj:
<!-- group with header files ..\Modules\<MODULE>.h --> <ItemGroup> ... <ClInclude Include="..\Modules\_foo\helper.h" /> ... </ItemGroup> <!-- group with source files ..\Modules\<MODULE>.c --> <ItemGroup> ... <ClCompile Include="..\Modules\_foo\_foomodule.c" /> <ClCompile Include="..\Modules\_foo\helper.c" /> ... </ItemGroup>Update PCbuild/pythoncore.vcxproj.filters:
<!-- group with header files ..\Modules\<MODULE>.h --> <ItemGroup> ... <ClInclude Include="..\Modules\_foo\helper.h"> <Filter>Modules\_foo</Filter> </ClInclude> ... </ItemGroup> <!-- group with source files ..\Modules\<MODULE>.c --> <ItemGroup> ... <ClCompile Include="..\Modules\_foo\_foomodule.c"> <Filter>Modules\_foo</Filter> </ClCompile> <ClCompile Include="..\Modules\_foo\helper.c"> <Filter>Modules\_foo</Filter> </ClCompile> ... <ItemGroup>
Tip
Header files use <ClInclude> tags, whereas source files use <ClCompile> tags.
Compiling the CPython project¶
Now that the configuration is in place, it remains to compile the project:
Tip
Use make -jN to speed up compilation by utilizing as many CPU cores as possible, where N is as many CPU cores you want to spare (and have memory for). Be careful using make -j with no argument, as this puts no limit on the number of jobs, and compilation can sometimes use up a lot of memory (like when building with LTO).
make regen-configure updates the configure script.
The configure script must be generated using a specific version of autoconf. To that end, the Tools/build/regen-configure.sh script which the regen-configure rule is based on either requires Docker or Podman, the latter being assumed by default.
Tip
We recommend installing Podman instead of Docker since the former does not require a background service and avoids creating files owned by the root user in some cases.
make regen-all is responsible for regenerating header files and invoking other scripts, such as Argument Clinic. Execute this rule if you do not know which files should be updated.
make regen-stdlib-module-names updates the standard module names, making _foo discoverable and importable via import _foo.
The final make step is generally not needed since the previous make invocations may completely rebuild the project, but it could be needed in some specific cases.
Troubleshooting¶
This section addresses common issues that you may face when following this example of adding an extension module.
No rule to make target regen-configure¶
This usually happens after running make distclean (which removes the Makefile). The solution is to regenerate the configure script as follows:
If missing, the configure script can be regenerated by executing Tools/build/regen-configure.sh:
make regen-configure and missing permissions with Docker¶
If Docker complains about missing permissions, this Stack Overflow post could be useful in solving the issue: How to fix docker: permission denied. Alternatively, you may try using Podman.
Missing Py_BUILD_CORE define when using internal headers¶
By default, the CPython Stable ABI is exposed via #include "Python.h". In some cases, this may be insufficient and internal headers from Include/internal are needed; in particular, those headers require the Py_BUILD_CORE macro to be defined.
To that end, one should define the Py_BUILD_CORE_BUILTIN or the Py_BUILD_CORE_MODULE macro depending on whether the extension module is built-in or shared. Using either of the two macros implies Py_BUILD_CORE and gives access to CPython internals:
Tips¶
In this section, we give some tips for improving the quality of extension modules meant to be included in the standard library.
Restricting to the Limited API¶
In order for non-CPython implementations to benefit from new extension modules, it is recommended to use the Limited API. Instead of exposing the entire Stable ABI, define the Py_LIMITED_API macro before the #include "Python.h" directive:
This makes the extension module non-CPython implementation-friendly by removing the dependencies to CPython internals.