Array API¶
Array structure and data access¶
These macros all access the :ctype:`PyArrayObject` structure members. The input argument, arr, can be any :ctype:`PyObject *` that is directly interpretable as a :ctype:`PyArrayObject *` (any instance of the :cdata:`PyArray_Type` and its sub-types).
Data access¶
These functions and macros provide easy access to elements of the ndarray from C. These work for all arrays. You may need to take care when accessing the data in the array, however, if it is not in machine byte-order, misaligned, or not writeable. In other words, be sure to respect the state of the flags unless you know what you are doing, or have previously guaranteed an array that is writeable, aligned, and in machine byte-order using :cfunc:`PyArray_FromAny`. If you wish to handle all types of arrays, the copyswap function for each type is useful for handling misbehaved arrays. Some platforms (e.g. Solaris) do not like misaligned data and will crash if you de-reference a misaligned pointer. Other platforms (e.g. x86 Linux) will just work more slowly with misaligned data.
Creating arrays¶
From scratch¶
Warning
If data is passed to :cfunc:`PyArray_NewFromDescr` or :cfunc:`PyArray_New`, this memory must not be deallocated until the new array is deleted. If this data came from another Python object, this can be accomplished using :cfunc:`Py_INCREF` on that object and setting the base member of the new array to point to that object. If strides are passed in they must be consistent with the dimensions, the itemsize, and the data of the array.
From other objects¶
Dealing with types¶
General check of Python Type¶
Data-type checking¶
For the typenum macros, the argument is an integer representing an enumerated array data type. For the array type checking macros the argument must be a :ctype:`PyObject *` that can be directly interpreted as a :ctype:`PyArrayObject *`.
Converting data types¶
New data types¶
Special functions for NPY_OBJECT¶
Array flags¶
The flags
attribute of the PyArrayObject
structure contains
important information about the memory used by the array (pointed to
by the data member) This flag information must be kept accurate or
strange results and even segfaults may result.
There are 6 (binary) flags that describe the memory area used by the
data buffer. These constants are defined in arrayobject.h
and
determine the bit-position of the flag. Python exposes a nice
attribute- based interface as well as a dictionary-like interface for
getting (and, if appropriate, setting) these flags.
Memory areas of all kinds can be pointed to by an ndarray, necessitating
these flags. If you get an arbitrary PyArrayObject
in C-code, you
need to be aware of the flags that are set. If you need to guarantee
a certain kind of array (like :cdata:`NPY_ARRAY_C_CONTIGUOUS` and
:cdata:`NPY_ARRAY_BEHAVED`), then pass these requirements into the
PyArray_FromAny function.
Basic Array Flags¶
An ndarray can have a data segment that is not a simple contiguous chunk of well-behaved memory you can manipulate. It may not be aligned with word boundaries (very important on some platforms). It might have its data in a different byte-order than the machine recognizes. It might not be writeable. It might be in Fortan-contiguous order. The array flags are used to indicate what can be said about data associated with an array.
In versions 1.6 and earlier of NumPy, the following flags did not have the _ARRAY_ macro namespace in them. That form of the constant names is deprecated in 1.7.
Note
Arrays can be both C-style and Fortran-style contiguous simultaneously. This is clear for 1-dimensional arrays, but can also be true for higher dimensional arrays.
Even for contiguous arrays a stride for a given dimension
arr.strides[dim]
may be arbitrary if arr.shape[dim] == 1
or the array has no elements.
It does not generally hold that self.strides[-1] == self.itemsize
for C-style contiguous arrays or self.strides[0] == self.itemsize
for
Fortran-style contiguous arrays is true. The correct way to access the
itemsize
of an array from the C API is PyArray_ITEMSIZE(arr)
.
See also
:cfunc:`PyArray_UpdateFlags` (obj, flags) will update the obj->flags
for flags
which can be any of :cdata:`NPY_ARRAY_C_CONTIGUOUS`,
:cdata:`NPY_ARRAY_F_CONTIGUOUS`, :cdata:`NPY_ARRAY_ALIGNED`, or
:cdata:`NPY_ARRAY_WRITEABLE`.
Combinations of array flags¶
Flag-like constants¶
These constants are used in :cfunc:`PyArray_FromAny` (and its macro forms) to specify desired properties of the new array.
Flag checking¶
For all of these macros arr must be an instance of a (subclass of) :cdata:`PyArray_Type`, but no checking is done.
Warning
It is important to keep the flags updated (using :cfunc:`PyArray_UpdateFlags` can help) whenever a manipulation with an array is performed that might cause them to change. Later calculations in NumPy that rely on the state of these flags do not repeat the calculation to update them.
Array method alternative API¶
Conversion¶
Shape Manipulation¶
Warning
matrix objects are always 2-dimensional. Therefore, :cfunc:`PyArray_Squeeze` has no effect on arrays of matrix sub-class.
Item selection and manipulation¶
Calculation¶
Tip
Pass in :cdata:`NPY_MAXDIMS` for axis in order to achieve the same
effect that is obtained by passing in axis = None
in Python
(treating the array as a 1-d array).
Note
The out argument specifies where to place the result. If out is
NULL, then the output array is created, otherwise the output is
placed in out which must be the correct size and type. A new
reference to the ouput array is always returned even when out
is not NULL. The caller of the routine has the responsability
to DECREF
out if not NULL or a memory-leak will occur.
Note
The rtype argument specifies the data-type the reduction should take place over. This is important if the data-type of the array is not “large” enough to handle the output. By default, all integer data-types are made at least as large as :cdata:`NPY_LONG` for the “add” and “multiply” ufuncs (which form the basis for mean, sum, cumsum, prod, and cumprod functions).
Functions¶
Array Functions¶
Note
The simulation of a C-style array is not complete for 2-d and 3-d arrays. For example, the simulated arrays of pointers cannot be passed to subroutines expecting specific, statically-defined 2-d and 3-d arrays. To pass to functions requiring those kind of inputs, you must statically define the required array and copy data.
Other functions¶
Auxiliary Data With Object Semantics¶
New in version 1.7.0.
When working with more complex dtypes which are composed of other dtypes, such as the struct dtype, creating inner loops that manipulate the dtypes requires carrying along additional data. NumPy supports this idea through a struct :ctype:`NpyAuxData`, mandating a few conventions so that it is possible to do this.
Defining an :ctype:`NpyAuxData` is similar to defining a class in C++, but the object semantics have to be tracked manually since the API is in C. Here’s an example for a function which doubles up an element using an element copier function as a primitive.:
typedef struct {
NpyAuxData base;
ElementCopier_Func *func;
NpyAuxData *funcdata;
} eldoubler_aux_data;
void free_element_doubler_aux_data(NpyAuxData *data)
{
eldoubler_aux_data *d = (eldoubler_aux_data *)data;
/* Free the memory owned by this auxadata */
NPY_AUXDATA_FREE(d->funcdata);
PyArray_free(d);
}
NpyAuxData *clone_element_doubler_aux_data(NpyAuxData *data)
{
eldoubler_aux_data *ret = PyArray_malloc(sizeof(eldoubler_aux_data));
if (ret == NULL) {
return NULL;
}
/* Raw copy of all data */
memcpy(ret, data, sizeof(eldoubler_aux_data));
/* Fix up the owned auxdata so we have our own copy */
ret->funcdata = NPY_AUXDATA_CLONE(ret->funcdata);
if (ret->funcdata == NULL) {
PyArray_free(ret);
return NULL;
}
return (NpyAuxData *)ret;
}
NpyAuxData *create_element_doubler_aux_data(
ElementCopier_Func *func,
NpyAuxData *funcdata)
{
eldoubler_aux_data *ret = PyArray_malloc(sizeof(eldoubler_aux_data));
if (ret == NULL) {
PyErr_NoMemory();
return NULL;
}
memset(&ret, 0, sizeof(eldoubler_aux_data));
ret->base->free = &free_element_doubler_aux_data;
ret->base->clone = &clone_element_doubler_aux_data;
ret->func = func;
ret->funcdata = funcdata;
return (NpyAuxData *)ret;
}
Array Iterators¶
As of Numpy 1.6, these array iterators are superceded by the new array iterator, :ctype:`NpyIter`.
An array iterator is a simple way to access the elements of an N-dimensional array quickly and efficiently. Section 2 provides more description and examples of this useful approach to looping over an array.
Broadcasting (multi-iterators)¶
Neighborhood iterator¶
New in version 1.4.0.
Neighborhood iterators are subclasses of the iterator object, and can be used to iter over a neighborhood of a point. For example, you may want to iterate over every voxel of a 3d image, and for every such voxel, iterate over an hypercube. Neighborhood iterator automatically handle boundaries, thus making this kind of code much easier to write than manual boundaries handling, at the cost of a slight overhead.
Array Scalars¶
Data-type descriptors¶
Warning
Data-type objects must be reference counted so be aware of the action on the data-type reference of different C-API calls. The standard rule is that when a data-type object is returned it is a new reference. Functions that take :ctype:`PyArray_Descr *` objects and return arrays steal references to the data-type their inputs unless otherwise noted. Therefore, you must own a reference to any data-type object used as input to such a function.
Conversion Utilities¶
For use with :cfunc:`PyArg_ParseTuple`¶
All of these functions can be used in :cfunc:`PyArg_ParseTuple` (...) with the “O&” format specifier to automatically convert any Python object to the required C-object. All of these functions return :cdata:`NPY_SUCCEED` if successful and :cdata:`NPY_FAIL` if not. The first argument to all of these function is a Python object. The second argument is the address of the C-type to convert the Python object to.
Warning
Be sure to understand what steps you should take to manage the memory when using these conversion functions. These functions can require freeing memory, and/or altering the reference counts of specific objects based on your use.
Other conversions¶
Miscellaneous¶
Importing the API¶
In order to make use of the C-API from another extension module, the
import_array
() command must be used. If the extension module is
self-contained in a single .c file, then that is all that needs to be
done. If, however, the extension module involves multiple files where
the C-API is needed then some additional steps must be taken.
Checking the API Version¶
Because python extensions are not used in the same way as usual libraries on most platforms, some errors cannot be automatically detected at build time or even runtime. For example, if you build an extension using a function available only for numpy >= 1.3.0, and you import the extension later with numpy 1.2, you will not get an import error (but almost certainly a segmentation fault when calling the function). That’s why several functions are provided to check for numpy versions. The macros :cdata:`NPY_VERSION` and :cdata:`NPY_FEATURE_VERSION` corresponds to the numpy version used to build the extension, whereas the versions returned by the functions PyArray_GetNDArrayCVersion and PyArray_GetNDArrayCFeatureVersion corresponds to the runtime numpy’s version.
The rules for ABI and API compatibilities can be summarized as follows:
- Whenever :cdata:`NPY_VERSION` != PyArray_GetNDArrayCVersion, the extension has to be recompiled (ABI incompatibility).
- :cdata:`NPY_VERSION` == PyArray_GetNDArrayCVersion and :cdata:`NPY_FEATURE_VERSION` <= PyArray_GetNDArrayCFeatureVersion means backward compatible changes.
ABI incompatibility is automatically detected in every numpy’s version. API incompatibility detection was added in numpy 1.4.0. If you want to supported many different numpy versions with one extension binary, you have to build your extension with the lowest NPY_FEATURE_VERSION as possible.
Internal Flexibility¶
Memory management¶
Threading support¶
These macros are only meaningful if :cdata:`NPY_ALLOW_THREADS` evaluates True during compilation of the extension module. Otherwise, these macros are equivalent to whitespace. Python uses a single Global Interpreter Lock (GIL) for each Python process so that only a single thread may excecute at a time (even on multi-cpu machines). When calling out to a compiled function that may take time to compute (and does not have side-effects for other threads like updated global variables), the GIL should be released so that other Python threads can run while the time-consuming calculations are performed. This can be accomplished using two groups of macros. Typically, if one macro in a group is used in a code block, all of them must be used in the same code block. Currently, :cdata:`NPY_ALLOW_THREADS` is defined to the python-defined :cdata:`WITH_THREADS` constant unless the environment variable :cdata:`NPY_NOSMP` is set in which case :cdata:`NPY_ALLOW_THREADS` is defined to be 0.
Group 1¶
This group is used to call code that may take some time but does not use any Python C-API calls. Thus, the GIL should be released during its calculation.
Group 2¶
This group is used to re-acquire the Python GIL after it has been released. For example, suppose the GIL has been released (using the previous calls), and then some path in the code (perhaps in a different subroutine) requires use of the Python C-API, then these macros are useful to acquire the GIL. These macros accomplish essentially a reverse of the previous three (acquire the LOCK saving what state it had) and then re-release it with the saved state.
Tip
Never use semicolons after the threading support macros.