Cython memoryview transpose: Typeerror

lhk

2016-05-12T06:07:10

I'm trying to develop a small Convolutional Neural Network framework with python. The code for the convolutional node already works (slowly) and I would like to speed it up. The hotspots are the loops where the convolutional filter is moved across the image. I chose to use cython to speed up those loops.

The obvious small annotations, cdef for all local variables and removing boundscheck, shaved hardly 10% off of my runtime. That seemed strange to me, based on what I read online, cython should already be able to do its magic.

Unfortunately the code is inside a class and relies heavily on the properties of that class. I decided to convert it into a cdef class. This means that all class attributes have to be declared with cdef. Apparently cython doesn't support numpy arrays, so I declared all numpy arrays as double[:,:,...]

So far the code worked fine, all unittests passing. Now the compilation to .pyd (I'm working under windows) still works. But running the code creates a Typeerror:

TypeError: only length-1 arrays can be converted to Python scalars

Here is some code. This is the entire forward method of my convolutional node, which might be too much and not easily readable. You probably only need the very last line. That's were the error happens:

    @cython.boundscheck(False)
    @cython.nonecheck(False)
    def forward(self):

        # im2col: x -> in_cols
        # padding
        cdef np.ndarray[DTYPE_t, ndim=4] x_padded = np.zeros((self.batch_size, self.in_colors, self.in_width + self.padding*2, self.in_height + self.padding*2))
        if self.padding>0:
            x_padded[:, :, self.padding:self.in_width+self.padding, self.padding:self.in_height+self.padding] = self.x
        else:
            x_padded[:]=self.x

        # allocating new field
        cdef np.ndarray[DTYPE_t, ndim=4] rec_fields = np.empty((self.filter_size**2* self.in_colors, self.batch_size, self.out_width, self.out_height))

        # copying receptive fields
        cdef int w,h
        for w, h in np.ndindex((self.out_width, self.out_height)):
            rec_fields[:, :, w, h] = x_padded[:, :, w*self.stride:w*self.stride + self.filter_size, h*self.stride:h*self.stride + self.filter_size] \
                .reshape((self.batch_size, self.filter_size**2* self.in_colors)) \
                .T

        self.in_cols = rec_fields.reshape((self.filter_size**2 * self.in_colors, self.batch_size * self.out_width * self.out_height))

        # linear node: in_cols -> out_cols
        cdef np.ndarray[DTYPE_t, ndim=2] out_cols=np.dot(self.W,self.in_cols)+self.b

        # col2im: out_cols -> out_image -> y
        cdef np.ndarray[DTYPE_t, ndim=4] out_image = out_cols.reshape((self.out_colors, self.batch_size, self.out_width, self.out_height))
        self.y[:] = out_image.transpose(1, 0, 2, 3)

This last call to transpose is marked in the exception. I can't explain this. Do memoryviews behave differently when transposed ?

UPDATE:

I'm sure that the dimensions are defined correctly. If there is a dimension mismatch, it produces a different runtime error. Can't check right now, but it was something like "got 4-dim, expected 2-dim". I've got to say that I'm extremely impressed by the type system of cython. This kind of runtime type information in a python exception is rather useful. Sadly it doesn't explain why the transpose above fails.

UPDATE:

There's some complication with the arrays: They must not be overwritten, only be used as references.

A little difficult to explain: At the core of the neural network is a loop which calls the method forward() on all nodes in the network consecutively.

for node in self.nodes:
    node.forward()

In this method the node looks at its input data, makes some computations and writes to its output. It relies on the fact that the input already contains the correct data.

For the setup of my network I store the nodes in the right order. And I connect them manually.

node2.x=node1.y

Now if I write

self.y[:]= data

in the forward method of node1, node2 automatically has the correct input. This requires careful programming: the forward methods must be called in the right order and the output must never be overwritten, only written to.

The alternative would be a huge structure where I store the output of each node and pass this data around. That would create lots of boilerplate code and mess up the forward and backward pass.

UPDATE:

the last few lines in forward now look like this:

cdef np.ndarray[DTYPE_t, ndim=4] out_image = out_cols.reshape((self.out_colors, self.batch_size, self.out_width, self.out_height))
        cdef double[:,:,:,:] temp
        temp=out_image.transpose(1,0,2,3)
        self.y[...] = temp

The assignment to temp fails with the same TypeError message.

Copyright License：
Author:「lhk」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to：https://stackoverflow.com/questions/37174074/cython-memoryview-transpose-typeerror

About “Cython memoryview transpose: Typeerror” questions

Cython: Transpose a memoryview

Some background for the question: I'm trying to optimise a custom neural network code. It relies heavily on loops and I decided to use cython to speed up the calculation. I followed the usual onl...

Cython memoryview transpose: Typeerror

Cython Memoryview as return value

Consider this dummy Cython code: #!python #cython: boundscheck=False #cython: wraparound=False #cython: initializedcheck=False #cython: cdivision=True #cython: nonecheck=False import numpy as np #

Cython: optimize native Python memoryview

I have a function (from an external Python library) that returns a memoryview object that I want to process in Cython. Is there a way to convert it to a typed memoryview of bytes (without copy) for

Initialise Cython Memoryview efficiently

I'm currently setting my MemoryViews in my Cython pyx file as follows: @cython.boundscheck(False) cdef int[:] fill_memview(): # This happens inside a big loop so needs to be fast cdef int[...

cython memoryview not faster than ndarray

I have a function written in regular numpy ndarray and another one with a typed memoryview. However, I couldn't get the memoryview version to work faster than the regular version (unlike many of the

Sort memoryview in Cython

How can I sort a memoryview in-place in Cython? Is there a built-in function that can do it? Right now I have to use a numpy array instead and use numpy's sort, which is very slow.

indexing Cython memoryview using memoryview of ints

Using Cython, I try to do this: cpdef myFun(double[:] array): cdef int[:] sortIndices = np.argsort(array, kind='mergesort') array = array[sortIndices] The compiler complains: Invalid ...

Norm of Memoryview - Cython

I have a function that is given a memoryview vector and I want to calculate the norm of that vector. Until now I achieved that by converting the memoryview to a Numpy array and calculating the norm...

Assembling a cython memoryview from numpy arrays

I have a bunch of numpy arrays as attributes of an array of python objects, in cython, in preparation for prange processing (which requires nogil), I wanted to create a memory view that was "indire...