Let's take a look at a similar example. Create two multiarrays as follows:
>>> z array([1, 2]) >>> z.shape (2,) >>> w array([3, 4, 5]) >>> w.shape (3,)
By all the rules we know so far, if we were to add these two objects, an exception would be raised. To accomplish the operation and kick in broadcasting, the
w multiarray needs to be altered. We can change the shape of
w with the reshape function to generate the
>>> v=reshape(w,(3,1)) >>> v array([, , ]) >>> v.shape (3, 1)
Now the operation can proceed with the aid of broadcasting.
In order to support operations of this type and not to rely on the reshape function, which can clutter up the code, NumPy's implementers invented the
NewAxis is a pseudo-index that allows the temporary addition of an axis into a multiarray. If you were to try the previous example as a single line of code, you might perform the following:
>>> z = array([1,2]) >>> w = array([3,4,5]) >>> z+reshape(w,(3,1)) array([[4, 5], [5, 6], [6, 7]])
NewAxis, the reshape function need not be called and the code is a bit more streamlined.
>>> z+w[:,NewAxis] array([[4, 5], [5, 6], [6, 7]])
These two examples do the same thing but differ in implementation. The first reshapes the array to the desired layout; the second uses the slicing operator to recreate the array while adding in an axis using
NewAxis. For more examples of slicing, refer to last month's article.
Here is a more interesting example of slicing with
>>> a=zeros((3,4,5,6)) >>> b=zeros((4,6)) >>> c=a+b[:,NewAxis,:]
In this case we inserted a temporary axis (supporting broadcasting) between the first and second axes of
b. Note that since the resulting
b multiarray was only of rank three and
a of rank four, broadcasting also occurred at the left-most index!
Do you see the pattern in the examples? Broadcasting happens when the ranks of the two multiarrays in question are not equal. When this happens, a set of rules comes into play whereby each axis is compared for length and adjustments made. A missing axis can be filled in to make the operation work; an axis of length 1 can also be overridden.
So what good is broadcasting?
The use of broadcasting is so inherent in NumPy operations, living without it would be hard. Consider the operation of trying to add 1 to each value in a multiarray or scaling all the values by 2. Both of these operations rely on broadcasting to succeed. It is hard (if not impossible) to imagine how to accomplish these operations without broadcasting.
Consider creating a rank 3 multiarray with each element set to 5 with broadcasting:
>>> a = ones((1,2,3)) * 5
Without broadcasting this becomes UGLY!
>>> a = ones((2,3,4)) >>> tmp = a.shape >>> for i in range(tmp): ... for j in range(tmp): ... for k in range(tmp): ... a[i,j,k] = a[i,j,k] * 5
In more complex cases, having to multiply each row of a matrix by a vector without broadcasting would require that the vector be replicated first into a matrix; then the operation could be performed. With broadcasting, the replication stage can be avoided. In cases where large multiarrays are in play, this can be a significant memory- and time-saving feature.
Bottom line, broadcasting allows the programmer to avoid the step of creating the intermediate and dimension-matched arrays. But be careful with your knowledge; the next time you use a complex form of broadcasting, help yourself and those who follow by giving them an insight into your thinking and the operation at hand -- yes, comments are helpful!
This month we looked at examples of broadcasting support by NumPy. These rules provide ways for multiarrays to interact when their ranks are not equal. The rules can be a bit confusing, but exploiting them will help make full use of the capabilities of NumPy.
Next month we will put together a larger scale application, bringing into play many of the features we have learned in this series of articles. I'll show how you can integrate NumPy with the multimedia capabilities of your computer to generate a hands-on application that employs NumPy (including broadcasting
;) ), the FFT module, and the DISLIN plotting package! See you next month!
Eric Hagemann specializes in fast algorithms for crunching numbers on all varieties of computers from embedded to mainframe.
Read more Numerically Python columns.
Discuss this article in the O'Reilly Network Python Forum.
Return to the Python DevCenter.