Image of NumPy: ndarray.flatten() function | NumPy.matrix.flatten

ADVERTISEMENT

Table of Contents

Introduction

Scientific computing in Python can be complicated, but Numpy offers a solution. Numpy is a powerful library that adds a new data structure called the n-dimensional array or the ndarray array which allows you to work with multi-dimensional arrays in Python. In addition, numpy provides a wide variety of functions to perform fast operations on these ndarrays.

In this article, we will share what we mean by "flatten", and how you can flatten ndarrays or matrices with the flatten method in Python with NumPy.

What are ndarrays in NumPy?

Let's first practice how you can create ndarrays with NumPy. First, you have to import the numpy library:

>>> import numpy as np

Note, that when importing numpy many times it's renamed to np for convenience.

NumPy provides various functions to quickly build large multi-dimensional arrays. However, the simplest way to create an ndarray is using a similar list with the np.array() function.

>>> myNdArray = np.array([1, 2, 3])
>>> myNdArray
array([1, 2, 3])
>>> type(myNdArray)
<class 'numpy.ndarray'>
>>> 

Here, you can see that myNdArray is of the type numpy.ndarray. And you can create multi-dimensional arrays as well:

>>> twoDArray = np.array([[1, 2, 3],
...                       [4, 5, 6]])
>>> 
>>> twoDArray
array([[1, 2, 3],
       [4, 5, 6]])

Note that the innermost arrays in the multi-dimensional arrays must be the same size, if not a warning will be raised:

>>> twoDArray = np.array([[1, 2, 3, 4],
...                       [5, 6, 7]])
<stdin>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
>>> twoDArray
array([list([1, 2, 3, 4]), list([5, 6, 7])], dtype=object)

The resulting ndarray will not be proper as it will just be an ndarray containing lists, and you won't be able to use all the numpy ndarray features correctly.

What is flatten in numpy?

To flatten an array in numpy means to convert a multi-dimensional array into a 1-dimensional array. For example, a 2-dimensional array such as:

[[1, 2, 3], [4, 5, 6]]

would be flattened to:

[1, 2, 3, 4, 5, 6]

In NumPy, the flatten() method can be used to flatten numpy ndarray and return a flattened version of the original array. Note that this method can be explicitly used on ndarrays and not regular python list. Let's look at an example:

>>> myArray =  [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> myNdArray = np.array(myArray)
>>> flattenArray = myNdArray.flatten()
>>> flattenArray
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> type(flattenArray)
<class 'numpy.ndarray'>
>>> myNdArray
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Here, we start with the list myArray which is then used to get an ndarray myNdArray. Then using the flatten() method myNdArray is flattened and its output is stored in flattenArray, and you can see that the flattened array is also of the type ndarray.

Notice, that the method did not affect myNdArray, and returned a new flattened version of the array. If we try to use flatten() on a regular list an error is returned:

>>> myArray.flatten()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'flatten'

Of course, flattening a 1-dimensional array would return the same array:

>>> oneDArray = np.array([1, 2, 3, 4])
>>> oneDArray.flatten()
array([1, 2, 3, 4])

The flatten() method has one optional parameter "order", which you will look at in the next section.

Various ways to flatten a stack using the order parameter

Let's say you are working with a stack represented as an array. The "order" parameter only takes four options as its value: ‘C’, ‘F’, ‘A’, ‘K’. Each option determines different ways the resulting flattened array will be ordered. Different order parameters can be useful for different types of algorithms or routines. Here, you will see examples for two of the options: 'C' and 'F'.

The 'C' is the default option that is used when no argument is passed with the flatten() method. It flattens the array in row-major order, which means it first adds items from the first row in-order to the resulting array followed by the next row, and so on. You can see this in the 2-dimensional array in the last section.

array([1, 2, 3, 4])
>>> threeDArray = np.array([[[1, 2, 3], 
...                          [4, 5, 6], 
...                          [6, 7, 8]],
...                         [[1, 2, 3], 
...                          [4, 5, 6], 
...                          [6, 7, 8]]])
>>> threeDArray.flatten('C')
array([1, 2, 3, 4, 5, 6, 6, 7, 8, 1, 2, 3, 4, 5, 6, 6, 7, 8])

In a 3-dimensional array above, you can see that this still applies as the items are added from left to right, row-by-row.

The 'F' option flattens the array in column-major order, which means it adds the items from top to bottom, column-by-column. For example:

>>> twoDarray = np.array([[1, 2, 3], 
...                       [4, 5, 6], 
...                       [7, 8, 9]])
>>> 
>>> twoDarray.flatten('F')
array([1, 4, 7, 2, 5, 8, 3, 6, 9])

However, for 3-dimensional arrays, it's slightly different. In the example below, you can see that for each 2D array in the 3D array, it will first add all the first items in the first columns, followed by the second item in the first column, and so on column-by-column.

>>> threeDArray = np.array([[[1, 2, 3], 
...                          [4, 5, 6], 
...                          [6, 7, 8]],
...                         [[1, 2, 3], 
...                          [4, 5, 6], 
...                          [6, 7, 8]]])
>>> threeDArray.flatten('F')
array([1, 1, 4, 4, 6, 6, 2, 2, 5, 5, 7, 7, 3, 3, 6, 6, 8, 8])

The 'K' option will order the items by the order that they appear in memory.

The 'A' option will order items that appear together (or contiguous as stated in the NumPy docs) in memory in column-major order and if they don't it will order them in row-major order.

What is flatten Matrix?

The main difference between a matrix and an ndarray in NumPy is that matrices are strictly 2-dimensional, while an n-dimensional array can be of varying dimensions, including 2-dimensional. If you try to build a 3D array with the np.matrix() function you will get an error:

>>> threeDArray = np.matrix([[[1, 2, 3], 
...                           [4, 5, 6], 
...                           [6, 7, 8]],
...                          [[1, 2, 3], 
...                           [4, 5, 6], 
...                           [6, 7, 8]]])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/gauravk/opt/anaconda3/lib/python3.9/site-packages/numpy/matrixlib/defmatrix.py", line 149, in __new__
    raise ValueError("matrix must be 2-dimensional")
ValueError: matrix must be 2-dimensional

Both ndarray and matrix flatten in almost the same way, with the major difference being that matrix must be 2-dimensional, so the output of the matrix.flatten() is the flattened matrix wrapped in an array so the matrix remains 2-dimensional. For example:

>>> twoDArray = np.array([[1, 2, 3],
...                       [4, 5, 6]])
>>> twoDMatrix = np.matrix([[1, 2, 3],
...                         [4, 5, 6]])
>>> 
>>> twoDArray
array([[1, 2, 3],
       [4, 5, 6]])
>>> twoDMatrix
matrix([[1, 2, 3],
        [4, 5, 6]])
>>>
>>> type(twoDArray)
<class 'numpy.ndarray'>
>>> type(twoDMatrix)
<class 'numpy.matrix'>
>>>
>>> twoDArray.flatten()
array([1, 2, 3, 4, 5, 6])
>>> twoDMatrix.flatten()
matrix([[1, 2, 3, 4, 5, 6]])
>>> 

Here, you can see how the outputs of the two differ slightly.

NumPy flatten vs ravel vs flat

The flatten method, ravel function, and attribute flat can all be used to flatten an ndarray in NumPy, but in different use cases.

The flat attribute returns the numpy.flatiter of the flattened array. This is similar to iterators in Python, but they are different and methods such as next() don't work with numpy.flatiter. However, you can use indexing with numpy.flatiter and iterator through it using a loop.

>>> twoDArray
array([[1, 2, 3],
       [4, 5, 6]])
>>> twoDArray.flat
<numpy.flatiter object at 0x7fbcb9016600>
>>> twoDArray.flat[1]
2

The flatten method and the ravel function might seem the same at first but are different.

>>> twoDArray
array([[1, 2, 3],
       [4, 5, 6]])
>>> twoDArray.flatten()
array([1, 2, 3, 4, 5, 6])
>>> arrayFlatten = twoDArray.flatten()
>>> arrayFlatten[0] = 99
>>> arrayFlatten
array([99,  2,  3,  4,  5,  6])
>>> twoDArray
array([[1, 2, 3],
       [4, 5, 6]])
>>> 
>>> arrayRavel = np.ravel(twoDArray)
>>> arrayRavel
array([1, 2, 3, 4, 5, 6])
>>> arrayRavel[0] = 199
>>> arrayRavel
array([199,   2,   3,   4,   5,   6])
>>> twoDArray
array([[199,   2,   3],
       [  4,   5,   6]])
>>> 

In this example, we flatten the array twoDArray with flatten to get arrayFlatten and ravel to get arrayRavel. Changing the value of index 0 in arrayFlatten changes arrayFlatten but does not affect twoDArray. This shows that the flatten method creates a complete copy of the original array.

However, changing index 0 in arrayRavel changes the corresponding index in twoDArray as well, which shows that the ravel function merely returns a different view of the original array and its items are the same as the original array.

Also, note that ravel could sometimes make a copy of the array if needed and ravel also has the order parameter.

Summary

In this article, you learned what flattening an array means, how you can flatten a ndarray, or a matrix using the flatten method, and other alternatives.

First, you were briefly introduced to numpy and creating ndarrays. Next, you learned some notes about what flattening an array means and how you can flatten arrays of different dimensions using the flatten method. You followed by learning what each option of the order parameter of the flatten method does. Then you learned the difference between matrix and ndarrays and how the output of the flatten method differs for each. Finally, you learned alternatives to the flatten method: the flat attribute and the ravel function.

Next Steps

To learn more about the basics of Python, coding, and software development, check out our Coding Essentials Guidebook for Developers, where we cover the essential languages, concepts, and tools that you'll need to become a professional developer.

Thanks and happy coding! We hope you enjoyed this article. If you have a question or any comments, feel free to reach out to jacob@initialcommit.io.

References

  1. What is NumPy? - https://numpy.org/doc/stable/user/whatisnumpy.html
  2. Numpy Docs ndarray.flatten - https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flatten.html
  3. Numpy Docs matrix.flatten - https://numpy.org/doc/stable/reference/generated/numpy.matrix.flatten.html
  4. Numpy Docs numpy.ravel - https://numpy.org/doc/stable/reference/generated/numpy.ravel.html#numpy.ravel
  5. Numpy Docs ndarray.flat - https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flat.html#numpy.ndarray.flat

Final Notes