NumPy Data Types (dtypes)
What is a dtype?
A dtype (data type) tells NumPy what kind of values an array contains, such as:
- integers (
int32int32,int64int64) - floats (
float32float32,float64float64) - booleans (
boolbool) - strings (
<U...<U...) and bytes (|S...|S...)
Because NumPy uses a single dtype for the entire array, it can store values efficiently and run fast computations.
Checking dtype
check
import numpy as np
arr = np.array([1, 2, 3])
print(arr.dtype)check
import numpy as np
arr = np.array([1, 2, 3])
print(arr.dtype)Common numeric dtypes
Integers
ints
a = np.array([1, 2, 3], dtype=np.int32)
b = np.array([1, 2, 3], dtype=np.int64)
print(a.dtype, b.dtype)ints
a = np.array([1, 2, 3], dtype=np.int32)
b = np.array([1, 2, 3], dtype=np.int64)
print(a.dtype, b.dtype)Floats
floats
a = np.array([1.5, 2.0, 3.25], dtype=np.float32)
b = np.array([1.5, 2.0, 3.25], dtype=np.float64)
print(a.dtype, b.dtype)floats
a = np.array([1.5, 2.0, 3.25], dtype=np.float32)
b = np.array([1.5, 2.0, 3.25], dtype=np.float64)
print(a.dtype, b.dtype)Memory usage and dtype
Smaller dtypes use less memory.
memory
import numpy as np
arr32 = np.ones(1_000_000, dtype=np.float32)
arr64 = np.ones(1_000_000, dtype=np.float64)
print("float32 bytes:", arr32.nbytes)
print("float64 bytes:", arr64.nbytes)memory
import numpy as np
arr32 = np.ones(1_000_000, dtype=np.float32)
arr64 = np.ones(1_000_000, dtype=np.float64)
print("float32 bytes:", arr32.nbytes)
print("float64 bytes:", arr64.nbytes)Type conversion
Using .astype().astype()
astype
import numpy as np
arr = np.array([1, 2, 3])
arr_f = arr.astype(np.float64)
print(arr_f, arr_f.dtype)astype
import numpy as np
arr = np.array([1, 2, 3])
arr_f = arr.astype(np.float64)
print(arr_f, arr_f.dtype)Safe conversion (avoid overflow)
Converting large values into a smaller dtype can overflow.
overflow
import numpy as np
arr = np.array([300], dtype=np.int16)
print(arr.astype(np.uint8)) # wraps around in many casesoverflow
import numpy as np
arr = np.array([300], dtype=np.int16)
print(arr.astype(np.uint8)) # wraps around in many casesDtype pitfalls in data analytics
1) Missing values
NumPy numeric arrays canβt store NaNNaN in integer dtype.
nan-int
import numpy as np
# This will upcast to float automatically because of np.nan
arr = np.array([1, 2, np.nan])
print(arr)
print(arr.dtype)nan-int
import numpy as np
# This will upcast to float automatically because of np.nan
arr = np.array([1, 2, np.nan])
print(arr)
print(arr.dtype)2) Mixed types
If you mix strings and numbers, dtype may become objectobject or strings.
mixed
import numpy as np
arr = np.array([1, "two", 3])
print(arr)
print(arr.dtype)mixed
import numpy as np
arr = np.array([1, "two", 3])
print(arr)
print(arr.dtype)Arrays with dtype=objectdtype=object are slower for numerical operations.
Next
Continue to: Indexing and Slicing Arrays to learn how to select, filter, and extract parts of arrays.
π§ͺ Try It Yourself
Exercise 1 β Create a NumPy Array
Exercise 2 β Array Shape and Reshape
Exercise 3 β Array Arithmetic
If this helped you, consider buying me a coffee β
Buy me a coffeeWas this page helpful?
Let us know how we did
