Note
Go to the end to download the full example code.
12.3.10.4.10. Selection of data#
import numpy as np
import pandas as pd
dates = pd.date_range("20220501", periods=6)
dataFrame = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
Getting data
dataFrame["A"]
2022-05-01 0.973799
2022-05-02 0.155131
2022-05-03 0.445839
2022-05-04 -0.613454
2022-05-05 -0.776714
2022-05-06 -0.600595
Freq: D, Name: A, dtype: float64
dataFrame[0:3]
dataFrame["20220501":"20220502"]
Selection by label
dataFrame.loc[dates[0]]
A 0.973799
B -1.856910
C 1.102405
D 1.448620
Name: 2022-05-01 00:00:00, dtype: float64
dataFrame.loc[:, ["A", "B"]]
dataFrame.loc["20220501":"20220502", ["A", "B"]]
dataFrame.loc["20220501", ["A", "B"]]
A 0.973799
B -1.856910
Name: 2022-05-01 00:00:00, dtype: float64
dataFrame.loc[dates[0], "A"]
0.9737991449933912
dataFrame.at[dates[0], "A"]
0.9737991449933912
Selection by position
dataFrame.iloc[3]
A -0.613454
B -1.217677
C 0.144906
D 0.719465
Name: 2022-05-04 00:00:00, dtype: float64
dataFrame.iloc[3:5, 0:2]
dataFrame.iloc[[1, 2, 4], [0, 2]]
dataFrame.iloc[1:3, :]
dataFrame.iloc[:, 1:3]
dataFrame.iloc[1, 1]
-0.5821550971365247
dataFrame.iat[1, 1]
-0.5821550971365247
Boolean indexing
dataFrame[dataFrame["A"] > 0]
dataFrame[dataFrame > 0]
dataFrame2 = dataFrame.copy()
dataFrame2["E"] = ["one", "one", "two", "three", "four", "three"]
dataFrame2[dataFrame2["E"].isin(["two", "four"])]
Setting data
series = pd.Series([1, 2, 3, 4, 5, 6], index=pd.date_range("20130102", periods=6))
dataFrame["F"] = series
dataFrame.at[dates[0], "A"] = 0
dataFrame.iat[0, 1] = 0
dataFrame.loc[:, "D"] = np.array([5] * len(dataFrame))
dataFrame2 = dataFrame.copy()
dataFrame2[dataFrame2 > 0] = -dataFrame2
Total running time of the script: (0 minutes 0.019 seconds)