12.3.10.4.1. Categoricals#

import pandas as pd


dataFrame = pd.DataFrame(
    {"id": [1, 2, 3, 4, 5, 6], "raw_grade": ["a", "b", "b", "a", "a", "e"]}
)
dataFrame["grade"] = dataFrame["raw_grade"].astype("category")
dataFrame["grade"].cat.rename_categories(["very good", "good", "very bad"])
dataFrame["grade"] = dataFrame["grade"].cat.set_categories(
    ["very bad", "bad", "medium", "good", "very good"]
)
dataFrame["grade"]
0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
Name: grade, dtype: category
Categories (5, object): ['very bad', 'bad', 'medium', 'good', 'very good']
dataFrame.sort_values(by="grade")
id raw_grade grade
0 1 a NaN
1 2 b NaN
2 3 b NaN
3 4 a NaN
4 5 a NaN
5 6 e NaN


dataFrame.groupby("grade").size()
C:\Workspace\itom_development_VS2019_Qt5.15.2_x64\itomProject\itom\demo\python_packages\pandas\demo_categoricals.py:28: FutureWarning:

The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.


grade
very bad     0
bad          0
medium       0
good         0
very good    0
dtype: int64

Total running time of the script: (0 minutes 0.020 seconds)