12.3.10.4.2. Create and view an object#

import pandas as pd
import numpy as np

Create an object

series = pd.Series([1, 3, 5, np.nan, 6, 8])
dates = pd.date_range("20220501", periods=6)
dataFrame = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
dataFrame2 = pd.DataFrame(
    {
        "A": 1.0,
        "B": pd.Timestamp("20220501"),
        "C": pd.Series(1, index=list(range(4)), dtype="float32"),
        "D": np.array([3] * 4, dtype="int32"),
        "E": pd.Categorical(["test", "train", "test", "train"]),
        "F": "foo",
    }
)

View an object

dataFrame2.head()
A B C D E F
0 1.0 2022-05-01 1.0 3 test foo
1 1.0 2022-05-01 1.0 3 train foo
2 1.0 2022-05-01 1.0 3 test foo
3 1.0 2022-05-01 1.0 3 train foo


dataFrame2.tail()
A B C D E F
0 1.0 2022-05-01 1.0 3 test foo
1 1.0 2022-05-01 1.0 3 train foo
2 1.0 2022-05-01 1.0 3 test foo
3 1.0 2022-05-01 1.0 3 train foo


dataFrame2.dtypes
A          float64
B    datetime64[s]
C          float32
D            int32
E         category
F           object
dtype: object
dataFrame2.index
Index([0, 1, 2, 3], dtype='int64')
dataFrame2.columns
Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')
dataFrame2.describe()
A B C D
count 4.0 4 4.0 4.0
mean 1.0 2022-05-01 00:00:00 1.0 3.0
min 1.0 2022-05-01 00:00:00 1.0 3.0
25% 1.0 2022-05-01 00:00:00 1.0 3.0
50% 1.0 2022-05-01 00:00:00 1.0 3.0
75% 1.0 2022-05-01 00:00:00 1.0 3.0
max 1.0 2022-05-01 00:00:00 1.0 3.0
std 0.0 NaN 0.0 0.0


Convert to numpy

dataFrame.to_numpy()
array([[-0.48070149,  0.69640839,  0.91964576, -0.16073026],
       [-0.34969799,  1.00341675, -0.82853341,  0.78270195],
       [-1.79253563,  0.25743326,  0.58993783, -0.33844128],
       [ 1.79722667, -1.10740684,  0.48154367,  1.27495664],
       [ 0.38022496, -0.12789082, -0.85380203,  2.81773155],
       [ 0.25135383,  1.00941959,  1.31962369, -0.1835926 ]])

Transpose, sorting data

dataFrame.T
2022-05-01 2022-05-02 2022-05-03 2022-05-04 2022-05-05 2022-05-06
A -0.480701 -0.349698 -1.792536 1.797227 0.380225 0.251354
B 0.696408 1.003417 0.257433 -1.107407 -0.127891 1.009420
C 0.919646 -0.828533 0.589938 0.481544 -0.853802 1.319624
D -0.160730 0.782702 -0.338441 1.274957 2.817732 -0.183593


dataFrame.sort_index(axis=1, ascending=False)
D C B A
2022-05-01 -0.160730 0.919646 0.696408 -0.480701
2022-05-02 0.782702 -0.828533 1.003417 -0.349698
2022-05-03 -0.338441 0.589938 0.257433 -1.792536
2022-05-04 1.274957 0.481544 -1.107407 1.797227
2022-05-05 2.817732 -0.853802 -0.127891 0.380225
2022-05-06 -0.183593 1.319624 1.009420 0.251354


dataFrame.sort_values(by="B")
A B C D
2022-05-04 1.797227 -1.107407 0.481544 1.274957
2022-05-05 0.380225 -0.127891 -0.853802 2.817732
2022-05-03 -1.792536 0.257433 0.589938 -0.338441
2022-05-01 -0.480701 0.696408 0.919646 -0.160730
2022-05-02 -0.349698 1.003417 -0.828533 0.782702
2022-05-06 0.251354 1.009420 1.319624 -0.183593


Total running time of the script: (0 minutes 0.030 seconds)