pandas
缺失值
计算
看过来
《pandas 教程》 持续更新中,可作为 pandas 入门进阶课程、pandas 中文手册、用法大全,配有案例讲解和速查手册。提供建议、纠错、催更等加作者微信: gairuo123(备注:pandas教程)和关注公众号「盖若」ID: gairuo。查看更新日志。作者开办 Python 数据分析培训,详情 Python 数据分析培训。
![]() |
本教程作者所著新书《深入浅出Pandas:利用Python进行数据处理与分析》(ISBN:9787111685456)已由机械工业出版社出版上市,各大电商平台有售,欢迎:查看详情并关注购买。 |
缺失值参加各种计算会按什么逻辑进行呢?本文将介绍它在参与各种运算中的逻辑。
以下是两个包含缺失值的数据之间的加法运算:
a
'''
one two
a NaN -0.282863
c NaN 1.212112
e 0.119209 -1.044236
f -2.104569 -0.494929
h -2.104569 -0.706771
'''
b
'''
one two three
a NaN -0.282863 -1.509059
c NaN 1.212112 -0.173215
e 0.119209 -1.044236 -0.861849
f -2.104569 -0.494929 1.071804
h NaN -0.706771 -1.039575
'''
a + b
'''
one three two
a NaN NaN -0.565727
c NaN NaN 2.424224
e 0.238417 NaN -2.088472
f -4.209138 NaN -0.989859
h NaN NaN -1.413542
'''
计算逻辑如下:
df
'''
one two three
a NaN -0.282863 -1.509059
c NaN 1.212112 -0.173215
e 0.119209 -1.044236 -0.861849
f -2.104569 -0.494929 1.071804
h NaN -0.706771 -1.039575
'''
df['one'].sum()
# -1.9853605075978744
df.mean(1)
'''
a -0.895961
c 0.519449
e -0.595625
f -0.509232
h -0.873173
dtype: float64
'''
df.cumsum()
'''
one two three
a NaN -0.282863 -1.509059
c NaN 0.929249 -1.682273
e 0.119209 -0.114987 -2.544122
f -1.985361 -0.609917 -1.472318
h NaN -1.316688 -2.511893
'''
df.cumsum(skipna=False)
'''
one two three
a NaN -0.282863 -1.509059
c NaN 0.929249 -1.682273
e NaN -0.114987 -2.544122
f NaN -0.609917 -1.472318
h NaN -1.316688 -2.511893
'''
pd.Series([np.nan]).sum()
# 0.0
pd.Series([], dtype="float64").sum()
# 0.0
pd.Series([np.nan]).prod()
# 1.0
pd.Series([], dtype="float64").prod()
# 1.0
如果聚合分组的列里有空值,则会自动忽略这些值(就当它不存在):
df
'''
one two three
a NaN -0.282863 -1.509059
c NaN 1.212112 -0.173215
e 0.119209 -1.044236 -0.861849
f -2.104569 -0.494929 1.071804
h NaN -0.706771 -1.039575
'''
df.groupby('one').mean()
'''
two three
one
-2.104569 -0.494929 1.071804
0.119209 -1.044236 -0.861849
'''