python – 具有MultiIndex的Pandas DataFrame:按DateTime级别值的年份分组

python – 具有MultiIndex的Pandas DataFrame:按DateTime级别值的年份分组,第1张

概述我有和pandas数据帧的multiindex看起来像这样: # -*- coding: utf-8 -*-import numpy as npimport pandas as pd# multi-indexed dataframedf = pd.DataFrame(np.random.randn(8760 * 3, 3))df['concept'] = "some_value"df 我有和pandas数据帧的multiindex看起来像这样:

# -*- Coding: utf-8 -*-import numpy as npimport pandas as pd# multi-indexed dataframedf = pd.DataFrame(np.random.randn(8760 * 3,3))df['concept'] = "some_value"df['datetime'] = pd.date_range(start='2016',periods=len(df),freq='60Min')df.set_index(['concept','datetime'],inplace=True)df.sort_index(inplace=True)

控制台输出:

df.head()Out[23]:                  0         1         2datetime                              2016      0.458802  0.413004  0.0910562016     -0.051840 -1.780310 -0.3041222016     -1.119973  0.954591  0.2790492016     -0.691850 -0.489335  0.5542722016     -1.278834 -1.292012 -0.637931df.head()    ...: df.tail()Out[24]:                  0         1         2datetime                              2018     -1.872155  0.434520 -0.5265202018      0.345213  0.989475 -0.8920282018     -0.162491  0.908121 -0.9934992018     -1.094727  0.307312  0.5150412018     -0.880608 -1.065203 -1.438645

现在我想在’datetime’级别创建年度总和.

我的第一次尝试是以下,但这不起作用:

# sum along yearsyears = df.index.get_level_values('datetime').year.toList()df.index.set_levels([years],level=['datetime'],inplace=True)df = df.groupby(level=['datetime']).sum()

这对我来说似乎也很沉重,因为这个任务可能很容易实现.

所以这是我的问题:如何获得“日期时间”级别的年度总和?有没有一种简单的方法来通过将函数应用于DateTime级别值来实现这一点?

解决方法 您可以通过第二级multiindex和 year获得 groupby

# -*- Coding: utf-8 -*-import numpy as npimport pandas as pd# multi-indexed dataframedf = pd.DataFrame(np.random.randn(8760  * 3,inplace=True)df.sort_index(inplace=True)print df.head()                                        0         1         2concept    datetime                                         some_value 2016-01-01 00:00:00  1.973437  0.101535 -0.693360           2016-01-01 01:00:00  1.221657 -1.983806 -0.075609           2016-01-01 02:00:00 -0.208122 -2.203801  1.254084           2016-01-01 03:00:00  0.694332 -0.235864  0.538468           2016-01-01 04:00:00 -0.928815 -1.417445  1.534218# sum along years#years = df.index.get_level_values('datetime').year.toList()#df.index.set_levels([years],inplace=True)print df.index.levels[1].year[2016 2016 2016 ...,2018 2018 2018]df = df.groupby(df.index.levels[1].year).sum()print df.head()               0           1          22016  -93.901914  -32.205514 -22.4609652017  205.681817   67.701669 -33.9608012018   67.438355  150.954614 -21.381809

或者您可以使用get_level_valuesyear

df = df.groupby(df.index.get_level_values('datetime').year).sum()print df.head()               0           1          22016  -93.901914  -32.205514 -22.4609652017  205.681817   67.701669 -33.9608012018   67.438355  150.954614 -21.381809
总结

以上是内存溢出为你收集整理的python – 具有MultiIndex的Pandas DataFrame:按DateTime级别值的年份分组全部内容,希望文章能够帮你解决python – 具有MultiIndex的Pandas DataFrame:按DateTime级别值的年份分组所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: https://www.outofmemory.cn/langs/1193510.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-03
下一篇 2022-06-03

发表评论

登录后才能评论

评论列表(0条)

保存