五木齋

  • 博客
  • 留言
  • 关于
  • EN
  1. 首页
  2. Python
  3. 正文

Pandas 时间序列常用处理

2022.09.07 1439点热度 0人点赞 0条评论

1. pandas.Grouper

按年/月/日汇总

# importing modules
import pandas as pd

# creating a dataframe df
df = pd.DataFrame(
    {
        "Date": [
            pd.Timestamp("2000-11-02"),
            pd.Timestamp("2000-01-02"),
            pd.Timestamp("2000-01-09"),
            pd.Timestamp("2000-03-11"),
            pd.Timestamp("2000-01-26"),
            pd.Timestamp("2000-02-16")
        ],
        "ID": [1, 2, 3, 4, 5, 6],
        "Price": [140, 120, 230, 40, 100, 450]
    }
)

# applying the groupby function on df

# by month
df.groupby(pd.Grouper(key='Date', axis=0, freq='M')).sum()

# by day
df.groupby(pd.Grouper(key='Date', axis=0, freq='2D', sort=True)).sum()

# by year
df.groupby(pd.Grouper(key='Date', freq='2Y')).sum()

按时/分/秒汇总

# importing module
import pandas as pd

# create an array of 5 dates starting
# at '2015-02-24', one per minute
dates = pd.date_range('2015-02-24', periods=10, freq='T')

# creating dataframe with above array
# of dates
df = pd.DataFrame(
    {
        "Date": dates,
        "ID": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 
        "Price": [140, 120, 230, 40, 100, 450, 234, 785, 12, 42]
    }
)

# display dataframe
display(df)

# applied groupby function
# by sec.
df.groupby(pd.Grouper(key='Date', freq='30S')).sum()
# by min.
df.groupby(pd.Grouper(key='Date', freq='2T')).sum()
df.groupby(pd.Grouper(key='Date', freq='2min')).sum()
# by hour
df.groupby(pd.Grouper(key='Date', freq='1H')).sum()

2. Dataframe.resample()

构造样例数据:

d = {'price': [10, 11, 9, 13, 14, 18, 17, 19],
      'volume': [50, 60, 40, 100, 50, 100, 40, 50]}
df = pd.DataFrame(d)
df['week_starting'] = pd.date_range('01/01/2018', periods=8, freq='W')
print(df)

输出:

   price  volume week_starting
0     10      50    2018-01-07
1     11      60    2018-01-14
2      9      40    2018-01-21
3     13     100    2018-01-28
4     14      50    2018-02-04
5     18     100    2018-02-11
6     17      40    2018-02-18
7     19      50    2018-02-25

按月汇总:

print(df.resample('M', on='week_starting').mean())

结果:

                price  volume
week_starting
2018-01-31     10.75    62.5
2018-02-28     17.00    60.0

更多参考:

  1. How to Group Pandas DataFrame By Date and Time ?
  2. pd.Grouper
  3. pandas.DataFrame.resample
  4. Offset aliases
本作品采用 知识共享署名 4.0 国际许可协议 进行许可
标签: 暂无
最后更新:2023.09.08

伏星宇

· 数据分析/摄影/游戏 · 致力于瞎折腾的INFJ

点赞

文章评论

razz evil exclaim smile redface biggrin eek confused idea lol mad twisted rolleyes wink cool arrow neutral cry mrgreen drooling persevering
取消回复

分类
  • Docker
  • Python
  • 日常
  • 游戏
  • 运维

归档

  • 2023 年 9 月
  • 2023 年 5 月
  • 2023 年 1 月
  • 2022 年 12 月
  • 2022 年 9 月
  • 2022 年 8 月
  • 2022 年 5 月

分类

  • Docker
  • Python
  • 日常
  • 游戏
  • 运维

COPYRIGHT © 2023 Fu Xingyu. ALL RIGHTS RESERVED. POWERED BY WordPress.

Theme Kratos Made By Seaton Jiang

沪ICP备2023001945号-1

沪公网安备31010702008173号