2016-04-09 14 views
0

ben şöyle bir csv var:Pandalar: Aynı saatten fazla ortalama

Hour      Average 
13:00     1009.9795825 
14:00     1009.748204 
15:00     .... 

:

YYYY-MO-DD HH-MI-SS_SSS    ATMOSPHERIC PRESSURE (hPa) mean 
2/24/2016 13:00       1011.937618 
2/24/2016 14:00       1011.721583 
2/24/2016 15:00       1011.348064 
2/24/2016 16:00       1011.30785 
2/24/2016 17:00       1011.3198 
2/24/2016 18:00       1011.403372 
2/24/2016 19:00       1011.485108 
2/24/2016 20:00       1011.270083 
2/24/2016 21:00       1010.936331 
2/24/2016 22:00       1010.920958 
2/24/2016 23:00       1010.816478 
2/25/2016 00:00       1010.899142 
2/25/2016 01:00       1010.209392 
2/25/2016 02:00       1009.700625 
2/25/2016 03:00       1009.457683 
2/25/2016 04:00       1009.268081 
2/25/2016 05:00       1009.718639 
2/25/2016 06:00       1010.745444 
2/25/2016 07:00       1011.062028 
2/25/2016 08:00       1011.168117 
2/25/2016 09:00       1010.771281 
2/25/2016 10:00       1010.138053 
2/25/2016 11:00       1009.509119 
2/25/2016 12:00       1008.703811 
2/25/2016 13:00       1008.021547 
2/25/2016 14:00       1007.774825 
    .....          ..... 

Her günün aynı saatte ortalamalar ile yeni dataframe oluşturmak istiyorum Bunu yapmak için herhangi bir kolay yolu var mı?

Teşekkürler!

df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS']) 
hour = pd.to_timedelta(df['YYYY-MO-DD HH-MI-SS_SSS'].dt.hour, unit='H') 

Sonra hour grup ve her biri için ortalama hesaplayabilir: Bir Pandalar datetime benzeri Serisi içine tarihleri ​​ayrıştırmak kez

cevap

3

, o zaman dt accessor kullanarak zaman serisinin saat erişebilir grubu:

df.groupby(hour).mean() 

import pandas as pd 
df = pd.DataFrame(
    {'ATMOSPHERIC PRESSURE (hPa) mean': 
    [1011.937618, 1011.721583, 1011.348064, 1011.30785, 1011.3198, 1011.403372, 
     1011.485108, 1011.270083, 1010.936331, 1010.920958, 1010.816478, 1010.899142, 
     1010.209392, 1009.700625, 1009.457683, 1009.268081, 1009.718639, 1010.745444, 
     1011.062028, 1011.168117, 1010.771281, 1010.138053, 1009.509119, 1008.703811, 
     1008.021547, 1007.774825], 
    'YYYY-MO-DD HH-MI-SS_SSS': 
    ['2/24/2016 13:00', '2/24/2016 14:00', '2/24/2016 15:00', '2/24/2016 16:00', 
     '2/24/2016 17:00', '2/24/2016 18:00', '2/24/2016 19:00', '2/24/2016 20:00', 
     '2/24/2016 21:00', '2/24/2016 22:00', '2/24/2016 23:00', '2/25/2016 00:00', 
     '2/25/2016 01:00', '2/25/2016 02:00', '2/25/2016 03:00', '2/25/2016 04:00', 
     '2/25/2016 05:00', '2/25/2016 06:00', '2/25/2016 07:00', '2/25/2016 08:00', 
     '2/25/2016 09:00', '2/25/2016 10:00', '2/25/2016 11:00', '2/25/2016 12:00', 
     '2/25/2016 13:00', '2/25/2016 14:00']}) 
df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS']) 

hour = pd.to_timedelta(df['YYYY-MO-DD HH-MI-SS_SSS'].dt.hour, unit='H') 
hour.name = 'Hour' 
result = df.groupby(hour).mean() 

verimleri

      ATMOSPHERIC PRESSURE (hPa) mean 
YYYY-MO-DD HH-MI-SS_SSS         
00:00:00          1010.899142 
01:00:00          1010.209392 
02:00:00          1009.700625 
03:00:00          1009.457683 
04:00:00          1009.268081 
05:00:00          1009.718639 
...