How to filter excel data by row? Say my data is
for f in glob.glob('/Users/seb/PycharmProjects/data-transform/data/[I..O]*.xlsx'):
df = pd.read_excel(f)
all_data = all_data.append(df, ignore_index=True)
print(all_data.describe())
print(all_data.head())
But not sure how to filter for rows with certain string, so that input:
Date Time Variable value
08/01/2017 06:00:05 Occupancy_T4CHK_zoneA_1_Live 83
08/01/2017 06:00:05 Occupancy_T4CHK_totalConcourse_1_Live 510
08/01/2017 06:00:05 ProjQueueTime_T4CHK_zoneA_1_Live 919.048
08/01/2017 06:00:05 Flow_T4CHK_zoneA_1_Live 0
08/01/2017 06:00:05 ExpQueueTime_T4CHK_zoneA_1_Live 357.688
08/01/2017 06:00:05 Flow_T4CHK_zoneB_1_Live 0
08/01/2017 06:00:05 ProjQueueTime_T4CHK_zoneC_1_Live 114.911
08/01/2017 06:00:05 Flow_T4CHK_zoneC_1_Live 0
08/01/2017 06:00:05 ExpQueueTime_T4CHK_zoneC_1_Live 355.909
08/01/2017 06:00:05 Flow_T4CHK_zoneD_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneE_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneF_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneG_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneH_1_Live 0
produces output is
08/01/2017 06:00:05 Flow_T4CHK_zoneD_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneE_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneF_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneG_1_Live 0
08/01/2017 06:00:05 Flow_T4CHK_zoneH_1_Live 0
08/01/2017 06:00:10 Flow_T4CHK_zoneD_1_Live 0
08/01/2017 06:00:10 Flow_T4CHK_zoneE_1_Live 0
08/01/2017 06:00:10 Flow_T4CHK_zoneF_1_Live 0
08/01/2017 06:00:10 Flow_T4CHK_zoneG_1_Live 0
08/01/2017 06:00:10 Flow_T4CHK_zoneH_1_Live 0
Comment From: jreback
you prob want something like:
df.loc[df.value.startswith('foo')]
Comment From: scheung38
But how did you specify which column to look for 'foo'?
Comment From: jorisvandenbossche
@scheung38 This issue tracker is for bugs and enhancement for pandas. For more general usage questions, you will get more help on eg StackOverflow