`import pandas as pd import datetime from pandas_datareader import data, wb import csv import json

declaring libraries for the analysis

out= open("testfile.csv", "rt") # read the .csv file data = csv.reader(out) data = [[row[0],row[1] + "" + row[2],row[3] +"" + row[4], row[5],row[6]] for row in data] #this part is mainly to concatenate the City and country out.close()

out=open("data.csv", "wt") output = csv.writer(out) for row in data: output.writerow(row) out.close()

<------------ End of concatenation ------------->

The result is written out and saved in data.csv (I don't want to override the original source)

df = pd.read_csv('data.csv') #using pandas and dataframe, I can read in data and loop it through between the Arrival Date and Departure Date

df.DateDpt = pd.to_datetime(df.DateDpt) df.DateAr = pd.to_datetime(df.DateAr) df = df.set_index('DateAr') new_df = pd.DataFrame() for i, data in df.iterrows(): data = data.to_frame().transpose() data = data.reindex(pd.date_range(start=data.index[0], end=data.DateDpt[0])).fillna(method='ffill').reset_index().rename(columns={'index': 'DateAr'}) new_df = pd.concat([new_df, data])

new_df = new_df[['AuthorID', 'ArCity_ArCountry', 'DptCity_DptCountry', 'DateAr', 'DateDpt']]

print (new_df)

<---- End of iteration ---->

Json file : <------ JSON ----------------->

json_dict = {}

for arrival_date, data in new_df.groupby('DateAr'): matching_dates = data[data.DateDpt==arrival_date] not_matching_dates = data[data.DateDpt!=arrival_date] json_dict[arrival_date.strftime('%Y-%m-%d')] = {} if not matching_dates.empty: for city, flights in matching_dates.groupby('ArCity_ArCountry'): json_dict[arrival_date.strftime('%Y-%m-%d')][city] = [str(v) for v in flights.AuthorID.to_dict().values()] if not not_matching_dates.empty: for city, flights in not_matching_dates.groupby('DptCity_DptCountry'): json_dict[arrival_date.strftime('%Y-%m-%d')][city] = [str(v) for v in flights.AuthorID.to_dict().values()]

with open('json_dict.json', 'w') as f: json.dump(json_dict, f, indent=4, sort_keys=True)

<------- End of JSON ---------------->`

CSV File (testfile.csv) AuthorID ArCity ArCountry DptCity DptCountry DateDpt DateAr AAA Paris France NewYork UnitedState 2008-03-10 2001-02-02 BBB Paris France NewYork UnitedState 2008-03-10 2001-02-02 CCC Paris France Beijing Japan 2008-03-10 2001-02-02 DDD Paris France NewYork UnitedState 2008-03-10 2001-02-02 EEE Paris France London UK 2008-03-10 2001-02-02 FFF Paris France NewYork UnitedState 2008-03-10 2001-02-02 GGG Paris France Beijing Japan 2008-03-10 2001-02-02

Expected output:

{
    "2001-02-02": {
        "Beijing _Japan": [
            "CCC",
            "GGG"
        ],
        "London_UK": [
            "EEE"
        ],
        "NewYork_UnitedState": [
            "AAA"
            "BBB"
            "DDD"
            "FFF"
        ],
        "Paris_France": [
            "AAA"
            "BBB"
            "CCC"
            "DDD"
            "EEE"
            "FFF"
        ]

    },

    .
    .
    .
  }

But the current output is 
`{
    "2001-02-02": {
        "Beijing _Japan": [
            "GGG"
        ],
        "London_UK": [
            "EEE"
        ],
        "NewYork_UnitedState": [
            "FFF"
        ]
    },
    "2001-02-03": {
        "Beijing _Japan": [
            "GGG"
        ],
        "London_UK": [
            "EEE"
        ],
        "NewYork_UnitedState": [
            "FFF"
        ]
    },
    .
    .
    .
  }`
``

Comment From: jreback

you will get more help on the mailing list and/or stack overflow

Comment From: jreback

this is not a library issues, rather a question how to structure your code

Comment From: jreback

you will have better responses with a much shorter and copy pastable example

you are reading from a csv file which only u have

Comment From: abdojulari

I have attached the sample csv file

AuthorID    ArCity  ArCountry   DptCity       DptCountry            DateDpt            DateAr
AAA         Paris       France      NewYork    UnitedState            2008-03-10      2001-02-02
BBB         Paris       France      NewYork    UnitedState            2008-03-10      2001-02-02
CCC         Paris      France      Beijing       Japan                      2008-03-10      2001-02-02
DDD         Paris      France      NewYork    UnitedState            2008-03-10      2001-02-02
EEE         Paris       France      London       UK                         2008-03-10      2001-02-02
FFF         Paris       France      NewYork     UnitedState           2008-03-10      2001-02-02
GGG         Paris      France      Beijing        Japan                    2008-03-10       2001-02-02

I have posted it on stackover flow, but no response and I'm right now behind schedule. Please sir! This is the sample CSV file named testfile.cs

Comment From: abdojulari

Please assist Sir

Thank you so much for your time sir I only seek for assistance to fix the code.

I have a csv file, which has several IDs, these fall in categories of arrival, departure, within range of dates. I have group by date and then find any of the IDs on particular date sharing the same place of arrival and /or departure with any other IDs

import pandas as pd

import datetime

import numpy as np

from pandas_datareader import data, wb

import csv

import json

df = pd.read_csv('testfile.csv')

df.DepartureDate = pd.to_datetime(df.DepartureDate)

df.ArrivalDate = pd.to_datetime(df.ArrivalDate)

df = df.set_index('ArrivalDate')

new_df = pd.DataFrame()

for i, data in df.iterrows():

data = data.to_frame().transpose()


data = 

data.reindex(pd.date_range(start=data.index[0], end=data.DepartureDate[0])).fillna(method='ffill').reset_index().rename(columns={'index': 'ArrivalDate'})

new_df = pd.concat([new_df, data])

new_df = new_df[['ID', 'Arrival', 'Departure', 'ArrivalDate', 'DepartureDate']]

print (new_df)

json_dict = {}

for arrival_date, data in new_df.groupby('ArrivalDate'):

matching_dates = data[data.DepartureDate==arrival_date]


not_matching_dates = data[data.DepartureDate!=arrival_date]


json_dict[arrival_date.strftime('%Y-%m-%d')] = {}


if not matching_dates.empty:


    for city, flights in matching_dates.groupby('Arrival'):


        json_dict[arrival_date.strftime('%Y-%m-%d')][city] = [str(v) for v in flights.ID.to_dict().values()]


if not not_matching_dates.empty:


    for city, flights in not_matching_dates.groupby('Departure'):


        json_dict[arrival_date.strftime('%Y-%m-%d')][city] = [str(v) for v in flights.ID.to_dict().values()]

with open('json_dict.json', 'w') as f:

 json.dump(json_dict, f, indent=4, sort_keys=True)

The desired output:

{ "2001-02-02": { "Japan": [ "CCC", "GGG" ], "UK": [ "EEE" ], "UnitedState": [ "AAA" "BBB" "DDD" "FFF" ], "France": [ "AAA" "BBB" "CCC" "DDD" "EEE" "FFF" ]

},

.
.
.

}

The current output

{ "2001-02-02": { "Japan": [ "GGG" ], "UK": [ "EEE" ], "UnitedState": [ "FFF" ] }, "2001-02-03": { "Japan": [ "GGG" ], "UK": [ "EEE" ], "UnitedState": [ "FFF" ] }, . . . }

Date: Fri, 29 Jan 2016 18:30:01 -0800 From: notifications@github.com To: pandas@noreply.github.com CC: tollycoast@hotmail.com Subject: Re: [pandas] Please I have serious issue with the CSV, Pandas, Dataframe and JSON, I need someone to assist. (#12183)

you will have better responses with a much shorter and copy pastable example

you are reading from a csv file which only u have

— Reply to this email directly or view it on GitHub.

Comment From: jorisvandenbossche

@tollycoast As Jeff already said, can you please ask this question for example at Stack Overflow (http://stackoverflow.com/) or the mailing list (https://groups.google.com/forum/?fromgroups#!forum/pydata) (I personally recommend Stack Overflow).

It is not that we don't want to help you (we can answer on those other platforms), but we like to keep this issue tracker for bugs/enhancement requests for pandas.