How do I build a CSV with data returned from multiple API calls in Python? -
i'd build csv file combining data multiple api calls. i'm okay basic python, , can call api, extract json data, , write data csv. need efficiently merging data can write out csv once data extraction finished.
this data looks straight api request:
{u'datetime': u'2011-03-28', u'value': u'2298'}, {u'datetime': u'2011-03-29', u'value': u'2322'}, {u'datetime': u'2011-03-30', u'value': u'2309'}, {u'datetime': u'2011-03-31', u'value': u'2224'}, {u'datetime': u'2011-04-01', u'value': u'2763'}, {u'datetime': u'2011-04-02', u'value': u'3543'},
so i'd looking @ merging lots of together:
>apicall1 2011-03-28,2298 2011-03-29,2322 2011-03-30,2309 >apicall2 2011-03-28,432 2011-03-29,0 2011-03-30,444
each api call result looks pretty same: date , value. date formatted same, our common element.
for given date , value, there may no value or 0 returned, need able account case there no data.
the ideal output this:
2011-03-28,2298,432,23952,765,31 2011-03-29,2322,0,432353,766,31 2011-03-30,2309,444,2343923,0,32 2011-03-31,2224,489,3495,765,33
i have 15 calls make, , each return response contains approximately 800 rows of data (800 days, essentially, growing 1 row per day future). need run few times per day, concerned efficiency degree grows larger. unfortunately, historical data can change, need rebuild whole list every time run command. however, historical data changes infrequently , small percent change, if there efficiency had in updating data, i'm open that.
one option know make work writing csv file first api call , re-open file , write more data csv each subsequent call (i.e., 15 separate reads , writes csv per program execution). doesn't sound efficient me.
should use sqlite in memory building data set , dump out csv @ end? list of lists better? i'm not strong on sql, although know enough dangerous if it's right way go.
you can use pandas
library that.
import sys import pandas pd # simulation of return values calls calls = [ [ {u'datetime': u'2011-03-28', u'value': u'2298'}, {u'datetime': u'2011-03-29', u'value': u'2322'}, {u'datetime': u'2011-03-30', u'value': u'2309'}, ], [ {u'datetime': u'2011-03-28', u'value': u'28'}, {u'datetime': u'2011-03-29', u'value': u'22'}, {u'datetime': u'2011-03-30', u'value': u'09'}, ] ] # create initial empty data frame df = pd.dataframe() # make consecutive calls i, call in enumerate(calls): # create new dataframe data got df_new = pd.dataframe(call).set_index('datetime') # rebane column avoid collision df_new.rename(columns={'value': 'value_%s' % i}, inplace=true) # merge current data frame df = pd.concat([df, df_new], axis=1) # save data file (i'm using here sys.stdout, # print console. df.to_csv(sys.stdout, header=none)
result:
2011-03-28,2298,28 2011-03-29,2322,22 2011-03-30,2309,09
Comments
Post a Comment