python - Pandas DataFrame Advanced Slicing -
i r user , found myself struggling bit moving python, indexing capabilities of pandas.
household_id second column. sorted dataframe based on column , ran following instructions, returning various results (that expect same). expressions same? if so, why see different results?
in [63]: ground_truth.columns out[63]: index([timestamp, household_id, ... (continues) in [59]: ground_truth.ix[1107177,'household_id'] out[59]: 2 in [60]: ground_truth.ix[1107177,1] out[60]: 2.0 in [61]: ground_truth.iloc[1107177,1] out[61]: 4.0 in [62]: ground_truth['household_id'][1107177] out[62]: 2
ps: cant post data unfortunately (too big).
note: when sort column, you'll rearrange index, , assuming wasn't sorted way begin you'll have integers labels don't equal linear index in array.
first, ix
first try integers labels indices, immediate 59 , 62 same. second, if index not 0:n - 1
1107177 label, not integer index difference between 60 , 61. far float casting goes, looks might using older version of pandas. doesn't happen in git master.
here docs on ix
.
here's example toy dataframe
:
in [1]: df = dataframe(randn(10, 3), columns=list('abc')) print df print print df.sort('a') b c 0 -1.80 -0.28 -1.10 1 -0.58 1.00 -0.48 2 -2.50 1.59 -1.42 3 -1.00 -0.12 -0.93 4 -0.65 1.41 1.20 5 0.51 0.96 1.28 6 -0.28 0.13 1.59 7 1.28 -0.84 0.51 8 0.77 -1.26 -0.50 9 -0.59 -1.34 -1.06 b c 2 -2.50 1.59 -1.42 0 -1.80 -0.28 -1.10 3 -1.00 -0.12 -0.93 4 -0.65 1.41 1.20 9 -0.59 -1.34 -1.06 1 -0.58 1.00 -0.48 6 -0.28 0.13 1.59 5 0.51 0.96 1.28 8 0.77 -1.26 -0.50 7 1.28 -0.84 0.51
notice sorted row indices integers , don't map locations.
Comments
Post a Comment