Skip to content

Commit

Permalink
Pandas DataFrames: Add Row Column Name (openPMD#1501)
Browse files Browse the repository at this point in the history
By default, the row index (!= particle index) in a pandas
dataframe has no name. This can be a bit cumbersome for
exports, e.g., to CSV - where this header field would just
be empty.

This PR names the index now "row", because it is not a
(macro) particle id property.
  • Loading branch information
ax3l authored Aug 9, 2023
1 parent 3c063e8 commit 358e0ab
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 1 deletion.
7 changes: 7 additions & 0 deletions src/binding/python/openpmd_api/DaskDataFrame.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,11 @@ def particles_to_daskdataframe(particle_species):
]
df = dd.from_delayed(dfs)

# set a header for the first column (row index)
# note: this is NOT the particle id
# TODO both these do not work:
# https://github.com/dask/dask/issues/10440
# df.index.name = "row"
# df.index = df.index.rename("row")

return df
8 changes: 7 additions & 1 deletion src/binding/python/openpmd_api/DataFrame.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,10 @@ def particles_to_dataframe(particle_species, slice=None):
columns[column_name] = np.multiply(
columns[column_name], rc.unit_SI)

return pd.DataFrame(columns)
df = pd.DataFrame(columns)

# set a header for the first column (row index)
# note: this is NOT the particle id
df.index.name = "row"

return df

0 comments on commit 358e0ab

Please sign in to comment.