This repo contains a decorator factory memoize
that manages a local file cache of function results.
The cache is stored as a JSON file.
Install using pip:
python3 -m pip install git+https://github.com/ethho/memoize.git
By default, memoize
stores its cache in /tmp/memoize/<date>.json
, but this can be overridden by passing optional kwargs to the decorator factory.
from memoize import memoize
from functools import lru_cache
@lru_cache() # Optionally, use with LRU cache to also cache in RAM
# All are optional kwargs
@memoize(stub='my_cache', # file stub override
cache_dir='/tmp/my_cache_dir', # cache directory override
log_func=logger.info # logging function override, print by default
ignore_invalid=True) # ignore cache if not JSON serializable
def my_func(s: str, b: bool = True, opt=None):
return {"s": s, "b": b, "opt": opt}
The memoize_df
decorator caches the pandas.DataFrame
returned from a function to a CSV file.
The pandas
module must be installed to use this feature:
python3 -m pip install pandas
The memoize_df
decorator factory can be used for any function that returns a pandas.DataFrame
.
While memoize
stores the results of many calls in one cache file, memoize_df
writes a separate cache file for each unique call.
Also note that DataFrame index will be written to the CSV cache if and only if the index has a non-null name
attribute.
import pandas as pd
from memoize.dataframe import memoize_df
@memoize_df(cache_dir='/tmp/memoize')
def make_dataframe(foo: int):
df = pd.DataFrame(data=reversed(range(foo)), index=range(foo), columns=['my_column'])
df.index.name = 'my_index'
return df
print(make_dataframe(4))
# Using cache fp='/tmp/memoize/make_dataframe_44566a0_20230120.csv' to write results of function make_dataframe
# my_column
# my_index
# 0 3
# 1 2
# 2 1
# 3 0
print(make_dataframe(3))
# Using cache fp='/tmp/memoize/make_dataframe_3c15101_20230120.csv' to write results of function make_dataframe
# my_column
# my_index
# 0 2
# 1 1
# 2 0
print(make_dataframe(4))
# Using cache fp='/tmp/memoize/make_dataframe_44566a0_20230120.csv' to write results of function make_dataframe
# Using cached call from /tmp/memoize/make_dataframe_44566a0_20230120.csv
# my_index my_column
# 0 0 3
# 1 1 2
# 2 2 1
# 3 3 0
MIT
Args, kwargs, and function return value must be JSON-serializable if using the memoize
decorator.
The return value of the wrapped function must be a pandas.DataFrame
when using the memoize_df
decorator.
The entire contents of the date-stamped cache file will be read and written on every function call, which may post I/O challenges.