You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When inserting into a table that has a field result : attach@minio, the insert table method expects a file path. Similarly, fetch stores a file and returns a file path. This is often times inconvenient, because (i) the data saved in the file is required as an object in the python script one is executing, and (ii) the saved / downloaded files remains on local storage even after the script terminated.
Requirements
Possible solution: Introduce a parameter to insert that automatically saves the data that should be inserted to a file, inserts it into the table, and then removes that file. Similarly, fetch could save the file, and return the file / data loaded within the python script.
Justification
See problem section
Alternative Considerations
Currently I am using an AttachMixin as a workaround, i.e. my table would be defined as class MyTable(AttachMixin, dj.Computed). The mixin could be the code basis for the feature I suggested, although it would need a little bit of improvement.
classAttachMixin:
defattach_insert(self, keys: Iterable[Dict[str, Any]], attach_keys: Iterable[str]) ->None:
ifnotisinstance(attach_keys, list):
raiseValueError("attach_keys must be a list")
withtempfile.TemporaryDirectory(dir=os.environ.get("TMP", ".")) astemp_dir:
for (i, key), akinproduct(enumerate(keys), attach_keys):
path=os.path.join(temp_dir, create_random_str() +".pkl")
withopen(path, "wb") asf:
pickle.dump(key[ak], f)
keys[i][ak] =pathself.insert(keys)
defattach_insert1(self, key: Dict[str, Any], attach_keys: Iterable[str]) ->None:
self.attach_insert([key], attach_keys)
defattach_fetch(
self,
*attrs: str,
key: Optional[Dict[str, Any]] =None,
**kwargs,
) ->Union[Dict[str, Any], List]:
key=keyor {}
withtempfile.TemporaryDirectory(dir=os.environ.get("TMP", ".")) astemp_dir:
ret= (self&key).fetch(*attrs, download_path=temp_dir, **kwargs) # array, list[dict]ifisinstance(ret, dict):
ret=self._load_from_dict(ret)
elifisinstance(ret, Iterable):
ret=np.array(ret)
fori, valueinenumerate(ret):
ifisinstance(value, dict):
ret[i] =self._load_from_dict(value)
elifself._is_pkl_path(value):
withopen(value, "rb") asf:
ret[i] =pickle.load(f)
else:
raiseNotImplementedError(f"Value {value} is not a dict or a pkl path")
elifself._is_pkl_path(ret):
withopen(ret, "rb") asf:
ret=pickle.load(f)
else:
raiseNotImplementedError(f"Return value {ret} is not a dict, Iterable, or a pkl path")
returnretdefattach_fetch1(
self,
*attrs: str,
key: Optional[Dict[str, Any]] =None,
**kwargs,
) ->Union[Dict[str, Any], List]:
ret=self.attach_fetch(*attrs, key=key, **kwargs)
iflen(ret) >1:
raisedj.DataJointError(f"fetch1 should only return one tuple. {len(ret)} tuples were found")
returnret[0]
def_load_from_dict(self, d: dict[str, str]) ->dict[str, Any]:
forkey, valueind.items():
ifself._is_pkl_path(value):
withopen(value, "rb") asf:
d[key] =pickle.load(f)
returnddef_is_pkl_path(self, value):
return (
isinstance(value, str) andvalue.endswith(".pkl") andos.path.isfile(value)
)
Related
This issues might be (loosely) related: #1109 #1099
If you think such a feature could be helpful to be included in datajoint, I would be happy to help implementing it.
The text was updated successfully, but these errors were encountered:
I think you're suggesting some sort of a user-provided functions on insert and on fetch for attach type.
This is very much the idea of DataJoint's AttributeAdapter feature - see here
With that feature, you can define a new DataJoint datatype (e.g. attack_pkl or something like that).
Feature Request
Problem
When inserting into a table that has a field
result : attach@minio
, theinsert
table method expects a file path. Similarly,fetch
stores a file and returns a file path. This is often times inconvenient, because (i) the data saved in the file is required as an object in the python script one is executing, and (ii) the saved / downloaded files remains on local storage even after the script terminated.Requirements
Possible solution: Introduce a parameter to
insert
that automatically saves the data that should be inserted to a file, inserts it into the table, and then removes that file. Similarly,fetch
could save the file, and return the file / data loaded within the python script.Justification
See problem section
Alternative Considerations
Currently I am using an
AttachMixin
as a workaround, i.e. my table would be defined asclass MyTable(AttachMixin, dj.Computed)
. The mixin could be the code basis for the feature I suggested, although it would need a little bit of improvement.Related
This issues might be (loosely) related:
#1109
#1099
If you think such a feature could be helpful to be included in datajoint, I would be happy to help implementing it.
The text was updated successfully, but these errors were encountered: