You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we write widgets, we usually assume that their inputs do not change unless the widget receives a new input. If one widget modifies the table that is also an input to some other widget, we would have problems.
Orange data Tables store, among others, domain, X, Y, Z, weights, ids. All these are directly writable. We never prevent changing them, so we rely only on programmer discipline to avoid changing tables. Even within our group, we sometimes start changing tables when we should not. Who knows how bugs are in Orange because of this.
Multithreading makes changes to Tables even more dangerous: even if some widget changes its input table only for computation and restores it when done, the table is invalid in between. We just saw a bug like this today with Vesna, and luckily we saw it before the merge.
So what could we do? We could keep relying on conventions. But now that the Table seems to be getting a pandas interface #5189, whose setters set both the domain and value arrays, we could see bugs even more often.
On the other extreme, we could go nuclear and hide values behind properties and not implement any setters. Ok, we would implement them and immediately deprecate them for backward compatibility. But changing tables can sometimes be genuinely useful. I do not see a problem if someone is changing a Table when it is still private, before it is sent out. For example, I often expand a table with .transform() and then fill it in afterward. Also, if we want to facilitate users in writing code (either as widgets or as Python scripts), we should make creating tables friendlier, not harder.
We could go for something in the middle. We could put all elements behind properties and have some kind of a lock on setters. So setting elements could work at first, and then we could lock it. I do not know precisely how this locking should work, but just an idea: perhaps every time a table is sent into some other widget, its setters would get locked (of course, first with deprecation warnings). It would help us write more reliable code and would also warn Python Script users, who tend to write code like in_data.X = X+42.
What do you think we should do?
The text was updated successfully, but these errors were encountered:
When we write widgets, we usually assume that their inputs do not change unless the widget receives a new input. If one widget modifies the table that is also an input to some other widget, we would have problems.
Orange data Tables store, among others, domain, X, Y, Z, weights, ids. All these are directly writable. We never prevent changing them, so we rely only on programmer discipline to avoid changing tables. Even within our group, we sometimes start changing tables when we should not. Who knows how bugs are in Orange because of this.
Multithreading makes changes to Tables even more dangerous: even if some widget changes its input table only for computation and restores it when done, the table is invalid in between. We just saw a bug like this today with Vesna, and luckily we saw it before the merge.
So what could we do? We could keep relying on conventions. But now that the Table seems to be getting a pandas interface #5189, whose setters set both the domain and value arrays, we could see bugs even more often.
On the other extreme, we could go nuclear and hide values behind properties and not implement any setters. Ok, we would implement them and immediately deprecate them for backward compatibility. But changing tables can sometimes be genuinely useful. I do not see a problem if someone is changing a Table when it is still private, before it is sent out. For example, I often expand a table with .transform() and then fill it in afterward. Also, if we want to facilitate users in writing code (either as widgets or as Python scripts), we should make creating tables friendlier, not harder.
We could go for something in the middle. We could put all elements behind properties and have some kind of a lock on setters. So setting elements could work at first, and then we could lock it. I do not know precisely how this locking should work, but just an idea: perhaps every time a table is sent into some other widget, its setters would get locked (of course, first with deprecation warnings). It would help us write more reliable code and would also warn Python Script users, who tend to write code like
in_data.X = X+42
.What do you think we should do?
The text was updated successfully, but these errors were encountered: