Table has public attributes we assume won't change (but can) #5303

markotoplak · 2021-03-01T19:29:19Z

When we write widgets, we usually assume that their inputs do not change unless the widget receives a new input. If one widget modifies the table that is also an input to some other widget, we would have problems.

Orange data Tables store, among others, domain, X, Y, Z, weights, ids. All these are directly writable. We never prevent changing them, so we rely only on programmer discipline to avoid changing tables. Even within our group, we sometimes start changing tables when we should not. Who knows how bugs are in Orange because of this.

Multithreading makes changes to Tables even more dangerous: even if some widget changes its input table only for computation and restores it when done, the table is invalid in between. We just saw a bug like this today with Vesna, and luckily we saw it before the merge.

So what could we do? We could keep relying on conventions. But now that the Table seems to be getting a pandas interface #5189, whose setters set both the domain and value arrays, we could see bugs even more often.

On the other extreme, we could go nuclear and hide values behind properties and not implement any setters. Ok, we would implement them and immediately deprecate them for backward compatibility. But changing tables can sometimes be genuinely useful. I do not see a problem if someone is changing a Table when it is still private, before it is sent out. For example, I often expand a table with .transform() and then fill it in afterward. Also, if we want to facilitate users in writing code (either as widgets or as Python scripts), we should make creating tables friendlier, not harder.

We could go for something in the middle. We could put all elements behind properties and have some kind of a lock on setters. So setting elements could work at first, and then we could lock it. I do not know precisely how this locking should work, but just an idea: perhaps every time a table is sent into some other widget, its setters would get locked (of course, first with deprecation warnings). It would help us write more reliable code and would also warn Python Script users, who tend to write code like in_data.X = X+42.

What do you think we should do?

The text was updated successfully, but these errors were encountered:

janezd · 2021-04-02T07:43:26Z

@janezd will try to implement context manager that unlocks a table.

janezd · 2021-10-08T19:15:51Z

Fixed via #5381.

markotoplak added bug report Bug is reported by user, not yet confirmed by the core team needs discussion Core developers need to discuss the issue and removed bug report Bug is reported by user, not yet confirmed by the core team labels Mar 1, 2021

markotoplak mentioned this issue Mar 1, 2021

Better pandas integration #5189

Merged

3 tasks

janezd removed the needs discussion Core developers need to discuss the issue label Apr 2, 2021

janezd self-assigned this Apr 2, 2021

janezd mentioned this issue Apr 2, 2021

[ENH] Table lock: tests run with tables that are read-only by default #5381

Merged

3 tasks

janezd closed this as completed Oct 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table has public attributes we assume won't change (but can) #5303

Table has public attributes we assume won't change (but can) #5303

markotoplak commented Mar 1, 2021

janezd commented Apr 2, 2021

janezd commented Oct 8, 2021

Table has public attributes we assume won't change (but can) #5303

Table has public attributes we assume won't change (but can) #5303

Comments

markotoplak commented Mar 1, 2021

janezd commented Apr 2, 2021

janezd commented Oct 8, 2021