You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can we remove Variable.make and solve the problems it is supposed to be solving in another way?
We were discussing with @janezd that we could try to automatically match variables when that is needed instead of reusing and changing existing objects.
For example building a model on training data and then trying to use it on new test data would have problems without make. But the model needs to transform new data into its domain (that of the training data) anyway, and could instead do that in a better way. Just map variables with the same name, type and compute_value (so that variable age with values 1-100 can't map to a normalized version with values 0-1). For discrete variables, values need special care since some can be missing in train/test data in which case we need to remap the values.
Before starting work on this, we need a better estimate of how many things would need to be changed. The main one is Table.from_table(domain2, data1) which also covers the use of transform. Then there is Table.get_column_view() and the underlying Domain operations (index, __get_item__, etc).
We also need to take care that transformations (through their use of compute_value) still work.
The text was updated successfully, but these errors were encountered:
See also the following (and other) issues:
#2500, #2346, #2943, #3521
Can we remove
Variable.make
and solve the problems it is supposed to be solving in another way?We were discussing with @janezd that we could try to automatically match variables when that is needed instead of reusing and changing existing objects.
For example building a model on training data and then trying to use it on new test data would have problems without
make
. But the model needs to transform new data into its domain (that of the training data) anyway, and could instead do that in a better way. Just map variables with the same name, type and compute_value (so that variable age with values 1-100 can't map to a normalized version with values 0-1). For discrete variables,values
need special care since some can be missing in train/test data in which case we need to remap the values.Before starting work on this, we need a better estimate of how many things would need to be changed. The main one is
Table.from_table(domain2, data1)
which also covers the use oftransform
. Then there isTable.get_column_view()
and the underlyingDomain
operations (index
,__get_item__
, etc).We also need to take care that transformations (through their use of
compute_value
) still work.The text was updated successfully, but these errors were encountered: