[FIX] table_from_frame: replace nan with String.Unknown for string variable #5795
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
When transforming pandas data frame to table string columns that contain nan values will
keep them after the transformation but Orange uses an empty string for the unknown value in the StringVarable.
For example:
df = pd.DataFrame(
[["a", "b"], ["c", "d"], ["e", "f"], [np.nan, np.nan]],
)
will be transformed to the table with two string variables and nan values will be kept even
String.Unknow = ""
When a column is recognized to be a string and has an object type, it still can contain some values that are not strings. Cast column to string.
Description of changes
Changed that nan values are transformed to
String.Unknown
for columns that will be transformed to the string variable and values are transformed to strings.Includes