Skip to content

Commit

Permalink
Fix merged cells text getting duplicated when linearizing as plaintext
Browse files Browse the repository at this point in the history
  • Loading branch information
Belval committed May 23, 2024
1 parent 9df5d26 commit 6116503
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions textractor/entities/table.py
Original file line number Diff line number Diff line change
Expand Up @@ -933,6 +933,7 @@ def get_text_and_words(
for cell in sorted(row, key=lambda c: c.col_index):
# Siblings includes the current cell
if cell.siblings:
children = []
first_row, first_col, last_row, last_col = cell._get_merged_cell_range()
if (cell.col_index == first_col and cell.row_index == first_row) or config.table_duplicate_text_in_merged_cells:
for sib in cell.siblings:
Expand Down

0 comments on commit 6116503

Please sign in to comment.