You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to preprocess a huge text dataset (non English) as per the code of preprocess.ipynb as provided in the repo itself. In order to do so, I have split the large dataset into small chunks of 1.26 GB (approximately) and then trying to preprocess it. However, I am getting errors (like segmentation error, etc.,) and unable to complete the preprocessing for all the chunks. Can anyone suggest anything regarding this?
The text was updated successfully, but these errors were encountered:
I am trying to preprocess a huge text dataset (non English) as per the code of preprocess.ipynb as provided in the repo itself. In order to do so, I have split the large dataset into small chunks of 1.26 GB (approximately) and then trying to preprocess it. However, I am getting errors (like segmentation error, etc.,) and unable to complete the preprocessing for all the chunks. Can anyone suggest anything regarding this?
The text was updated successfully, but these errors were encountered: