Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we build and use the slackarchive-import tool? #10

Open
paulwilton opened this issue May 6, 2018 · 13 comments
Open

How do we build and use the slackarchive-import tool? #10

paulwilton opened this issue May 6, 2018 · 13 comments

Comments

@paulwilton
Copy link

Hi Remco - how do we build and use the archive importer ?

@nl5887
Copy link
Member

nl5887 commented May 6, 2018

I'll try to make a docker file for the same, then it will be starting just the docker file.

@nl5887
Copy link
Member

nl5887 commented May 7, 2018

Added a Dockerfile to https://github.com/dutchcoders/slackarchive-import, including how to use it.

@paulwilton
Copy link
Author

paulwilton commented May 7, 2018

Hi Remco - great stuff
I have tried running it against an export. Couple of things to note:

  1. the tool tries to import from ./data/{team} - had to unzip the export into ./data/{team} subdir
  2. using docker, I needed to set the mongo and elastic endpoints to use the slackarchive network names in the config.yaml
    http://{user}:{passwd}@mongodb:27017/slackarchive
    http://elasticasearch:9200

But I am getting this error on import:
2018/05/07 14:42:12 importChannel: json: cannot unmarshal string into Go struct field Channel.created of type slack.JSONTime

@noisymime
Copy link

Same error as Paul here. The Team and Users load in, but it crashes out on the Channel.created field when trying to do the channels.

@paulwilton
Copy link
Author

I can see in code the Channel.created field is commented out in the Channel Model - not sure if this is the issue though..
https://github.com/dutchcoders/slackarchive-import/blob/af4814269fa1c1e9b7fb774c4a5fccd6058a4089/models/channel.go#L10

@paulwilton
Copy link
Author

paulwilton commented May 9, 2018

okay I have debugged this (I am learning go on the go :)
the problem is that the https://github.com/nlopes/slack lib that unmarshalls the channels.json into the Channel data structure uses the slack.JSONtime datatype internally, and the JSON unmarshaller expects the input value to be an integer value not a string for this datatype.
However the export from Slack outputs the Channel.created property as a string (quoted integer, unix timestamp), instead of a literal integer value

eg.
in channels.json there is "created": "1457081364"
but the unmarshaller for slack.JSONtime datatype expects "created": 1457081364
The same goes for the field topic.last_set this also needs to be a literal integer in the JSON

I don't know how to fix this in the code, without writing a custom unmarshaller, as the error I think is in the nlopes/slack library.

If I run a regex on the channels.json to replace
"created": "([0-9]+)" with "created": $1
and
"last_set": "([0-9]+)" with "last_set": $1
and then re-run the import, it all works
I am not a golang expert - maybe @nl5887 can shed some light and has an easier fix

@paulwilton
Copy link
Author

Just to clarify if it wasn't clear, there is a workaround :

  1. run a regex on channels.json to replace
    "created": "([0-9]+)" with "created": $1
    and
    "last_set": "([0-9]+)" with "last_set": $1

  2. run the import command as documented
    I was able to import about 1m messages fairly quickly with no errors after this.

@noisymime
Copy link

@paulwilton That's awesome, thanks for that! regex replace worked a treat for me on those 2 fields and my import has just finished without error :)

@paulwilton
Copy link
Author

i guess the workaround could simply be coded into the importer - pre-process the channels.json file with those regexes.

@nl5887
Copy link
Member

nl5887 commented May 10, 2018

I think this is something where there are differences between the export files and the models being used in the APIs. Just verified that dependencies are up to date, so that's not the issue. Let me see how I can fix this. Thanks for the thorough debugging and information, that helps a lot.

@nl5887
Copy link
Member

nl5887 commented May 10, 2018

Created a new PR for this nlopes/slack#314. I'll wait for this PR to be accepted, then it will be updated in this repo.

@arduanov
Copy link

@nl5887 Please fix PR

@aeron7
Copy link

aeron7 commented Jul 23, 2018

@paulwilton @nl5887

I have installed slackarchive and running at http://51.15.224.203:8080

Then I git-ed slackarchive-import. So the current folder structure looks -
/root/slackarchive-docker
/root/slackarchive-import

The Slack domain is - unofficed

So extracted all the contents of the zip file in /root/slackarchive-import/data/unofficed/

Updated the token!

docker run --mount type=bind,source=$(pwd)/config.yaml,target="/config/config.yaml" --mount type=bind,source=$(pwd)/data/,target=$(pwd)/data1 --network slackarchive dutchcoders/slackarchive-import -- xoxb-token $(pwd)/data1

After lots of struggle, I ran this line. It shows a string of "user already exists" and then shows this error -
2018/07/23 19:02:29 importChannel: json: cannot unmarshal string into Go struct field Channel.created of type slack.JSONTime

But the Jsons are supplied by Slack only. What did I do wrong? Any change in slack exports?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants