Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarsnap: Pathname in pax header can't be converted to current locale. #366

Open
yazinsai opened this issue Sep 10, 2019 · 9 comments
Open

Comments

@yazinsai
Copy link

When I try to restore a file that I had previously backed up on Ubuntu 16.04 (running on Windows Subsystem for Linux) on my mac, I get the following error:

tarsnap: Pathname in pax header can't be converted to current locale.

The command I'm running is:

tarsnap -x -f archive filepath

--
System Version: macOS 10.14.6 (18G95)
Kernel Version: Darwin 18.7.0

@gperciva
Copy link
Member

If you run

locale

what do you see? (My first guess is that your current locale doesn't support UTF-8, so the solution would be to add that. But let's see what your system says first.)

@yazinsai
Copy link
Author

Thanks for the quick response, output below:

~> locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

@gperciva
Copy link
Member

Huh. Do you remember what locale you used in Ubuntu (or what language)? For example, what it set to ar_BH?

I'm still looking into this. For the record, tarsnap uses libarchive (which is also used by bsdtar or FreeBSD's tar), so solutions to "bsdtar Pathname in pax header" would plausibly be a solution to this.

@gperciva
Copy link
Member

Oh, one more thing to try:

locale-gen

(or maybe with sudo, and maybe after checking /etc/locale.gen)

I'd hope that OSX would have run that automatically during installation / upgrade, but you never know...

@yazinsai
Copy link
Author

I was able to run locale on Ubuntu:

> locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"       
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"       
LC_MONETARY="en_US.UTF-8"      
LC_MESSAGES="en_US.UTF-8"      
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"       
LC_TELEPHONE="en_US.UTF-8"     
LC_MEASUREMENT="en_US.UTF-8"   
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Back on my Mac, I tried running locale-gen but the command doesn't exist. sudo doesn't change things, and /etc/locale.gen doesn't exist either.

@gperciva
Copy link
Member

Next idea: download the archive, then investigate with other tar tools.

I made an archive called kana (which contains a file with some Japanese characters), so substitute that for your own archive name. Then try:

$ ./tarsnap -rf kana > foo.tar
$ file foo.tar 
foo.tar: POSIX tar archive
$ tar -tf foo.tar 
kana/
tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
kana/カナ
$

I expect the tar -tf foo.tar line to fail with the "Pathnames" error, but the earlier lines should work. If by any chance it does work, please don't feel that you should post the actual filenames (in case that contains private information).

The "Ignoring unknown extended..." is a BSD tar vs. GNU tar thing, and can be safely ignored. (if you're curious, it's discussed here: https://superuser.com/questions/318809/linux-os-x-tar-incompatibility-tarballs-created-on-os-x-give-errors-when-unt )

The main thing I'm hoping to find out is whether you can get a tarball or not, because that should help to zoom in on the problem.

@yazinsai
Copy link
Author

Strange, I tried restoring the file this morning after shutting down and restarting my machine, and it worked 🤔

I suspect I know the cause of the issue. I had previously set LC_CTYPE=C in my .bash_profile to resolve a separate issue I had with getting the tr -cd command to work on my Mac. I had set this generally (i.e. LC_CTYPE=C) and then later modified it to run only on the relevant command (LC_CTYPE=C tr -cd ..).

I thought that closing the shell and reopening would be enough to ensure that the LC_CTYPE was not overwritten (and indeed, the locale command I ran showed the default value of UTF8).

Thank you for the help @gperciva -- if you'd like me to to any other tests to find the definitive cause (and possibly help future users), please let me know!

@gperciva
Copy link
Member

Thanks for letting me know, and I'm relieved that you can access your files! :) I was starting to think of really far-fetched possibilities.

For the record, I can somewhat reproduce this problem on Linux:

$ ls kana
カナ  another
$ tarsnap -cf kana kana/
$ rm -rf kana
$ LANG=C tarsnap -xf kana
tarsnap: Pathname in pax header can't be converted to current locale.
tarsnap: Error exit delayed from previous errors.
$ ls kana
カナ  another
$

I appear to have all the data, but it definitely gave me a scary error message.

@yazinsai: if you try it again with LANG=C, do you also get the data despite the message?

I'll think about a bit more, and either:

  • edit the error message to suggest checking the locale (or check the locale directly and warn if it doesn't include UTF-8)
  • add something to the website

One way or another, there should definitely be more info available to users about this situation.

@gperciva
Copy link
Member

Technical notes (mainly for myself) that might be useful at some point:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants