Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

function WriteTo return 'file does not exist' #510

Open
shimeng930 opened this issue Jun 1, 2022 · 14 comments
Open

function WriteTo return 'file does not exist' #510

shimeng930 opened this issue Jun 1, 2022 · 14 comments

Comments

@shimeng930
Copy link

`if f, err = s.client.Open(path.Join(remotePath, fileName)); err != nil {
return err
}

if err = os.MkdirAll(localPath, os.ModePerm); err != nil {
	return err
}
if localFile, err = os.Create(path.Join(localPath, fileName)); err != nil {
	return err
}
if n, err := f.WriteTo(localFile); err != nil {

return err
}
`

open file is ok, and i am sure that sftp server has the file
but writeTo calling will return a err, err content is file does not exist

@shimeng930
Copy link
Author

shimeng930 commented Jun 1, 2022

whether i use fstat or stat ,
i get the error message is pThe message [path/filename] is not extractable

@puellanivis
Copy link
Collaborator

Can you use fmt.Errorf("<unique prefix per return>: %w", err) to ensure where the error is isolated?

These errors are also not coming from this package, but the server. You may be attempting to do something incompatible with the way the server works.

@shimeng930
Copy link
Author

shimeng930 commented Jun 1, 2022

yep, err pThe message [path/filename] is not extractable is from server. and sftp pkg will transfer this error.
anyway, I asked the 3pp server
on the other hand, here want to know if anyone met this kind of issue and know what config should i set in code to fix it?

@puellanivis
Copy link
Collaborator

puellanivis commented Jun 1, 2022

The other possibility is that the file is being deleted before the transfer. The WriteTo can sometimes trigger an attempt to determine the side of the file, which can make some remote systems think that the file has been completely read, for which, it then deletes the file afterwards. Check if https://pkg.go.dev/github.com/pkg/sftp#UseFstat helps you.

If it does (or doesn’t), another option is to implement the io.WriteTo yourself with a simple Read into a buffer and Write. It won’t be particularly fast, as each operation will need to round-trip before the next Read is issued, but it has a higher compatibility with these sorts of corner cases.

@sogyals429
Copy link

Hi I also seem to be facing the same issue. But oddly for me if I connect to the sftp using the cli I can download the file locally without any issues. One thing I noticed was if I download and upload the same file back to the sftp the package works completely fine. P.S there are no permission changed on the file.

@puellanivis
Copy link
Collaborator

As I noted, IBM Sterling SFTP servers (and some other ”high-security” SFTP server configurations) are known to have compatibility issues with us trying to figure out how big a file is before downloading it concurrently.

I’m starting to think that maybe we should build a “WriteTo” alternative which is built for maximum compatibility. 🤔

@shimeng930
Copy link
Author

The other possibility is that the file is being deleted before the transfer. The WriteTo can sometimes trigger an attempt to determine the side of the file, which can make some remote systems think that the file has been completely read, for which, it then deletes the file afterwards. Check if https://pkg.go.dev/github.com/pkg/sftp#UseFstat helps you.

If it does (or doesn’t), another option is to implement the io.WriteTo yourself with a simple Read into a buffer and Write. It won’t be particularly fast, as each operation will need to round-trip before the next Read is issued, but it has a higher compatibility with these sorts of corner cases.

yep.

  1. first useFstat is not useful
  2. i use buf read and write and download ok

@shimeng930
Copy link
Author

As I noted, IBM Sterling SFTP servers (and some other ”high-security” SFTP server configurations) are known to have compatibility issues with us trying to figure out how big a file is before downloading it concurrently.

I’m starting to think that maybe we should build a “WriteTo” alternative which is built for maximum compatibility. 🤔

i think a “WriteTo” alternative is necessary

@puellanivis
Copy link
Collaborator

i think a “WriteTo” alternative is necessary

I mean, remember this won’t be automatically used by anything, like io.Copy so it’s not actually going to solve many of the most common issues. I’ve been putting off doing the client rewrite, but I think this is something I’ll want to keep in mind… that we might want a much more compatible WriteTo process, even if it means we over provision goroutines for smaller files…

@theGuen
Copy link

theGuen commented Jul 11, 2022

Hi,
i guess i had the same Problem with the disappearing files.
There are like 20 files with the same name... one for every day.
The fstat always deletes the newest... and you read an older one until
you have only one left... a total mess...

I have changed the WriteTo to preserve the concurrency.
More of a ugly solution... but seems to work for my case...

func (f *File) WriteTo(w io.Writer, fileInfo os.FileInfo) (written int64, err error) {
	f.mu.Lock()
	defer f.mu.Unlock()

	if f.c.disableConcurrentReads {
		return f.writeToSequential(w)
	}

	// For concurrency, we want to guess how many concurrent workers we should use.
	fileSize := uint64(fileInfo.Size())
	mode := uint32(fileInfo.Mode())

	if fileSize <= uint64(f.c.maxPacket) || !isRegular(mode) {
...

I take the fileInfo from ReadDir

fis, err := sftpClient.ReadDir("/Out")
	var fileInfo os.FileInfo
	for i, v := range fis {
		fmt.Println("LIST", i, v.Name(), v.ModTime(), v.Size())
                // filter the right fileName
                // in my case there are 20 files with the same name ???
		fileInfo = v
	}
...

@puellanivis
Copy link
Collaborator

The WriteTo function is an interface that must be complied with: https://pkg.go.dev/[email protected]#WriterTo So we cannot change the function signature.

However, in my planned client re-rewrite, I’m think I should just ditch the whole “guess the best number of concurrent requests” entirely. Even if this causes small-file inefficiencies it will be less of a pain than the regular occurrence of people’s frustrations at these only-read-once-for-security designs.

@theGuen
Copy link

theGuen commented Jul 11, 2022

Ok that's right...
As i said just a quick hack to make things work with this stupid server...
The concurrency should stay...
I just tested again with

sftp.UseConcurrentReads(true)
➜  testSFTP time go run testftp.go
go run testftp.go  0,39s user 0,53s system 19% cpu 4,613 total

sftp.UseConcurrentReads(false)
➜  testSFTP time go run testftp.go
go run testftp.go  0,46s user 0,66s system 4% cpu 24,885 total

And that was just a 18M file

Maybe something like

// returns the first matched fileInfo 
func (c *Client) ReadDirWithFilter(path string, fileFilter func(string) bool)(fi fs.FileInfo, found bool)
and 
func (c *Client) OpenWithFileInfo(path string, fileInfo fs.FileInfo) (*File, error)
with a File like 
// File represents a remote file.
type File struct {
	c      *Client
	path   string
	handle string

	mu     sync.Mutex
	offset int64 // current offset within remote file

        fromFileInfo bool // indicates that that the size is known
        size uint64
        mode fs.FileMode
}

path should maybe be the path without filename for the functions...
I guess this would be optional to use and breaks nothing

@theGuen
Copy link

theGuen commented Jul 12, 2022

Just in case of any misunderstanding ...
With "this stupid server" i was referring to this or maybe that server which deletes the files on a fstat and has the same filename more than once.... so if you missed a file you have to read first the newest than the older one to import in reverse order... I would never have opened an issue here to deal with a server like that...

This library is just great and served me well over the last year. Thanks for that!

Since i have to deal with that server i have now implemented the FilterDir and OpenWithFileInfo in a minimal invasive way.
If you like i can create a pull request... On the other hand if you are rewriting the client it's maybe not needed...

@puellanivis
Copy link
Collaborator

Oh, yeah, I totally understood what you were referring to, and yeah, these servers are typically poorly designed to integrate with typical engineering practice… I was wanting to put in “security theater” but decided to avoid it. 😆

I’d probably say, I would prefer us to not pick up FilterDir, as with a proper look towards the future, we should target compatibility with io/fs.WalkDir.

If we add an OpenWithFileInfo exported function, then we’re going to be stuck with it indefinitely. We already have a complete duplicate set of exported error names, because we’re kind of stuck with what we’ve provided before. The central idea of “do not introduce any breaking changes without a major version number increase”.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants