Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memfile stuff #561

Open
wants to merge 30 commits into
base: master
Choose a base branch
from
Open

Memfile stuff #561

wants to merge 30 commits into from

Conversation

zelch
Copy link

@zelch zelch commented Oct 31, 2023

There are roughly three sets of changes in this PR, all in the memfile backend.

The first set is using errors.Is to check error types, and this is primarily to make the linter happy. (What can I say, if I'm working on a file, it's much easier to just make it lint cleanly before I do anything else.)

The second is making Setstat more functional: Support for setting the mode and ctime, which involves tracking the file mode.

And the third is making it practical to use Setstat on a memfile backend directly, for the purpose of letting test frameworks do things like set the mode and ModTime on test files.

The commit message on 7bf7a25 spells out why I'm not especially happy with the solution I came up with.

It works, it's minimally invasive, and it just feels wrong.

I'm very open to alternate solutions here. A function to construct a Request object with attributes also feels somewhat messy, especially without defining a better structure format to use at some point.

Zeph / Liz Loss-Cutler-Hull added 5 commits October 30, 2023 17:11
This correctly handles wrapped errors as well.
This makes it possible for memfile to handle mode changes, ctime
changes, in addition to the already existing file size changes.
I'm really not especially happy with this.

It works, it's minimally invasive, and it feels like the wrong solution.

I'm pretty sure that the right solution is a breaking change of the
Request structure, to just flat out contain FileAttrFlags and FileStat,
having requestFromPacket do the parsing into those structures.

But that's an API change, and I doubt that we want a major version bump
for this.

As an alternative, we could have a new Request structure type, with that
layout, and have public functions for converting between them.

That would at least make it possible for someone to construct the
Request structure for a Setstat command.

The use case of this is being able to populate a memfile backed sftp
server with test data that has specific timestamps, as part of the test
framework of an internal program.

I'm definitely open to other ideas on how to solve this problem.
This would replace the SetStat public function from the last commit.

Instead, this makes it practical to construct a Setstat request.

From some quick testing, this appears to do the job correctly, and it
feels like a lot less of hack.

It still doesn't feel entirely right, but it's closer.
@zelch
Copy link
Author

zelch commented Oct 31, 2023

And I went ahead and wrote up an alternate option, with the badly named SetAttributes allowing us to construct a valid SetAttr Request object.

Again, suggestions on how this could be improved are more than welcome!

Copy link
Collaborator

@puellanivis puellanivis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well as the comments inline, there are a few other cases where err is compared directly against os.ErrNotExist. It seems your linter is only catching err != os.ErrNotExist and not err == os.ErrNotExist?

Otherwise, this seems pretty good so far.

request-attrs.go Outdated Show resolved Hide resolved
request-attrs.go Outdated Show resolved Hide resolved
@@ -110,6 +110,7 @@ func (fs *root) openfile(pathname string, flags uint32) (*memFile, error) {

file := &memFile{
modtime: time.Now(),
mode: 0644,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be defaulting this value. Please, pass it as a parameter, and document in the godoc that it is ignored if the creation flag is not set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I had defaulted to the value that we previously hard coded. I'll take a crack at getting it from the initial request.

request-example.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
@@ -569,7 +582,7 @@ func (f *memFile) Mode() os.FileMode {
if f.symlink != "" {
return os.FileMode(0777) | os.ModeSymlink
}
return os.FileMode(0644)
return os.FileMode(f.mode)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function also now needs to that directories might have non-default modes.

request-interfaces.go Outdated Show resolved Hide resolved
request-attrs.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
@zelch
Copy link
Author

zelch commented Oct 31, 2023

As well as the comments inline, there are a few other cases where err is compared directly against os.ErrNotExist. It seems your linter is only catching err != os.ErrNotExist and not err == os.ErrNotExist?

Otherwise, this seems pretty good so far.

Good catch, and while I'm at it, I should probably tackle all of the other cases of this as well, not just those in request-example.go.

Zeph / Liz Loss-Cutler-Hull added 6 commits October 31, 2023 20:08
This is in favor of an alternate approach, we can always pull it back
out of git history if we want to.
It will be replaced with an alternate method.
This should cover every single case where we were using == or != on an
err.

There may be other cases to address, but this covers a big one.
With us no longer offering an interface with SetStat, there is no reason
for this to be it's own function anymore.

This also cleaned up how we handle truncation errors.
For the sake of sanity, we really should be offering _some_ way for
users of memfile to get the file mode right.  These two are what we use
internally, so they make perfect sense to expose.
@zelch
Copy link
Author

zelch commented Nov 1, 2023

Hi @puellanivis, based on your feedback I redesigned the entire process. (And did a few other things.)

I'm still not entirely satisfied by the design, but I'm not sure that it's going to be possible to do better without causing an API break.

Zeph / Liz Loss-Cutler-Hull added 4 commits October 31, 2023 21:02
We now check to ensure that the permissions on a given file allow the
user to perform the requested operation.

For Filereader, the file must be readable.

For Filewriter, the file must be writable.

For OpenFile, the file must be readable and writable.
Missed this in the first pass.
request-example.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
return err
}

// Note: fs.openfile does not support opening a directory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memfile is intended to be a stripped-feature POSIX compatible filesystem.

It turns out, this behavior is a bug. It just wasn’t apparent, because I wasn’t implementing permissions on the filesystem, and we do not stress test this package for POSIX compatibility.

Any feature expansion to support permission should comply with POSIX compatibility of permissions, and this includes directories having permissions, and the behavior of those permissions.

When files don’t have permissions, it does not violate the principle of least astonishment that directories do not have permissions as well. But if files have permissions, it does violate the principle of least astonishment that directories do not have permissions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is awkward, and I think that we might be better off documenting where we are not POSIX compliant.

Otherwise we're left checking the permissions of every directory under a given file. That wouldn't necessarily be the end of the world, but it definitely makes all the file open cases more complicated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fortunately, opening both files and directories goes through either fetch or lfetch.

And yeah. :( POSIX is really hard, but our testing code kind of relies upon it being as POSIX-compliant that we can get. Last I touched this, I basically had to do a complete overhaul to correctly support open flags properly. … 🤔 I think I might have avoided working on permissions and ownership precisely because things were getting out-of-hand, and it would require an even further restructure/refactor to more properly emulate inodes in their entirety.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fortunately, this is all still passing the testing infrastructure, so I'm still very inclined to just keep permissions on directories as 'unimplemented' for right this moment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s passing the testing infrastructure merely because we’re not sufficiently testing it. :blobsweat:

I think if we’re going to start implementing permissions for files, but not for directories, then we’re going to need to start documenting what is covered and what is not. Right now attempting to change permissions on anything won’t do anything, and users are likely to say, “oh… I guess no permissions?” (Somewhat typical situation because of Windows.) Or changing ownership yields, “oh, I guess no ownership.”

But “wait, but file permissions work, and dir permissions don’t?!” is going to lead to questions that we can answer with documentation. If you’re not willing to put in the effort of :blobsweat: implementing the full permissions system. (I don’t blame you. I clearly stopped short.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with implementing directory permissions is that I can't even remember the esoteric rules on what happens when one of read or execute is set without the other set, let alone the special bits. Especially since we don't even have a chdir at that level, just get of specific full paths. And I definitely recall that it gets weird when you have that be the case for an intermediate directory in the path, but you have full permissions for a subdirectory.

Only, now you've made me look at the POSIX specification with a migraine, ugh.

Writing test cases for this is going to be painful, but I'll take a look in my free time. Sadly, having gotten stuff working 'well enough' for the work application, it's harder to spend work time on the remaining bits.

request.go Outdated Show resolved Hide resolved
client.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
request-example.go Outdated Show resolved Hide resolved
request.go Outdated Show resolved Hide resolved
Zeph / Liz Loss-Cutler-Hull and others added 10 commits November 1, 2023 18:23
FileStat.MarshalTo takes in flags, and generates the []byte that goes into
Request.Attr

FileAttrFlags.ForRequest() generates the uint32 bitmap that goes into
Request.Flags
As best as I can tell, under POSIX chmod will follow symlinks.

However I am definitely not sure if that is the case for common sftp
servers on POSIX filesystems, we should consider checking...  Some day.
Zeph / Liz Loss-Cutler-Hull added 3 commits November 1, 2023 19:22
This avoids logic duplication.
That is, 0oNNN instead of 0oNNNN, this is because we don't have any
current cases where we're using the 4th digit, and so things turned into
0o0200.
@zelch
Copy link
Author

zelch commented Nov 2, 2023

Alright, I think that this version is a lot cleaner, if not perfect.

As a matter of preference, do you want a PR without all of the history of the various different designs that were discarded, or would you prefer to leave it as a record of things not to do?

@urko-b
Copy link
Contributor

urko-b commented Nov 2, 2023

Hi, I don't know if this stuff I want to share is out of scope but I did an implementation of a in memory map[string]custom.Bytes struct which handles storing files in memory and uploading them to some storage/local etc...

THen I did another version which is writting files into temp directory. I used sftp.NewRequestServer()

func newWriteOnlyHandler(tempDir string) *writeOnlyHandler {
	return &writeOnlyHandler{
		tempDir: tempDir,
		files:   make(map[string]string, 0),
	}
}

type writeOnlyHandler struct {
	tempDir string
	files   map[string]string
}

func (ch *writeOnlyHandler) Filewrite(r *sftp.Request) (io.WriterAt, error) {
	localPath, ok := ch.files[r.Filepath]
	if !ok {
		localPath = filepath.Join(ch.tempDir, filepath.Base(r.Filepath))
		ch.files[r.Filepath] = localPath
	}
	file, err := os.OpenFile(localPath, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0o666)
	if err != nil {
		return nil, err
	}
	// Do not need to close as we delegate this to sftp server
	return file, nil
}

var errOperationNotAllowed = errors.New("this operation is not allowed")

// Deny read requests since we only want to allow write (put) operations.
func (ch *writeOnlyHandler) Fileread(r *sftp.Request) (io.ReaderAt, error) {
	return nil, errOperationNotAllowed
}

// Deny read requests since we only want to allow write (put) operations.
func (ch *writeOnlyHandler) Filecmd(r *sftp.Request) error {
	return errOperationNotAllowed
}

// Deny read requests since we only want to allow write (put) operations.
func (ch *writeOnlyHandler) Filelist(r *sftp.Request) (sftp.ListerAt, error) {
	return nil, errOperationNotAllowed
}

Then I used it like this:

handlers := sftp.Handlers{
			FileGet:  s.writeonlyHdl,
			FilePut:  s.writeonlyHdl,
			FileCmd:  s.writeonlyHdl,
			FileList: s.writeonlyHdl,
		}
		reqServer := sftp.NewRequestServer(channel, handlers, sftp.WithStartDirectory(tempdir))

@puellanivis
Copy link
Collaborator

Thanks for your input @urko-b . I think our earlier versions of InMemoryHandler approached things in a very similar manner. The problem we ran into is that it was acting unpredictably from what expectations of how a POSIX filesystem should work, and things… uh… snowballed. As I think you can imagine.

Copy link
Collaborator

@puellanivis puellanivis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭 Now, I think I know why I stopped before implementing permissions.

attrs.go Outdated Show resolved Hide resolved
b = marshalUint32(b, fs.Mtime)
}

if len(fs.Extended) > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And… 🤦‍♀️ we run into another bug of design.

The problem here is that we’re going to need to Marshal flags itself based on if ls(fs.Extended) is set or not. 😐

This split implementation is really going to drive me nuts, and might just drive me mad enough to go ahead and just roll out v2…

@@ -54,31 +54,10 @@ func marshalFileInfo(b []byte, fi os.FileInfo) []byte {
// so that number of pairs equals extended_count

flags, fileStat := fileStatFromInfo(fi)
f := newFileAttrFlags(flags)

b = marshalUint32(b, flags)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See? Think about what if flags & sshFileXferAttrExtended == sshFileXferAttrExtended and len(fileStat.Extended) == 0.

We’re going to encode a corrupted packet. 🤦‍♀️

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankfully, that's not actually possible, but why it's not possible is subtle:

    // if fi implements FileInfoExtendedData, retrieve extended data from it
    if fiExt, ok := fi.(FileInfoExtendedData); ok {
        fileStat.Extended = fiExt.Extended()
        if len(fileStat.Extended) > 0 {
            flags |= sshFileXferAttrExtended
        }
    }

That's from fileStatFromInfo.

So it's not possible to have sshFileXferAttrExtended set while len(fileStat.Extended) == 0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for marshalFileInfo this might be true, but since FileStat.MarshalTo would be exported, more people can call it than just this code here, and the RequestServer code as well.

@urko-b
Copy link
Contributor

urko-b commented Nov 2, 2023

Thanks for your input @urko-b . I think our earlier versions of InMemoryHandler approached things in a very similar manner. The problem we ran into is that it was acting unpredictably from what expectations of how a POSIX filesystem should work, and things… uh… snowballed. As I think you can imagine.

OK, I'm not an expert about the POSIX, even have quite knowledge :D BTW this example I've shared is working good on ubuntu VPS server. So If you want me to add an example that is working I could prepare a PR

@zelch
Copy link
Author

zelch commented Nov 3, 2023

func (ch *writeOnlyHandler) Filewrite(r *sftp.Request) (io.WriterAt, error) {
	localPath, ok := ch.files[r.Filepath]
	if !ok {
		localPath = filepath.Join(ch.tempDir, filepath.Base(r.Filepath))
		ch.files[r.Filepath] = localPath
	}
	file, err := os.OpenFile(localPath, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0o666)
	if err != nil {
		return nil, err
	}

Then I used it like this:

handlers := sftp.Handlers{
			FileGet:  s.writeonlyHdl,
			FilePut:  s.writeonlyHdl,
			FileCmd:  s.writeonlyHdl,
			FileList: s.writeonlyHdl,
		}
		reqServer := sftp.NewRequestServer(channel, handlers, sftp.WithStartDirectory(tempdir))

Hm, how are you avoiding directory traversal attacks?

That is, assuming that ch.tempDir is /home/sftp_user/tmp, and tempdir is /, what's to stop someone from doing a put to /../.bash_profile?

@puellanivis
Copy link
Collaborator

Again, thanks for the input @urko-b but there’s like so many caveats and gotchas that come into play here. Subtle loopholes of the standards, etc.

I would really rather not introduce yet another simplified implementation of an in-memory filesystem. We’re already struggling to cover the existing one, as can be demoed by the length of this PR.

@puellanivis
Copy link
Collaborator

Just checking back in. 🤔 I think the only hang up here was on if we wanted to also develop directory permissions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants