Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to handle Btrfs multiple devices on the desktop #802

Open
cmurf opened this issue Sep 28, 2020 · 15 comments · May be fixed by #838
Open

how to handle Btrfs multiple devices on the desktop #802

cmurf opened this issue Sep 28, 2020 · 15 comments · May be fixed by #838

Comments

@cmurf
Copy link

cmurf commented Sep 28, 2020

Background on this issue:
https://gitlab.gnome.org/GNOME/gvfs/-/issues/519
https://bugs.kde.org/show_bug.cgi?id=427092

Nautilus and Dolphin show a disk icon for each Btrfs member device, and then much user and udisks confusion ensues. Desktop environment consumers may not need physical device information at all, and instead may be better off not being aware of it. When the user clicks on the various devices multiple times, multiple mount points are created, which is unintended but also confusing and not desired.

udisksdump.txt

Instead, they need a way to handle subvolumes, perhaps as virtual device 'children'. (This may not be entirely different from LVM thin pool or Stratis pool as the parent, and its filesystems as children - if this metaphor holds - except in the level of detail.)

Filing this bug to facilitate awareness of the competing issues.

Related
#768
#88
libblockdev#244

@cmurf cmurf changed the title how to handle Btrfs multiple devices how to handle Btrfs multiple devices on the desktop Sep 28, 2020
@vojtechtrefny
Copy link
Member

Instead, they need a way to handle subvolumes

We have a separate btrfs plugin with "advanced" btrfs functionality:
http://storaged.org/doc/udisks2-api/latest/gdbus-org.freedesktop.UDisks2.Manager.BTRFS.html
http://storaged.org/doc/udisks2-api/latest/gdbus-org.freedesktop.UDisks2.Filesystem.BTRFS.html

@cmurf
Copy link
Author

cmurf commented Sep 29, 2020

We have a separate btrfs plugin with "advanced" btrfs functionality:

OK cool!

@tbzatek
Copy link
Member

tbzatek commented Sep 29, 2020

@cmurf, can you please attach udevadm info --export-db too? I'm wondering whether there are any udev properties specific to btrfs multidisk volume.

@tbzatek
Copy link
Member

tbzatek commented Sep 29, 2020

(This may not be entirely different from LVM thin pool or Stratis pool as the parent, and its filesystems as children - if this metaphor holds - except in the level of detail.)

For the record, the root cause of these issues is the fact that such btrfs multidisk volume members are detected as IdUsage: filesystem and thus displayed in the GUI and offered for mounting. This is a btrfs specific and creates confusion not only to upper local storage management layers, but possibly also to sysadmins working with CLI tools and not being fully aware of these specifics.

@vojtechtrefny
Copy link
Member

vojtechtrefny commented Sep 29, 2020

UDev info for "multidisk" and "singledisk" volumes is the same. AFAICT only way how we can tell that two btrfs filesystems are part of the same volume is the same UUID.

$ udevadm info /dev/sde1                
P: /devices/pci0000:00/0000:00:07.0/host9/target9:0:1/9:0:1:0/block/sde/sde1
N: sde1
L: 0
S: disk/by-path/pci-0000:00:07.0-scsi-0:0:1:0-part1
S: disk/by-uuid/d986fd44-ec55-4744-b0c0-4306dcc97cb0
S: disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-0-1-part1
S: disk/by-partuuid/0ef75796-01
E: DEVPATH=/devices/pci0000:00/0000:00:07.0/host9/target9:0:1/9:0:1:0/block/sde/sde1
E: DEVNAME=/dev/sde1
E: DEVTYPE=partition
E: PARTN=1
E: MAJOR=8
E: MINOR=65
E: SUBSYSTEM=block
E: USEC_INITIALIZED=9525266
E: ID_SCSI=1
E: ID_VENDOR=QEMU
E: ID_VENDOR_ENC=QEMU\x20\x20\x20\x20
E: ID_MODEL=QEMU_HARDDISK
E: ID_MODEL_ENC=QEMU\x20HARDDISK\x20\x20\x20
E: ID_REVISION=2.5+
E: ID_TYPE=disk
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_drive-scsi1-0-1
E: ID_SERIAL_SHORT=drive-scsi1-0-1
E: ID_BUS=scsi
E: ID_PATH=pci-0000:00:07.0-scsi-0:0:1:0
E: ID_PATH_TAG=pci-0000_00_07_0-scsi-0_0_1_0
E: ID_PART_TABLE_UUID=0ef75796
E: ID_PART_TABLE_TYPE=dos
E: ID_FS_UUID=d986fd44-ec55-4744-b0c0-4306dcc97cb0
E: ID_FS_UUID_ENC=d986fd44-ec55-4744-b0c0-4306dcc97cb0
E: ID_FS_UUID_SUB=9d86ffc7-a2d4-4b6a-8763-545efb08b295
E: ID_FS_UUID_SUB_ENC=9d86ffc7-a2d4-4b6a-8763-545efb08b295
E: ID_FS_TYPE=btrfs
E: ID_FS_USAGE=filesystem
E: ID_PART_ENTRY_SCHEME=dos
E: ID_PART_ENTRY_UUID=0ef75796-01
E: ID_PART_ENTRY_TYPE=0x83
E: ID_PART_ENTRY_NUMBER=1
E: ID_PART_ENTRY_OFFSET=2048
E: ID_PART_ENTRY_SIZE=2095104
E: ID_PART_ENTRY_DISK=8:64
E: DM_MULTIPATH_DEVICE_PATH=0
E: ID_BTRFS_READY=1
E: DEVLINKS=/dev/disk/by-path/pci-0000:00:07.0-scsi-0:0:1:0-part1 /dev/disk/by-uuid/d986fd44-ec55-4744-b0c0-4306dcc97cb0 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-0-1-part1 /dev/disk/by-partuuid/0ef75796-01
E: TAGS=:systemd:

@vojtechtrefny
Copy link
Member

We can add some additional functions and/or properties to the btrfs plugin, but I don't see how we could add something helpful to the "core" UDisks API.

@vojtechtrefny
Copy link
Member

It would be really helpful to have more information in UDev database. btrfs progs already ship a very simple UDev rule so adding a a btrfs filesystem show call to it and setting some btrfs-specific properties could be an option? @cmurf

@tbzatek
Copy link
Member

tbzatek commented Sep 29, 2020

Opened kdave/btrfs-progs#302 requesting at least some information published in the udev db. I believe such kind of information should be provided at the right place first as the local storage management is a layered model. Only then some upper layer like UDisks could make use of it with the benefit of all upper layers built on top of it.

@cmurf
Copy link
Author

cmurf commented Sep 29, 2020

udevadminfo.txt

@cmurf
Copy link
Author

cmurf commented Oct 2, 2020

I mentioned in gvfs#519 but forgot to mention here; seems that udisksd is being asked to mount by /dev node rather than by fs UUID. At least from the man page I don't see a way to reference fs UUID with udisksctl. The mount command can do it by label or uuid for any file system. I wonder if the most generic approach for mounting is to just always use label or uuid, no matter the file system.

Most interactions with btrfs file systems is mounting, and post-mount. The only thing that really needs to understand the details of all the devices is udisksd itself on behalf of a handful of sophisticated programs like partitioning agents. Maybe it'd be better if most of the time the majority of user agents are kept oblivious of the details, and just interact with either uuid/label and mount point?

@tbzatek
Copy link
Member

tbzatek commented Oct 5, 2020

It's more complicated than that. Kernel and udev operates on major:minor block device nodes and /dev/disk/ symlinks are just different representations of the same object. Similarly any reference to a filesystem via LABEL= or UUID= resolves to a device node. The new kernel mount API could possibly take slightly different approach, however this needs to be reflected in libmount public API.

That said having duplicate filesystem identifiers present on different block devices is just wrong. Even for multipath a single device is created (btrfs over multipath anyone?). The "universally unique identifier (UUID)" is immediately not unique anymore, causing udev to randomly overwrite symlinks in /dev/disk/ that some libraries or tools do use. When matching against udev db, either a first or a random occurrence will get used, certainly not in an persistent order. That's where having more insight to a filesystem structure exposed to a udev db is crucial to solve first.

As a first step on UDisks side it will need to be made aware of duplicate filesystem identifiers and handle them gracefully to e.g. prevent multiple mounts, mount point cleanup conflicts, etc. Perhaps just taking first occurrence from a sorted list - reasonably stable within daemon lifespan. As described in https://gitlab.gnome.org/GNOME/gvfs/-/issues/519#note_921832. That will not fix the multiple object representation for the moment.

@cmurf
Copy link
Author

cmurf commented Oct 5, 2020

The new kernel mount API could possibly take slightly different approach, however this needs to be reflected in libmount public API.

I was thinking of the clients, e.g. gvfs, file managers, open/save dialogs, udisksctl. Even GNOME Disks doesn't need to interact with literal block devices most of the time, such as when mounting the file system.

That said having duplicate filesystem identifiers present on different block devices is just wrong.

Why? It's the same for mdadm multiple devices:

/dev/vda3: UUID="05c30b48-4f9f-e3da-9489-5a6703287405" UUID_SUB="18ebc747-9949-489c-f896-a47a9cdced7c" LABEL="localhost-live:root" TYPE="linux_raid_member" PARTUUID="5ce570aa-cb25-4ee6-9f5c-3fc22d54b7af"
/dev/vdb1: UUID="05c30b48-4f9f-e3da-9489-5a6703287405" UUID_SUB="a372c360-1157-e88c-a1ca-3c0be19f4ddf" LABEL="localhost-live:root" TYPE="linux_raid_member" PARTUUID="cee7279a-7c63-4c6c-8c32-1194bd16e926"

RFC 4122 doesn't require a UUID exist only once, but that at the time of creation it must be unique. A collision only occurs if the same UUID is used for different referents, in both mdadm and Btrfs cases, there's one referent. The same UUID with different UUID_SUB seems to clearly indicate each unique individual constituent part of a whole.

In the mdadm case, udev seems to export udisks specific info.

E: UDISKS_MD_MEMBER_LEVEL=raid0
E: UDISKS_MD_MEMBER_DEVICES=2

Btrfs does have number of devices in each device's superblock. That's easy for udev to get and expose to udisks, if that's what's needed. Member devices aren't per se raid, that isn't how it works on Btrfs. Instead the 'raid level' is referred to as 'profile' and the profile applies per block group, and they can be different. This information isn't part of the superblock, but is stored in a btree.

As a first step on UDisks side it will need to be made aware of duplicate filesystem identifiers and handle them gracefully to e.g. prevent multiple mounts, mount point cleanup conflicts, etc.

Allowing multiple mounts of the file system is needed to support explicitly mounting subvolumes. Such a layout has been used by Fedora for ~10 years, and is used by default starting with Fedora 33, where subvol=home is mounted at /home, and subvol=root is mounted at /. It's effectively a bind mount, except that it's possible to path resolution without it first being visible.

The thing to probably avoid is mounting the same subvolume multiple times, but this is something of an artifact or side effect of multiple /dev nodes being exposed in the GUI rather than one filesystem volume icon. Each icon is currently a /dev node and we get a mount everytime the user clicks on one of the seemingly umounted ones, even though it is mounted. A related problem happens in GNOME Disks where it shows 1 of 3 Btrfs devices as mounted, the other two are not mounted, but they are all part of the same filesystem which is mounted.

@tbzatek
Copy link
Member

tbzatek commented Jan 26, 2021

Why? It's the same for mdadm multiple devices:

/dev/vda3: UUID="05c30b48-4f9f-e3da-9489-5a6703287405" UUID_SUB="18ebc747-9949-489c-f896-a47a9cdced7c" LABEL="localhost-live:root" TYPE="linux_raid_member" PARTUUID="5ce570aa-cb25-4ee6-9f5c-3fc22d54b7af"
/dev/vdb1: UUID="05c30b48-4f9f-e3da-9489-5a6703287405" UUID_SUB="a372c360-1157-e88c-a1ca-3c0be19f4ddf" LABEL="localhost-live:root" TYPE="linux_raid_member" PARTUUID="cee7279a-7c63-4c6c-8c32-1194bd16e926"

RFC 4122 doesn't require a UUID exist only once, but that at the time of creation it must be unique. A collision only occurs if the same UUID is used for different referents, in both mdadm and Btrfs cases, there's one referent. The same UUID with different UUID_SUB seems to clearly indicate each unique individual constituent part of a whole.

Yes, however the mdraid components carry the ID_FS_USAGE=raid udev attribute (even for legacy mdraid superblock versions) in contrast to btrfs multidisk volumes that carry ID_FS_USAGE=filesystem. It's the combination of the filesystem usability flag and duplicate UUID that causes the problem.

In the mdadm case, udev seems to export udisks specific info.

E: UDISKS_MD_MEMBER_LEVEL=raid0
E: UDISKS_MD_MEMBER_DEVICES=2

These are own rules that we ship. The right place would be at the respective upstream projects and that's what kdave/btrfs-progs#302 should be about for btrfs (still need to follow up on that).

@tbzatek
Copy link
Member

tbzatek commented Jan 26, 2021

Anyway, the basic support for multiple devices to avoid creating duplicate mounts is the #838 PR.

Let's deal with btrfs subvolumes in #768.

@cmurf
Copy link
Author

cmurf commented Jan 26, 2021

It's the combination of the filesystem usability flag and duplicate UUID that causes the problem.

Would it help having ID_FS_USAGE=btrfs? Or does that just make things more complicated? Nevermind, answered in btrfs-progs-302.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants