Improve vgscan behavior

vgscan is run during node unstage to make sure datacache partitions are in a good state: https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/pkg/gce-pd-csi-driver/node.go#L672

However vgscan can hang if there are certain devices offline (eg, a filesystem mounted from a loopback device backend by network storage).

It's possible vgscan should only be run if the volume being unstaged is datacache (but I'm not sure? especially if we do something like time out the vgscan call we may want to make sure it runs on the next unstage regardless of the volume type).

It also may be a good idea to time out the vgscan call. Since in the these cases, vgscan is hanging, we'd have to run it in a goroutine that times out. These goroutines may accumulate, but we also have vgscan processes accumulating in the system as well so this probably isn't a big deal.

A third thing to consider is if we can limit the devices looked at by vgscan. eg we should only look at /dev/sd* and maybe raided devices (/dev/md*).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve vgscan behavior #2209

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve vgscan behavior #2209

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions