libmetal: lib: linux: improve Linux UIO-backed device open and test coverage#365
libmetal: lib: linux: improve Linux UIO-backed device open and test coverage#365bentheredonethat wants to merge 6 commits into
Conversation
6dbafca to
758f084
Compare
| void *output, int len); | ||
| int metal_linux_uio_validate_offset(const char *dev_name, | ||
| unsigned int index, | ||
| unsigned long offset); |
There was a problem hiding this comment.
doxygen documentation header
|
|
||
| int metal_device_open(const char *bus_name, const char *dev_name, | ||
| struct metal_device **device) | ||
| int metal_device_open_from_bus(const char *bus_name, const char *dev_name, |
There was a problem hiding this comment.
I do not understand the need to create a new API. It looks to me like this introduces confusion by duplicating APIs instead of making things more obvious.
Having a single system-agnostic API makes more sense to me.
|
Hi @arnopo Thanks for the review. I agree the Linux change is too large as posted, and I can split it into smaller commits before respinning. The main problem I am trying to solve is narrower than the current series makes it look. Linux userspace libmetal already opens platform/PCI devices through the existing Linux UIO-backed path. For demos and portable userspace applications, we would like to use a logical name such as "demo-ipi". On systems using UIO, that logical name can be exposed by the kernel as the UIO class name, The Linux libmetal backend can then resolve that UIO class name back to the real bus device: and continue through the existing Linux bus open/bind/map/IRQ/close flow. So the intended direction is not to make UIO mandatory for OpenAMP generally, and not to introduce a new Linux device model. It is only to let the existing Linux UIO backend accept the UIO class I also agree with your comment about the new public API. The UIO-name use case does not require adding The Linux backend would interpret the device argument as either:
Internally I plan to keep both identities clear, e.g. requested name / resolved bus device name / UIO name. For minimum behavior change, the returned For the respin, does the following direction sound acceptable?
If this direction makes sense, I will rework the series around that. |
What about using symbolic link for that, as proposed by @tnmysh in OpenAMP/openamp-system-reference#101.
Sound good. Thanks
|
|
Thanks @arnopo, this is a good point. Using a udev-created symlink such as:
For upstream libmetal, I would prefer to treat this as optional platform integration rather than a hard dependency, because not all deployments guarantee custom udev rules. So my plan is:
That keeps behavior portable out of the box while allowing integrators to use the symlink approach for faster/cleaner lookup. If you agree, I will document this in the commit message as “optional udev optimization, generic fallback preserved”. |
I would prefer that we handle this in the same way we manage /dev/rpmsgX or /sys/class/remoteproc/remoteprocX devices, rather than adding it to libmetal. I propose adding this PR to the agenda for the next OpenAMP meeting so we can discuss it further. |
758f084 to
e4bde50
Compare
|
Hi @arnopo @wmamills @tnmysh i have some updates here: Update: I pushed the revised libmetal changes to this PR. The current branch now fleshes out the Linux I also kept the fallback behavior in place: native platform-bus open remains the primary path for existing users, and the UIO class-name path is only used when callers explicitly request the I investigated the symlink-based lookup option as requested. I do not think symlinks are needed for this PR. The kernel already exposes the stable lookup key we need through So the PR now takes this approach:
|
e4bde50 to
68c9e7f
Compare
arnopo
left a comment
There was a problem hiding this comment.
@bentheredonethat
Sorry for the delay, please find some comment.
I need to past time on "lib: linux: add UIO bus open by class name" to understand your work. adding more comment would help me
| unsigned long offset, | ||
| metal_phys_addr_t *phys, | ||
| size_t *map_len, | ||
| size_t *region_size) |
There was a problem hiding this comment.
Define structures to avoid passing too many parameters.
| irqs[irq].arg = NULL; | ||
| metal_mutex_release(&irq_lock); | ||
|
|
||
| metal_linux_irq_notify(); |
There was a problem hiding this comment.
code above should not be done here you should only have a check
if (metal_linux_irq_is_enabled(irq)
return EINVAL;
=> the standard metal_irq_disable should be called before by the IRQ consumer
There was a problem hiding this comment.
ok will fix.
metal_linux_irq_unregister_dev() will only detach the Linux fd-to-device bookkeeping after the consumer has disabled the IRQ.
|
|
||
| dir = opendir(METAL_UIO_CLASS_PATH); | ||
| if (!dir) | ||
| return -errno; |
There was a problem hiding this comment.
Addressed by mapping missing /sys/class/uio to -ENODEV, while preserving other opendir() failures as -errno. This keeps “UIO class unavailable” consistent with the rest of Linux bus probing.
| } | ||
|
|
||
| int metal_linux_uio_validate_offset(const char *dev_name, | ||
| unsigned int index, |
There was a problem hiding this comment.
index seems useless as only use to print the error message,
There was a problem hiding this comment.
will remove
| error = ldrv->dev_open(lbus, ldev); | ||
| if (error) { | ||
| if (open_error == -ENODEV || error != -ENODEV) | ||
| open_error = error; |
There was a problem hiding this comment.
seems to me that it is not the good strategy .
- if ldrv->dev_open return an error, the device is probably not opened so no need to close it
- if the open fails, we should keep the error and close all devices aready opened
for_each_linux_driver(lbus, ldrv) {
/* Check if we have a viable driver. */
if (!ldrv->sdrv || !ldrv->dev_open)
continue;
/* Reset device data. */
memset(ldev, 0, sizeof(*ldev));
strncpy(ldev->dev_name, dev_name, sizeof(ldev->dev_name) - 1);
ldev->fd = -1;
ldev->ldrv = ldrv;
ldev->device.bus = bus;
/* Try and open the device. */
error = ldrv->dev_open(lbus, ldev);
if (error) {
goto close_dev;
}
*device = &ldev->device;
(*device)->name = ldev->dev_name;
metal_list_add_tail(&bus->devices, &(*device)->node);
return 0;
}
close_dev:
for_each_linux_driver(lbus, ldrv) {
ldev->ldrv = ldrv;
ldrv->dev_close(lbus, ldev);
metal_list_del(&ldev->device.node);
}
free(ldev);
return error;
There was a problem hiding this comment.
ok will do this - thanks
68c9e7f to
026d616
Compare
arnopo
left a comment
There was a problem hiding this comment.
Some of the commits are still not easy to review. Reverse engineering is required to understand it.
Adding more details on the algorithm you try to apply in the commit message could help
| void *raw, *virt; | ||
| int irq_info; | ||
|
|
||
| i = 0; |
There was a problem hiding this comment.
can be initialized when declared
| phys = &ldev->region_phys[ldev->device.num_regions]; | ||
| result = metal_uio_read_map_attr(ldev, i, "offset", &offset); | ||
| if (result) | ||
| break; |
There was a problem hiding this comment.
Why do you break for this one?
There was a problem hiding this comment.
will add comment
| } | ||
| result = metal_open(ldev->dev_path, 0); | ||
| if (result < 0) { | ||
| metal_log(METAL_LOG_ERROR, "failed to open device %s\n", |
There was a problem hiding this comment.
| metal_log(METAL_LOG_ERROR, "failed to open device %s\n", | |
| metal_log(METAL_LOG_ERROR, "failed to open device %s: %s\n", |
| /* | ||
| * /sys/class/uio is a class, not a bus. Register the synthetic bus only | ||
| * when the UIO class exists and skip normal bus/driver probing. | ||
| */ |
There was a problem hiding this comment.
Comment not clear to me . the sentence "/sys/class/uio is a class, not a bus" is confusing.
suggestion
/*
* Register the synthetic bus only when the /sys/class/uio
*class exists and skip normal bus/driver probing.
*/
| phys = &ldev->region_phys[ldev->device.num_regions]; | ||
| result = metal_uio_read_map_attr(ldev, i, "offset", &offset); | ||
| if (result) | ||
| break; |
There was a problem hiding this comment.
Error case to manage to properly leave this function here and below , closing /unregistering, freeing, ... thinks
| *newline = '\0'; | ||
|
|
||
| return 0; | ||
| } |
There was a problem hiding this comment.
suggestion ( can also be applyed for metal_uio_read_str_attr())
static int metal_linux_read_first_line(const char *path, char *output,
size_t output_len)
{
FILE *fp;
char *newline;
err = 0;
if (!path || !output || output_len < 2)
return -EINVAL;
fp = fopen(path, "r");
if (!fp)
return -errno;
if (!fgets(output, output_len, fp)) {
int err = ferror(fp) ? -errno : -ENODATA;
goto close_file;
}
newline = strchr(output, '\n');
if (newline)
*newline = '\0';
close_file:
fclose(fp);
return err;
}
There was a problem hiding this comment.
ok will fix
| } | ||
| found = true; | ||
|
|
||
| result = snprintf(ldev->cls_path, sizeof(ldev->cls_path), |
There was a problem hiding this comment.
not obvious what is cls_path , pleas clarify by comments or by describing ldev fields
|
|
||
| result = metal_uio_dev_bind(ldev, ldrv); | ||
| if (result) | ||
| return result; |
There was a problem hiding this comment.
error case to manage to free resources here and below
026d616 to
49932fa
Compare
|
@arnopo i split further the UIO rework so there are more comments, split the commits for easier review, and detail the algorithm used as well as the nits |
The Linux bus open path may try more than one backend driver for a device. When a backend finds the device but fails while opening it, the common open loop currently discards that errno and returns -ENODEV after all drivers have been tried. Keep the first useful backend open error, preferring non-ENODEV failures over a plain miss. This preserves the existing not-found result while letting callers see real failures such as UIO map population errors. Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
UIO map offsets identify the usable resource start inside the page-aligned mapping exposed by sysfs. The Linux backend previously exposed and unmapped the adjusted virtual address directly. Keep the raw mmap base and length for close, expose the usable virtual address as raw mapping plus offset, and derive the libmetal physical base and size from the usable portion of the UIO map. Use the sysfs map size as the mmap length. For an unaligned resource, UIO already reports a page-aligned address and a full mmap length, so adding the offset to that length can over-map the resource and fail. Reject offsets outside the system page size, reject offsets beyond the map size, and report overflow before attempting to mmap the region. Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
A UIO-backed device registers its file descriptor with the Linux IRQ controller so interrupt handling can find the owning metal device. Closing the device must clear that association before closing the fd. Add an internal unregister helper that detaches the device pointer after the IRQ consumer has disabled the IRQ. Keep IRQ handler and enable-state teardown owned by the standard IRQ disable and unregister paths. Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
Split the UIO open flow into two stages. The parent-bus path still opens the platform or PCI sysfs device, binds it to the selected UIO driver, finds the child UIO class device, and records the resolved class and /dev paths. Move the common stage into metal_uio_populate(). That helper waits for the /dev/uioX node, opens it, reads each UIO map, maps the full mmap extent, exposes the usable region after the sysfs offset, and registers IRQ bookkeeping when the UIO fd supports interrupts. Keep close-time cleanup unchanged by storing the raw mmap address and length alongside the adjusted libmetal I/O region. On populate failure, unmap any regions mapped so far and close the UIO fd locally before the generic open path releases parent sysfs and driver override state. Also make local error paths close the temporary UIO child list before returning. Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
Add the resolver used by the synthetic uio bus. It scans every /sys/class/uio/uioX/name file, compares the first line against the requested libmetal device name, and rejects duplicate matches because they cannot be opened deterministically. When a unique match is found, fill the same linux_device fields that the parent-bus UIO path fills: cls_path points at the UIO sysfs class directory, dev_path points at /dev/uioX, and the UIO name and device node name are saved for diagnostics and future callers. The class-name open callback then reuses metal_uio_populate(), so UIO class opens and parent-bus UIO opens share mmap setup, IRQ registration, DMA handling, and close-time cleanup. Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
Register a synthetic Linux uio bus so callers can use the existing
metal_device_open("uio", name, ...) API shape to open UIO devices by
the value exported in /sys/class/uio/uioX/name.
This bus is not backed by a sysfs bus directory or a probed kernel
driver handle. During Linux bus initialization, register it only when
/sys/class/uio exists, and skip the normal sysfs bus and driver probing
that platform and PCI devices require.
During device open, allow the synthetic uio driver to run its class-name
open callback without an sdrv handle. The callback resolves the UIO class
device and then uses the shared populate path added earlier, so the new
bus preserves the same mmap, IRQ, DMA, and close semantics as existing
UIO-backed platform and PCI opens.
Also make bus close tolerate the missing sysfs bus handle and copy the
requested device name with snprintf() so oversized names fail cleanly.
Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
49932fa to
187b01a
Compare
This series improves the Linux UIO-backed device-open flow in libmetal and
adds test coverage for the new API and Linux-specific helper paths.
The immediate motivation is to support Linux userspace applications that open
UIO-exposed devices through libmetal while keeping the existing bus/device
contract intact. In the current Linux implementation, the basic UIO path is
already present, but the backend is tightly coupled to bus probing, does not
cleanly separate resolved Linux device identities, does not unregister Linux
IRQ device state on close, and does not correctly retain the raw mapping
needed when a UIO map uses a non-zero offset.
This series addresses those gaps in three steps:
With these changes, downstream Linux host applications can continue to use
libmetal's device-open and IRQ registration model while relying on the
improved Linux UIO device handling in the library.