optoe: Add CMIS Bank support for transceivers with >8 lanes#473
optoe: Add CMIS Bank support for transceivers with >8 lanes#473ishidawataru wants to merge 8 commits intosonic-net:masterfrom
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@prgeor PTAL |
| - The default bank size is set to 4, and can be modified via the 'bank_size' sysfs entry. | ||
| - For 'optoe3', the 'write_max' value is updated to 2 to comply with CMIS requirements, | ||
| which mandate that both bank and page values be updated in a single WRITE operation. | ||
|
|
There was a problem hiding this comment.
It’d be great if you could add a paragraph how to verify/test this to the commit message too.
b380e95 to
28fa045
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
28fa045 to
4f10614
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@prgeor I updated the patch based on our discussion PTAL. |
@ishidawataru Ack. reviewing. |
|
@ishidawataru can you define what do you mean by |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
For example, if I also added a commit that explains this in the comment. |
| /* fundamental unit of addressing for EEPROM */ | ||
| #define OPTOE_PAGE_SIZE 128 | ||
| + | ||
| +#define OPTOE_DEFAULT_BANK_SIZE 0 |
There was a problem hiding this comment.
@ishidawataru the default bank is 0, which means modules that support 8 lanes, the bank size = 1?
There was a problem hiding this comment.
@ishidawataru This probably can be removed if we implement full range of linear address for all bank pages. See my comments below
There was a problem hiding this comment.
@ishidawataru I am bit confused with respect to naming these by _SIZE. From the code use, it looks like these should be _COUNT?
There was a problem hiding this comment.
If these constants are still necessary after the discussion below, I’ll rename them as you suggested.
| /* 0x80 places the offset in the top half, offset is last 7 bits */ | ||
| *offset = OPTOE_PAGE_SIZE + (*offset & 0x7f); | ||
| - | ||
| - return page; /* note also returning client and offset */ |
There was a problem hiding this comment.
@ishidawataru The computation does not consider full range of linear address for all banks. Its OK to use all page address for bank 0 i.e Assume that page 0x10, 0x11, 0x12... 0xFF exist for Bank 0 as well.
So that the getaddr() can be implemented as follows:-
def getaddr(self, bank=0, page, offset, page_size=128):
return ((uint32_t)bank * PAGES_PER_BANK + page) * page_size + offset;
In your case you are not wasting the address space for OPTOE_NON_BANKED_PAGE_SIZE but its OK to loose those many address range to keep the use space implementation of getaddr() simple
Here is the simple decoding of the above linear address that optoe needs to do:-
static uint8_t optoe_translate_offset(struct optoe_data *optoe,
loff_t *offset, struct i2c_client **client, uint8_t *bank)
{
unsigned int page = 0;
*bank = *offset / (PAGES_PER_BANK * PAGE_SIZE);
page = ((*offset / PAGE_SIZE) % PAGES_PER_BANK) - 1;
*offset = PAGE_SIZE + linear_address % PAGE_SIZE;
return page;
}
Assumption here is the user application will provide only the valid pages and the module will return IO error if incorrect Page address is selected.
There was a problem hiding this comment.
@prgeor This is the linear address map of the current implementation.
Are you suggesting to change it like below?
Returning I/O errors for the grayed-out area will prevent us from using cat on the EEPROM file as shown below.
admin@sonic:~$ hexdump /sys/class/i2c-dev/i2c-1/device/1-0050/eeprom
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
001e880
I suggest keeping the current implementation or returning the value from bank 0, rather than returning I/O errors.
What do you think?
There was a problem hiding this comment.
@ishidawataru I am not sure if hexdump will be user friendly to read so many pages across multiple banks for the user to scroll and look for. Alternatively, one can use the userspace tools like sfputil to dump only relavant page from a particular bank.
| + * For CMIS transceivers that support Banked Pages, access to these pages | ||
| + * is also supported. To access the banked pages, set the number of banks | ||
| + * to access via the `bank_size` sysfs entry. | ||
| + * By default, `bank_size` is set to 0, which disables this feature. |
There was a problem hiding this comment.
@ishidawataru We probably don't need this. see my comments above.
| + * For just a Page Index change (mapping another Page in the current Bank), | ||
| + * or for mapping an arbitrary unbanked Page to Upper Memory, a host may WRITE only the PageSelect Byte. | ||
| + */ | ||
| + if (bank > 0) { |
There was a problem hiding this comment.
@ishidawataru would it be good to have the sanity check to validate the bank value to what the module supports? Page 01h, Byte 142
| if (ret < 0) { | ||
| dev_err(&client->dev, | ||
| - "Restore page register to 0 failed:%d!\n", ret); | ||
| + "Restore bank, page register to (0, 0) failed:%d!\n", ret); |
There was a problem hiding this comment.
If this restore fails, the module's bank/page registers remain pointing to the last accessed bank/page (not reset to 0). If the next write is intended to be for page/bank 0 in optoe_eeprom_update_client, it will skip the optoe_eeprom_write and actually be read from the wrong page/bank. ( due to the if page/bank > 0 check)
If the restore fails, we shouldn't skip the optoe_eeprom_write
| + return count; | ||
| +} | ||
| + | ||
| +static ssize_t set_bank_size(struct device *dev, |
There was a problem hiding this comment.
@ishidawataru do we need this? What is the usecase? I thought the number of banks is already advertised so the optoe driver can read the advertisement and cache the value? Are you considering a case where the module is swapped?
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
1b2fb60 to
053afe8
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This pull request adds CMIS Bank support to the optoe3 driver to enable access to CMIS transceivers with more than 8 lanes (e.g., OSFP-XD, CPO OEs). The implementation adds a configurable bank_size sysfs interface that allows extending the addressable EEPROM space beyond the default single bank (256 pages) to support up to 8 banks, mapping them into a linear address space.
Changes:
- Adds bank register handling logic to the optoe driver for CMIS transceivers (optoe3 device class)
- Implements
bank_sizesysfs attribute to enable/configure bank support (defaults to 0 = disabled) - Updates address translation logic to compute bank, page, and offset from linear address space
- Modifies page/bank register restoration logic to handle bank select register writes
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| patches-sonic/series | Adds the new bank support patch to the patch series, positioned correctly after existing optoe patches |
| patches-sonic/driver-support-optoe-bank-support.patch | Complete implementation of CMIS bank support including address translation, register handling, sysfs interface, and memory size calculations |
| + | ||
| + /* Indicates if page restore has failed. | ||
| + * If true, the driver doesn't skip writing to page select register | ||
| + * even for acesses to page 0. |
There was a problem hiding this comment.
Spelling error: "acesses" should be "accesses".
| + * even for acesses to page 0. | |
| + * even for accesses to page 0. |
| + struct optoe_data *optoe = i2c_get_clientdata(client); | ||
| + unsigned int bank_size; | ||
| + | ||
| + // setting bank size is only supported for the CMIS device |
There was a problem hiding this comment.
Use C-style comment syntax /* / instead of C++ style //. Kernel code should use C-style comments for inline code comments. The comment should be: / setting bank size is only supported for the CMIS device */
| + // setting bank size is only supported for the CMIS device | |
| + /* setting bank size is only supported for the CMIS device */ |
| + * Memory layout: multiple banks, each containing 256 pages of 128 bytes. | ||
| + */ | ||
| + loff_t offset_in_paged_area = *offset - OPTOE_PAGE_SIZE; | ||
| + const size_t bytes_per_bank = OPTOE_ARCH_PAGES * OPTOE_PAGE_SIZE; // 256 * 128 = 32KB |
There was a problem hiding this comment.
Use C-style comment syntax /* / instead of C++ style //. Kernel code should use C-style comments for inline code comments. The comment should be: / 256 * 128 = 32KB */
| + const size_t bytes_per_bank = OPTOE_ARCH_PAGES * OPTOE_PAGE_SIZE; // 256 * 128 = 32KB | |
| + const size_t bytes_per_bank = OPTOE_ARCH_PAGES * OPTOE_PAGE_SIZE; /* 256 * 128 = 32KB */ |
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Wataru Ishida <wataru.ishid@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |



This patch adds CMIS Bank support to the 'optoe3' device class in order
to enable access to CMIS transceivers with more than 8 lanes (e.g., OSFP-XD, CPO OEs).
The default bank size is set to 4, and can be modified via the 'bank_size' sysfs entry.For 'optoe3', the 'write_max' value is updated to 2 to comply with CMIS requirements,which mandate that both bank and page values be updated in a single WRITE operation.
Updated the behavior as below after discussing with @prgeor offline
automatically updated to 2 to comply with CMIS requirements,
which mandate that both bank and page values be updated in a single WRITE operation.
Only tested with the SONiC VM + i2c-stub.
Snip of dmesg. See bank:3 is accessed.
Not tested with real hardware yet.