Hi, smp_discover 1.62 20190124 from smp-utils 0.99-1 has a bug when the -p/--phy options is used; -n/--num is also broken. The main problem is that the -p option is handled incorrectly, e.g. smp_discover -p 5 not only starts at phy 5 but also skips the last 5 phys. For instance, on a system with 30 phys on expander-1:1: # smp_discover -m /dev/bsg/expander-1:1 # This works fine and is only shown to showcase the bugs phy 0:S:attached:[500605b009574bd1:07 i(SSP+STP+SMP)] 6 Gbps phy 1:S:attached:[500605b009574bd1:06 i(SSP+STP+SMP)] 6 Gbps phy 2:S:attached:[500605b009574bd1:05 i(SSP+STP+SMP)] 6 Gbps phy 3:S:attached:[500605b009574bd1:04 i(SSP+STP+SMP)] 6 Gbps phy 4:U:attached:[0000000000000000:00] phy 5:U:attached:[0000000000000000:00] phy 6:U:attached:[0000000000000000:00] phy 7:U:attached:[0000000000000000:00] phy 8:D:disabled phy 9:D:disabled phy 10:D:disabled phy 11:D:disabled phy 12:U:attached:[5000cca23c0b38a5:00 t(SSP)] 6 Gbps phy 13:U:attached:[5000cca23c0e74fd:00 t(SSP)] 6 Gbps phy 14:U:attached:[5000cca098215285:00 t(SSP)] 6 Gbps phy 15:U:attached:[5000cca25552e5f9:00 t(SSP)] 6 Gbps phy 16:U:attached:[5000cca2710aab6d:00 t(SSP)] 6 Gbps phy 17:U:attached:[5000cca23c0e75b9:00 t(SSP)] 6 Gbps phy 18:U:attached:[5000cca23c0ebaad:00 t(SSP)] 6 Gbps phy 19:U:attached:[5000cca23c0f9fe5:00 t(SSP)] 6 Gbps phy 20:U:attached:[5000cca232695239:00 t(SSP)] 6 Gbps phy 21:U:attached:[5000cca23c0dd615:00 t(SSP)] 6 Gbps phy 22:U:attached:[5000cca232717d11:00 t(SSP)] 6 Gbps phy 23:U:attached:[5000cca23c10e5c9:00 t(SSP)] 6 Gbps phy 24:D:disabled phy 25:D:disabled phy 26:D:disabled phy 27:D:disabled phy 28:D:attached:[5003048017874b3d:00 V i(SSP+SMP) t(SSP)] 6 Gbps phy 29:D:attached:[0000000000000000:00] # smp_discover -m /dev/bsg/expander-1:1 -p 10 # Not only are the first 10 phys skipped, the last 10 are skipped too phy 10:D:disabled phy 11:D:disabled phy 12:U:attached:[5000cca23c0b38a5:00 t(SSP)] 6 Gbps phy 13:U:attached:[5000cca23c0e74fd:00 t(SSP)] 6 Gbps phy 14:U:attached:[5000cca098215285:00 t(SSP)] 6 Gbps phy 15:U:attached:[5000cca25552e5f9:00 t(SSP)] 6 Gbps phy 16:U:attached:[5000cca2710aab6d:00 t(SSP)] 6 Gbps phy 17:U:attached:[5000cca23c0e75b9:00 t(SSP)] 6 Gbps phy 18:U:attached:[5000cca23c0ebaad:00 t(SSP)] 6 Gbps phy 19:U:attached:[5000cca23c0f9fe5:00 t(SSP)] 6 Gbps # smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 12 # Instead of displaying 12 phys starting with phy 10, it only displays phys leading up to phy 12 phy 10:D:disabled phy 11:D:disabled # smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 1 # This returns nothing On Debian 10 (I'm not showing the result on Debian 11 where smp-utils is the same version), with smp_discover 1.46 20140526 from smp-utils 0.98-2: # smp_discover -m /dev/bsg/expander-1:1 -p 10 phy 10:D:disabled phy 11:D:disabled phy 12:U:attached:[5000cca23c0b38a5:00 t(SSP)] 6 Gbps phy 13:U:attached:[5000cca23c0e74fd:00 t(SSP)] 6 Gbps phy 14:U:attached:[5000cca098215285:00 t(SSP)] 6 Gbps phy 15:U:attached:[5000cca25552e5f9:00 t(SSP)] 6 Gbps phy 16:U:attached:[5000cca2710aab6d:00 t(SSP)] 6 Gbps phy 17:U:attached:[5000cca23c0e75b9:00 t(SSP)] 6 Gbps phy 18:U:attached:[5000cca23c0ebaad:00 t(SSP)] 6 Gbps phy 19:U:attached:[5000cca23c0f9fe5:00 t(SSP)] 6 Gbps phy 20:U:attached:[5000cca232695239:00 t(SSP)] 6 Gbps phy 21:U:attached:[5000cca23c0dd615:00 t(SSP)] 6 Gbps phy 22:U:attached:[5000cca232717d11:00 t(SSP)] 6 Gbps phy 23:U:attached:[5000cca23c10e5c9:00 t(SSP)] 6 Gbps phy 24:D:disabled phy 25:D:disabled phy 26:D:disabled phy 27:D:disabled phy 28:D:attached:[5003048017874b3d:00 V i(SSP+SMP) t(SSP)] 6 Gbps phy 29:D:attached:[0000000000000000:00] # smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 12 phy 10:D:disabled phy 11:D:disabled phy 12:U:attached:[5000cca23c0b38a5:00 t(SSP)] 6 Gbps phy 13:U:attached:[5000cca23c0e74fd:00 t(SSP)] 6 Gbps phy 14:U:attached:[5000cca098215285:00 t(SSP)] 6 Gbps phy 15:U:attached:[5000cca25552e5f9:00 t(SSP)] 6 Gbps phy 16:U:attached:[5000cca2710aab6d:00 t(SSP)] 6 Gbps phy 17:U:attached:[5000cca23c0e75b9:00 t(SSP)] 6 Gbps phy 18:U:attached:[5000cca23c0ebaad:00 t(SSP)] 6 Gbps phy 19:U:attached:[5000cca23c0f9fe5:00 t(SSP)] 6 Gbps phy 20:U:attached:[5000cca232695239:00 t(SSP)] 6 Gbps phy 21:U:attached:[5000cca23c0dd615:00 t(SSP)] 6 Gbps # smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 1 phy 10:D:disabled This bug was introduced in git commit 5e9dd3dbab9f7fc85e2816ab2eb62bc92b068cc1, specifically this block, starting at line 923: https://github.com/doug-gilbert/smp_utils/commit/5e9dd3dbab9f7fc85e2816ab2eb62bc92b068cc1#diff-853a65985ccfe20502945ddfbc340cbedde6cda3b15a3b09c64424634cd53c74R923 <https://github.com/doug-gilbert/smp_utils/commit/5e9dd3dbab9f7fc85e2816ab2eb62bc92b068cc1#diff-853a65985ccfe20502945ddfbc340cbedde6cda3b15a3b09c64424634cd53c74R923> The working code was: num = op->do_num ? (op->phy_id + op->do_num) : MAX_PHY_ID; The new broken code is: num = get_num_phys(top, op, &has_t2t); if (num <= 0) num = op->do_num ? (op->phy_id + op->do_num) : MAX_PHY_ID; else { if (op->phy_id >= num) { printf("Given phy_id=%d at or beyond number of phys (%d)\n", op->phy_id, num); return 0; /* nothing to do */ } num -= op->phy_id; if (op->do_num) num = (num > op->do_num) ? op->do_num : num; } I don't really understand what upstream was trying to do here, possibly attempting to improve the error messages? I emailed them as they don't really seem to have a bug tracker. This was on 2025-01-17 and I haven't received a reply yet. I don't know if the project is still actively maintained as upstream's doesn't seem to be public, but the GitHub mirror hasn't been updated for about 2 years. I have opened a trivial merge request at https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4 <https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4> which is restoring the old code to make option parsing work again.
Have you reported it upstream ? That'd be the very first thing to do s3nt fr0m a $martph0ne, excuse typ0s
Have you reported it upstream ? That'd be the very first thing to do s3nt fr0m a $martph0ne, excuse typ0s
Hi, As I explained at the end of the bug report, I contacted Douglas Gilbert 10 days ago, using the email provided at the bottom of https://sg.danny.cz/sg/index.html. I haven't heard from him since. I could open an issue at https://github.com/doug-gilbert/smp_utils/issues but I doubt it would help as the author "still prefers to use a subversion repository on his own equipment (a Raspberry Pi) for development" according to https://sg.danny.cz/sg/smp_utils.html. If you have another way to reach him, please let me know.
Hi, As I explained at the end of the bug report, I contacted Douglas Gilbert 10 days ago, using the email provided at the bottom of https://sg.danny.cz/sg/index.html. I haven't heard from him since. I could open an issue at https://github.com/doug-gilbert/smp_utils/issues but I doubt it would help as the author "still prefers to use a subversion repository on his own equipment (a Raspberry Pi) for development" according to https://sg.danny.cz/sg/smp_utils.html. If you have another way to reach him, please let me know.
Opening an issue is worthwhile. The reason is because emails may fall under spam classification while opening an issue guarantees that it is reported. s3nt fr0m a $martph0ne, excuse typ0s
Opening an issue is worthwhile. The reason is because emails may fall under spam classification while opening an issue guarantees that it is reported. s3nt fr0m a $martph0ne, excuse typ0s
It's now reported at https://github.com/doug-gilbert/smp_utils/issues/4 Have you looked at https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4? Do you think it could still be applied to fix the Debian package or would you rather wait for a potential response from upstream?
It's now reported at https://github.com/doug-gilbert/smp_utils/issues/4 Have you looked at https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4? Do you think it could still be applied to fix the Debian package or would you rather wait for a potential response from upstream?
It is preferred to have the point of view from upstream as well. Let's wait and see. 🙏 s3nt fr0m a $martph0ne, excuse typ0s
It is preferred to have the point of view from upstream as well. Let's wait and see. 🙏 s3nt fr0m a $martph0ne, excuse typ0s
Hello Ritesh, It has now been ~2 weeks since I opened the issue on GitHub and almost a month since I emailed the author. Could you or another maintainer check my merge request at https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4 please? Having the option parsing code restored would be of great help. Have a nice day, Louis
Hello Ritesh, It has now been ~2 weeks since I opened the issue on GitHub and almost a month since I emailed the author. Could you or another maintainer check my merge request at https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4 please? Having the option parsing code restored would be of great help. Have a nice day, Louis
Hello Louis, I've copied some of the other DDs who may have stake in smp-utils. I personally no more actively maintain these packages but hope to help/chime in, if I have time. I've copied Douglas on this email, for what record I have. I have no idea why you are facing this problem. The changes look fine to me. They are just standardizing on using a more standard interface command in REPORT GENERAL. Also, that change is from almost 7 years ago. From personal experience, as such, if it was a general issue, it'd have been uncovered widely by now. I personally wouldn't override the current code with your proposed MR changes, for I have limited insight of its side-effects. For you, I guess if this change is a blocker, you could continue to use the older versions until the cause of this anomaly is sorted out. Thanks, Ritesh
Hi again,
Let me try to explain. The previous version determined the index of the
last phy to display with this very simple snippet:
num = op->do_num ? (op->phy_id + op->do_num) : MAX_PHY_ID;
if -n is passed (do_num), the index of the last phy to display was the
value passed to -p (phy_id) + the value passed to -n.
If -n was not used, the index of the last device was a constant set to
254 (I assume something caused the loop to stop before 254 because it
worked fine).
The following loop was used (still is in the newer version) to display phys:
for (k = op->phy_id; k < num; ++k) {
So, if we passed -n 5 -p 10, num would get the value 10 + 5 = 15. This
means smp_discover displayed devices starting at 10 and stopping at 15,
which was fine.
Now, the newer version initializes num at the number of phys, this is
fine when -n and -p are not used.
num = get_num_phys(top, op, &has_t2t);
The problem is this line:
num -= op->phy_id
It causes the index of the last phy to depend on the value of -p, the
index of the first phy. This seems wrong.
If we only pass -p 10, assuming a total of 30 phys, num takes the value
30 - 10 = 20, which is what I am observing in the initial bug report.
Only the first 20 phys are displayed although I only changed the value
of -p, not -n.
Then, this line breaks it even further if -n is used:
num = (num > op->do_num) ? op->do_num : num;
If we pass -p 10 -n 5, num was 20 (> 5), so it's eventually set to 5.
This causes the loop to display nothing because it attempts to iterate
from 10 to 5.
I have to admit that I am also surprised. Maybe smp_discover is a very
low usage package. Or maybe all its users are stuck on older versions.
The reason I filed this bug was to raise awareness of this issue after
spending hours pinpointing the problematic code.
I can offer access to one or more machines with an expander where the
issue occurs if it helps in any way. Feel free to email me and I'll
grant you or your fellow developers access.
Sure, I can do that as a last resort, but I feel like fixing this bug in
Debian would benefit everyone.
Cheers,