#1094405 smp-utils: option parsing broken for smp_discover -p/--phy and -n/--num

Package:
smp-utils
Source:
smp-utils
Description:
SAS Expander (SMP) utilities for SAS/SATA disk arrays
Submitter:
Louis Sautier
Date:
2025-02-10 20:42:02 UTC
Severity:
normal
#1094405#5
Date:
2025-01-27 20:50:02 UTC
From:
To:
Hi,
smp_discover 1.62 20190124 from smp-utils 0.99-1 has a bug when the
-p/--phy options is used; -n/--num is also broken.
The main problem is that the -p option is handled incorrectly, e.g.
smp_discover -p 5
not only starts at phy 5 but also skips the last 5 phys.

For instance, on a system with 30 phys on expander-1:1:
# smp_discover -m /dev/bsg/expander-1:1 # This works fine and is only
shown to showcase the bugs
    phy   0:S:attached:[500605b009574bd1:07  i(SSP+STP+SMP)]  6 Gbps

    phy   1:S:attached:[500605b009574bd1:06  i(SSP+STP+SMP)] 6 Gbps
    phy   2:S:attached:[500605b009574bd1:05  i(SSP+STP+SMP)] 6 Gbps
    phy   3:S:attached:[500605b009574bd1:04  i(SSP+STP+SMP)] 6 Gbps
    phy   4:U:attached:[0000000000000000:00]
    phy   5:U:attached:[0000000000000000:00]
    phy   6:U:attached:[0000000000000000:00]
    phy   7:U:attached:[0000000000000000:00]
    phy   8:D:disabled
    phy   9:D:disabled
    phy  10:D:disabled
    phy  11:D:disabled
    phy  12:U:attached:[5000cca23c0b38a5:00  t(SSP)]  6 Gbps
    phy  13:U:attached:[5000cca23c0e74fd:00  t(SSP)]  6 Gbps
    phy  14:U:attached:[5000cca098215285:00  t(SSP)]  6 Gbps
    phy  15:U:attached:[5000cca25552e5f9:00  t(SSP)]  6 Gbps
    phy  16:U:attached:[5000cca2710aab6d:00  t(SSP)]  6 Gbps
    phy  17:U:attached:[5000cca23c0e75b9:00  t(SSP)]  6 Gbps
    phy  18:U:attached:[5000cca23c0ebaad:00  t(SSP)]  6 Gbps
    phy  19:U:attached:[5000cca23c0f9fe5:00  t(SSP)]  6 Gbps
    phy  20:U:attached:[5000cca232695239:00  t(SSP)]  6 Gbps
    phy  21:U:attached:[5000cca23c0dd615:00  t(SSP)]  6 Gbps
    phy  22:U:attached:[5000cca232717d11:00  t(SSP)]  6 Gbps
    phy  23:U:attached:[5000cca23c10e5c9:00  t(SSP)]  6 Gbps
    phy  24:D:disabled
    phy  25:D:disabled
    phy  26:D:disabled
    phy  27:D:disabled
    phy  28:D:attached:[5003048017874b3d:00  V i(SSP+SMP) t(SSP)]  6 Gbps
    phy  29:D:attached:[0000000000000000:00]
# smp_discover -m /dev/bsg/expander-1:1 -p 10 # Not only are the first
10 phys skipped, the last 10 are skipped too
    phy  10:D:disabled
    phy  11:D:disabled
    phy  12:U:attached:[5000cca23c0b38a5:00  t(SSP)]  6 Gbps
    phy  13:U:attached:[5000cca23c0e74fd:00  t(SSP)]  6 Gbps
    phy  14:U:attached:[5000cca098215285:00  t(SSP)]  6 Gbps
    phy  15:U:attached:[5000cca25552e5f9:00  t(SSP)]  6 Gbps
    phy  16:U:attached:[5000cca2710aab6d:00  t(SSP)]  6 Gbps
    phy  17:U:attached:[5000cca23c0e75b9:00  t(SSP)]  6 Gbps
    phy  18:U:attached:[5000cca23c0ebaad:00  t(SSP)]  6 Gbps
    phy  19:U:attached:[5000cca23c0f9fe5:00  t(SSP)]  6 Gbps
# smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 12 # Instead of
displaying 12 phys starting with phy 10, it only displays phys leading
up to phy 12
    phy  10:D:disabled
    phy  11:D:disabled
# smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 1 # This returns nothing


On Debian 10 (I'm not showing the result on Debian 11 where smp-utils is
the same version), with smp_discover 1.46 20140526 from smp-utils 0.98-2:
# smp_discover -m /dev/bsg/expander-1:1 -p 10
    phy  10:D:disabled
    phy  11:D:disabled
    phy  12:U:attached:[5000cca23c0b38a5:00  t(SSP)]  6 Gbps
    phy  13:U:attached:[5000cca23c0e74fd:00  t(SSP)]  6 Gbps
    phy  14:U:attached:[5000cca098215285:00  t(SSP)]  6 Gbps
    phy  15:U:attached:[5000cca25552e5f9:00  t(SSP)]  6 Gbps
    phy  16:U:attached:[5000cca2710aab6d:00  t(SSP)]  6 Gbps
    phy  17:U:attached:[5000cca23c0e75b9:00  t(SSP)]  6 Gbps
    phy  18:U:attached:[5000cca23c0ebaad:00  t(SSP)]  6 Gbps
    phy  19:U:attached:[5000cca23c0f9fe5:00  t(SSP)]  6 Gbps
    phy  20:U:attached:[5000cca232695239:00  t(SSP)]  6 Gbps
    phy  21:U:attached:[5000cca23c0dd615:00  t(SSP)]  6 Gbps
    phy  22:U:attached:[5000cca232717d11:00  t(SSP)]  6 Gbps
    phy  23:U:attached:[5000cca23c10e5c9:00  t(SSP)]  6 Gbps
    phy  24:D:disabled
    phy  25:D:disabled
    phy  26:D:disabled
    phy  27:D:disabled
    phy  28:D:attached:[5003048017874b3d:00  V i(SSP+SMP) t(SSP)]  6 Gbps
    phy  29:D:attached:[0000000000000000:00]
# smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 12
    phy  10:D:disabled
    phy  11:D:disabled
    phy  12:U:attached:[5000cca23c0b38a5:00  t(SSP)]  6 Gbps
    phy  13:U:attached:[5000cca23c0e74fd:00  t(SSP)]  6 Gbps
    phy  14:U:attached:[5000cca098215285:00  t(SSP)]  6 Gbps
    phy  15:U:attached:[5000cca25552e5f9:00  t(SSP)]  6 Gbps
    phy  16:U:attached:[5000cca2710aab6d:00  t(SSP)]  6 Gbps
    phy  17:U:attached:[5000cca23c0e75b9:00  t(SSP)]  6 Gbps
    phy  18:U:attached:[5000cca23c0ebaad:00  t(SSP)]  6 Gbps
    phy  19:U:attached:[5000cca23c0f9fe5:00  t(SSP)]  6 Gbps
    phy  20:U:attached:[5000cca232695239:00  t(SSP)]  6 Gbps
    phy  21:U:attached:[5000cca23c0dd615:00  t(SSP)]  6 Gbps
# smp_discover -m /dev/bsg/expander-1:1 -p 10 -n 1
    phy  10:D:disabled


This bug was introduced in git commit
5e9dd3dbab9f7fc85e2816ab2eb62bc92b068cc1, specifically this block,
starting at line 923:
https://github.com/doug-gilbert/smp_utils/commit/5e9dd3dbab9f7fc85e2816ab2eb62bc92b068cc1#diff-853a65985ccfe20502945ddfbc340cbedde6cda3b15a3b09c64424634cd53c74R923
<https://github.com/doug-gilbert/smp_utils/commit/5e9dd3dbab9f7fc85e2816ab2eb62bc92b068cc1#diff-853a65985ccfe20502945ddfbc340cbedde6cda3b15a3b09c64424634cd53c74R923>
The working code was:
num = op->do_num ? (op->phy_id + op->do_num) : MAX_PHY_ID;

The new broken code is:
num = get_num_phys(top, op, &has_t2t);
if (num <= 0)
      num = op->do_num ? (op->phy_id + op->do_num) : MAX_PHY_ID;
else {
      if (op->phy_id >= num) {
          printf("Given phy_id=%d at or beyond number of phys (%d)\n",
                 op->phy_id, num);
          return 0;   /* nothing to do */
      }
      num -= op->phy_id;
      if (op->do_num)
          num = (num > op->do_num) ? op->do_num : num;
}
I don't really understand what upstream was trying to do here, possibly
attempting to improve the error messages?
I emailed them as they don't really seem to have a bug tracker. This was
on 2025-01-17 and I haven't received a reply yet. I don't know if the
project is still actively maintained as upstream's doesn't seem to be
public, but the GitHub mirror hasn't been updated for about 2 years.

I have opened a trivial merge request at
https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4
<https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4>
which is restoring the old code to make option parsing work again.

#1094405#10
Date:
2025-01-28 05:44:23 UTC
From:
To:
Have you reported it upstream ? That'd be the very first thing to do

s3nt fr0m a $martph0ne, excuse typ0s

#1094405#15
Date:
2025-01-28 05:44:23 UTC
From:
To:
Have you reported it upstream ? That'd be the very first thing to do

s3nt fr0m a $martph0ne, excuse typ0s

#1094405#20
Date:
2025-01-28 07:00:26 UTC
From:
To:
Hi,

As I explained at the end of the bug report, I contacted Douglas Gilbert
10 days ago, using the email provided at the bottom of
https://sg.danny.cz/sg/index.html. I haven't heard from him since. I
could open an issue at https://github.com/doug-gilbert/smp_utils/issues
but I doubt it would help as the author "still prefers to use a
subversion repository on his own equipment (a Raspberry Pi) for
development" according to https://sg.danny.cz/sg/smp_utils.html. If you
have another way to reach him, please let me know.

#1094405#25
Date:
2025-01-28 07:00:26 UTC
From:
To:
Hi,

As I explained at the end of the bug report, I contacted Douglas Gilbert
10 days ago, using the email provided at the bottom of
https://sg.danny.cz/sg/index.html. I haven't heard from him since. I
could open an issue at https://github.com/doug-gilbert/smp_utils/issues
but I doubt it would help as the author "still prefers to use a
subversion repository on his own equipment (a Raspberry Pi) for
development" according to https://sg.danny.cz/sg/smp_utils.html. If you
have another way to reach him, please let me know.

#1094405#30
Date:
2025-01-28 08:22:23 UTC
From:
To:
Opening an issue is worthwhile. The reason is because emails may fall under
spam classification while opening an issue guarantees that it is reported.

s3nt fr0m a $martph0ne, excuse typ0s

#1094405#35
Date:
2025-01-28 08:22:23 UTC
From:
To:
Opening an issue is worthwhile. The reason is because emails may fall under
spam classification while opening an issue guarantees that it is reported.

s3nt fr0m a $martph0ne, excuse typ0s

#1094405#40
Date:
2025-01-28 10:47:09 UTC
From:
To:
It's now reported at https://github.com/doug-gilbert/smp_utils/issues/4
Have you looked at
https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4?
Do you think it could still be applied to fix the Debian package or
would you rather wait for a potential response from upstream?

#1094405#45
Date:
2025-01-28 10:47:09 UTC
From:
To:
It's now reported at https://github.com/doug-gilbert/smp_utils/issues/4
Have you looked at
https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4?
Do you think it could still be applied to fix the Debian package or
would you rather wait for a potential response from upstream?

#1094405#50
Date:
2025-01-29 07:04:04 UTC
From:
To:
It is preferred to have the point of view from upstream as well. Let's wait
and see. 🙏

s3nt fr0m a $martph0ne, excuse typ0s

#1094405#55
Date:
2025-01-29 07:04:04 UTC
From:
To:
It is preferred to have the point of view from upstream as well. Let's wait
and see. 🙏

s3nt fr0m a $martph0ne, excuse typ0s

#1094405#62
Date:
2025-02-10 03:27:28 UTC
From:
To:
Hello Ritesh,

It has now been ~2 weeks since I opened the issue on GitHub and almost a
month since I emailed the author.

Could you or another maintainer check my merge request at
https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4
please? Having the option parsing code restored would be of great help.

Have a nice day,

Louis

#1094405#67
Date:
2025-02-10 03:27:28 UTC
From:
To:
Hello Ritesh,

It has now been ~2 weeks since I opened the issue on GitHub and almost a
month since I emailed the author.

Could you or another maintainer check my merge request at
https://salsa.debian.org/linux-blocks-team/smp-utils/-/merge_requests/4
please? Having the option parsing code restored would be of great help.

Have a nice day,

Louis

#1094405#72
Date:
2025-02-10 08:59:20 UTC
From:
To:
Hello Louis,

I've copied some of the other DDs who may have stake in smp-utils. I
personally no more actively maintain these packages but hope to
help/chime in, if I have time.

I've copied Douglas on this email, for what record I have.


I have no idea why you are facing this problem. The changes look fine
to me. They are just standardizing on using a more standard interface
command in REPORT GENERAL. Also, that change is from almost 7 years
ago. From personal experience, as such, if it was a general issue, it'd
have been uncovered widely by now.

I personally wouldn't override the current code with your proposed MR
changes, for I have limited insight of its side-effects.

For you, I guess if this change is a blocker, you could continue to use
the older versions until the cause of this anomaly is sorted out.

Thanks,
Ritesh

#1094405#77
Date:
2025-02-10 20:39:11 UTC
From:
To:
Hi again,

Let me try to explain. The previous version determined the index of the
last phy to display with this very simple snippet:

num = op->do_num ? (op->phy_id + op->do_num) : MAX_PHY_ID;

if -n is passed (do_num), the index of the last phy to display was the
value passed to -p (phy_id) + the value passed to -n.

If -n was not used, the index of the last device was a constant set to
254 (I assume something caused the loop to stop before 254 because it
worked fine).

The following loop was used (still is in the newer version) to display phys:

for (k = op->phy_id; k < num; ++k) {

So, if we passed -n 5 -p 10, num would get the value 10 + 5 = 15. This
means smp_discover displayed devices starting at 10 and stopping at 15,
which was fine.

Now, the newer version initializes num at the number of phys, this is
fine when -n and -p are not used.

num = get_num_phys(top, op, &has_t2t);

The problem is this line:

num -= op->phy_id

It causes the index of the last phy to depend on the value of -p, the
index of the first phy. This seems wrong.
If we only pass -p 10, assuming a total of 30 phys, num takes the value
30 - 10 = 20, which is what I am observing in the initial bug report.
Only the first 20 phys are displayed although I only changed the value
of -p, not -n.

Then, this line breaks it even further if -n is used:

num = (num > op->do_num) ? op->do_num : num;

If we pass -p 10 -n 5, num was 20 (> 5), so it's eventually set to 5.
This causes the loop to display nothing because it attempts to iterate
from 10 to 5.

I have to admit that I am also surprised. Maybe smp_discover is a very
low usage package. Or maybe all its users are stuck on older versions.

The reason I filed this bug was to raise awareness of this issue after
spending hours pinpointing the problematic code.
I can offer access to one or more machines with an expander where the
issue occurs if it helps in any way. Feel free to email me and I'll
grant you or your fellow developers access.
Sure, I can do that as a last resort, but I feel like fixing this bug in
Debian would benefit everyone.

Cheers,