请求Bash sort-n选项解释



我提取了许多IP地址,然后对其进行排序以供进一步显示。相关代码为:

| cut -w -f11 | sort -t. -k1,4  -u -n
69.156.151.245
99.226.129.44
108.170.136.226
142.126.92.197

然而,这排除了一个已知的地址,如果去掉-n选项,就会显示这个地址:

| cut -w -f11 | sort -t. -k1,4 -u
108.170.136.226
142.126.92.197
69.156.151.245
69.156.7.43
99.226.129.44
99.255.53.67

并且它再次出现-u选项被删除:

| cut -w -f11 | sort -t. -k1,4 -n   
69.156.151.245
69.156.7.43
69.156.7.43
69.156.7.43
99.226.129.44
99.255.53.67
99.255.53.67
108.170.136.226
142.126.92.197

我的问题是:为什么-n与-u选项组合时会产生从输出中删除69.156.7.43的效果。我可以猜测这与69.156.151.245有关,但什么?

下面给出的答案产生了这个:

cut -w -f11 | sort -t. -k1,4  -u -V
69.156.7.43
69.156.151.245
99.226.129.44
99.255.53.67
108.170.136.226
142.126.92.197
216.185.71.41

这是一个有趣的边缘案例。-n选项假定数字和数字有一个小数点。因此,唯一性的比较只是前两个标记。变通办法是使用版本排序。

... | sort -V -u 

是的,这有点糟糕。。。假设输入在最后一个片段中(并将AA、BB、CC、DD等放在行的末尾以查看发生了什么,我们会看到这个输出。

| sort --debug -t. -k1,4 -n -u
Memory to be used for sorting: 4294967296
Number of CPUs: 4
Using collate rules of C locale
Byte sort is used
Positive sign: <+>
Negative sign: <->
sort_method=mergesort
; k1=<99.226.129.44 EE>, k2=<99.255.53.67 FF>; s1=<99.226.129.44 EE>, s2=<99.255.53.67 FF>; cmp1=0
; k1=<99.255.53.67 FF>, k2=<99.255.53.67 GG>; s1=<99.255.53.67 FF>, s2=<99.255.53.67 GG>; cmp1=0
; k1=<99.255.53.67 GG>, k2=<108.170.136.226 HH>; s1=<99.255.53.67 GG>, s2=<108.170.136.226 HH>; cmp1=-1
; k1=<108.170.136.226 HH>, k2=<142.126.92.197 II>; s1=<108.170.136.226 HH>, s2=<142.126.92.197 II>; cmp1=-1
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 BB>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 BB>; cmp1=0
; k1=<69.156.7.43 CC>, k2=<69.156.7.43 DD>; s1=<69.156.7.43 CC>, s2=<69.156.7.43 DD>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 CC>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 CC>; cmp1=0
; k1=<69.156.7.43 CC>, k2=<69.156.7.43 BB>; s1=<69.156.7.43 CC>, s2=<69.156.7.43 BB>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<99.226.129.44 EE>; s1=<69.156.151.245 AAA>, s2=<99.226.129.44 EE>; cmp1=-1
; k1=<99.226.129.44 EE>, k2=<69.156.7.43 BB>; s1=<99.226.129.44 EE>, s2=<69.156.7.43 BB>; cmp1=1
; k1=<99.226.129.44 EE>, k2=<69.156.7.43 CC>; s1=<99.226.129.44 EE>, s2=<69.156.7.43 CC>; cmp1=1
; k1=<99.226.129.44 EE>, k2=<69.156.7.43 DD>; s1=<99.226.129.44 EE>, s2=<69.156.7.43 DD>; cmp1=1
69.156.151.245 AAA
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 BB>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 BB>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 CC>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 CC>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 DD>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 DD>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<99.226.129.44 EE>; s1=<69.156.151.245 AAA>, s2=<99.226.129.44 EE>; cmp1=-1
99.226.129.44 EE
; k1=<99.226.129.44 EE>, k2=<99.255.53.67 FF>; s1=<99.226.129.44 EE>, s2=<99.255.53.67 FF>; cmp1=0
; k1=<99.226.129.44 EE>, k2=<99.255.53.67 GG>; s1=<99.226.129.44 EE>, s2=<99.255.53.67 GG>; cmp1=0
; k1=<99.226.129.44 EE>, k2=<108.170.136.226 HH>; s1=<99.226.129.44 EE>, s2=<108.170.136.226 HH>; cmp1=-1
108.170.136.226 HH
; k1=<108.170.136.226 HH>, k2=<142.126.92.197 II>; s1=<108.170.136.226 HH>, s2=<142.126.92.197 II>; cmp1=-1
142.126.92.197 II

即使我放弃了键的定义,它仍然有点奇怪。。

然而,您可以通过将其拆分为两部分来解决此问题(假设您在数字订购的IP的最小集合之后(-go unique,然后-n

| sort -u  | sort -n
69.156.151.245
69.156.7.43
99.226.129.44
99.255.53.67
108.170.136.226
142.126.92.197

最新更新