std::sort导致段错误



我编写了一个程序,使用std::unordered_map计算字符串出现的次数,然后将映射中的值复制到std::vector,并调用std::sort先按计数排序,然后按字符串排序。

见下面的代码。请原谅这一长串字符串。我无法很快找到一个更短的列表来重现这种行为。

编译器资源管理器同意GCC 10及以上版本的段错误。

这是一个编译器错误还是我错过了一些明显的东西?

#include <unordered_map>
#include <vector>
#include <algorithm>
#include <string>
int main(int argc, char** argv) {
std::unordered_map<
std::string,
int
> terms;
std::vector<const char*> input {
"Head", "of", "Data", "News", "UK", "We", "are", "News", "UK", "is", "a",
"great", "company", "full", "talented", "and", "creative", "people", "We",
"are", "an", "organisation", "that", "holds", "journalism", "at", "its",
"very", "heart", "Our", "newspapers", "and", "digital", "products",
"include", "some", "of", "the", "most", "powerful", "media", "brands", "in",
"the", "English", "speaking", "world:", "the", "Times", "The", "Sunday",
"Times", "and", "The", "Sun", "reaching", "30", "million", "people", "each",
"week", "Despite", "differences", "in", "audience", "and", "content", "our",
"brands", "are", "united", "by", "a", "commitment", "to", "independent",
"journalism", "that", "connects", "our", "customers"
};
for (const char* str : input) {
++terms[str];
}
std::vector<std::pair<std::string,int>> order;
order.reserve(terms.size());
for (const auto& [ term, n ] : terms)
order.emplace_back(term,n);
std::sort(order.begin(), order.end(),
[](const auto& a, const auto& b) -> bool {
if (a.second > b.second) return true;
return a.first < b.first;
}
);
}
下面是gdb的输出:
Program received signal SIGSEGV, Segmentation fault.
__memcmp_evex_movbe () at ../sysdeps/x86_64/multiarch/memcmp-evex-movbe.S:118
118     VMOVU_MASK (%rsi), %YMM2{%k2}
(gdb) bt
#0  __memcmp_evex_movbe () at ../sysdeps/x86_64/multiarch/memcmp-evex-movbe.S:118
#1  0x00007ffff7d4d7ad in std::char_traits<char>::compare (__n=<optimized out>, __s2=<optimized out>, __s1=<optimized out>)
at /usr/src/debug/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:385
#2  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare (this=<optimized out>, __str=...)
at /usr/src/debug/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:3148
#3  0x000055555555b408 in std::operator< <char, std::char_traits<char>, std::allocator<char> > (__lhs="Data", __rhs=<error: Cannot access memory at address 0x72656d6f74737563>)
at /usr/include/c++/12.1.0/bits/basic_string.h:3694
#4  0x00005555555573e2 in operator()<std::pair<std::__cxx11::basic_string<char>, int>, std::pair<std::__cxx11::basic_string<char>, int> > (__closure=0x7fffffffd9f7, a={...}, b={...}) at test.cc:108
#5  0x000055555555744b in __gnu_cxx::__ops::_Val_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> >::operator()<std::pair<std::__cxx11::basic_string<char>, int>, __gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char>, int>*, std::vector<std::pair<std::__cxx11::basic_string<char>, int> > > >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> &, __gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >) (this=0x7fffffffd9f7, __val={...},
__it={first = <error: Cannot access memory at address 0x72656d6f74737563>, second = 2449}) at /usr/include/c++/12.1.0/bits/predefined_ops.h:240
#6  0x0000555555557096 in std::__unguarded_linear_insert<__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char>, int>*, std::vector<std::pair<std::__cxx11::basic_string<char>, int> > >, __gnu_cxx::__ops::_Val_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> > >(__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__ops::_Val_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> >) (__last={first = "", second = 3}, __comp=...) at /usr/include/c++/12.1.0/bits/stl_algo.h:1789
#7  0x0000555555556c28 in std::__unguarded_insertion_sort<__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char>, int>*, std::vector<std::pair<std::__cxx11::basic_string<char>, int> > >, __gnu_cxx::__ops::_Iter_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> > >(__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__ops::_Iter_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> >) (__first={first = "an", second = 1}, __last={first = "", second = 0}, __comp=...) at /usr/include/c++/12.1.0/bits/stl_algo.h:1830
#8  0x000055555555695b in std::__final_insertion_sort<__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char>, int>*, std::vector<std::pair<std::__cxx11::basic_string<char>, int> > >, __gnu_cxx::__ops::_Iter_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> > >(__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__ops::_Iter_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> >) (__first={first = "", second = 3}, __last={first = "", second = 0}, __comp=...) at /usr/include/c++/12.1.0/bits/stl_algo.h:1850
#9  0x0000555555556806 in std::__sort<__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char>, int>*, std::vector<std::pair<std::__cxx11::basic_string<char>, int> > >, __gnu_cxx::__ops::_Iter_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> > >(__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__ops::_Iter_comp_iter<main(int, char**)::<lambda(const auto:1&, const auto:2&)> >) (__first={first = "", second = 3}, __last={first = "", second = 0}, __comp=...) at /usr/include/c++/12.1.0/bits/stl_algo.h:1940
#10 0x0000555555556751 in std::sort<__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char>, int>*, std::vector<std::pair<std::__cxx11::basic_string<char>, int> > >, main(int, char**)::<lambda(const auto:1&, const auto:2&)> >(__gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, __gnu_cxx::__normal_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > >, struct {...}) (__first={first = "", second = 3}, __last={first = "", second = 0}, __comp=...)
at /usr/include/c++/12.1.0/bits/stl_algo.h:4853
#11 0x000055555555665c in main (argc=1, argv=0x7fffffffdfc8) at test.cc:105
[](const auto& a, const auto& b) -> bool {
if (a.second > b.second) return true;
return a.first < b.first;
}

这个比较器函数违反了作为严格弱排序比较器的要求。为了清楚起见,我将使用两个int值,而不是一个int和一个字符串,因为这使问题更容易理解,但是如果其中一个值是字符串,也会发生同样的事情。为了便于解释,比较两个int值与比较两个std::string具有相同的语义。

因此,例如,使用以下两个值:

a={2, 4}
b={1, 2}

第一个将比较小于第二个。4 > 2和比较器返回true。

a={1, 2}
b={2, 4}

这里也是一样。2 > 4为假,但1 < 2为真,所以比较器也返回真。

最终结果:这里你有两个值,每一个都小于另一个。

就像史波克先生会说的:这是不合逻辑的。

这是未定义的行为。

由于比较不满足严格弱排序的要求,

您可以使用在头文件<utility>中声明的标准函数std::tie解决这个问题,方法如下

std::sort(order.begin(), order.end(),
[](const auto& a, const auto& b) -> bool {
return std::tie( b.second, a.first ) < std::tie( a.second, b.first );
} );

之后,前10个排序记录看起来像

and 4
are 3
the 3
News 2
The 2
Times 2
UK 2
We 2
a 2
brands 2

注意按字典顺序大写字母排在小写字母前面。

比较器函数没有反映您的排序标准。通常,当您使用元组作为排序键时,例如(ke1, key2),当且仅当key1与元组相同时才比较key2。因此,比较器函数在比较key2时必须将tie反映为先决条件。一般来说,比较器是这样的:

if (a.key1 < a.key1) return true;
if (a.key1 == a.key1) /* tie case for key1 */ 
return a.key2 < b.key2;
return false;

所以比较器可以改成

[](const auto& a, const auto& b) -> bool {
if (a.second > b.second) return true;  
if (a.second == b.second)  // This reflects a tie for key1 
return a.first < b.first;
return false;
}