首页 > 解决方案 > std::sort 函数有问题。在 2 轮迭代之后,1 个元素似乎总是有空值

问题描述

我正在尝试编写一个排序程序,试图对我拥有的数据集进行排序。排序的关键是 Grid_ID,它恰好是一个字母数字数据。我试图相应地排序

它给了我一个错误

在抛出 'std::out_of_range' 的实例后调用终止
what(): basic_string::at: __n (which is 0) >= this->size() (which is 0)

在进行调试时,代码的读取部分似乎可以正常工作。将文件内容读取到 DataContainer 数据中会填充正确的键和文本位置数据。但是当涉及到 std::sort 时,当调用程序“less”时,const GridLabel& elem2 在第二次迭代后总是为零或 null

下面是一些数据集和部分源代码(我这里不包括按排序顺序写的内容,但应该是可运行的)

感谢帮助!

这是部分数据集

Grid_Id,Speed,Acc,ID
K31,173,8.37,1
K29,143,3.36,2
K29,107,4.56,3
K30,133,5.97,4
K30,153,2.38,5
J27,203,1.86,6
J27,143,1.59,7
I26,73,7.66,8
I27,134,2.86,9

这是代码

#include <algorithm>
#include <functional>
#include <fstream>
#include <string>

#include <deque>
#include <vector>

#include <iostream>
#include <sstream>

struct GridLabel
{
    std::string key_;
    std::istream::pos_type pos_;        // position in stream where this line starts

    GridLabel( const std::string& key, const std::istream::pos_type& pos) : key_( key)
                                                                   , pos_( pos)
    {
    }

    const GridLabel& operator=( const GridLabel& other)
    {
        key_ = other.key_;
        pos_ = other.pos_;

        return *this;
    }
};

typedef std::vector<  GridLabel> DataContainer;

// Return whether first element is smaller than the second
bool less( const  GridLabel& elem1, const  GridLabel& elem2 )
{
   std::stringstream ss1, ss2;
   ss1 <<  elem1.key_.at(0);
   ss2 <<  elem2.key_.at(0);

   int value  = (ss1.str()).compare(ss2.str());

   if( value < 0 )
   {
       return true;
   }
   else if( value == 0)
   {
       // need to check if the rest are smaller
       std::string substr1 = elem1.key_.substr(1, std::string::npos);
       std::string substr2 = elem2.key_.substr(1, std::string::npos);

       return (std::atoi(substr1.c_str()) < std::atoi(substr2.c_str()));
   }
   else
   {
       return false;
   }
 }

int main(int argc, char* argv[])
{
   DataContainer data;

   // read data into the vector here
   std::ifstream input( "some_file.csv");

   // check if it is correct
   if ( !input.good())
   {
        std::cerr << "Input file can not be openned." << std::endl;
        return -1;
   }

   std::string text;
   std::string key;
   std::istream::pos_type pos;
   int count=0, save=0;

   // to skip the header
   std::getline( input, text);

   for( int line = 0; !input.eof(); ++line)
   {
      // store the position before reading the line
      pos = input.tellg();

      std::getline( input, text);

      // parse it
      save = text.find(",");

      key = text.substr(0,(save));

      data.push_back(  GridLabel( key, pos));
   }

   // sort the data in sorted order
   std::sort( data.begin(), data.end(), less);

   // create the new file
   ...............

   return 0;
}

标签: c++algorithmsorting

解决方案


一个简化less()的比较

  1. 的第一个字符GridLabel::key
  2. 从 的第 2个字符开始的整数GridLabel::key

这不会考虑还存储了什么GridLabel::key。(这可能是 OP 的意图。)

样本:

#include <algorithm>
#include <iostream>
#include <string>

struct GridLabel {
  std::string key;
};

bool less(const GridLabel &elem1, const GridLabel &elem2)
{
  // compare first chars of keys
  const char c1 = elem1.key.at(0), c2 = elem2.key.at(0);
  if (c1 != c2) return c1 < c2;
  // compare integral beginning in 2nd char of keys
  const int i1 = atoi(elem1.key.c_str() + 1);
  const int i2 = atoi(elem2.key.c_str() + 1);
  return i1 < i2;
}

int main()
{
  GridLabel data[] = {
    { "K31,173,8.37,1" },
    { "K29,143,3.36,2" },
    { "K29,107,4.56,3" },
    { "K30,133,5.97,4" },
    { "K30,153,2.38,5" },
    { "J27,203,1.86,6" },
    { "J27,143,1.59,7" },
    { "I26,73,7.66,8" },
    { "I27,134,2.86,9" }
  };
  { std::cout << "Original data:\n";
    int i = 0;
    for (const GridLabel &entry : data) {
      std::cout << i++ << ": '" << entry.key << "'\n";
    }
  }
  std::cout << "Sorting...";
  std::sort(std::begin(data), std::end(data), less);
  std::cout << " Done.\n";
  { std::cout << "Sorted data:\n";
    int i = 0;
    for (const GridLabel &entry : data) {
      std::cout << i++ << ": '" << entry.key << "'\n";
    }
  }
}

输出:

Original data:
0: 'K31,173,8.37,1'
1: 'K29,143,3.36,2'
2: 'K29,107,4.56,3'
3: 'K30,133,5.97,4'
4: 'K30,153,2.38,5'
5: 'J27,203,1.86,6'
6: 'J27,143,1.59,7'
7: 'I26,73,7.66,8'
8: 'I27,134,2.86,9'
Sorting... Done.
Sorted data:
0: 'I26,73,7.66,8'
1: 'I27,134,2.86,9'
2: 'J27,203,1.86,6'
3: 'J27,143,1.59,7'
4: 'K29,143,3.36,2'
5: 'K29,107,4.56,3'
6: 'K30,133,5.97,4'
7: 'K30,153,2.38,5'
8: 'K31,173,8.37,1'

coliru 现场演示

请注意(根据谓词less()的实现方式)有很多元素被认为是相等的:

  • I26,73,7.66,8I27,134,2.86,9
  • J27,203,1.86,6J27,143,1.59,7
  • 等等

这些元素在排序后会以任意顺序出现。

或者,std::stable_sort()可以使用在这些情况下保留原始顺序。


推荐阅读