快速排序不能在10k个元素下工作



我的老师给了我一个快速排序函数来使用和测试执行时间,但是当它到达一个10,000个元素的列表时,它抛出堆栈溢出,我不知道为什么。我在几台不同的计算机上对它进行了测试,在解析的10,000个元素中,大约有9375个元素得到了相同的结果。

快速排序文件

#include "swap.h"
/** Chooses a pivot for quicksort's partition algorithm and swaps
 *  it with the first item in an array.
 * @pre theArray[first..last] is an array; first <= last.
 * @post theArray[first] is the pivot.
 * @param theArray  The given array.
 * @param first  The first element to consider in theArray.
 * @param last  The last element to consider in theArray. */
void choosePivot(int theArray[], int first, int last){
    //cerr << "choosePivot(array, " << first << ", " << last << ")n";
    int mid = (last - first) / 2;
    if( (theArray[first] <= theArray[mid] &&
         theArray[mid] <= theArray[last]) ||
        (theArray[last] <= theArray[mid] &&
         theArray[mid] <= theArray[first]) ){
        // value at mid index is middle of values at first and last indices 
        swap(theArray[first], theArray[mid]);
    }else if( (theArray[first] <= theArray[last] &&
               theArray[last] <= theArray[mid]) ||
              (theArray[mid] <= theArray[last] &&
               theArray[last] <= theArray[first])){
        // value at last index is middle of values
        swap(theArray[first], theArray[last]);
    }
}

/** Partitions an array for quicksort.
 * @pre theArray[first..last] is an array; first <= last.
 * @post Partitions theArray[first..last] such that:
 *    S1 = theArray[first..pivotIndex-1] <  pivot
 *         theArray[pivotIndex]          == pivot
 *    S2 = theArray[pivotIndex+1..last]  >= pivot
 * @param theArray  The given array.
 * @param first  The first element to consider in theArray.
 * @param last  The last element to consider in theArray.
 * @param pivotIndex  The index of the pivot after partitioning. */
void partition(int theArray[],
               int first, int last, int& pivotIndex){
   // place pivot in theArray[first]
   choosePivot(theArray, first, last);
   int pivot = theArray[first];     // copy pivot
   // initially, everything but pivot is in unknown
   int lastS1 = first;           // index of last item in S1
   int firstUnknown = first + 1; // index of first item in
                                 // unknown
   // move one item at a time until unknown region is empty
   for (; firstUnknown <= last; ++firstUnknown)
   {  // Invariant: theArray[first+1..lastS1] < pivot
      //         theArray[lastS1+1..firstUnknown-1] >= pivot
      // move item from unknown to proper region
      if (theArray[firstUnknown] < pivot)
      {  // item from unknown belongs in S1
         ++lastS1;
         swap(theArray[firstUnknown], theArray[lastS1]);
      }  // end if
      // else item from unknown belongs in S2
   }  // end for
   // place pivot in proper position and mark its location
   swap(theArray[first], theArray[lastS1]);
   pivotIndex = lastS1;
}  // end partition
/** sorts the items in an array into ascending order.
 * @pre theArray[first..last] is an array.
 * @post theArray[first..last] is sorted.
 * @param theArray  The given array.
 * @param first  The first element to consider in theArray.
 * @param last  The last element to consider in theArray. */
void quicksort(int theArray[], int first, int last){
    int pivotIndex;
   if (first < last)
   {  // create the partition: S1, pivot, S2
      partition(theArray, first, last, pivotIndex);
      // sort regions S1 and S2
      quicksort(theArray, first, pivotIndex-1);
      quicksort(theArray, pivotIndex+1, last);
   }  // end if
}  // end quicksort

swap.h文件

#ifndef _SWAP_H
#define _SWAP_H
/** Swaps two items.
 * @pre x and y are the items to be swapped.
 * @post Contents of actual locations that x and y represent are
 *       swapped.
 * @param x  Given data item.
 * @param y  Given data item. */
void swap(int& x, int& y){
   int temp = x;
   x = y;
   y = temp;
}  // end swap
#endif /* _SWAP_H */

和实现文件

//main.cpp
//Angelo Todaro
//Main driverto clock the timing efficiency of different sort algorithms for different sized lists
#include "quickSort.cpp"
#include <iostream>
#include <time.h>
using namespace std;
double diffclock(clock_t,clock_t);
int main(){
    clock_t begin, end;//clocks to store number of ticks at beginning and end
    srand(time(NULL));//initialize seed
    cout << "# of ElementstQuickn";
    for(int n = 10; n < 100000; n*=10){
        int* array = new int[n];
        cout << n << "tt";
        for(int i =0; i < n; i++){
            array[i]=rand()%1000;
        }

        //quick sort
        begin=clock();
        quicksort(array,0,n);
        end=clock();
        cout << diffclock(end,begin) << "t";
    }
    return 0;
}

double diffclock(clock_t clock1, clock_t clock2){
    double diffticks = clock1-clock2;//finds difference between ticks
    double diffmili=diffticks/CLOCKS_PER_SEC;//turns tickes into miliseconds
    return diffmili;
}

这是一个递归快速排序实现,它也没有选择一个非常好的枢轴。给定一些输入,它可能会对每个元素进行函数调用。栈上的10k个调用很难处理。在这一点上,只有具有随机枢轴的迭代原地快速排序才是一个好的算法。

当我为你调试代码时,我可以看到当你调用choosePivot时,只有一个元素,它被完全擦除:

enter quicksort(array, 2, 3)
1  //content of region
enter partition(array, 2, 3)
1  //content of region 
enter choosePivot(array, 2, 3)
1  //content of region 
0  //content of region - WAIT, WHAT HAPPENED TO THE ONE
end choosePivot
0   //content of region
end partition

所以我们至少找到了问题的位置:choosePivot。当我们仔细观察这个函数时,我最终发现了错误:

void choosePivot(int theArray[], int first, int last){
    //cerr << "enter choosePivot(array, " << first << ", " << last << ")n";
    int mid = (last - first) / 2;
    if( (theArray[first] <= theArray[mid] &&
         theArray[mid] <= theArray[last]) ||
        (theArray[last] <= theArray[mid] &&
         theArray[mid] <= theArray[first]) ){

theArray[last]超出了边界并且超过了数组的末尾。您将一个完全随机且无效的数字交换到数组中,并将一个元素交换出数组。这种情况似乎经常发生。这就是为什么我们在测试大数之前要测试准确性。

当我将theArray[last]更改为theArray[last-1]时,您的代码通过了我所有的测试。注意:

  • c++库已经有了std::swap
  • 你泄露了你所有的内存。匹配deletenew,或者更好,使用std::vector<int>

最新更新