我在github中为quickselect
算法找到了这段代码,也称为order-statistics
。此代码运行良好。
我不理解medianOf3
方法,它应该按排序顺序排列第一个、中间个和最后一个索引。但实际上,在调用medianof3
方法后,当我输出数组时,它不会。除了swap(list, centerIndex, rightIndex - 1);
的最后一个调用之外,我可以按照这个方法来判断它在做什么。有人能解释一下为什么叫这个吗?
import java.util.Arrays;
/**
* This program determines the kth order statistic (the kth smallest number in a
* list) in O(n) time in the average case and O(n^2) time in the worst case. It
* achieves this through the Quickselect algorithm.
*
* @author John Kurlak <john@kurlak.com>
* @date 1/17/2013
*/
public class Quickselect {
/**
* Runs the program with an example list.
*
* @param args The command-line arguments.
*/
public static void main(String[] args) {
int[] list = { 3, 5, 9, 10, 7, 40, 23, 45, 21, 2 };
int k = 6;
int median = medianOf3(list, 0, list.length-1);
System.out.println(median);
System.out.println("list is "+ Arrays.toString(list));
Integer kthSmallest = quickselect(list, k);
if (kthSmallest != null) {
System.out.println("The kth smallest element in the list where k=" + k + " is " + kthSmallest + ".");
} else {
System.out.println("There is no kth smallest element in the list where k=" + k + ".");
}
System.out.println(Arrays.toString(list));
}
/**
* Determines the kth order statistic for the given list.
*
* @param list The list.
* @param k The k value to use.
* @return The kth order statistic for the list.
*/
public static Integer quickselect(int[] list, int k) {
return quickselect(list, 0, list.length - 1, k);
}
/**
* Recursively determines the kth order statistic for the given list.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @param k The k value to use.
* @return The kth order statistic for the list.
*/
public static Integer quickselect(int[] list, int leftIndex, int rightIndex, int k) {
// Edge case
if (k < 1 || k > list.length) {
return null;
}
// Base case
if (leftIndex == rightIndex) {
return list[leftIndex];
}
// Partition the sublist into two halves
int pivotIndex = randomPartition(list, leftIndex, rightIndex);
int sizeLeft = pivotIndex - leftIndex + 1;
// Perform comparisons and recurse in binary search / quicksort fashion
if (sizeLeft == k) {
return list[pivotIndex];
} else if (sizeLeft > k) {
return quickselect(list, leftIndex, pivotIndex - 1, k);
} else {
return quickselect(list, pivotIndex + 1, rightIndex, k - sizeLeft);
}
}
/**
* Randomly partitions a set about a pivot such that the values to the left
* of the pivot are less than or equal to the pivot and the values to the
* right of the pivot are greater than the pivot.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @return The index of the pivot.
*/
public static int randomPartition(int[] list, int leftIndex, int rightIndex) {
int pivotIndex = medianOf3(list, leftIndex, rightIndex);
int pivotValue = list[pivotIndex];
int storeIndex = leftIndex;
swap(list, pivotIndex, rightIndex);
for (int i = leftIndex; i < rightIndex; i++) {
if (list[i] <= pivotValue) {
swap(list, storeIndex, i);
storeIndex++;
}
}
swap(list, rightIndex, storeIndex);
return storeIndex;
}
/**
* Computes the median of the first value, middle value, and last value
* of a list. Also rearranges the first, middle, and last values of the
* list to be in sorted order.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @return The index of the median value.
*/
public static int medianOf3(int[] list, int leftIndex, int rightIndex) {
int centerIndex = (leftIndex + rightIndex) / 2;
if (list[leftIndex] > list[rightIndex]) {
swap(list, leftIndex, centerIndex);
}
if (list[leftIndex] > list[rightIndex]) {
swap(list, leftIndex, rightIndex);
}
if (list[centerIndex] > list[rightIndex]) {
swap(list, centerIndex, rightIndex);
}
swap(list, centerIndex, rightIndex - 1);
return rightIndex - 1;
}
/**
* Swaps two elements in a list.
*
* @param list The list.
* @param index1 The index of the first element to swap.
* @param index2 The index of the second element to swap.
*/
public static void swap(int[] list, int index1, int index2) {
int temp = list[index1];
list[index1] = list[index2];
list[index2] = temp;
}
}
所以我编写了原始代码,但在使其可读性方面做得很差。
回过头来看,我不认为这行代码是必要的,但我认为这是一个小的优化。如果我们删除这行代码并返回centerIndex
,它似乎可以正常工作,没有任何问题。
不幸的是,它执行的优化应该从medianOf3()
中重构出来,并转移到randomPartition()
中。
从本质上讲,优化是为了在对子数组进行分区之前,尽可能多地对其进行"部分排序"。原因是:数据排序越多,我们未来的分区选择就越好,这意味着我们的运行时间有望更接近O(n),而不是O(n^2)。在randomPartition()
方法中,我们将pivot值移动到我们正在查看的子数组的最右边。这将最右边的值移动到子数组的中间。这是不希望的,因为最右边的值应该是"更大的值"。我的代码试图通过将数据透视索引放在最右边的索引旁边来防止这种情况。然后,当枢轴索引与randomPartition()
中最右侧的索引交换时,"较大"的最右侧值不会移动到子数组的中间,而是保持在右侧附近。
函数medianOf3
用于定义左中值和右中值的顺序。最后陈述
swap(list, centerIndex, rightIndex - 1)
用于实现以下前提条件:
然而,而不是像在quicksort中那样向两边递归,quickselect只递归到一侧&具有该元素的一侧正在搜索。这降低了O(n log n)(in快速排序)到O(n)(在快速选择中)。
然后算法继续:
for (int i = leftIndex; i < rightIndex; i++) {
if (list[i] <= pivotValue) {
swap(list, storeIndex, i);
storeIndex++;
}
}
以便
枢轴左侧的值小于或等于枢轴和枢轴右侧的值大于枢轴。