气泡无序播放-加权无序播放



可以设想对冒泡排序的修改;交换;以概率p随机发生而不是通过执行比较。结果可以称为";泡沫洗牌";。靠近前面的元素可能会留在那里,但有机会被转移到列表的后面。

修改从互联网上窃取的泡沫排序,你可以想出以下方法:

import random
def bubble_shuffle(arr, p):
arr = copy.copy(arr)
n = len(arr) 

# Traverse through all array elements 
for i in range(n-1): 
# range(n) also work but outer loop will repeat one time more than needed. 

# Last i elements are already in place 
for j in range(0, n-i-1): 

# traverse the array from 0 to n-i-1 
# Swap if random number [0, 1] is less than p
if random.random() < p:
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr

这个算法是n阶平方。。。但是元素最终出现在任何特定位置的概率应该是可以提前计算的;需要";为n平方。是否可以采取更有效的方法?

我曾考虑过从几何分布中采样,并将其添加到原始索引中(加上(len(n)-i-1)/len(n)以打破平局(,但这并不能提供相同的动态。

我同意btilly和其他人的观点,即相关性使这非常困难,如果不是不可能的话。

关于你的方法,单次传球的运动确实是几何分布的。然而,对于许多次传递,中心极限定理开始发挥作用。忽略边界效应,在一次传递中,元素以概率p向左移动一次,否则(以概率(1-p)(以成功概率1-p向右移动几何次数。这个分布的平均值为零。第一种可能性对方差贡献CCD_ 6。第二个贡献了(1-p) sum_{i>=0} p^i (1-p) i^2,WolframAlpha将其评估为(1+p) p / (1-p)

假设这个方差为v = p + (1+p) p / (1-p),我们可以想象,t通过后元素的delta位置正态分布,平均值为零,标准偏差为sqrt(t v)。我们的下一个近似是从离散时间切换到连续时间,对于每个元素,提取一个正常样本x,并假设delta位置平滑地变化为sqrt(t v) x。对于最初位于i位置的元素,我们可以求解方程i + sqrt(t v) x = n - t,使t近似该元素何时被锁定。然后我们只按t降序进行排序。

下面是一个实现这一点的简短Python程序。希望它离得足够近。

import math
import random

def variance(p):
return p + (1 + p) * p / (1 - p)

def solve_quadratic(b, c):
assert c < 0
return (math.sqrt(b ** 2 - 4 * c) - b) / 2

def bubble_shuffle(arr, p):
n = len(arr)
s = math.sqrt(variance(p))
return [
arr[i]
for i in sorted(
range(n),
key=lambda i: solve_quadratic(random.gauss(0, s), i - n),
reverse=True,
)
]

if __name__ == "__main__":
print(bubble_shuffle(range(100), 0.5))

通过每(n,p(只需要进行一次预计算,我们可以在预期的线性时间内模拟bubble_shuffle运行(不包括预计算(。

方法:get_bub(n,p(:O(n^2(方法来运行bubble_shuffle

get_expected_bub(n,p(:O(n^2(方法计算运行bubble_shuffle 的每个位置的预期平均值

get_dist(pos,p(:simbub使用的O(1(方法,该方法根据所使用的p获得随机数目的连续交换。

get_simbb(n,p_arr(:期望O(n*min(n,(1/(1-p((方法来模拟运行bubble_shuffle。对于p=0.5,这是预期的O(n(。对于p=1-(1/n(,这是O(n^2(。

get_expected_simbb(n,p_arr(:O(n^2(方法计算运行simbub的每个位置的预期平均值。

get_p_arr(n,p,tolerance(:对于给定的n&p.

compare(n,p,p_arr,trials(:多次运行simbub并将结果与bubble_shuffle的预期进行比较的方法。

time_trials(n,p,seconds(:对于给定的n&p、 在输入秒内同时运行bubble_shuffle和simbub,并比较我们能够完成的运行次数。

所有代码都是用Ruby编写的。

# Run bubble_shuffle
def get_bub(n, p)
arr = [*0..(n-1)]
0.upto(n-1) do |i|
0.upto(n-i-2) do |j|
if rand < p
arr[j], arr[j+1] = arr[j+1], arr[j]
end
end
end
return arr
end

# Get the expected average results of running bubble_shuffle many times
# This works by iteratively distributing value according to p.
def get_expected_bub(n, p)
arr = [*0.upto(n-1)]  

(n-1).downto(0) do |last_index|
working_arr = arr.clone
0.upto(last_index) do |i|
working_arr[i] = 0
end
0.upto(last_index) do |source_index|
min_sink = [0, source_index-1].max
max_sink = last_index
min_sink.upto(max_sink) do |sink_index|
portion = 1.0
if sink_index == source_index - 1
portion *= p
else
portion *= (1-p) if source_index > 0
portion *= (p**(sink_index - source_index)) if sink_index > source_index
portion *= (1-p) if sink_index < last_index
end
working_arr[sink_index] += arr[source_index] * portion
end
end
0.upto(last_index) do |i|
arr[i] = working_arr[i]
end
end
return arr
end

# For simbub, randomly get the distance to the index being swapped into 
# the current position
def get_dist(pos, p)
return 0 if pos == 0
return [pos, Math.log(1 - rand, p).floor].min
end

# Run simbub from the last-to-first index
# p_arr is the array of probabilities corresponding to the effective probability
# of swapping used at each position. The last value of this array will always
# equal the p value being simulated. So will the first, though this is not used.
def get_simbub(n, p_arr)
arr = [*0..(n-1)]
(n-1).downto(0) do |pos|
p = p_arr[pos]
dist = get_dist(pos, p)
if dist > 0
val_moving_up = arr[pos - dist]
(pos - dist).upto(pos - 1) do |j|
arr[j] = arr[j+1]
end
arr[pos] = val_moving_up
end
end
return arr
end

# Get the expected average results of running simbub many times
# This works by iteratively distributing value according to p_arr.
def get_expected_simbub(n, p_arr)
arr = [*0.upto(n-1)]  

(n-1).downto(1) do |last_index|
working_arr = arr.clone
0.upto(last_index) do |i|
working_arr[i] = 0
end

p = p_arr[last_index]
cum_p_distance = 0
0.upto(last_index) do |distance|
if distance == last_index
p_distance = p ** distance
else
p_distance = (1-p) * (p ** distance)
end

working_arr[last_index] += p_distance * arr[last_index - distance]

if distance >= 1
working_arr[last_index - distance] = arr[last_index - distance] + (1 - cum_p_distance) * (arr[last_index - distance + 1] - arr[last_index - distance])
end

cum_p_distance += p_distance
end
arr = working_arr
end
return arr
end

# Solve for the p_arr that yields the same expected averages for simbub for 
# each position (within tolerance) as bub
def get_p_arr(n, p, tolerance = 0.00001)
expected_bub = get_expected_bub(n, p)
p_arr = [p] * n

(n-2).downto(1) do |pos|
min_pos_p = 0.0
max_pos_p = 1.0
while true do
expected_simbub = get_expected_simbub(n, p_arr)
if expected_simbub[pos] > expected_bub[pos] + tolerance
min_pos_p = p_arr[pos]
p_arr[pos] = (p_arr[pos] + max_pos_p) / 2.0
elsif expected_simbub[pos] < expected_bub[pos] - tolerance
max_pos_p = p_arr[pos]
p_arr[pos] = (p_arr[pos] + min_pos_p) / 2.0
else
break
end
end
end
return p_arr
end

def compare(n, p, p_arr, trials)
expected_bub = get_expected_bub(n, p)
#bub_totals = [0]*n
simbub_totals = [0]*n
trials.times do 
simbub_trial = get_simbub(n, p_arr, 0)
#bub_trial = bub(n, p)
0.upto(n-1) do |i|
simbub_totals[i] += simbub_trial[i] 
#bub_totals[i] += bub_trial[i]
end
end
puts "   #:  expbub |  simbub |   delta"
0.upto(n-1) do |i|
#b = bub_totals[i] / trials.to_f
b = expected_bub[i]
s = simbub_totals[i] / trials.to_f
puts "#{(i).to_s.rjust(4)}: #{b.round(2).to_s.rjust(7)} | #{s.round(2).to_s.rjust(7)} | #{(s-b).round(2).to_s.rjust(7)}"
end
end

def time_trials(n, p, seconds)
t = Time.now
bub_counter = 0
while Time.now < t + seconds do
get_bub(n, p)
bub_counter += 1
end
t = Time.now
p_arr = get_p_arr(n, p, 0.0001)
p_arr_seconds = Time.now - t
t = Time.now
simbub_counter = 0
while Time.now < t + seconds do
get_simbub(n, p_arr)
simbub_counter += 1
end
puts "Trial results (#{seconds} seconds): "
puts "Time to get p_arr for simbub: #{p_arr_seconds.round(2)}"
puts "bub runs: #{bub_counter}"
puts "simbub runs: #{simbub_counter}"
puts "ratio: #{(simbub_counter.to_f/bub_counter.to_f).round(2)}"
end

n=100,p=0.5 的误差与预期

compare(100, 0.5, p_arr, 10000)
#:  expbub |  simbub |   delta
0:   10.27 |   10.23 |   -0.04
1:   10.27 |   10.18 |   -0.09
2:   10.33 |   10.16 |   -0.16
3:   10.44 |   10.45 |    0.01
4:   10.61 |   10.66 |    0.05
5:   10.83 |   10.83 |   -0.01
6:   11.11 |    11.1 |   -0.02
7:   11.45 |    11.5 |    0.05
8:   11.84 |   11.92 |    0.08
9:   12.27 |   12.35 |    0.08
10:   12.76 |   12.78 |    0.02
11:   13.29 |   13.23 |   -0.06
12:   13.87 |   13.72 |   -0.15
13:   14.49 |   14.58 |    0.09
14:   15.15 |   15.14 |   -0.01
15:   15.85 |   15.83 |   -0.02
16:   16.58 |   16.51 |   -0.06
17:   17.34 |   17.35 |    0.01
18:   18.13 |   18.26 |    0.13
19:   18.95 |    19.0 |    0.05
20:   19.79 |   19.75 |   -0.04
21:   20.66 |   20.85 |    0.19
22:   21.54 |    21.7 |    0.16
23:   22.45 |   22.64 |    0.19
24:   23.36 |   23.49 |    0.13
25:   24.29 |   24.19 |   -0.11
26:   25.24 |   25.17 |   -0.07
27:   26.19 |   26.38 |    0.19
28:   27.15 |   27.16 |    0.01
29:   28.12 |   28.16 |    0.05
30:   29.09 |   28.99 |    -0.1
31:   30.07 |   30.08 |     0.0
32:   31.05 |   31.19 |    0.14
33:   32.04 |   31.88 |   -0.16
34:   33.03 |   33.07 |    0.03
35:   34.02 |   33.78 |   -0.24
36:   35.02 |   34.97 |   -0.05
37:   36.01 |   36.05 |    0.04
38:   37.01 |    37.0 |   -0.01
39:   38.01 |   37.95 |   -0.06
40:    39.0 |   38.94 |   -0.07
41:    40.0 |   39.94 |   -0.06
42:    41.0 |   41.01 |     0.0
43:    42.0 |   42.08 |    0.08
44:    43.0 |   42.87 |   -0.13
45:    44.0 |   43.88 |   -0.12
46:    45.0 |   44.99 |   -0.02
47:    46.0 |   45.92 |   -0.08
48:    47.0 |    46.8 |    -0.2
49:    48.0 |   47.92 |   -0.08
50:    49.0 |   49.01 |    0.01
51:    50.0 |   50.04 |    0.04
52:    51.0 |   51.11 |    0.11
53:    52.0 |   51.95 |   -0.05
54:    53.0 |   53.08 |    0.08
55:    54.0 |   54.05 |    0.05
56:    55.0 |   54.95 |   -0.05
57:    56.0 |   55.98 |   -0.02
58:    57.0 |   57.13 |    0.13
59:    58.0 |   58.01 |    0.01
60:    59.0 |   59.11 |    0.11
61:    60.0 |   60.01 |    0.01
62:    61.0 |   61.02 |    0.02
63:    62.0 |   61.93 |   -0.07
64:    63.0 |   63.05 |    0.05
65:    64.0 |   64.01 |    0.01
66:    65.0 |    65.0 |    -0.0
67:    66.0 |   66.04 |    0.04
68:    67.0 |   67.11 |    0.11
69:    68.0 |   68.01 |    0.01
70:    69.0 |   69.03 |    0.03
71:    70.0 |   70.08 |    0.08
72:    71.0 |   70.96 |   -0.04
73:    72.0 |   72.01 |    0.01
74:    73.0 |   72.95 |   -0.05
75:    74.0 |    74.0 |    -0.0
76:    75.0 |   74.99 |   -0.01
77:    76.0 |   75.92 |   -0.08
78:    77.0 |   76.98 |   -0.02
79:    78.0 |   77.91 |   -0.09
80:    79.0 |   79.05 |    0.05
81:    80.0 |   79.96 |   -0.04
82:    81.0 |    81.0 |    -0.0
83:    82.0 |    82.0 |    -0.0
84:    83.0 |   82.98 |   -0.02
85:    84.0 |   84.06 |    0.06
86:    85.0 |   84.99 |   -0.01
87:    86.0 |   85.97 |   -0.03
88:    87.0 |    87.0 |    -0.0
89:    88.0 |   88.04 |    0.04
90:    89.0 |   88.95 |   -0.05
91:    90.0 |   90.03 |    0.03
92:    91.0 |   91.01 |    0.01
93:    92.0 |   91.97 |   -0.03
94:    93.0 |   92.98 |   -0.02
95:    94.0 |   94.01 |    0.01
96:    95.0 |   94.99 |   -0.01
97:    96.0 |   95.97 |   -0.03
98:    97.0 |   97.03 |    0.03
99:    98.0 |    98.0 |    -0.0

计时赛:(simbub在60秒内运行(/(bubble_shuffle在60秒后运行(

p=0.01  p=0.25  p=0.50  p=0.75  p=0.99
n = 100   10.85   10.17   10.75    9.53    4.16
n = 200   22.98   18.11   17.30   13.46    5.33 
n = 300   27.70   25.03   23.88   18.11    5.94
n = 400   41.09   29.46   27.11   21.81    6.92

最新更新