概率数据结构



这个想法是拥有一个数据结构,您只能随机访问其元素,但基于用户为每个元素定义的概率因子。 因此,如果包含 100 个元素的结构产生x的概率为 0.5,那么,理论上如果我们尝试检索一个随机元素一百次,那么x将被返回大约 \~50 次。

我找不到一个现成的解决方案来做到这一点,所以这是我的看法:

import kotlin.math.absoluteValue
/**
*@author mhashim6 on 13/10/2019
*/
class ProbabilitySet<T>(private val items: Array<out Pair<T, Float>>) {
private var probabilityIndices: List<Int>
private fun calcFutureSize(count: Int, probability: Float) =
((count / (1f - probability)) - count).toInt().absoluteValue
init {
probabilityIndices = items.withIndex().flatMap { (i, item) ->
item.act { (_, probability) ->
calcFutureSize(items.size, probability).minus(items.size).act { delta ->
Iterable { ConstIterator(delta, i) }
}
}
}
}
fun next(): T = items.random().first
}
class ConstIterator(private var size: Int, private val const: Int) : IntIterator() {
override fun nextInt(): Int {
size--
return const
}
override fun hasNext(): Boolean = size > 0
}
fun <E> probabilitySetOf(vararg items: Pair<E, Float>) = ProbabilitySet(items)
inline fun <T, R> T.act(action: (T) -> R) = action(this)

我试图让它可变,但我遇到了很多关于时间和记忆的复杂性。所以它现在是不可变的。

这是一个可行的实现吗? 是否已经有解决此问题的实现? 如何使其可变?

我假设如果元素概率之和不等于1,则必须通过将其原始概率除以所有元素概率的总和来计算实际元素概率。例如,由"A" to 0.1F"B" to 0.3F组成的ProbabilitySet25% 的案例中返回"A",在75%的案例中返回"B"

以下是我对可变ProbabilitySet的实现,其中addO(1( 中运行,nextO(logN(中运行:

class ProbabilitySet<E>(
private val random: Random = Random.Default
) {
private val nodes = mutableListOf<Node>()
private var sum = 0F
fun add(element: E, probability: Float) {
require(probability >= 0) { "[$element]'s probability ($probability) is less than 0" }
val oldSum = sum
sum += probability
nodes += Node(oldSum..sum, element)
}
fun isEmpty() = sum == 0F
fun next(): E {
if (isEmpty()) throw NoSuchElementException("ProbabilitySet is empty")
val index = random.nextFloat() * sum
return nodes[nodes.binarySearch {
when {
it.range.start > index -> 1
it.range.endInclusive < index -> -1
else -> 0
}
}].element
}
private inner class Node(
val range: ClosedRange<Float>,
val element: E
)
}

工厂方法:

fun <E> probabilitySetOf(vararg items: Pair<E, Float>, random: Random = Random.Default) =
ProbabilitySet<E>(random).apply {
items.forEach { (element, probability) -> add(element, probability) }
}

用例:

val set = probabilitySetOf("A" to 0.4F, "B" to 0.3F)
println(set.next())
set.add("C", 0.9F)
println(set.next())

最新更新