我需要一个建议,告诉我如何用程序计算从开始节点到结束节点的所有可能路径(程序化(,我可以使用C#、Python或Matlab,但我不知道哪一个更容易、更快地编码,也不知道从哪里开始,我还需要自动绘制节点之间的边,然后计算更多的公式。我有以下数据
Node Date Data
A 2020-01-01 2.09
B 2020-01-05 0.89
C 2020-01-08 3.17
D 2020-01-08 1.15
E 2020-01-15 3.65
我想做以下事情:
创建从任何节点到另一个节点的路径(单向,仅从较低日期到较高日期(,并且必须通过路径中日期之间的所有节点,例如:
1.1从(A(到(E(应该提取以下路径:
- A、B、C、E
- A、 B、D、E
因为C和D在同一日期,所以我们有两条不同的路径。
列表项
使用Python groupby和来自Python itertools模块的产品
代码
from itertools import groupby, product
def all_paths(s):
# Convert string to list of node triplets
nodes = [n.strip().split() for n in s.split('n')]
# Skip header and sort by time
# (since time is padded, no need to convert to datetime object to sort)
nodes = sorted(nodes[1:], key=lambda n: n[1]) # sorting by time which is second item in list (n[1])
# Group nodes by time (i.e. list of list, with nodes with same time in same sublist)
# Just keeping the node field of each item in sublist (i.e. node[0] is Node field)
labels = [[node[0] for node in v] for k, v in groupby(nodes, lambda n:n[1])]
# Path is the product of the labels (list of lists)
return list(product(*labels))
用法
示例1:字符串中的数据
data = '''Node Date Data
A 2020-01-01 2.09
B 2020-01-05 0.89
C 2020-01-08 3.17
D 2020-01-08 1.15
E 2020-01-15 3.65'''
print(all_paths(data))
示例2:文件Data.txt 中的数据
with open('data.txt', 'r') as fin:
print(all_paths(fin.read()))
输出(两种情况(:
[('A', 'B', 'C', 'E'), ('A', 'B', 'D', 'E')]
这将为每个路径生成一个节点链列表。它使用python列表和字典进行散列,因此如果您使用大型数据集,它将非常缓慢。如果是这种情况,请查看pandas库(groupby(。同样的逻辑也会起作用,但您肯定会在nodes_by_date函数中节省时间。该库中可能还有其他工具来生成所需的路径。
nodes = [('E',15,3.65), ('A',1,2.09), ('B',5,.89), ('C',8,3.17), ('D',8,1.15)]
#nodes = [('E',15,3.65), ('A',1,2.09), ('B',5,.89), ('C',8,3.17), ('D',8,1.15), ('F', 16, 100), ('G', 16, 200), ('H', 17, 1000)]
def all_paths(nodes):
nbd = nodes_by_date(nodes)
return get_chains(nbd, 0)
def nodes_by_date(nodes):
# sort by date (not sure what format your data is in, but might be easier to convert to a datenum/utc time)
nodes = sorted(nodes, key=lambda x: x[1])
# build a list of lists, each containing all of the nodes with a certain date
# if your data is larger, look into the pandas library's groupby operation
dates = {}
for n in nodes:
d = dates.get(n[1], [])
d.append(n)
dates[n[1]]=d
return list(dates.values())
def get_chains(nbd, i):
if i == len(nbd):
return []
# depth-first recursion so tails are only generated once
tails = get_chains(nbd, i+1)
chains = []
for n in nbd[i]:
if len(tails):
for t in tails:
newchain = [n]
# only performant for smaller data
newchain.extend(t)
chains.append(newchain)
# end of recursion
else:
chains.append([n])
return chains