在不丢失跟踪事件的情况下停止并启动Erlang跟踪程序

我有一个关于Erlang中的跟踪程序的问题，以及如何在不丢失任何跟踪事件的情况下打开和关闭这些跟踪程序。假设我有一个进程P1，它正在使用send和receive跟踪标志进行跟踪，如下所示：

erlang:trace(P1Pid, true, [set_on_spawn, send, 'receive', {tracer, T1Pid}])

由于指定了set_on_spawn标志，一旦(子)进程P2由P1派生，相同的标志(，即set_on_spawn、send、'receive')也将应用于P2。现在假设我只想在P2上创建一个新的跟踪器，这样跟踪器T1处理来自P1的跟踪，跟踪器T2处理来自P2的跟踪。为了做到这一点，(因为Erlang每个进程只允许一个跟踪器)，我需要首先从P2中取消设置跟踪标志(，即set_on_spawn、send、'receive')(因为这些标志是由于set_on_spawn标志而自动继承的)，然后在P2上再次设置它们，如下所示：

% Unset trace flags on P2. 
erlang:trace(P2Pid, false, [set_on_spawn, send, 'receive']),
% We might lose trace events at this instant which were raised
% by process P2 while un-setting the tracer on P2 and setting
% it again.
% Now set again trace flags on P2, directing the trace to 
% a new tracer T2.
erlang:trace(P2Pid, true, [set_on_spawn, send, 'receive', {tracer, T2Pid}]),

在设置和取消设置跟踪器之间的行中，由于此处的竞争条件，进程P2引发的许多跟踪事件可能会丢失。

我的问题是：这能在不丢失追踪事件的情况下实现吗？

Erlang是否提供了以原子方式完成这种"示踪剂切换"(，即从T1到T2的)的方法？

或者，是否可以暂停Erlang虚拟机并暂停跟踪，从而避免丢失跟踪事件？

我已经深入研究了这个问题，可能发现了一个半理想的(见下文)部分解决方案。在阅读了Erlang文档之后，我发现了erlang:suspend_process/1和erlang:resume_process/1BIF。使用这两个，我可以实现所需的行为，比如：

% Suspend process P2. According to the Erlang docs, this function
% blocks the caller (i.e. the current tracer) until P2 is suspended.
% This way, we do not lose trace events.
erlang:suspend_process(P2Pid),
% Unset trace flags on P2. 
erlang:trace(P2Pid, false, [set_on_spawn, send, 'receive']),
% We should not lose any trace events from P2, since it is
% currently suspended, and therefore cannot generate any.
% However, we can still lose receive trace events that are 
% generated as a result of other processes sending messages 
% to P2.
% Now set again trace flags on P2, directing the trace to 
% a new tracer T2.
erlang:trace(P2Pid, true, [set_on_spawn, send, 'receive', {tracer, T2Pid}]),
% Finally, resume process P2, so that we can receive any trace 
% messages generated by P2 on the new tracer T2.
erlang:resume_process(P2Pid).

使用这种方法，我只关心以下三个问题：

Erlang:shupend_process/1和Erlang:resume_process/1的Erlang文档明确指出，这些文件仅用于调试目的。我的问题是，如示例所示，除非流程P2暂停，否则我们将面临丢失跟踪事件的风险(从跟踪程序T1切换到跟踪程序T2)，为什么这些不能在生产中使用
我们实际上是在扰乱系统(即我们正在干扰其调度)。是否存在与此相关的风险(除了可能忘记在先前挂起的进程上调用erlang:resume_process/1这一事实之外)
更重要的是，尽管我们可以阻止进程P2采取任何行动，但我们不能阻止其他进程向P2发送消息。这些消息将导致{trace, Pid, receive, ...}跟踪事件，当我们切换跟踪时，这些事件可能会丢失。有没有办法避免这种情况

NB：如果p'(调用erlang:suspend_process/1的进程)死亡，先前由进程p'挂起的进程p将自动恢复。

相关内容

最新更新

热门标签：