使用@autoreleasepool降低峰值内存使用率

我在一个iPad应用程序上工作，该应用程序有一个同步过程，该过程紧密地使用web服务和核心数据。为了根据苹果公司的建议减少内存占用，我定期分配并耗尽NSAutoreleasePool。这目前工作得很好，并且当前应用程序没有内存问题。然而，我计划转到ARC，在那里NSAutoreleasePool不再有效，并希望保持这种性能。我创建了几个示例并对它们进行了计时，我想知道使用ARC获得相同性能并保持代码可读性的最佳方法是什么。

出于测试目的，我提出了3个场景，每个场景都使用1到10000000之间的数字创建一个字符串。我将每个示例运行了3次，以确定使用带有AppleLLVM3.0编译器（w/ogdb-O0）和XCode4.2的Mac64位应用程序需要多长时间。我还通过仪器对每个例子进行了测试，以大致了解记忆峰值是什么。

以下每个示例都包含在以下代码块中：

int main (int argc, const char * argv[])
{
    @autoreleasepool {
        NSDate *now = [NSDate date];
        //Code Example ...
        NSTimeInterval interval = [now timeIntervalSinceNow];
        printf("Duration: %fn", interval);
    }
}

NSAutoreleasePool批次[原始预ARC]（峰值内存：~116 KB）

    static const NSUInteger BATCH_SIZE = 1500;
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    for(uint32_t count = 0; count < MAX_ALLOCATIONS; count++)
    {
        NSString *text = [NSString stringWithFormat:@"%u", count + 1U];
        [text class];
        if((count + 1) % BATCH_SIZE == 0)
        {
            [pool drain];
            pool = [[NSAutoreleasePool alloc] init];
        }
    }
    [pool drain];

运行时间：
10.928158
10.912849
11.084716

外部@自动释放池（峰值内存：~382 MB）

    @autoreleasepool {
        for(uint32_t count = 0; count < MAX_ALLOCATIONS; count++)
        {
            NSString *text = [NSString stringWithFormat:@"%u", count + 1U];
            [text class];
        }
    }

运行时间：
11.489350
11.310462
11.344662

内部@自动释放池（峰值内存：~61.2KB）

    for(uint32_t count = 0; count < MAX_ALLOCATIONS; count++)
    {
        @autoreleasepool {
            NSString *text = [NSString stringWithFormat:@"%u", count + 1U];
            [text class];
        }
    }

运行时间：
2012年11月14日
14.284014
14.099625

@autoreleasepool w/goto（峰值内存：~115KB）

    static const NSUInteger BATCH_SIZE = 1500;
    uint32_t count = 0;
    next_batch:
    @autoreleasepool {
        for(;count < MAX_ALLOCATIONS; count++)
        {
            NSString *text = [NSString stringWithFormat:@"%u", count + 1U];
            [text class];
            if((count + 1) % BATCH_SIZE == 0)
            {
                count++; //Increment count manually
                goto next_batch;
            }
        }
    }

运行时间：
10.908756
10.960189
11.018382

goto语句提供了最接近的性能，但它使用了goto。有什么想法吗？

更新：

注意：goto语句是文档中所述的@autoreleasepool的正常出口，不会泄漏内存

进入时，会推送一个自动释放池。在正常退出（中断，return、goto、fall-through等等）自动释放池弹出。为了与现有代码兼容，如果退出是由于异常，自动释放池不会弹出。

以下内容应该与不带goto:的goto答案实现相同的效果

for (NSUInteger count = 0; count < MAX_ALLOCATIONS;)
{
    @autoreleasepool
    {
        for (NSUInteger j = 0; j < BATCH_SIZE && count < MAX_ALLOCATIONS; j++, count++)
        {
            NSString *text = [NSString stringWithFormat:@"%u", count + 1U];
            [text class];
        }
    }
}

请注意，ARC启用了-O0中未启用的重要优化。如果要在ARC下测量性能，必须在启用优化的情况下进行测试。否则，您将根据ARC的"幼稚模式"测量手动调整的保留/释放位置。

通过优化再次运行测试，看看会发生什么。

更新：我很好奇，所以我自己运行了它。这些是发布模式（-Os）下的运行时结果，有7000000个分配。

arc-perf[43645:f803] outer: 8.1259
arc-perf[43645:f803] outer: 8.2089
arc-perf[43645:f803] outer: 9.1104
arc-perf[43645:f803] inner: 8.4817
arc-perf[43645:f803] inner: 8.3687
arc-perf[43645:f803] inner: 8.5470
arc-perf[43645:f803] withGoto: 7.6133
arc-perf[43645:f803] withGoto: 7.7465
arc-perf[43645:f803] withGoto: 7.7007
arc-perf[43645:f803] non-ARC: 7.3443
arc-perf[43645:f803] non-ARC: 7.3188
arc-perf[43645:f803] non-ARC: 7.3098

内存峰值（只有100000次分配，因为仪器永远都在占用）：

Outer: 2.55 MB
Inner: 723 KB
withGoto: ~747 KB
Non-ARC: ~748 KB

这些结果让我有点吃惊。嗯，记忆峰值的结果不是；这正是你所期望的。但是，即使启用了优化，inner和withGoto之间的运行时间差也高于我的预期。

当然，这在某种程度上是一种病理性的微观测试，它不太可能对任何应用程序的真实世界性能进行建模。这里的结论是，ARC确实可能会带来一些开销，但在做出假设之前，您应该始终衡量您的实际应用程序。

（此外，我使用嵌套的for循环测试了@ipmcc的答案；它的行为几乎与goto版本完全一样。）

相关内容

最新更新

热门标签：