我正在尝试优化Java 8中的函数操作,但我遇到了一些严重的性能问题。
<标题> 情况我必须从给定枚举的值中解析HTTP Headers String, List<String>
,该枚举将HeaderName映射到许多可能的变体String, Set<String>
。
给定以下HttpHeaders:
public static final Map<String, List<String>> httpHeaders = new HashMap<>();
httpHeaders.put("Content-Type", Arrays.asList("application/json", "text/x-json"));
httpHeaders.put("SID", Arrays.asList("ABC123"));
httpHeaders.put("CORRELATION-ID", Arrays.asList("ZYX666"));
和我的自定义枚举:
LogHeaders
protected final String key;
protected final Set<String> variation;
SESSION_ID("_sid", Arrays.asList("SESSION-ID", "SID"));
CORRELATION_ID("cid", Arrays.asList("CORRELATION-ID", "CID")),
private LogHeaders(final String logKey, final List<String> logKeyVariations) {
this.logKey = logKey;
this.logKeyVariations = new HashSet<>(logKeyVariations);
}
@Override
public String toString() {
return this.logKey;
}
结果应该是"LogHeaders"的映射。` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` `对于给定的标头,只能有一种变体:
// {LogHeaders.key : HttpHeaderValue>
{
_sid=[ABC123],
_cid=[ZYX666]
}
<标题> 程序代码final Map<String, List<String>> logHeadersToValue = new HashMap<>();
for (final LogHeaders header : LogHeaders.values()) {
for (final String variation : header.getLogKeyVariations()) {
final List<String> headerValue = httpHeaders.get(variation);
if (headerValue != null) {
logHeadersToValue.put(header.logKey, headerValue);
break;
}
}
}
<标题> 功能代码final Map<String, List<String>> logHeadersToValue =
EnumSet.allOf(LogHeaders.class)
.stream()
.collect(Collectors.toMap(
LogHeaders::toString,
logHeader -> logHeader.getLogKeyVariations().stream()
.map(variation -> httpHeaders.get(variation)).filter(Objects::nonNull)
.collect(singletonCollector())));
public static <T> Collector<T, ?, T> singletonCollector() {
return Collectors.collectingAndThen(Collectors.toList(), list -> {
if (list.size() < 1) {
return null;
}
return list.get(0);
});
}
<标题>当前基准h1> 知道我该如何优化我的功能部分吗? 谢谢<标题>更新基准h1> 用@Tagir Valeev代码运行了100k的预热+ 100k的迭代:标题>
FunctionalParsing : 0.040s
ProceduralParsing : 0.010s
更新基准#2
我用@Misha代码运行了100k的预热+ 100k的迭代:
FunctionalParsing : 0.025s
ProceduralParsing : 0.017s
标题>标题>标题>标题>标题>
我绝对肯定您做错了基准测试。很可能你只执行了一次。您并不关心您的程序是运行0.001秒还是0.086秒,对吧?它仍然比你眨眼的速度快。因此,您可能希望多次运行这段代码。但似乎你只测量了一次时间,并错误地假设每次连续的运行将花费大致相同的时间。在第一次启动期间,代码主要由解释器执行,而稍后将进行jit编译,并且工作速度会快得多。这对于流相关的代码非常重要。
对于您的代码,似乎没有必要使用自定义收集器。你可以这样实现它:
final Map<String, List<String>> logHeadersToValue =
EnumSet.allOf(LogHeaders.class)
.stream()
.collect(Collectors.toMap(
LogHeaders::toString,
logHeader -> logHeader.getLogKeyVariations().stream()
.map(httpHeaders::get).filter(Objects::nonNull)
.findFirst().orElse(null)));
这个解决方案也可能更快,因为它不会读取多个http头(就像在过程代码中通过break
完成)。
你的函数代码做的事情和你原来的不一样。如果其中一个NullPointerException
未能匹配标头,则旧代码将跳过它,而功能代码将抛出CC_5。
将原始代码直接翻译成流应该是这样的:
Map<String, List<String>> logHeadersToValue = Arrays.stream(LogHeaders.values())
.collect(
HashMap::new,
(map, logHeader) -> logHeader.getLogKeyVariations().stream()
.filter(httpHeaders::containsKey)
.findAny()
.ifPresent(x -> map.put(logHeader.key, httpHeaders.get(x))),
Map::putAll
);
如果你想让它更高效,更易于阅读,可以考虑预先计算每个变量的Map<String,String>
。您可以这样修改enum
:
enum LogHeaders {
SESSION_ID("_sid", "SESSION-ID", "SID"),
CORRELATION_ID("cid", "CORRELATION-ID", "CID");
final String key;
final Map<String, String> variations;
private LogHeaders(final String key, String... variation) {
this.key = key;
variations = Arrays.stream(variation).collect(collectingAndThen(
toMap(x -> x, x -> key),
Collections::unmodifiableMap
));
}
// unmodifiable map of every variation to its key
public final static Map<String, String> variationToKey =
Arrays.stream(LogHeaders.values())
.flatMap(lh -> lh.variations.entrySet().stream())
.collect(collectingAndThen(
toMap(Map.Entry<String, String>::getKey, Map.Entry<String, String>::getValue),
Collections::unmodifiableMap
)); // will throw if 2 keys have the same variation
}
这种方法的优点是,如果存在重复的变量,则可以快速失败。现在代码变得非常简单:
Map<String, List<String>> logHeadersToValue = LogHeaders.variationToKey.keySet().stream()
.filter(httpHeaders::containsKey)
.collect(toMap(LogHeaders.variationToKey::get, httpHeaders::get));