我不知道如何使用 Rcpp 连接 2 个字符串;当我怀疑有一个明显的答案时,文档对我没有帮助。
http://gallery.rcpp.org/articles/working-with-Rcpp-StringVector/
http://gallery.rcpp.org/articles/strings_with_rcpp/
StringVector concatenate(StringVector a, StringVector b)
{
StringVector c;
c= ??;
return c;
}
我希望这个输出:
a=c("a","b"); b=c("c","d");
concatenate(a,b)
[1] "ac" "bd"
可能有几种不同的方法可以解决这个问题,但这里有一个std::transform
选项:
#include <Rcpp.h>
using namespace Rcpp;
struct Functor {
std::string
operator()(const std::string& lhs, const internal::string_proxy<STRSXP>& rhs) const
{
return lhs + rhs;
}
};
// [[Rcpp::export]]
CharacterVector paste2(CharacterVector lhs, CharacterVector rhs)
{
std::vector<std::string> res(lhs.begin(), lhs.end());
std::transform(
res.begin(), res.end(),
rhs.begin(), res.begin(),
Functor()
);
return wrap(res);
}
/*** R
lhs <- letters[1:2]; rhs <- letters[3:4]
paste(lhs, rhs, sep = "")
# [1] "ac" "bd"
paste2(lhs, rhs)
# [1] "ac" "bd"
*/
首先将左手表达式复制到std::vector<std::string>
的原因是internal::string_proxy<>
类为operator+
提供了签名
std::string operator+(const std::string& x, const internal::string_proxy<STRSXP>& y)
而不是,例如
operator+(const internal::string_proxy<STRSXP>& x, const internal::string_proxy<STRSXP>& y)
如果您的编译器支持 C++11,则可以稍微干净一些:
// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
CharacterVector paste3(CharacterVector lhs, CharacterVector rhs)
{
using proxy_t = internal::string_proxy<STRSXP>;
std::vector<std::string> res(lhs.begin(), lhs.end());
std::transform(res.begin(), res.end(), rhs.begin(), res.begin(),
[&](const std::string& x, const proxy_t& y) {
return x + y;
}
);
return wrap(res);
}
/*** R
lhs <- letters[1:2]; rhs <- letters[3:4]
paste(lhs, rhs, sep = "")
# [1] "ac" "bd"
paste3(lhs, rhs)
# [1] "ac" "bd"
*/
一个有效的解决方案是使用:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
CharacterVector concatenate(std::string x, std::string y)
{
return wrap(x + y);
}
然后:
Vconcatenate=Vectorize(concatenate)
Vconcatenate(letters[1:2],letters[3:4])
或:
// [[Rcpp::export]]
CharacterVector concatenate(std::vector<std::string> x,std::vector<std::string> y)
{
std::vector<std::string> res(x.size());
for (int i=0; i < x.size(); i++)
{
res[i]=x[i]+y[i];
}
return wrap(res);
}
我把这个答案留了下来,但请注意@nrussell提供的关于使用push_back()
的警告!
我自己仍然在掌握Rcpp
,所以我在一个循环中做了一个字符串生成器
library(Rcpp)
cppFunction('StringVector concatenate(StringVector a, StringVector b)
{
StringVector c;
std::ostringstream x;
std::ostringstream y;
// concatenate inputs
for (int i = 0; i < a.size(); i++)
x << a[i];
for (int i = 0; i < b.size(); i++)
y << b[i];
c.push_back(x.str());
c.push_back(y.str());
return c;
}')
a=c("a","b"); b=c("c","d");
concatenate(a,b)
# [1] "ab" "cd"
比较 (i( 重复调用push_back
与 (ii( 预分配和填充策略的性能,我们可以看到后者更可取:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
CharacterVector pbpaste(CharacterVector lhs, CharacterVector rhs)
{
R_xlen_t i = 0, sz = lhs.size();
CharacterVector res;
for (std::ostringstream oss; i < sz; i++, oss.str("")) {
oss << lhs[i] << rhs[i];
res.push_back(oss.str());
}
return res;
}
// [[Rcpp::export]]
CharacterVector sspaste(CharacterVector lhs, CharacterVector rhs)
{
R_xlen_t i = 0, sz = lhs.size();
CharacterVector res(sz);
for (std::ostringstream oss; i < sz; i++, oss.str("")) {
oss << lhs[i] << rhs[i];
res[i] = oss.str();
}
return res;
}
/*** R
lhs <- as.character(1:5000); rhs <- as.character(5001:10000)
all.equal(pbpaste(lhs, rhs), sspaste(lhs, rhs))
# [1] TRUE
microbenchmark::microbenchmark(
"push_back" = pbpaste(lhs, rhs),
"preallocate" = sspaste(lhs, rhs),
times = 200L
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# push_back 101.521579 105.334649 115.156544 107.275678 110.957420 256.722239 200 b
# preallocate 1.364213 1.585818 1.789564 1.778153 1.934758 2.955352 200 a
*/