如何从每个包含R属性集合的字符串列表中获得所有唯一值?



这是我的一个类,我有一个数据帧来分析并得到一些统计结论,基本上,其中一个变量是由字符串组成的,看起来像这样:

{obj1,"obj2",obj3,"obj4",obj5,"obj6",obj7,obj8,obj9,"obj10","obj11","obj12",obj13}

在"包含空间。

我正在尝试列出所有这些附加的列表,然后过滤它以删除所有{,"}(有些还包含反斜杠)。

*在这里编辑,不知道我必须在这里使用双反斜杠

我尝试了一些我在网上找到的不同的方法,我的r代码的那一部分现在看起来像这样:

#analyzing amenities
amm_ord = c(df1$amenities)
unique (amm_ord)
k = c()
for (elements in amm_ord){

#for (element in elements) {k = append(k,unlist(element))
#for (element in elements) {k = c( k , element )
for (element in elements){
k1 = strsplit(element, regexpr('“{  ,\”  \”,\”  \”,  \”}' ,))
k = c (k , k1)
}
} 
uni_amm = unique(k)
print(uni_amm)
#k1 = strsplit(amm_ord , '\",\"  ,\"  "{  ')
my_string <- "{  ,",",",  ",  "}"
k1 = strsplit(amm_ord , my_string)
k1 = strsplit(amm_ord , regexpr('“{  ,\”  \”,\”  \”,  \”}'))
k1 = str_extract(k , regexpr('“{  ,\”  \”,\”  \”,  \”}'))
#amm_ord = strsplit(k1 , '\",')
#k1 = strsplit(amm_ord , '"\",\""\"')
unique(k1)
unique(amm_ord)

我一直得到错误,不能得到regexpr工作,你会怎么做?

我最近的想法是:

for (elements in amm_ord){

for (element in elements){
k1 = strsplit(element, regexpr('“{  ,\”  \”,\”  \”,  \”}' ,[idk what to put here]))
k = c (k , k1)
}
} 

这些是我的输出:

Error in check_lengths(string, pattern) : 
argument "pattern" is missing, with no default
>     k1 = str_extract(element, ('“{  ,\”  \”,\”  \”,  \”}'))
Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
Error in {min,max} interval. (U_REGEX_BAD_INTERVAL, context=`“{  ,”  ”,”  ”,  ”}`)
>     k1 = str_extract(element, regex('“{  ,\”  \”,\”  \”,  \”}'))
Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
Error in {min,max} interval. (U_REGEX_BAD_INTERVAL, context=`“{  ,”  ”,”  ”,  ”}`)
>     k1 = str_extract(element, regexpr('“{  ,\”  \”,\”  \”,  \”}'))
Error in is.factor(text) : argument "text" is missing, with no default
>     k1 = strsplit(element, regexpr('“{  ,\”  \”,\”  \”,  \”}'))
Error in is.factor(text) : argument "text" is missing, with no default
>     k1 = strsplit(element, regexpr('“{  ,\”  \”,\”  \”,  \”}' , txt))
Error in is.factor(text) : object 'txt' not found
>     k1 = strsplit(element, regexpr('“{  ,\”  \”,\”  \”,  \”}' , element))
Error in regexpr("“{  ,\”  \”,\”  \”,  \”}", element) : 
invalid regular expression, reason 'Invalid contents of {}'
> 
s <- "{obj1,"obj2",obj3,"obj4",obj5,"obj6",obj7,obj8,obj9,"obj10","obj11","obj12",obj13}"
str_split(substr(s, 2, nchar(s)-1), ""?,"?")
[[1]]
[1] "obj1"  "obj2"  "obj3"  "obj4"  "obj5"  "obj6"  "obj7"  "obj8"  "obj9" 
[10] "obj10" "obj11" "obj12" "obj13"

最新更新