r语言 - 将两个单词的首字母大写为两个单词字符串



假设我有一个两个单词的字符串,我想大写他们两个。

name <- c("zip code", "state", "final count")

Hmisc包有一个函数capitalize,它将第一个单词大写,但我不确定如何使第二个单词大写。capitalize的帮助页面并不建议它可以执行该任务。

library(Hmisc)
capitalize(name)
# [1] "Zip code"    "State"       "Final count"

我想得到:

c("Zip Code", "State", "Final Count")

三个单词的字符串呢:

name2 <- c("I like pizza")
标题

大小写还有一个内置的base-R解决方案

tools::toTitleCase("demonstrating the title case")
## [1] "Demonstrating the Title Case"

library(tools)
toTitleCase("demonstrating the title case")
## [1] "Demonstrating the Title Case"

执行大写的基本 R 函数是 toupper(x) 。 从?toupper的帮助文件中,有这个函数可以满足您的需求:

simpleCap <- function(x) {
  s <- strsplit(x, " ")[[1]]
  paste(toupper(substring(s, 1,1)), substring(s, 2),
      sep="", collapse=" ")
}
name <- c("zip code", "state", "final count")
sapply(name, simpleCap)
     zip code         state   final count 
   "Zip Code"       "State" "Final Count" 

编辑 这适用于任何字符串,无论字数如何:

simpleCap("I like pizza a lot")
[1] "I Like Pizza A Lot"
匹配从

空格[[:space:]]开头或之后^后,后跟字母字符[[:alpha:]]的正则表达式。全局(gsub 中的 g(将所有此类匹配的单词替换为匹配的开头或空格以及匹配的字母字符的大写版本,\1\U\2 。这必须通过 perl 风格的正则表达式匹配来完成。

gsub("(^|[[:space:]])([[:alpha:]])", "\1\U\2", name, perl=TRUE)
# [1] "Zip Code"    "State"       "Final Count"

在替换参数gsub()的更详细一点中,\1说"使用匹配第一个子表达式的x部分",即x匹配(^|[[:spacde:]])的部分。同样,\2 说使用与第二个子表达式匹配的x部分([[:alpha:]])\U是通过使用 perl=TRUE 启用的语法,并且表示使下一个字符大写。因此,对于"邮政编码",\1是"Zip",\2是"代码",\U\2是"代码",\1\U\2是"邮政编码"。

?regexp页面有助于理解正则表达式,?gsub有助于将内容组合在一起。

stringi包中使用这个函数

stri_trans_totitle(c("zip code", "state", "final count"))
## [1] "Zip Code"      "State"       "Final Count" 
stri_trans_totitle("i like pizza very much")
## [1] "I Like Pizza Very Much"

备选方案:

library(stringr)
a = c("capitalise this", "and this")
a
[1] "capitalise this" "and this"       
str_to_title(a)
[1] "Capitalise This" "And This"   

尝试:

require(Hmisc)
sapply(name, function(x) {
  paste(sapply(strsplit(x, ' '), capitalize), collapse=' ')
})

?toupper的帮助页面:

.simpleCap <- function(x) {
    s <- strsplit(x, " ")[[1]]
    paste(toupper(substring(s, 1,1)), substring(s, 2),
          sep="", collapse=" ")
}

> sapply(name, .simpleCap)
zip code         state   final count 
"Zip Code"       "State" "Final Count"

BBmisc现在包含函数capitalizeStrings

library("BBmisc")
capitalizeStrings(c("the taIl", "wags The dOg", "That Looks fuNny!")
    , all.words = TRUE, lower.back = TRUE)
[1] "The Tail"          "Wags The Dog"      "That Looks Funny!"

使用子字符串和正则表达式的替代方法:

substring(name, 1) <- toupper(substring(name, 1, 1))
pos <- regexpr(" ", name, perl=TRUE) + 1
substring(name, pos) <- toupper(substring(name, pos, pos))

您也可以使用蛇壳包:

install.packages("snakecase")
library(snakecase)
name <- c("zip code", "state", "final count")
to_title_case(name)
#> [1] "Zip Code"    "State"       "Final Count"
# or 
to_upper_camel_case(name, sep_out = " ")
#> [1] "Zip Code"    "State"       "Final Count"

https://github.com/Tazinho/snakecase

这为所有主要单词提供了大写字母

library(lettercase)
xString = str_title_case(xString)

另一个在DescTools中使用StrCap的版本

Text = c("This is my phrase in r", "No, this is not my phrase in r")
DescTools::StrCap(Text) # Default only first character capitalized
[1] "This is my phrase in r"         "No, this is not my phrase in r"
DescTools::StrCap(Text, method = "word") # Capitalize each word
[1] "This Is My Phrase In R"        "No This Is Not My Phrase In R"
> DescTools::StrCap(Text, method = "title") # Capitalize as in titles
[1] "This Is My Phrase in R"         "No, This Is Not My Phrase in R"

✓ 一行
✓ 一个现有功能;没有新软件包
✓适用于列表/所有单词
✓ 将第一个字母大写并降低单词的其余部分:

name <- c("zip CODE", "statE", "final couNt")
gsub("([\w])([\w]+)", "\U\1\L\2", name, perl = TRUE)
[1] "Zip Code"    "State"       "Final Count"

如果你打算经常使用它,我想你可以用它做一个包装函数:

capFirst <- function(x) gsub("([\w])([\w]+)", "\U\1\L\2", x, perl = TRUE)
capFirst(name)

如果您有特殊字母,则可以改用此 reprex:

capFirst <- function(x) gsub("(\p{L})(\p{L}+)", "\U\1\L\2", x, perl = TRUE)
capFirst(name)

除了 perl 不知道如何在之后将其设置为大写或小写......所以总是有:

stringi::stri_trans_totitle(c("zip CODE", "éTAts", "final couNt"))
#[1] "Zip Code"    "États"       "Final Count"

这是对已接受答案的轻微改进,避免了必须使用sapply()。 还会强制非第一个字符降低。

titleCase <- Vectorize(function(x) {
  
  # Split input vector value
  s <- strsplit(x, " ")[[1]]
  
  # Perform title casing and recombine
  ret <- paste(toupper(substring(s, 1,1)), tolower(substring(s, 2)),
        sep="", collapse=" ")
  
  return(ret)
  
}, USE.NAMES = FALSE)

name <- c("zip CODE", "statE", "final couNt")
titleCase(name)
#> "Zip Code"       "State" "Final Count" 

这可能对某些人有用。如果单词是大写的,首先必须将其变为小写。

tools::toTitleCase("FRANCE")
[1] "FRANCE"

而不是

tools::toTitleCase(tolower("FRANCE"))
[1] "France"

最新更新