contains_acronym
函数检查文本中是否存在2个或多个由括号包围的字符或数字,至少第一个字符为大写(如果是字母(,如果满足条件则返回True
,否则返回False
。例如,"Instant messaging (IM) is a set of communication technologies used for text-based communication"
应该返回True
,因为(IM)
满足匹配条件。填写此函数中的正则表达式:
import re
def contains_acronym(text):
pattern = ___
result = re.search(pattern, text)
return result != None
print(contains_acronym("Instant messaging (IM) is a set of communication technologies used for text-based communication")) # True
print(contains_acronym("American Standard Code for Information Interchange (ASCII) is a character encoding standard for electronic communication")) # True
print(contains_acronym("Please do NOT enter without permission!")) # False
print(contains_acronym("PostScript is a fourth-generation programming language (4GL)")) # True
print(contains_acronym("Have fun using a self-contained underwater breathing apparatus (Scuba)!")) # True
我尝试过这种模式,但它不适用于所有给定的输入情况:
pattern = r"(([A-Z0-9_]+))"
最后尝试了下面的模式,它用下面的代码覆盖了以上所有场景
import re
def contains_acronym(text):
pattern = r"([A-Za-z0-9]{2,})"
result = re.search(pattern, text)
return result != None
print(contains_acronym("Instant messaging (IM) is a set of communication technologies used for text-based communication")) # True
print(contains_acronym("American Standard Code for Information Interchange (ASCII) is a character encoding standard for electronic communication")) # True
print(contains_acronym("Please do NOT enter without permission!")) # False
print(contains_acronym("PostScript is a fourth-generation programming language (4GL)")) # True
print(contains_acronym("Have fun using a self-contained underwater breathing apparatus (Scuba)!")) # True
将其用作模式
pattern = r"(w.*w)"
"w"
的意思是字母和数字。
import re
def contains_acronym(text):
pattern = r'([A-Za-z0-9]{2,})'
result = re.search(pattern, text)
return result != None
print(contains_acronym("Instant messaging (IM) is a set of communication technologies used for text-based communication")) # True
print(contains_acronym("American Standard Code for Information Interchange (ASCII) is a character encoding standard for electronic communication")) # True
print(contains_acronym("Please do NOT enter without permission!")) # False
print(contains_acronym("PostScript is a fourth-generation programming language (4GL)")) # True
print(contains_acronym("Have fun using a self-contained underwater breathing apparatus (Scuba)!")) # True
import re
def contains_acronym(text):
pattern = r"([A-Z0-9][A-Z0-9a-z]+)"
result = re.search(pattern, text)
return result != None
您的正则表达式几乎是正确的。您只是忘记了它必须有至少2,所以只需将第一个范围作为字符串的常量部分,并用小写字母和+
(一个或多个(重复相同的匹配:
pattern = r"(([A-Z0-9_][A-Za-z0-9_]+))"
这应该可以工作
pattern = r"([A-Z0-9].*)"
注意
- "+"是一个匹配一次或多次的元字符
- 应使用"*">
- 也不需要双括号
- \w还包括小写字符以及大写和整数
pattern = r"(w{2,})"
这种模式有效而且看起来更好。
pattern = r"([A-Z0-9])*"
这个对我有用。很简单。
转义开括号和闭括号,将第一个字符定义为数字或大写字符,然后将后续字符定义为大写、小写或数字。我在这之后用一个*来表明这可以重复很多次,或者根本不重复。即,您将使用(IM(和(i(获得正确的输出。
pattern = "([A-Z0-9][A-Za-z0-9]*)"