我正试图通过发现在点和冒号之间显示的文本来清理摘要,然后是大写字符。为此,我使用正则表达式:
re.findall(r".s(.*?):s?[A-Z]", text)
用于文本
text = 'Background: Flavonoids constitute one of the best-characterized groups of plant secondary metabolites with enormous pharmaceutical potential. A flavone type of plant flavonoid, cirsilineol, has been reported to exhibit proapoptotic effects against malignant human cells. Objectives: The present study was designed to investigate the antiproliferative effects of cirsilineol against human gastric cancer cells. Materials and Methods: Cell viability was assessed by 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) and colony formation assays. Apoptosis was detected by acridine orange/ethidium bromide (AO/EB) and annexin V/propidium iodide (PI) assay. Protein expression was examined by western blotting analysis. Results: The results showed cirsilineol inhibits the proliferation of human gastric cancer cells. The IC50 of cirsilineol against human gastric cancer cells (BGC-823, SGC-7901, and MGC-803) ranged from 8 to 10 mu M. Nonetheless, cirsilineol exhibited comparatively lower antiproliferative effects against normal GES-1 cells. The IC50 of cirsilineol against normal GES-1 cells was found to be 120 mu M. Colony formation assay showed that cirsilineol suppressed the colony formation of BGC-823 and MGC-803 cells in a dose-dependent manner. Acridine orange and ethidium bromide (AO/EB) staining showed that cirsilineol induced apoptosis in BGC-823 and MGC-803 cells. The percentage of apoptosis increased from 7.4% in control to 40.5% in BGC-823 cells and from 6.56% in control to 33.53% in MGC-803 cells at 8 mu M cirsilineol. Western blotting showed cirsilineol caused an increase in Bax and cleaved caspase-3 and a decrease in Bcl-2 expression in both BGC-823 and MGC-803 cells. Conclusion: Together, the results are indicative of the proapoptotic and antitumor potential of cirsilineol against gastric cancer cells, suggestive of its possible therapeutic significance in future.'
然而,第一个提取的模式是:
'A flavone type of plant flavonoid, cirsilineol, has been reported to exhibit proapoptotic effects against malignant human cells. Objectives',
应该是'Objectives
'
我在这里错过了什么?
惰性修饰符表示如果匹配现在可以停止,则显示停止,而不是查看更多。不影响匹配的起始位置
对于你所描述的,你需要从匹配中排除.
。这里的正则表达式是:
.s([^.]*?):s?[A-Z]
这样,除了起始点之外,任何点都不允许出现在你的匹配中。
也可以用
(?<=.s)[^.]+(?=:s?[A-Z])
这种方式的匹配结果将只包含点和冒号之间的文本,后面跟着一个大写字母,但不包括点、冒号和大写字母,如果你需要使用其他语言。
对于python它可以双向工作!