正则表达式在在线测试器(https://regexr.com/)中工作,但不在Python / Google colab



我正在尝试捕获所附文本的脚注。因此,逻辑是脚注以换行符上的数字开头(nd+s)并以换行符上的数字(nd+s)或单词"Page"结尾。该逻辑似乎适用于多个在线正则表达式测试器,但返回一堆空字符串作为谷歌 colab 中 python 脚本的输出。

这是代码:

m=re.findall('nd+(?:(?!nd+s)(?!Page)(.|n))*', text)

输出是空字符串列表

下面是两个文本示例:

However, we also urge the Commission to go further, and revisit the entire NMS Plan 
process -- which is fundamentally conflicted and severely outdated.3 The NMS Plan 
process was devised over forty years ago at a time of fewer exchanges that were also 
mutually owned, not-for-profit entities. In stark contrast, today, these entities are 
for-profit publicly traded entities with third-party shareholders. While the proposed 
reforms would assert the Commission’s ability to more effectively regulate NMS Plan 
fee filings, it would not directly address the conflicts of interest of having for-profit 
entities acting as both regulators and fee setters on essential capital market utilities. We 
ask the Commission to boldly address these bigger issues, and to the extent necessary, 
seek assistance from Congress. 
1 The Healthy Markets Association is an investor-focused not-for-profit coalition working to educate 
market participants and promote data-driven reforms to market structure challenges. Our members, who 
range from a few billion to hundreds of billions of dollars in assets under management, have come 
together behind one basic principle: Informed investors and policymakers are essential for healthy capital 
markets. To learn more about Healthy Markets or our members, please see our website at 
http://healthymarkets.org. 
2 Rescission of Effective-Upon-Filing Procedure for NMS Plan Fee Amendments, Sec. and Exch. 
Comm’n, 84 Fed. Reg. 54794 (Oct. 11, 2019), available at 
https://www.govinfo.gov/content/pkg/FR-2019-10-11/pdf/2019-21770.pdf (“Rescission Proposal”). 
3 See, e.g., Remarks of Hon. Dan Gallagher, before the 2014 SRO Outreach Conference, Sept. 16, 
2014, available at https://www.sec.gov/news/speech/2014-spch091614dmg-sro 
Page 1 of 12 
http://www.healthymarkets.org/
https://www.govinfo.gov/content/pkg/FR-2019-10-11/pdf/2019-21770.pdf
https://www.sec.gov/news/speech/2014-spch091614dmg-sro
rule-comments@sec.gov
Background on NMS Plans Generally 
In the early 1970s, it became clear that the government needed to step into the markets 
to provide a mechanism to consolidate information and accountability across a myriad 
of trading venue. The Commission began outlining the contours of a “central market 
system for listed securities.”4 
The Commission has asked whether, as an alternative to rescinding the Fee Exception and 
instating the standard procedure, it should consider permitting NMS plan fee applications to 
take effect 60 days after filing if the Commission does not act.  In our view, this approach – 
while preferable to the current Fee Exception – essentially establishes a Fee Exception by 
another name.  As is the case under the current Fee Exception, the alternative, 60-day 
9 Proposal at 17. 
10 Proposal at 38. 
4 
https://transition.10
process, as described by the Commission, would still allow a fee change to take effect without 
an affirmative determination by the Commission during the 60-day abrogation period that the 
fee change comports with the requirements of the Exchange Act. More fundamentally, the 
Commission has taken notice of the fact that NMS plan fees are not economically de minimis 
or otherwise trivial; to the contrary, the total revenues generated by fees for core data totaled 
more than $500 million in just 2017. Given that fact alone, we are hard-pressed to understand 
why fee-related applications should be subject to a lesser standard of review than applications 
that pertain to other matters and that may be less significant economically to investors and 
other market participants. 

在第二个输出中,我们期望的输出是:m=['9 Proposal at 17.', '10 Proposal at 38.']

总之,列表m中的每个项目都应该是一个单独的脚注。我们应该如何使用 python 在谷歌 colab 中使用正则表达式来解决这个问题

当您在正则表达式中的捕获组周围加上括号时,我会工作:

m=re.findall('(nd+s(?:(?!nd+s)(?!Page)(.|n))*)', text)

相关内容

  • 没有找到相关文章

最新更新