如何构造字符串的pd.Series



如何从包含每个单元格作为数据字段的字符串(作为txt文件导入(中构造pd.Series对象?

字符串:

'Hegselmann, R. (2012). Thomas C. Schelling and the Computer: Some Notes on Schelling’s Essay „Letting a Computer Help with the Work“. Journal of Artificial Societies and Social Simulation, 15(4). http://jasss.soc.surrey.ac.uk/15/4/9.htmlnDowney, A. (2012). Think Python. How to Think Like a Computer Scientist. O’Reilly Media, Incorporated. http://www.greenteapress.com/thinkpython/html/index.htmlnBird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python—Analyzing Text with the Natural Language Toolkit. O’Reilly Media. https://sites.google.com/site/naturallanguagetoolkit/book'

首先,我将文件更改为csv

import pandas as pd
import numpy as np
df = pd.read_fwf('E1_TM_1.txt')
df.to_csv('E1_TM_1.csv')

如果我现在想把它表示为向量(这是正确的术语吗?(它看起来应该只是一张简单的桌子。第一列以索引1开头,第二列包含字符串中的每个References。

我试过这个代码,但看起来不像我想的。

pd.read_fwf('E1_TM_1.csv', encoding='utf8', index_col=0)
,"Hegselmann, R. (2012). Thomas C.","Schelling and the Computer: Some Notes on Schelling’s Essay „Letting a Computer Help with the Work“. Journal of Artificial Societies and Social Simulation, 15(4). http://jasss.soc.surrey.ac.uk/15/4/9.html"
0,"Downey, A. (2012). Think Python.","How to Think Like a Computer Scientist. O’Reilly Media, Incorporated. http://www.greenteapress.com/thinkpython/html/index.html"
1,"Bird, S., Klein, E., & Loper, E.",(2009). Natural Language Processing with Python—Analyzing Text with the Natural Language Toolkit. O’Reilly Media. https://sites.google.com/site/naturallanguagetoolkit/book

此外,对utf8的编码对整个字符串不起作用。

首先,我建议您使用"拆分字符串:

string1 = "Hegselmann, R. (2012). Thomas C. Schelling and the     Computer: Some Notes on Schelling’s Essay „Letting a Computer Help with the Work“. Journal of Artificial Societies and Social Simulation, 15(4). http://jasss.soc.surrey.ac.uk/15/4/9.htmlnDowney, A. (2012). Think Python. How to Think Like a Computer Scientist. O’Reilly Media, Incorporated. http://www.greenteapress.com/thinkpython/html/index.htmlnBird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python—Analyzing Text with the Natural Language Toolkit. O’Reilly Media. https://sites.google.com/site/naturallanguagetoolkit/book."
list_string =string1.split(' ')   
import pandas as pd   
import numpy as np
np.array(list_string)   

老实说,你很快描述了任务。。。我认为,在创建数组之前,您可以清理列表并选择您需要的单词。

相关内容

  • 没有找到相关文章

最新更新