我有一个数据集,其中有两列:Industry Classifications
和Stock Tickers
。公司的Industry Classification
列中有多个标签,由;
分隔符分隔。我只想选择第一个标签。
import pandas as pd
training = pd.read_excel('Training Data.xlsx')
当前文件结构:(这是列的示例(
Industry Classifications
Beauty Care Products (Primary); Consumer Staples (Primary); Hair Care Products (Primary);
Catalog Flowers, Gifts and Novelties (Primary); Catalog Hobbies, Games and Toy Retail (Primary);
Information Technology (Primary); Internet Software and Services (Primary);
Casualty (Primary); Financials (Primary); Fire and Marine Insurance (Primary);
Commercial and Professional Services (Primary); Commercial Services and Supplies (Primary);
Banks (Primary); Banks (Primary); Diversified Banks (Primary); Financials (Primary);
Application Software (Primary); Information Technology (Primary); Software (Primary);
Commercial and Professional Services (Primary); Consulting Services (Primary); Industrials (Primary);
Banks (Primary); Banks (Primary); Financials (Primary); National and State Commercial Banks (Primary);
预期产出:
Industry Classifications
Beauty Care Products (Primary)
Catalog Flowers
Information Technology (Primary)
Casualty (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)
Application Software (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)
您可以像已经做的那样提取第一列,然后在分号上拆分并获取结果的第一个元素。
first_tag = col.split(';')[0]