如何仅在数据帧中第一次拆分之前拆分文本



我有一个数据集,其中有两列:Industry ClassificationsStock Tickers。公司的Industry Classification列中有多个标签,由;分隔符分隔。我只想选择第一个标签。

import pandas as pd
training = pd.read_excel('Training Data.xlsx')

当前文件结构:(这是列的示例(

Industry Classifications
Beauty Care Products (Primary); Consumer Staples (Primary); Hair Care Products (Primary);
Catalog Flowers, Gifts and Novelties (Primary); Catalog Hobbies, Games and Toy Retail (Primary);
Information Technology (Primary); Internet Software and Services (Primary);
Casualty (Primary); Financials (Primary); Fire and Marine Insurance (Primary); 
Commercial and Professional Services (Primary); Commercial Services and Supplies (Primary); 
Banks (Primary); Banks (Primary); Diversified Banks (Primary); Financials (Primary); 
Application Software (Primary); Information Technology (Primary); Software (Primary);
Commercial and Professional Services (Primary); Consulting Services (Primary); Industrials (Primary);
Banks (Primary); Banks (Primary); Financials (Primary); National and State Commercial Banks (Primary); 

预期产出:

Industry Classifications
Beauty Care Products (Primary)
Catalog Flowers
Information Technology (Primary)
Casualty (Primary)
Commercial and Professional Services (Primary) 
Banks (Primary); Banks (Primary)
Application Software (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)

您可以像已经做的那样提取第一列,然后在分号上拆分并获取结果的第一个元素。

first_tag = col.split(';')[0]

最新更新