使用pandas,我想为一组长度为4个或更多字符的名字派生一个姓氏列。
我试过这些:
data = pd.read_csv("Data.csv")
data
#split the EmployeeName into firstname and lastname
flname = data['EmployeeName'].str.split(expand=True)
flname
#add first name column to data frame
data['FirstName'] = flname[0]
#apply condition on first name
dfname = data['FirstName'].apply(lambda x:x if len(x) \> 4 else None)
dfname = dfname.dropna()
#add last name and new first name columns to data frame
data['LastName'] = flname[0]
data['NewFirstName'] = dfname
data
#This is the wrong bit that throws an error
derived_name = data.apply(lambda x:x if data\['FirstName'\] in data\['NewFirstName'\] else None)
derived_name.dropna()
2条答案
按热度按时间u3r8eeie1#
我用问题1387的答案解决了这个问题。
谢谢大家。但是有更简短的方法来回答这个问题吗?
ac1kyiln2#
拆分数据
拆分名称列后,您应该使用掩码,因为它使此操作非常简单。
应该可以给予所需的输出