基于Python中的条件拆分列

wfveoks0  于 2022-09-21  发布在  Python
关注(0)|答案(1)|浏览(182)

我有一个数据集,其中一列中有多个值,问题是这些列中可能有一些空值。我需要从这个列创建三个不同的列,其中的字符数量和位置都不固定。

之前的数据:

df=pd.DataFrame({'Date':['2-18-2019','2-18-2019','2-19-2019','2-19-2019','2-20-2019','2-21-2019','2-21-2019','2-22-2019'],'Item':['NY01','Ld01','Du02','Du01','Ps55','L55','Du85','L85'],'SizeAgeQuantity':['13 3/8 5 846','4 1/2 557 85','9 5/8 47 4464','30 58','32 304 304','32 304 304 ','7 6588 685','4118 587','29']})

   Date    |    Item    |    SizeAgeQuantity
2-18-2019  |    NY01    |     13 3/8 5 846         
2-18-2019  |    Ld01    |     4 1/2 557 85        
2-19-2019  |    Du02    |     9 5/8 47 4464         
2-19-2019  |    Du01    |         30 58      
2-20-2019  |    Ps55    |     32 304 304      
2-21-2019  |    L55     |     7  6588 685  
2-21-2019  |    Du85    |        4118 587       
2-22-2019  |    L85     |        29

我想要的结果是这样的:

Date    |    Item    |    Size    |    Age   |   Quantity
2-18-2019  |    NY01    |   13 3/8   |     5    |     846         
2-18-2019  |    Ld01    |    4 1/2   |    557   |     85        
2-19-2019  |    Du02    |    9 5/8   |    47    |     4464         
2-19-2019  |    Du01    |     30     |    58    |  
2-20-2019  |    Ps55    |     32     |    304   |     304      
2-21-2019  |    L55     |     7      |    6588  |     685  
2-21-2019  |    Du85    |            |    4118  |     587       
2-22-2019  |    L85     |            |    29    |

唯一一致的是“Size”列应该只有以下集合中的值(“4 1/2”、“7”、“9 5/8”、“13 3/8”、“18”、“30”、“32”)

我尝试了以下代码:df['Size'], df['FrakS'], df['Age'], df['Quantity'] = df['SizeAgeQuantity'].str.split(' ', 3).str

但结果如下:

Date    |    Item    |    Size    |   FrakS   |    Age   |   Quantity
2-18-2019  |    NY01    |     13     |    3/8    |     5    |     846         
2-18-2019  |    Ld01    |     4      |    1/2    |    557   |     85        
2-19-2019  |    Du02    |     9      |    5/8    |    47    |     4464         
2-19-2019  |    Du01    |     30     |    58     |          |  
2-20-2019  |    Ps55    |     32     |    304    |    304   |           
2-21-2019  |    L55     |     7      |    658    |    685   |       
2-21-2019  |    Du85    |    4118    |    587    |          |          
2-22-2019  |    L85     |     29     |           |          |

如果有人能帮我,我将不胜感激

sqxo8psd

sqxo8psd1#

尝试:

df['SizeAgeQuantity'] = df['SizeAgeQuantity'].str.split()

def f_size(x):
    if (len(x)==4) and ('/' in x[1]):
        return ' '.join(x[:2])
    elif (len(x)==3):
        return x[0]
    else:
        return x     

def f_age(x):
    if (len(x)==4) and ('/' in x[1]):
        return x[2]
    elif (len(x)==3):
        return x[1]
    else:
        return x     

def f_qty(x):
    if (len(x)==4) and ('/' in x[1]):
        return x[-1]
    elif (len(x)==3):
        return x[-1]
    else:
        return x

df['size'] = df['SizeAgeQuantity'].map(f_size)
df['age'] = df['SizeAgeQuantity'].map(f_age)
df['qty'] = df['SizeAgeQuantity'].map(f_qty)

df
    Date        Item    SizeAgeQuantity     size        age     qty
0   2-18-2019   NY01    [13, 3/8, 5, 846]   13 3/8      5       846
1   2-18-2019   Ld01    [4, 1/2, 557, 85]   4 1/2       557     85
2   2-19-2019   Du02    [9, 5/8, 47, 4464]  9 5/8       47      4464
3   2-19-2019   Du01    [30, 58]            [30, 58]    [30, 58][30, 58]
4   2-20-2019   Ps55    [32, 304, 304]      32          304     304
5   2-21-2019   L55     [32, 304, 304]      32          304     304
6   2-21-2019   Du85    [7, 6588, 685]      7           6588    685
7   2-22-2019   L85     [4118, 587]         [4118, 587] [4118, 587] [4118, 587]
8   r           n       [29]                [29]        [29]    [29]

相关问题