使用Python pandas计算调整后的成本基础(股票买入/卖出的投资组合分析)

pcww981p  于 2023-04-28  发布在  Python
关注(0)|答案(2)|浏览(104)

我试图做一个投资组合分析我的交易,并试图计算调整后的成本基础价格。我已经尝试了几乎所有的东西,但似乎没有工作。我能够计算调整后的数量,但不能得到调整后的买入价格谁可以请帮助?
以下是交易日志原始数据示例

import pandas as pd
import numpy as np

raw_data = {'Date':  ['04-23-2020', '05-05-2020', '05-05-2020', '05-11-2020', '05-11-2020', 
              '05-12-2020', '05-12-2020', '05-27-2020', '06-03-2020', '06-03-2020', 
              '06-03-2020', '06-03-2020', '06-03-2020'],
    'Type': ['Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 
             'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell'],
    'Symbol': ['TSE:AC', 'TSE:AC', 'TSE:HEXO', 'TSE:BPY.UN', 'TSE:BPY.UN', 
               'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 
               'TSE:AC', 'TSE:BPY.UN', 'TSE:HEXO'],
    'Quantity': [75, 100, 1450, 200, 50, 80, 150, 100, 125, 100, 100, 50, 1450],
    'Amount per unit': [18.04, 17.29, 0.73, 13.04, 13.06, 12.65, 15.9, 15.01, 
                        18.05, 14.75, 15.8, 14.7, 1.07],
    'Turnover': [1353, 1729, 1058.5, 2608, 653, 1012, 2385, 1501, 2256.25, 1475, 1580, 735, 1551.5],
    }
df = pd.DataFrame (raw_data, columns = ['Date','Type','Symbol','Quantity','Amount per unit', 'Turnover']).sort_values(['Date','Symbol']).reset_index(drop = True)

我能够得到调整后的数量没有任何问题,但我不能得到正确的调整后的价格每单位。这里的条件是,如果我卖股票,我的调整后的价格每单位不应该改变,并保持相同的最后调整后的价格时,购买的股票。

#to calculate adjusted quantity. this works as expected
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()

#section where I am having problem. Works good until I reach the row where sell was made
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])

运行此代码将导致以下结果

例如:,指数7处的行的调整后价格应为12.948(与指数6处的行相同),而不是12.052。此外,最后一行的调整后价格应为0.73(与指数2处的行相同),因为我买入和卖出的股票数量相同。
例2:在指数6时,我在12.65买入了80股BPY,这使我的平均价格下降到12.94,总共330股(250+80)。现在,我以15.01的价格卖出100股(索引7).我的代码使它的调整成本为12.05.我需要我的调整成本为12.94而不是12.05.简单地说如果事务处理类型为“卖出”,则忽略调整价格。使用该特定股票的上次买入类型事务处理中的上次调整价格。
我的代码的最后2行是不正确的。你能帮我正确计算调整后的每单位价格吗?谢谢:)

p5cysglq

p5cysglq1#

如果您不计算销售的调整价格(如您所述),则可以将销售行处理为NA,并将其填写为同一库存的前一个值。作为代码中的确认,在开始计算“调整数量”时是否不需要考虑同一库存?

df.sort_values(['Symbol','Date','Type'], ascending=[True, True, True], inplace=True)
# your code
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])

df.loc[df['Type'] == 'Sell',['Adjusted Price Per Unit']] = np.NaN
df.fillna(method='ffill', inplace=True)

|    | Date       | Type   | Symbol     |   Quantity |   Amount per unit |   Turnover |   Adjusted Quantity |   Adjusted Price Per Unit |
|---:|:-----------|:-------|:-----------|-----------:|------------------:|-----------:|--------------------:|--------------------------:|
|  0 | 04-23-2020 | Buy    | TSE:AC     |         75 |             18.04 |    1353    |                  75 |                   18.04   |
|  1 | 05-05-2020 | Buy    | TSE:AC     |        100 |             17.29 |    1729    |                 175 |                   17.6114 |
|  5 | 05-12-2020 | Buy    | TSE:AC     |        150 |             15.9  |    2385    |                 325 |                   16.8215 |
|  9 | 06-03-2020 | Buy    | TSE:AC     |        100 |             15.8  |    1580    |                 425 |                   16.5812 |
|  8 | 06-03-2020 | Sell   | TSE:AC     |        125 |             18.05 |    2256.25 |                 300 |                   16.5812 |
|  3 | 05-11-2020 | Buy    | TSE:BPY.UN |        200 |             13.04 |    2608    |                 200 |                   13.04   |
|  4 | 05-11-2020 | Buy    | TSE:BPY.UN |         50 |             13.06 |     653    |                 250 |                   13.044  |
|  6 | 05-12-2020 | Buy    | TSE:BPY.UN |         80 |             12.65 |    1012    |                 330 |                   12.9485 |
|  7 | 05-27-2020 | Sell   | TSE:BPY.UN |        100 |             15.01 |    1501    |                 230 |                   12.9485 |
| 10 | 06-03-2020 | Sell   | TSE:BPY.UN |        100 |             14.75 |    1475    |                 130 |                   12.9485 |
| 11 | 06-03-2020 | Sell   | TSE:BPY.UN |         50 |             14.7  |     735    |                  80 |                   12.9485 |
|  2 | 05-05-2020 | Buy    | TSE:HEXO   |       1450 |              0.73 |    1058.5  |                1450 |                    0.73   |
| 12 | 06-03-2020 | Sell   | TSE:HEXO   |       1450 |              1.07 |    1551.5  |                   0 |                    0.73   |
e0bqpujr

e0bqpujr2#

@r-beginners的上述回答并没有正确地处理数量为0后的新购买,正如@guialmachado和@Rene Chan所观察到的那样。因此,这里是我的代码,使用Rene Chan的例子正确地计算它。

import pandas as pd
import numpy as np

raw_data = {'Date':  ['04-23-2020', '05-05-2020', '05-05-2020', '05-11-2020', '05-11-2020', 
              '05-12-2020', '05-12-2020', '05-27-2020', '06-03-2020', '06-03-2020', 
              '06-03-2020', '06-03-2020', '06-03-2020', '06-05-2020'],
    'Type': ['Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 
             'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell', 'Buy'],
    'Symbol': ['TSE:AC', 'TSE:AC', 'TSE:HEXO', 'TSE:BPY.UN', 'TSE:BPY.UN', 
               'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 
               'TSE:AC', 'TSE:BPY.UN', 'TSE:HEXO', 'TSE:HEXO'],
    'Quantity': [75, 100, 1450, 200, 50, 80, 150, 100, 125, 100, 100, 50, 1450, 3000],
    'Amount per unit': [18.04, 17.29, 0.73, 13.04, 13.06, 12.65, 15.9, 15.01, 
                        18.05, 14.75, 15.8, 14.7, 1.07, 2.50],
    'Turnover': [1353, 1729, 1058.5, 2608, 653, 1012, 2385, 1501, 2256.25, 1475, 1580, 735, 1551.5, 7500],
    }
df = pd.DataFrame (raw_data, columns = ['Date','Type','Symbol','Quantity','Amount per unit', 'Turnover']).sort_values(['Date','Symbol']).reset_index(drop = True)

df.sort_values(['Symbol','Date','Type'], ascending=[True, True, True], inplace=True)

df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()

def calculate_adjusted_price_per_unit(group):
    group['Adjusted Turnover'] = group.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
    group['Adjusted Price Per Unit'] = 0
    
    current_turnover = 0
    for index, row in group.iterrows():
        current_turnover += row['Adjusted Turnover']
        
        if row['Type'] == 'Buy':
            group.at[index, 'Adjusted Price Per Unit'] = current_turnover / row['Adjusted Quantity']
        
        if row['Adjusted Quantity'] == 0:
            current_turnover = 0
    
    return group

df = df.groupby('Symbol').apply(calculate_adjusted_price_per_unit)
df.drop(columns=['Adjusted Turnover'], inplace=True)

正确输出

Date  Type      Symbol  Quantity  Amount per unit  Turnover  Adjusted Quantity  Adjusted Price Per Unit
0   04-23-2020   Buy      TSE:AC        75            18.04   1353.00                 75                18.040000
1   05-05-2020   Buy      TSE:AC       100            17.29   1729.00                175                17.611429
5   05-12-2020   Buy      TSE:AC       150            15.90   2385.00                325                16.821538
9   06-03-2020   Buy      TSE:AC       100            15.80   1580.00                425                16.581176
8   06-03-2020  Sell      TSE:AC       125            18.05   2256.25                300                16.581176
3   05-11-2020   Buy  TSE:BPY.UN       200            13.04   2608.00                200                13.040000
4   05-11-2020   Buy  TSE:BPY.UN        50            13.06    653.00                250                13.044000
6   05-12-2020   Buy  TSE:BPY.UN        80            12.65   1012.00                330                12.948485
7   05-27-2020  Sell  TSE:BPY.UN       100            15.01   1501.00                230                12.948485
10  06-03-2020  Sell  TSE:BPY.UN       100            14.75   1475.00                130                12.948485
11  06-03-2020  Sell  TSE:BPY.UN        50            14.70    735.00                 80                12.948485
2   05-05-2020   Buy    TSE:HEXO      1450             0.73   1058.50               1450                 0.730000
12  06-03-2020  Sell    TSE:HEXO      1450             1.07   1551.50                  0                 0.730000
13  06-05-2020   Buy    TSE:HEXO      3000             2.50   7500.00               3000                 2.500000

相关问题