numpy 如何将 Dataframe 中的每一行复制一定次数，同时在特定列中逐个添加？[已关闭]

neekobn8 于 2023-01-20 发布在其他

关注(0)|答案(1)|浏览(118)

- 已关闭**。此问题需要超过focused。当前不接受答案。
- 想要改进此问题吗？**更新此问题，使其仅关注editing this post的一个问题。

5小时前关门了。
Improve this question
让我们假设我有一个回归模型，只给出月份、日期、喜欢率和浏览量（以千计），我就可以知道那天有多少人会分享视频。

Month    Day           Like_ratio        Thousands of views     Number of shares
                                   
 
07        02                  0.279323          0.877446                    7
12        23                   0.328068         0.837669                    8
11       30                    0.107959           0.678297                 12
02       26                      0.131555          0.418380                 3
06       12                    0.999961           0.619517                 4
10       17                    0.129270         0.024533                   8
05       08                    0.441010         0.741781                   9
07       31                    0.682101          0.375660                 2
08       24                    0.754488         0.352293                   9

现在他们给了我一个视频列表，用一个数字来标识，他们让我预测每一个视频，假设喜欢-不喜欢的比率和观看量保持不变，一个月内的总份额。

Video_ID     date  ratio_liked    accomulated_views
    45    08-01  0.540457  0.826594      
    87    06-07  0.979323  0.977446 
    34   02-09  0.128068  0.1237669  
    25   01-07  0.507959  0.378297  
    23   09-03  0.731555  0.818380  
    85   02-01  0.999961  0.619517  
    92   04-07  0.129270  0.024533  
    51   07-03  0.441010  0.741781  
    37   12-01  0.682101  0.375660  
    50   11-10  0.754488  0.352293

所以我想出来的唯一办法就是：
1.首先，我创建了一个日期范围，介于选定日期和（完成）一个月后的日期之间www.example.com_range（date，date + DateOffset（months = 1），freq ='d '） pandas.date _range(date, date+ DateOffset(months=1),freq='d')
1.然后，对于每个视频，我试图重现30倍相同的值video_id，like_ratio和views同时我增加每天一个接一个。（我不能这样做）
1.我从日期中提取月份和日期。
1.我用这个模型做回归分析
1.我按video_id分组，并计算所有共享数量的总和。
有一件事我真的做不到，那就是第二步。有人能帮帮我吗？

numpy

来源：https://stackoverflow.com/questions/75180970/how-to-duplicate-each-row-in-a-dataframe-a-certain-number-of-times-meanwhile-add

1条答案

按热度按时间

kqlmhetl1#

如果我没有理解错您的问题，那么可以根据Video_ID和month from date列对number_of_shares进行分组，然后对每个组进行累加求和。

df['accumulated_shares'] = df['number_of_shares'].groupby(by=[df.date.str[3:],'Video_ID']).cumsum()

赞(0）回复(0）举报 2023-01-20

我来回答

numpy 如何将 Dataframe 中的每一行复制一定次数，同时在特定列中逐个添加？[已关闭]

1条答案

相关问题

热门标签

最新问答