如何解决这个python spotify推荐问题?

fhg3lkii  于 2022-10-22  发布在  Python
关注(0)|答案(1)|浏览(148)

我正在尝试使用this dataset制作我的推荐列表。
我想按artists_upd、id和artists将歌曲分组。最初的是this,我想让它像this一样,但我失败了,它显示了keyerror“艺术家”,我不明白为什么?

import pandas as pd
import numpy as np
import json
import re 
import sys
import itertools

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
from spotipy.oauth2 import SpotifyOAuth
import spotipy.util as util

import warnings
warnings.filterwarnings("ignore")
%matplotlib inline

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

pd.set_option('display.max_columns', None)
pd.set_option("max_rows", None)

spotify_df = pd.read_csv('/content/drive/MyDrive/tracksong/tracks.csv')
spotify_df.head()

data_w_genre = pd.read_csv('/content/drive/MyDrive/tracksong/artists.csv')
data_w_genre.head()

spotify_df['artists_upd_v1'] = spotify_df['artists'].apply(lambda x: re.findall(r"'([^']*)'", x))
spotify_df['artists_upd_v1'].values[0][0]

spotify_df[spotify_df['artists_upd_v1'].apply(lambda x: not x)].head(5)

spotify_df['artists_upd_v2'] = spotify_df['artists'].apply(lambda x: re.findall('\"(.*?)\"',x))
spotify_df['artists_upd'] = np.where(spotify_df['artists_upd_v1'].apply(lambda x: not x), spotify_df['artists_upd_v2'], spotify_df['artists_upd_v1'] )

spotify_df['artists_song'] = spotify_df.apply(lambda row: row['artists_upd'][0]+str(row['name']),axis = 1)

# original code is -> spotify_df['artists_song'] = spotify_df.apply(lambda row: row['artists_upd'][0]+row['name'],axis = 1)

spotify_df.sort_values(['artists_song','release_date'], ascending = False, inplace = True)

spotify_df[spotify_df['name']=='Adore You']

spotify_df.drop_duplicates('artists_song',inplace = True)

spotify_df[spotify_df['name']=='Adore You']

artists_exploded = spotify_df[['artists_upd','id']].explode('artists_upd')

artists_exploded_enriched = artists_exploded.merge(data_w_genre, how = 'left', left_on = 'artists_upd',right_on = 'artists')

KeyError: 'artists'
这是my code(最后一行),这是i2j4k1l,第25行)。

f8rj6qna

f8rj6qna1#

问题是,您试图合并data_w_genreartists列(使用参数right_on = 'artists'),但data_w_genre没有同名列。

artists_exploded_enriched = artists_exploded.merge(data_w_genre, how = 'left', left_on = 'artists_upd',right_on = 'artists')

在原始代码中,从中导入data_w_genre的csv文件有一个artists列:

data_w_genre.dtypes

在单元9中输出:

artists              object
acousticness        float64
danceability        float64
duration_ms         float64
energy              float64
instrumentalness    float64
liveness            float64
loudness            float64
speechiness         float64
tempo               float64
valence             float64
popularity          float64
key                   int64
mode                  int64
count                 int64
genres               object
dtype: object

但代码中单元格9中的同一行输出:

id             object
followers     float64
genres         object
name           object
popularity    float64
dtype: object

请注意,artists缺失。如果data_w_genre中没有artists列,您将无法从原始代码中再现该行。我将研究原始作者从何处获得“data_w_genre”。csv‘并尝试获取或创建具有相同格式的文件。

相关问题