如何使用FuzzyWzzy从列表中提取全文？

yduiuuwa 于 2021-08-20 发布在 Java

关注(0)|答案(2)|浏览(495)

下面是我的代码：

from fuzzywuzzy import fuzz
check = open("text.txt","a")
MIN_MATCH_SCORE = 30
heard_word = 'i5-1135G7 '
possible_words = check
guessed_word = [word for word in possible_words if fuzz.ratio(heard_word, word) >= 
MIN_MATCH_SCORE]
print ('this one - ', guessed_word)

预期产出：

11th Generation Intel® Core™ i5-1135G7 Processor

仅仅给出“i5-1135g7”就可以得到预期输出的整个句子吗？是否有其他解决方案来实现我的期望？先谢谢你。
下面是text.txt的链接
https://drive.google.com/file/d/1mo3qfmeoaqa3wppyg8spefvsjdx7aqbj/view

python machine-learning fuzzywuzzy nlp nltk

来源：https://stackoverflow.com/questions/68327690/how-to-extract-full-text-from-a-list-with-fuzzywuzzy

2条答案

按热度按时间

l7wslrjt1#

为了抵消较长的句子，并确保在单词层面上重叠，您应该使用 token_set_ratio . 另外，如果您想要完整的单词重叠，则增加 MIN_MATCH_SCORE 接近100。

from fuzzywuzzy import fuzz
MIN_MATCH_SCORE = 90
heard_word = 'i5-1135G7'
possible_words = ['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to  4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)', 
                   'windows 10 64 bit', 'intel i7']
print ([word for word in possible_words 
        if fuzz.token_set_ratio(heard_word, word) >= MIN_MATCH_SCORE])

输出：

['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to  4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)']

赞(0）回复(0）举报 2021-08-20

nhhxz33t2#

token\u set\u比率工作正常！

从fuzzyfuzzy导入fuzz

s = []
for l in df1.values:
    l = ', '.join(l)
    s.append(l)
s = ', '.join(s)    
main = [x for x in g if x]
MIN_MATCH_SCORE = 60
heard_word = 'i5-11th gen'
guessed_word = [word for word in main if fuzz.token_set_ratio(heard_word, 
word) >= MIN_MATCH_SCORE]
print ('this one - ', guessed_word)

展开查看全部

赞(0）回复(0）举报 2021-08-20

我来回答

如何使用FuzzyWzzy从列表中提取全文？

2条答案

token\u set\u比率工作正常！

相关问题

热门标签

最新问答