Bug描述
当检索器找不到节点时,使用RetrieverEvaluator进行评估会导致错误:"Retrieved ids and expected ids must be provided."
版本
0.10.43
重现步骤
当检索器找不到节点时,使用RetrieverEvaluator进行评估会导致错误:"Retrieved ids和expected ids必须提供。"
相关日志/回溯
change `HitRate` Class to :
class HitRate(BaseRetrievalMetric):
"""Hit rate metric: Compute hit rate with two calculation options.
- The default method checks for a single match between any of the retrieved docs and expected docs.
- The more granular method checks for all potential matches between retrieved docs and expected docs.
Attributes:
use_granular_hit_rate (bool): Determines whether to use the granular method for calculation.
metric_name (str): The name of the metric.
"""
metric_name: str = "hit_rate"
use_granular_hit_rate: bool = False
def compute(
self,
query: Optional[str] = None,
expected_ids: Optional[List[str]] = None,
retrieved_ids: Optional[List[str]] = None,
expected_texts: Optional[List[str]] = None,
retrieved_texts: Optional[List[str]] = None,
) -> RetrievalMetricResult:
"""Compute metric based on the provided inputs.
Parameters:
query (Optional[str]): The query string (not used in the current implementation).
expected_ids (Optional[List[str]]): Expected document IDs.
retrieved_ids (Optional[List[str]]): Retrieved document IDs.
expected_texts (Optional[List[str]]): Expected texts (not used in the current implementation).
retrieved_texts (Optional[List[str]]): Retrieved texts (not used in the current implementation).
Raises:
ValueError: If the necessary IDs are not provided.
Returns:
RetrievalMetricResult: The result with the computed hit rate score.
"""
# Checking for the required arguments
if (
retrieved_ids is None
or expected_ids is None
or not retrieved_ids
or not expected_ids
):
# don't raise error
# raise ValueError("Retrieved ids and expected ids must be provided")
return 0.0
if self.use_granular_hit_rate:
# Granular HitRate calculation: Calculate all hits and divide by the number of expected docs
expected_set = set(expected_ids)
hits = sum(1 for doc_id in retrieved_ids if doc_id in expected_set)
score = hits / len(expected_ids) if expected_ids else 0.0
else:
# Default HitRate calculation: Check if there is a single hit
is_hit = any(id in expected_ids for id in retrieved_ids)
score = 1.0 if is_hit else 0.0
return RetrievalMetricResult(score=score)
1条答案
按热度按时间fnatzsnv1#
要解决
RetrieverEvaluator
在没有检索到节点时产生错误的问题,您可以修改HitRate
类以处理retrieved_ids
或expected_ids
未提供的情况。这确保评估不会引发错误,而是返回0.0分。以下是更新后的
HitRate
类:class HitRate(BaseRetrievalMetric):
"""Hit rate metric: Compute hit rate with two calculation options.
Attributes:
use_granular_hit_rate (bool): Determines whether to use the granular method for calculation.
metric_name (str): The name of the metric.
"""
Parameters:
query (Optional[str]): The query string (not used in the current implementation).
expected_ids (Optional[List[str]]): Expected document IDs.
retrieved_ids (Optional[List[str]]): Retrieved document IDs.
expected_texts (Optional[List[str]]): Expected texts (not used in the current implementation).
retrieved_texts (Optional[List[str]]): Retrieved texts (not used in the current implementation).
Returns:
RetrievalMetricResult: The result with the computed hit rate score.
"""
# Checking for the required arguments
if (
retrieved_ids is None
or expected_ids is None
or not retrieved_ids
or not expected_ids
):
return RetrievalMetricResult(score=0.0)
此修改确保如果
expected_ids
或retrieved_ids
是None
或空列表,方法将返回0.0分而不引发错误。