问题是什么?
Llama 3的后续行动
尽管输出是确定性的,并且在固定的seed
的情况下是可重现的,但是将temperature
设置为0和固定的num_ctx
时,生成的Llama 3的输出在第一次执行此代码和第二次执行此代码(不重启内核)之间略有不同。以下执行将与第二次执行相同:
从LLMs from scratch - Evaluation with Ollama中取样的代码片段:
import urllib.request
import json
def query_model(prompt, model="llama3", url="http://localhost:11434/api/chat"):
# Create the data payload as a dictionary
data = {
"model": model,
"messages": [
{
"role": "user",
"content": prompt
}
],
"options": {
"seed": 123,
"temperature": 0,
"num_ctx": 2048 # must be set, otherwise slightly random output
}
}
# Convert the dictionary to a JSON formatted string and encode it to bytes
payload = json.dumps(data).encode("utf-8")
# Create a request object, setting the method to POST and adding necessary headers
request = urllib.request.Request(url, data=payload, method="POST")
request.add_header("Content-Type", "application/json")
# Send the request and capture the response
response_data = ""
with urllib.request.urlopen(request) as response:
# Read and decode the response
while True:
line = response.readline().decode("utf-8")
if not line:
break
response_json = json.loads(line)
response_data += response_json["message"]["content"]
return response_data
result = query_model("What do Llamas eat?")
print(result)
第1次执行的输出(输出可能会有所不同):
Llamas are herbivores, which means they primarily feed on plant-based foods. Their diet typically consists of:
1. Grasses: Llamas love to graze on various types of grasses, including tall grasses, short grasses, and even weeds.
2. Hay: High-quality hay, such as alfalfa or timothy hay, is a staple in a llama's diet. They enjoy munching on hay as a snack or as a main meal.
3. Grains: Llamas may be fed grains like oats, barley, or corn as an occasional treat or to supplement their diet.
4. Fruits and vegetables: Fresh fruits and veggies, such as apples, carrots, and sweet potatoes, can be given as treats or added to their meals for variety.
5. Leaves and shrubs: Llamas will also eat leaves from trees and shrubs, like willow or cedar.
In the wild, llamas might eat:
* Various grasses and plants
* Leaves from trees and shrubs
* Fruits and berries
* Bark (in some cases)
Domesticated llamas, on the other hand, typically receive a diet that includes:
* Hay as their main staple
* Grains or pellets as a supplement
* Fresh fruits and veggies as treats
It's essential to provide llamas with a balanced diet that meets their nutritional needs. Consult with a veterinarian or an experienced llama breeder to determine the best feeding plan for your llama.
第2次执行到第n次执行的输出(输出应该是可重现的):
Llamas are herbivores, which means they primarily feed on plant-based foods. Their diet typically consists of:
1. Grasses: Llamas love to graze on various types of grasses, including tall grasses, short grasses, and even weeds.
2. Hay: High-quality hay, such as alfalfa or timothy hay, is a staple in a llama's diet. They enjoy munching on hay cubes or loose hay.
3. Grains: Llamas may receive grains like oats, barley, or corn as part of their diet. However, these should be given in moderation to avoid digestive issues.
4. Fruits and vegetables: Fresh fruits and veggies can be a tasty treat for llamas. Some favorites include apples, carrots, sweet potatoes, and leafy greens like kale or spinach.
5. Minerals: Llamas need access to mineral supplements, such as salt licks or loose minerals, to ensure they're getting the necessary nutrients.
In the wild, llamas might also eat:
1. Leaves: They'll munch on leaves from trees and shrubs, like willow or cedar.
2. Bark: In some cases, llamas may eat the bark of certain trees, like aspen or birch.
3. Mosses: Llamas have been known to graze on mosses and other non-woody plant material.
It's essential to provide a balanced diet for your llama, taking into account their age, size, and individual needs. Consult with a veterinarian or experienced llama breeder to determine the best feeding plan for your llama.
观察:
- 如您所见,第一次执行的输出将是随机的,而第二次执行和所有后续执行的输出将始终一致地生成确定性。
- 我尝试使用不同的平台(Windows,Docker使用的Ubuntu镜像),似乎在这些不同的操作系统之间生成的输出有所不同:第一个总是有点随机,但在某个平台上后面的输出是一致的。但是例如在Windows上,这段代码产生的一致确定性输出与Ubuntu不同。
- 我尝试设置Python哈希种子,这并没有解决这个问题。
Linux,macOS,Windows,Docker,WSL2
GPU:Nvidia
CPU:AMD
Llama版本:0.1.46
3条答案
按热度按时间qco9c6ql1#
你的版本确实包含了ead259d,所以我不确定为什么。
yuvru6vn2#
可以尝试应用这个补丁:
cache_prompt
标志由提交 a64570d 设置为 true。从 https://github.com/ggerganov/llama.cpp/tree/master/examples/server#api-endpoints ,它表示:提示:将此补全的提示作为字符串或表示令牌的字符串或数字数组提供。如果 cache_prompt 为 true,则内部会将提示与之前的补全进行比较,只评估“未见过”的后缀。
一旦应用了这个补丁,无论内核重启与否,当我使用相同的
seed
和相同的temperature
发送相同的提示时,都可以得到完全相同的输出。例如:第一个输出:
第二个输出:
我想这个标志应该是可配置的?
gijlo24d3#
老实说,我不知道这是否解决了问题。对于输出,你使用了不同的模型、不同的提示,并且没有在不同的操作系统上进行验证。
KV缓存实际上是一个有用的功能,但它可能在不同的操作系统上以不同的方式初始化。因此,禁用它可能会解决这个问题,但不能解决KV缓存初始化的问题。
ggerganov/llama.cpp#4902
但是你实际上给我带来了一个想法,通过设置
num_keep=0
(这不会禁用它,但至少不会将令牌存储在缓存中)。我不知道如何使用你的更改安装Ollama,无论是在Ubuntu还是在Windows上,我会在新版本的Ollama实现时测试一下。感谢你的PR!
顺便说一下,我还在llama.cpp上打开了一个PR,以使输出100%确定:
ggerganov/llama.cpp#8265
当使用
temperature=0
时,可能会使用一个小系数来防止零除。在某些情况下,这可能会稍微改变生成的输出,具体取决于使用的模型。因此,最好关闭beam search和多项式采样以获得确定性采样。设置一个
seed
只有在使用非确定性采样(如Top-k或Top-p采样)时才有意义,以确保可重复性。这里的Ollama示例代码并不完全有意义,因为你不需要设置一个seed
,因为像temperature=0
一样生成的输出已经是确定性的。但是当设置一个大于0.0的温度时,你需要设置一个种子以使输出可重复。