如何在Python中使用时间戳格式化JSON格式的OpenAI转录?

k4aesqcs  于 2023-08-08  发布在  Python
关注(0)|答案(1)|浏览(134)
import openai

openai.organization = "org-f1gupTwZLNo2Jmc8vry6VshM"
openai.api_key = "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"

audio_file_path =  "/Users/tejaksha/Downloads/dhoni.mp4"

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work

audio_file= open(audio_file_path, "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)

字符串
在上面的代码中,我可以得到输出

{
    "text": "Flat back, just got a little tight to him, he was wagging for it, set up for the slower ball and punished it. The one's going straight down the ground. And MS Daini just taking control."
}


但我想作为以下格式与时间戳如何获得使用OPENAI转录?
我需要的实际格式是

{
  "transcript": [
    {
      "text": "[Music]",
      "start": 7.39,
      "duration": 4.1
    },
    {
      "text": "once upon a time",
      "start": 16.48,
      "duration": 4.4
    },
    {
      "text": "in ancient china there lived three",
      "start": 17.6,
      "duration": 6.64
    },
    {
      "text": "old monks their names are not remembered",
      "start": 20.88,
      "duration": 6.559
    }
  ]
}

ix0qys7i

ix0qys7i1#

我认为OpenAI API不支持此功能。但是,您可以使用whisper库并返回时间戳。

import whisper
model = whisper.load_model("base")
audio = whisper.load_audio(ASRPage.output_file_path)
result = model.transcribe(audio)
print(result["segments"])

字符串
这确实意味着你需要自己的GPU或PC来运行推理。

相关问题