我想用AzureOpenAI和Langchain做一个文档问答程序

sczxawaw  于 2023-11-21  发布在  其他
关注(0)|答案(2)|浏览(252)
  1. llm = AzureOpenAI(openai_api_key=OPENAI_API_KEY, deployment_name=OPENAI_DEPLOYMENT_NAME, model_name=MODEL_NAME)
  2. # Configure the location of the PDF file.
  3. pdfReader = PdfReader('data\borders.pdf')
  4. # Extract the text from the PDF file.
  5. raw_text = ''
  6. for i, page in enumerate(pdfReader.pages):
  7. text = page.extract_text()
  8. if text:
  9. raw_text += text
  10. # Show first 1000 characters of the text.
  11. raw_text[:1000]
  12. # Split the text into chunks of 1000 characters with 200 characters overlap.
  13. text_splitter = CharacterTextSplitter(
  14. separator = "\n",
  15. chunk_size = 1000,
  16. chunk_overlap = 200,
  17. length_function = len,
  18. )
  19. pdfTexts = text_splitter.split_text(raw_text)
  20. # Show how many chunks of text are generated.
  21. len(pdfTexts)
  22. # Pass the text chunks to the Embedding Model from Azure OpenAI API to generate embeddings.
  23. embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, deployment=OPENAI_EMBEDDING_MODEL_NAME, client="azure", chunk_size=1)
  24. # Use FAISS to index the embeddings. This will allow us to perform a similarity search on the texts using the embeddings.
  25. # https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/faiss.html
  26. pdfDocSearch = FAISS.from_texts(pdfTexts, embeddings)
  27. # Create a Question Answering chain using the embeddings and the similarity search.
  28. # https://docs.langchain.com/docs/components/chains/index_related_chains
  29. chain = load_qa_chain(llm, chain_type="stuff")
  30. # Perform first sample of question answering.
  31. inquiry = "Who is the author of this book?"
  32. docs = pdfDocSearch.similarity_search(inquiry)
  33. chain.run(input_documents=docs, question=inquiry)

字符串
它给出了这个错误:openai.error.InvalidRequestError:完成操作不适用于指定的模型gpt-4。请选择其他模型并重试。您可以在此处了解有关每个操作可以使用哪些模型的详细信息:https://go.microsoft.com/fwlink/?linkid=2197993

aiqt4smr

aiqt4smr1#

它给出了这个错误:openai.error.InvalidRequestError:完成操作不适用于指定的模型gpt-4。请选择其他模型并重试。您可以在此处了解有关每个操作可以使用哪些模型的详细信息。
当您在配置中传递错误的模型或不正确的部署时,会发生上述错误。
根据这个**Document-1Document-2**,需要**text-davinci-003模型来完成,需要text-embedding-ada-002**模型来嵌入。
当我尝试使用上面的模型时,代码执行并给我输出。

产品代码:

  1. from langchain.llms import AzureOpenAI
  2. from PyPDF2 import PdfReader
  3. from langchain.text_splitter import CharacterTextSplitter
  4. from langchain.embeddings.openai import OpenAIEmbeddings
  5. from langchain.vectorstores.faiss import FAISS
  6. from langchain.chains.question_answering import load_qa_chain
  7. OPENAI_API_KEY="xxxxx"
  8. OPENAI_DEPLOYMENT_NAME="testxxxa" #deployment name with text-embedding-ada-002 model
  9. deployment="textxxx" #deployment name with text-davinci-003 model
  10. openai_api_base1="xxxxxx"
  11. llm = AzureOpenAI(openai_api_key=OPENAI_API_KEY, deployment_name=deployment,openai_api_base=openai_api_base1,openai_api_version="2022-12-01",openai_api_type="azure")
  12. pdfReader = PdfReader('example.pdf')
  13. raw_text = ''
  14. for i, page in enumerate(pdfReader.pages):
  15. text = page.extract_text()
  16. if text:
  17. raw_text += text
  18. raw_text[:1000]
  19. text_splitter = CharacterTextSplitter(
  20. separator = "\n",
  21. chunk_size = 1000,
  22. chunk_overlap = 200,
  23. length_function = len,
  24. )
  25. pdfTexts = text_splitter.split_text(raw_text)
  26. len(pdfTexts)
  27. embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, deployment=OPENAI_DEPLOYMENT_NAME, openai_api_base=openai_api_base1, openai_api_type="azure", openai_api_version="2022-12-01",chunk_size=1)
  28. pdfDocSearch = FAISS.from_texts(pdfTexts, embeddings)
  29. chain = load_qa_chain(llm, chain_type="stuff")
  30. inquiry = "Which month is specified?"
  31. docs = pdfDocSearch.similarity_search(inquiry)
  32. print(chain.run(input_documents=docs, question=inquiry))

字符串

输出:

  1. September


x1c 0d1x的数据

展开查看全部
ergxz8rk

ergxz8rk2#

在OpenAI中,你必须对文本生成进行主要操作:

  • 第一个月
  • chatCompletion

一些模型可用于完成(例如:GPT3.5版本0301,GPT-4等),其他可用于聊天完成(例如:GPT3.5版本0613,GPT-4等)。
在你的代码中有一些东西是不可见的,那就是langchain将在其步骤load_qa_chain中使用OpenAI和completion操作。
文档:https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability



因此,在您的情况下,您应该在设置llm时传递一个符合completion查询的部署:

  1. llm = AzureOpenAI(openai_api_key=OPENAI_API_KEY, deployment_name=OPENAI_DEPLOYMENT_NAME, model_name=MODEL_NAME)

字符串

展开查看全部

相关问题