如何在FastAPI中上传CSV文件并将其转换为JSON?

at0kjp5o  于 2022-12-06  发布在  其他
关注(0)|答案(3)|浏览(216)

我尝试上传我的.csv文件到我的FastAPI服务器,然后将其转换为JSON并返回给客户端。但是,当我尝试直接处理它(不将其存储在某个地方)时,我得到了以下错误:

  1. Error : FileNotFoundError: [Error 2] No such file or directory : "testdata.csv"

这是我的FastAPI代码:

  1. async def upload(file: UploadFile = File(...)):
  2. data = {}
  3. with open(file.filename,encoding='utf-8') as csvf:
  4. csvReader = csv.DictReader(csvf)
  5. for rows in csvReader:
  6. key = rows['No']
  7. data[key] = rows
  8. return {data}```
pprl5pva

pprl5pva1#

下面给出了如何将上传的.csv文件转换为JSON的各种选项。下面的示例中使用了以下.csv示例文件。

数据.csv

  1. Id,name,age,height,weight
  2. 1,Alice,20,62,120.6
  3. 2,Freddie,21,74,190.6
  4. 3,Bob,17,68,120.0

选项1

csv.DictReader()方法也可以接受文件对象作为file参数。(要了解更多信息,请查看this answer).您可以通过UploadFile对象的.file属性访问它.但是,由于FastAPI/Starlette以bytes模式打开文件,如果你直接将它传递给csv.DictReader()方法,你会得到一个错误,即_csv.Error: iterator should return strings, not bytes。因此,你可以使用codecs.iterdecode()(如this answer中所建议的),它使用一个增量解码器对 iterator 提供的输入进行迭代解码(在本例中是从bytesstr)。示例:

  1. from fastapi import FastAPI, File, UploadFile
  2. import csv
  3. import codecs
  4. app = FastAPI()
  5. @app.post("/upload")
  6. def upload(file: UploadFile = File(...)):
  7. csvReader = csv.DictReader(codecs.iterdecode(file.file, 'utf-8'))
  8. data = {}
  9. for rows in csvReader:
  10. key = rows['Id'] # Assuming a column named 'Id' to be the primary key
  11. data[key] = rows
  12. file.file.close()
  13. return data

输出

  1. {
  2. "1": {
  3. "Id": "1",
  4. "name": "Alice",
  5. "age": "20",
  6. "height": "62",
  7. "weight": "120.6"
  8. },
  9. "2": {
  10. "Id": "2",
  11. "name": "Freddie",
  12. "age": "21",
  13. "height": "74",
  14. "weight": "190.6"
  15. },
  16. "3": {
  17. "Id": "3",
  18. "name": "Bob",
  19. "age": "17",
  20. "height": "68",
  21. "weight": "120.0"
  22. }
  23. }

如果您想返回字典的list,您可以使用下面的代码。因为下面的代码要求file在返回结果时为open,从而阻止服务器正确关闭文件(通过调用file.file.close())当它完成时,可以使用BackgroundTasks(它运行 after 返回一个响应)关闭文件:

  1. from fastapi import FastAPI, File, UploadFile, BackgroundTasks
  2. import csv
  3. import codecs
  4. app = FastAPI()
  5. @app.post("/upload")
  6. def upload(background_tasks: BackgroundTasks, file: UploadFile = File(...)):
  7. csvReader = csv.DictReader(codecs.iterdecode(file.file, 'utf-8'))
  8. background_tasks.add_task(file.file.close)
  9. return list(csvReader)

输出

  1. [
  2. {
  3. "Id": "1",
  4. "name": "Alice",
  5. "age": "20",
  6. "height": "62",
  7. "weight": "120.6"
  8. },
  9. {
  10. "Id": "2",
  11. "name": "Freddie",
  12. "age": "21",
  13. "height": "74",
  14. "weight": "190.6"
  15. },
  16. {
  17. "Id": "3",
  18. "name": "Bob",
  19. "age": "17",
  20. "height": "68",
  21. "weight": "120.0"
  22. }
  23. ]

选项2

另一个解决方案是读取上传文件的字节数据-使用contents = file.file.read()(对于async的读/写,请参见this answer)-然后将字节转换为字符串,最后将它们加载到内存中的文本缓冲区(即StringIO),如前面提到的here,它可以传递给csv.DictReader()

  1. from fastapi import FastAPI, File, UploadFile
  2. import csv
  3. from io import StringIO
  4. app = FastAPI()
  5. @app.post("/upload")
  6. def upload(file: UploadFile = File(...)):
  7. data = {}
  8. contents = file.file.read()
  9. buffer = StringIO(contents.decode('utf-8'))
  10. csvReader = csv.DictReader(buffer)
  11. for row in csvReader:
  12. key = row['Id'] # Assuming a column named 'Id' to be the primary key
  13. data[key] = row
  14. buffer.close()
  15. file.file.close()
  16. return data

选项3

要以自己的方式解决这个问题(即,使用文件路径读取csv文件,而不是像前面所述的那样直接使用文件内容或类似文件的对象),可以将文件内容复制到NamedTemporaryFile中,它与UploadFile提供的SpooledTemporaryFile不同,“在文件系统中有一个可见的名称”,“可以用来打开文件”(同样,请查看this answer以了解更多信息)。下面是一个工作示例:

  1. from fastapi import FastAPI, File, UploadFile
  2. from tempfile import NamedTemporaryFile
  3. import os
  4. import csv
  5. app = FastAPI()
  6. @app.post("/upload")
  7. def upload(file: UploadFile = File(...)):
  8. data = {}
  9. temp = NamedTemporaryFile(delete=False)
  10. try:
  11. try:
  12. contents = file.file.read()
  13. with temp as f:
  14. f.write(contents);
  15. except Exception:
  16. return {"message": "There was an error uploading the file"}
  17. finally:
  18. file.file.close()
  19. with open(temp.name,'r', encoding='utf-8') as csvf:
  20. csvReader = csv.DictReader(csvf)
  21. for rows in csvReader:
  22. key = rows['Id'] # Assuming a column named 'Id' to be the primary key
  23. data[key] = rows
  24. except Exception:
  25. return {"message": "There was an error processing the file"}
  26. finally:
  27. #temp.close() # the `with` statement above takes care of closing the file
  28. os.remove(temp.name) # Delete the file
  29. return data

选项4

还可以将上传文件中的字节写入BytesIO流,然后将其转换为Pandas DataFrame。(如this answer中所述),您可以将 Dataframe 转换为字典并返回它-FastAPI在后台使用jsonable_encoder将其转换为JSON兼容的数据,最后,序列化数据并返回一个JSONResponse(有关详细信息,请参阅this answer)。或者,您可以使用to_json()方法并直接返回一个Response,如选项1(更新2)here中所述。

  1. from fastapi import FastAPI, File, UploadFile
  2. from io import BytesIO
  3. import pandas as pd
  4. app = FastAPI()
  5. @app.post("/upload")
  6. def upload(file: UploadFile = File(...)):
  7. contents = file.file.read()
  8. buffer = BytesIO(contents)
  9. df = pd.read_csv(buffer)
  10. buffer.close()
  11. file.file.close()
  12. return df.to_dict(orient='records')

注意:如果文件太大,占用了所有内存和/或处理和/或返回结果花费了太多时间,请查看this answerthis answerthis answer.

展开查看全部
q7solyqu

q7solyqu2#

您之所以会得到Error : FileNotFoundError: [Error 2] No such file or directory : "testdata.csv",是因为您尝试读取的文件不是本地存储的。
如果要以这种方式读取文件,则应在继续操作之前保存上载的文件:

  1. async def upload(uploaded_file: UploadFile = File(...)):
  2. # save csv to local dir
  3. csv_name = uploaded_file.filename
  4. csv_path = 'path_to/csv_dir/'
  5. file_path = os.path.join(csv_path, csv_name)
  6. with open(file_path, mode='wb+') as f:
  7. f.write(uploaded_file.file.read())
  8. # read csv and convert to json
  9. data = {}
  10. with open(file_path, mode='r', encoding='utf-8') as csvf:
  11. csvReader = csv.DictReader(csvf)
  12. for rows in csvReader:
  13. key = rows['No']
  14. data[key] = rows
  15. return {data}
展开查看全部
t30tvxxf

t30tvxxf3#

异步函数upload()中的file已经打开,可以直接从中取字符,不需要再打开。同样在FastAPI中,类UploadFile实际上是从标准库tempfile.SpooledTemporaryFile派生的,不能通过指定临时文件的路径来访问。
例如,如果使用CPython并在类Unix系统的upload()中读取file.filename的值,它将返回一个数字而不是格式良好的路径,因为类SpooledTemporaryFile的任何示例都将创建文件描述符(在当前存储的数据超过max_size时的某个点),并在访问SpooledTemporaryFile.filename时简单地返回文件描述符(在Unix中应为数字

相关问题