Background
I'm running langflow using Google Cloud Run. This works well for my use case which involves very infrequent but bursty (100,000s of tokens per second) requests. It's connected to a postgres database hosted elsewhere.
I've created a solution which works, but it's hacky and inelegant and I'm hoping to get some suggestions on some better ideas.
The issue
My issue is that the docker image as it stands doesn't allow write access to the /app/data/.cache
folder by default, which langflow requires to run. The gcs-fuse adaptor isn't POSIX compliant and doesn't provide concurrency control. This means that when more than one worker or more than one instances is running, I get lots of these sorts of errors:'IO Error: Could not set lock on file "/app/data/.cache/"
My solution
I've created a separate Dockerfile and script which creates a ramdisk mounted at /app/data
. When a container instance is started, the ramdisk is created and all writes go to RAM.
From what I can see, this solution has the following limitations:
- Any logs and chat histories created in that instance are lost as soon as the container instance shuts down.
- Creating the ramdisk for each instance slows down the start up times.
Any ideas on how I could approach this in a better way?
5条答案
按热度按时间sxpgvts31#
你好,@h-arnold!我在这里帮助你解决任何bug、问题或贡献。让我们一起解决这个问题。
要管理Google Cloud Run上Langflow的临时示例并解决
/app/data/.cache
文件夹的问题,你可以利用Kubernetes和Helm图表来获得更强大的解决方案。以下是你可以遵循的步骤:/app/data/.cache
文件夹使用ramdisk,而是配置持久化存储。values.yaml
文件以配置外部数据库。values.yaml
文件中的replicaCount
值。values.yaml
文件中的resources
值。通过遵循这些步骤,你可以提高在Google Cloud Run 1(https://github.com/langflow-ai/langflow/blob/main/docs/docs/deployment/kubernetes.md)上运行的Langflow临时示例的并发管理和数据持久性。
zf9nrax12#
监控服务似乎使用自己的duckdb示例,而不是postgresql,这对我来说似乎是硬编码的。这是.cache所需的唯一写入权限吗?
eit6fx6z3#
监控服务是需要对Langflow中的
/app/data/.cache
文件夹具有写入权限的主要组件。这从MonitorService
类中可以看出,它处理数据库操作,如添加行、删除消息和更新消息,所有这些操作都涉及到写入位于缓存目录中的monitor.duckdb
文件。然而,其他组件也需要对缓存目录具有写入权限。例如,
src/backend/langflow/services/cache/utils.py
中的实用程序函数也执行写操作。这些函数包括save_binary_file
、save_uploaded_file
和clear_old_cache_files
,它们负责在缓存目录中保存文件和管理缓存文件。为了更有效地解决这个问题,您可以确保只有必要的组件具有对该目录进行写入的所需权限。这可以通过设置适当的文件系统权限或通过配置服务以具有必要的权限(1(https://github.com/langflow-ai/langflow/blob/dev/src/backend/base/langflow/services/monitor/service.py)2(https://github.com/logspace-ai/langflow/blob/dev/src/backend/base/langflow/services/monitor/service.py)3(https://github.com/logspace-ai/langflow/blob/dev/src/backend/langflow/services/cache/utils.py)4(https://github.com/langflow-ai/langflow/blob/dev/src/backend/base/langflow/services/cache/utils.py)5(https://github.com/logspace-ai/langflow/blob/dev/src/backend/base/langflow/services/cache/utils.py))来实现。
ej83mcc04#
根据你所拥有的代码,你能找到一种方法来修改代码,将所有这些内容转移到数据库中吗?例如,更改MonitorService类可以使用的数据库,或者将二进制文件保存到文档数据库中?
bprjcwpo5#
要将
MonitorService
类中的所有写操作从/app/data/.cache
文件夹移动到数据库,您可以将 DuckDB 操作替换为类似 MongoDB 的文档数据库操作。以下是如何修改MonitorService
类以使用 MongoDB 而不是 DuckDB 的示例:pymongo
包:MonitorService
类以使用 MongoDB:此代码将 DuckDB 操作替换为 MongoDB 操作,确保所有写操作都移动到名为 1(https://github.com/logspace-ai/langflow/blob/dev/src/backend/base/langflow/services/monitor/service.py)2(https://github.com/langflow-ai/langflow/blob/dev/src/backend/base/langflow/services/monitor/service.py) 的 MongoDB 数据库。