我正在运行下面的C#代码,它计算光流图,并在游戏过程中将其保存为PNG,使用Unity和VR头戴式设备,强制上限为90 FPS。如果没有这段代码,项目将以90 FPS平稳运行。要在同一个项目上运行这段代码,始终保持80 FPS以上,我不得不使用WaitForSeconds(0.2f),但是理想的情况是为游戏的每一帧计算并保存光流图,或者至少延迟更低,大约0.01秒。我已经在使用AsyncGPUReadback和WriteAsync了。
*主要问题:如何进一步加快此代码的执行速度?
- 附带问题:有没有办法把计算出的光流图作为连续的行转储到CSV文件中,这样就可以写在一个文件中,而不是为每个图创建一个单独的PNG?或者这会更慢吗?
using System.Collections;
using UnityEngine;
using System.IO;
using UnityEngine.Rendering;
namespace OpticalFlowAlternative
{
public class OpticalFlow : MonoBehaviour {
protected enum Pass {
Flow = 0,
DownSample = 1,
BlurH = 2,
BlurV = 3,
Visualize = 4
};
public RenderTexture Flow { get { return resultBuffer; } }
[SerializeField] protected Material flowMaterial;
protected RenderTexture prevFrame, flowBuffer, resultBuffer, renderTexture, rt;
public string customOutputFolderPath = "";
private string filepathforflow;
private int imageCount = 0;
int targetTextureWidth, targetTextureHeight;
private EyeTrackingV2 eyeTracking;
protected void Start () {
eyeTracking = GameObject.Find("XR Rig").GetComponent<EyeTrackingV2>();
targetTextureWidth = Screen.width / 16;
targetTextureHeight = Screen.height / 16;
flowMaterial.SetFloat("_Ratio", 1f * Screen.height / Screen.width);
renderTexture = new RenderTexture(targetTextureWidth, targetTextureHeight, 0);
rt = new RenderTexture(Screen.width, Screen.height, 0);
StartCoroutine("StartCapture");
}
protected void LateUpdate()
{
eyeTracking.flowCount = imageCount;
}
protected void OnDestroy ()
{
if(prevFrame != null)
{
prevFrame.Release();
prevFrame = null;
flowBuffer.Release();
flowBuffer = null;
rt.Release();
rt = null;
renderTexture.Release();
renderTexture = null;
}
}
IEnumerator StartCapture()
{
while (true)
{
yield return new WaitForSeconds(0.2f);
ScreenCapture.CaptureScreenshotIntoRenderTexture(rt);
//compensating for image flip
Graphics.Blit(rt, renderTexture, new Vector2(1, -1), new Vector2(0, 1));
if (prevFrame == null)
{
Setup(targetTextureWidth, targetTextureHeight);
Graphics.Blit(renderTexture, prevFrame);
}
flowMaterial.SetTexture("_PrevTex", prevFrame);
//calculating motion flow frame here
Graphics.Blit(renderTexture, flowBuffer, flowMaterial, (int)Pass.Flow);
Graphics.Blit(renderTexture, prevFrame);
AsyncGPUReadback.Request(flowBuffer, 0, TextureFormat.ARGB32, OnCompleteReadback);
}
}
void OnCompleteReadback(AsyncGPUReadbackRequest request)
{
if (request.hasError)
return;
var tex = new Texture2D(targetTextureWidth, targetTextureHeight, TextureFormat.ARGB32, false);
tex.LoadRawTextureData(request.GetData<uint>());
tex.Apply();
WriteTextureAsync(tex);
}
async void WriteTextureAsync(Texture2D tex)
{
imageCount++;
filepathforflow = customOutputFolderPath + imageCount + ".png";
var stream = new FileStream(filepathforflow, FileMode.OpenOrCreate);
var bytes = tex.EncodeToPNG();
await stream.WriteAsync(bytes, 0, bytes.Length);
}
protected void Setup(int width, int height)
{
prevFrame = new RenderTexture(width, height, 0);
prevFrame.format = RenderTextureFormat.ARGBFloat;
prevFrame.wrapMode = TextureWrapMode.Repeat;
prevFrame.Create();
flowBuffer = new RenderTexture(width, height, 0);
flowBuffer.format = RenderTextureFormat.ARGBFloat;
flowBuffer.wrapMode = TextureWrapMode.Repeat;
flowBuffer.Create();
}
}
}
1条答案
按热度按时间bksxznpy1#
首先是使用CommandBuffers,通过它你可以执行屏幕的无拷贝回读,应用你的计算并将它们存储在单独的缓冲区(纹理)中。然后你可以请求回读部分纹理/多个帧上的纹理,而不会阻塞对当前计算纹理的访问。当执行回读时,最好的方法是在单独的线程中将其编码为PNG/JPG,而不会阻塞主线程。
如果您使用的是DX11/Desktop,也可以将D3 D缓冲区配置为快速cpu回读,如果您想避免由于使用异步回读而发生的几帧延迟,则可以在每帧Map它。
从缓冲区创建纹理是另一种性能浪费,因为回读会给你像素值,你可以使用通用的png编码器并多线程保存它(而纹理创建只允许在“主”线程中)
如果延迟对你来说没问题,但是你想有精确的帧号到图像Map,也可以将帧号编码到目标图像中,所以你在保存到png之前总是会有它。
关于附带的问题,CSV可能比默认的PNG编码更快,因为PNG内部使用了类似zip的压缩,而CSV只是一堆编译成字符串的数字