c++ 有什么方法可以让ESP32代码更快吗？我现在没办法了

8aqjt8rx 于 2023-10-20 发布在其他

关注(0)|答案(1)|浏览(153)

我正在为一台老式电脑（苹果丽莎）开发一个基于ESP 32的硬盘模拟器，我的代码的性能一直存在一些问题。我的模拟器在一个型号的丽莎上工作得很好，但另一个丽莎型号执行硬盘操作的速度要快得多，模拟器就是跟不上。代码中速度太慢的部分可以在下面找到：

#define busOffset 12 // The parallel bus starts on ESP32 pin 12.
#define CMDPin 21 // CMD is on ESP32 pin 21.
#define STRBPin 25 // STRB is on ESP32 pin 25.

uint32_t IRAM_ATTR prevState = 1; // Variables used for falling-edge detection on the strobe signal.
uint32_t IRAM_ATTR currentState = 1;
uint8_t blockData[532]; // The 532-byte hard disk block that we're currently working with.
volatile uint32_t IRAM_ATTR continueLoop = true;
byte *pointer;

void IRAM_ATTR exitLoopISR(){ // An ISR that clears the continueLoop flag when it's time to stop the data transfer.
  continueLoop = false;
}

void emulatorRead(){
  pointer = blockData; // Make our pointer point to the start of the blockData array.

  REG_WRITE(GPIO_OUT_W1TS_REG, *pointer << busOffset); // Put the first data byte on the bus and increment the value of the pointer.
  REG_WRITE(GPIO_OUT_W1TC_REG, ((byte)~*pointer++ << busOffset)); 

  attachInterrupt(CMDPin, exitLoopISR, FALLING); // Attach a falling-edge interrupt to CMD so that whenever the host computer lowers CMD, exitLoopISR will be called and will make us exit the data transfer loop.

  while(continueLoop){ // Keep transferring data until the host computer lowers CMD.
    currentState = bitRead(REG_READ(GPIO_IN_REG), STRBPin); // Check the current state of the data strobe.
    if(currentState == 0 and prevState == 1){ // If we're on the falling edge of the strobe, it's time to put the next data byte on the bus.
      REG_WRITE(GPIO_OUT_W1TS_REG, *pointer << busOffset); // So go ahead and put it on the bus. And then increment our blockData pointer to the next byte.
      REG_WRITE(GPIO_OUT_W1TC_REG, ((byte)~*pointer++ << busOffset));
    }
    prevState = currentState; // Set the previous strobe state to the current strobe state to prepare for the next iteration of the loop.
  }

  detachInterrupt(CMDPin); // Detach the CMD pin interrupt now that we're done.
}

我已经包括了在emulatorRead函数中使用的变量的声明，以及数据传输循环使用的ISR，以及给我带来问题的emulatorRead的片段。基本上，它应该抓住下一个字节的532字节块数据阵列，并把它放在并行总线上，每次它检测到一个下降沿的数据选通。它应该一直这样做，直到CMD引脚变低（这就是ISR的作用）。是的，我知道，如果模拟器在CMD变低之前接收到超过532个选通脉冲，我们将在块数据阵列的边界之外运行，但这是我可以担心的，一旦我让该死的东西首先工作！问题是，频闪是太快了，对这个模型的丽莎和模拟器错过了一对夫妇频闪脉冲。从我的测试来看，它通常只会错过532个中的一两个，但这显然足以把一切都搞砸！
我已经尝试了很多东西，但没有任何东西可以让代码足够快，可以正常工作。我尝试在platforms.txt文件中将编译器优化标志从-Os更改为-Ofast，但无济于事，我还尝试使用中断来检测选通脉冲的下降沿并将数据写入并行总线，但实际上这最终比我在这里使用的轮询方法还要慢！使用中断方法，通常会错过3或4个选通脉冲，而轮询方法仅错过1或2个。我也尝试过将某些变量（CSTATE、currentState和continueLoop）放入IRAM中，试图加快速度，但这似乎也不起作用。我必须将所有这些变量都设置为32位，即使它们不需要，因为IRAM似乎要求存储在其中的所有量都是32位长（每当我使用非32位量时，我总是得到LoadStoreError的内核恐慌）。
那么，有谁能想到更多的优化，我可以使这个工作？在这一点上，我没有办法加快循环，所以我真的很感激一些帮助！

c++

来源：https://stackoverflow.com/questions/77319611/is-there-any-way-to-make-this-esp32-code-faster-im-out-of-ideas-at-this-point