为什么scipy不理解Kotlin录制的wav文件?

roqulrg3  于 2022-12-18  发布在  Kotlin
关注(0)|答案(1)|浏览(169)

我呼吁像你这样聪明的程序员的善意支持来解决一个需要特殊技能的问题。
我尝试在Kotlinandroid应用程序中使用MediaRecorder录制WAV文件。它工作良好,可以录制和播放。我的瓶颈是当使用Python flask 应用程序处理此WAV文件时,使用scipy.wavefile.read。我收到错误消息,如“ValueError:无法理解文件格式b '\x00\x00\x00\x18'。仅支持'RIFF'和'RIFX'。”
在事后调试过程中,我从服务器下载了录制的文件,我的意思是,音频从手机到服务器再回到我的电脑。它在WindowsMediaPlayer和VLCM上播放得很好,他们将我的文件识别为.wav文件。然而,用记事本打开文件,我看到与scipy一起工作的wav文件以“RIFF$”字符串开头,而我的录音文件没有。也许是问题的暗示。不知道。
下面是记录该文件的Kotlin代码:

val tmpRecordfile: File = 
        File.createTempFile("birdrec${System.currentTimeMillis()}", ".wav", requireContext().cacheDir)
        if (tmpRecordfile.exists()) recordingFilePath = tmpRecordfile.path.toString()

        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.S) {
            recorder =  MediaRecorder(requireContext())
        }else {
            recorder =  MediaRecorder()
        }
        recorder!!.setAudioSource(MediaRecorder.AudioSource.MIC)
        recorder!!.setAudioSamplingRate(RECORDER_SAMPLE_RATE) // is 44100
        // working wav file, but not recognized by Scipy
        recorder!!.setOutputFormat(MediaRecorder.OutputFormat.AMR_WB)
        recorder!!.setOutputFormat(AudioFormat.ENCODING_PCM_16BIT)
        recorder!!.setAudioChannels(1)
        recorder!!.setOutputFile(recordingFilePath)

稍后,当发布到Flaks应用程序时

val requestBody: RequestBody =
                        MultipartBody.Builder().setType(MultipartBody.FORM)
                            .addFormDataPart("audioFile", tmpRecordfile.name,File(tmpRecordfile.path).asRequestBody("audio/wav".toMediaType())
                            )
                            .build()

                    val request: Request = Request.Builder().url(serverURL).post(requestBody).build()
                    val response: Response = client.newCall(request).execute()
                    Log.d("Server response: ",response.body!!.string())

它从电话顺利传输,当到达目的服务器时,由下面的Python代码接收:

// Get file from post
        request.files['audioFile'].save(save_path)
        app.logger.info("Save received file in " + save_path )
        #continue processing...
        try:
            sampleIn, dataIn = wavfile.read(save_path)  #<-- this line is responsible for my misery 
        except Exception:
            app.logger.info("Error reading wav file  " + save_path)


然后是当我收到错误消息:值错误:无法理解文件格式b '\x00\x00\x00\x18'。仅支持'RIFF'和'RIFX'。
也许我的波形文件不是一个真正的wav文件,有些东西丢失了,也许是“RIFF$”字符串在开始,但我不知道如何修复。
对此有什么提示吗?提前感谢!

yhxst69z

yhxst69z1#

贴出答案希望能帮助后来者,特别是因为很难找到能引导我找到解决方案的答案。我真的尼德处理字节。
问题是MediaRecorder不能以WAV格式保存。以“audio/wa”MIME类型发布原始数据不能使其成为一个wave文件。就像没有人仅仅通过穿和服就能成为忍者一样;)
要获取wave文件,我们需要:
(1)使用AudioRecorder代替Media Recorder,保存原始音频数据,如建议here
(2)将原始数据保存为wave文件,您需要逐字节写入wav文件,并添加Wave标头。
(3)调整WAVE标题,以避免Scipy出现“文件过早结束”错误
以下是(1)使用AudioRecord记录原始数据的工作代码(快速和粗略的方法):

//First check whether the above object actually initialized
            if (audioRecord!!.state != AudioRecord.STATE_INITIALIZED) {
                //return
                Log.e("[some tag]", "Audio record no inicializado")
            }

            // assign size so that bytes are read in in chunks inferior to AudioRecord internal buffer size
            isRecordingAudio = true

            val data = ByteArray(BUFFER_SIZE_RECORDING / 2)
            var outputStream: FileOutputStream? = null

            //Now start the audio recording
            audioRecord.startRecording()

            //rECORDING Thread Start
            Thread{

                try {
                    outputStream = FileOutputStream(tmpRecordfile)
                } catch (e: FileNotFoundException) {
                    Log.e("[some tag]", "tmpRecordfile access fail"+ e.message)
                    
                }

                while (isRecordingAudio) {
                    val read = audioRecord!!.read(data, 0, data.size)
                    try {
                        outputStream!!.write(data, 0, read)
                        // clean up file writing operations
                    } catch (e: IOException) {
                        Log.e("[some tag]", " writing audio into tmpFile>> file access fail"+ e.message)
                        e.printStackTrace()
                    }
                }

在零件代码中(2)我尝试将原始数据保存到WAV文件中,将WAV头添加到第一部分获得的原始数据中。关键部分是每个字节的头字节,(i)符合WAVE standard和(ii)usign little-endian format,WAVE所需。简单地说,字节被向后写入,而不是以十六进制写入int值16的“00 01”,你写“0100”。
在这一点上,如果你能逃跑,请这样做!以下代码是为了帮助业力灵魂支付他们的罪孽写入字节;)
在你编码之前,你需要装备一个特殊的武器。因为你需要理解的波头作为epxlaind在这里:

必不可少的工具将是一个十六进制编辑器和一些十六进制到十进制转换器。我用记事本+与十六进制插件检查文件中的字节。也是一个在线转换器从十六进制〈〉DEC,使用小端编码as this online converter例如,前4个字节有字符串“RIFF”,但follwogin有你的文件的大小。是必要检查的值,在十六进制使用小端encondig,和你的文件大小一样。2这个验证是wave标准中描述的每个字节都需要的。3下面是我的wave头的一个例子:

遵循代码的延续。啊,重要的是:在这个Kotlin中,我使用线程来记录数据,使用缓冲池来管理数据;)

while (isRecordingAudio) {
        val read = audioRecord!!.read(data, 0, data.size)
        try {
            outputStream!!.write(data, 0, read)
            // clean up file writing operations
        } catch (e: IOException) {
            Log.e("LOG TAG", " writing audio into tmpFile>> file access fail"+ e.message)
            e.printStackTrace()
        }
    }
    try {
        // SAVE AS WAVE ------------------------------------
        //Create the file used to storage the mixed audio file.

        //Open the buffer for this file.
        //val bufferedSink = Okio.buffer(Okio.appendingSink(file))
        val bufferedSink = finalAudioFile.appendingSink().buffer()

        //Data header of the file.
        val header = Buffer()
        //Data of the file.
        val audioData = Buffer()

        // copy tmpFile in  tempBuffer
        tmpRecordfile.source().buffer()?.let { file ->
            //Create a new buffer for every audio address.
            val buffer = Buffer()
            //Read every byte on the buffer.
            file.readAll(buffer)
            //Get the buffer and write every byte on the sink.
            audioData.writeAll(buffer)
            //Close the sink.
            buffer.close()
            file.close()
        }
        //Count of bytes on the data buffer.
        val fileSize = audioData.size.toInt()
        //The data is ready to be written on the sink.
        audioData.close()

        val totalFileSize = fileSize + 44 // The size of the raw data plus 44 bytes of the wave headers
        //
        val fileSizeSCIPY = totalFileSize - 8 // For some reason, scupy ask for a filesize 8 bytes less.. dont ask me why
        val byteRate = (RECORDER_SAMPLE_RATE * CHANNELS * BITS_PER_SAMPLE ) / 8
        val blocklAlign:Int = ( CHANNELS * BITS_PER_SAMPLE ) / 8

        Log.e("LOG TAG", "Raw file size: "+ fileSize.toString() +" Final file size: "+totalFileSize.toString())
        Log.e("LOG TAG", "Block allign: "+ blocklAlign.toString() +" byteRate: "+byteRate.toString())
        Log.e("LOG TAG", "BITS_PER_SAMPLE: "+ BITS_PER_SAMPLE.toString() +" CHANNELS: "+CHANNELS.toString())
        Log.e("LOG TAG", "AUDIO_FORMAT: "+ AUDIO_FORMAT.toString())

        //Write the header of the final file.
        header.writeUtf8("RIFF")
            //Write the total file size (with the header) OK!
            .writeByte(fileSizeSCIPY and 0xff)
            .writeByte((fileSizeSCIPY shr 8) and 0xff)
            .writeByte((fileSizeSCIPY shr 16) and 0xff)
            .writeByte((fileSizeSCIPY shr 24) and 0xff)
            //Write the total file size (with the header)
            //.writeIntLe(totalFileSize) // checking
            //.writeIntLe(Integer.reverseBytes(totalFileSize))
            //      .writeIntLe(fileSize) //Inform the size of the chunk, including the header.
            .writeUtf8("WAVE") //Inform the type of file.
            .writeUtf8("fmt ") //Add the "fmt" letters
            // size subchuck
            .writeByte(16)
            .writeByte(0)
            .writeByte(0)
            .writeByte(0)
            //.writeByte(0)
            //.writeByte(0)
            //.writeByte(1)// 16 for PCM e tanto faz LE o BE
            .writeByte(1) // PCM =1
            .writeByte(0)
            //.writeByte(1)  // tanto faz LE o BE
            //.writeByte(CHANNELS)
            .writeByte(CHANNELS)
            .writeByte(0)
            // Sample rate OK!!
            //.writeIntLe(RECORDER_SAMPLE_RATE) //fmt chunk
            .writeByte(RECORDER_SAMPLE_RATE and 0xff)
            .writeByte((RECORDER_SAMPLE_RATE shr 8) and 0xff)
            .writeByte((RECORDER_SAMPLE_RATE shr 16) and 0xff)
            .writeByte((RECORDER_SAMPLE_RATE shr 24) and 0xff)
            //Write the byte rate -- OK!!
            .writeByte(byteRate and 0xff)
            .writeByte((byteRate shr 8) and 0xff)
            .writeByte((byteRate shr 16) and 0xff)
            .writeByte((byteRate shr 24) and 0xff)
            //.writeIntLe(byteRate)
            .writeByte(blocklAlign) // tanto faz ??
            .writeByte(0)
            .writeByte(BITS_PER_SAMPLE) //Bytes per sample
            .writeByte(0)
            .writeUtf8("data")
            //File content size
            //.writeIntLe(fileSize)
            //fileSize for SCIPY with 8 bytes less -
            .writeByte(fileSize and 0xff)
            .writeByte((fileSize shr 8) and 0xff)
            .writeByte((fileSize shr 16) and 0xff)
            .writeByte((fileSize shr 24) and 0xff)
            .close()

        // Buffered Sink es finalAudioFile
        with (bufferedSink) {
            writeAll(header)
            writeAll(audioData)
            close() //Close and write the file on the memory.
        }

        // continue
        outputStream?.flush()
        outputStream?.close()
    } catch (e: IOException) {
        Log.e("LOG TAG", "exception while closing output stream $e")
        e.printStackTrace()
    }

}.start() // end recording

希望这段代码能帮助那些需要保存wave文件以便在scipy或其他地方使用的人。如果有帮助,请评价和评论!)

相关问题