silero-vad 官方新增了java 的demo

原来参考android GitHub - gkonovalov/android-vad: Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.的kt改写java demo 可费劲了 

上个月  https://github.com/snakers4/silero-vad/tree/master/examples/java-example 官方新增了例子  在java判断pcm 的静音简单了 

package org.example;

import ai.onnxruntime.OrtException;
import javax.sound.sampled.*;
import java.util.Map;

public class App {

    private static final String MODEL_PATH = "src/main/resources/silero_vad.onnx";
    private static final int SAMPLE_RATE = 16000;
    private static final float START_THRESHOLD = 0.6f;
    private static final float END_THRESHOLD = 0.45f;
    private static final int MIN_SILENCE_DURATION_MS = 600;
    private static final int SPEECH_PAD_MS = 500;
    private static final int WINDOW_SIZE_SAMPLES = 2048;

    public static void main(String[] args) {
        // Initialize the Voice Activity Detector
        SlieroVadDetector vadDetector;
        try {
            vadDetector = new SlieroVadDetector(MODEL_PATH, START_THRESHOLD, END_THRESHOLD, SAMPLE_RATE, MIN_SILENCE_DURATION_MS, SPEECH_PAD_MS);
        } catch (OrtException e) {
            System.err.println("Error initializing the VAD detector: " + e.getMessage());
            return;
        }

        // Set audio format
        AudioFormat format = new AudioFormat(SAMPLE_RATE, 16, 1, true, false);
        DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);

        // Get the target data line and open it with the specified format
        TargetDataLine targetDataLine;
        try {
            targetDataLine = (TargetDataLine) AudioSystem.getLine(info);
            targetDataLine.open(format);
            targetDataLine.start();
        } catch (LineUnavailableException e) {
            System.err.println("Error opening target data line: " + e.getMessage());
            return;
        }

        // Main loop to continuously read data and apply Voice Activity Detection
        while (targetDataLine.isOpen()) {
            byte[] data = new byte[WINDOW_SIZE_SAMPLES];

            int numBytesRead = targetDataLine.read(data, 0, data.length);
            if (numBytesRead <= 0) {
                System.err.println("Error reading data from target data line.");
                continue;
            }

            // Apply the Voice Activity Detector to the data and get the result
            Map detectResult;
            try {
                detectResult = vadDetector.apply(data, true);
            } catch (Exception e) {
                System.err.println("Error applying VAD detector: " + e.getMessage());
                continue;
            }

            if (!detectResult.isEmpty()) {
                System.out.println(detectResult);
            }
        }

        // Close the target data line to release audio resources
        targetDataLine.close();
    }
}

运行加下onnx的 dll git 下载下

System.load("F:\\jar\\onnxruntime-win-x64-1.16.3\\lib\\onnxruntime.dll");

对应基于freeswitch 获取到的pcm数据判断静音就简单了

vadDetector.apply(data, true); 主要方法就是get float值

// Call the model to get the prediction probability of speech
float speechProb = 0;
try {
    speechProb = model.call(new float[][]{audioData}, samplingRate)[0];
} catch (OrtException e) {
    throw new RuntimeException(e);
}

 

  有兴趣可以到https://item.taobao.com/item.htm?id=653611115230

你可能感兴趣的:(java,freeswitch,silero-vad)