* The AudioTrack class manages and plays a single audio resource for Java applications.
* It allows streaming of PCM audio buffers to the audio sink for playback. This is
* achieved by "pushing" the data to the AudioTrack object using one of the
* {@link #write(byte[], int, int)}, {@link #write(short[], int, int)},
* and {@link #write(float[], int, int, int)} methods.
* An AudioTrack instance can operate under two modes: static or streaming.
* In Streaming mode, the application writes a continuous stream of data to the AudioTrack, using
* one of the {@code write()} methods. These are blocking and return when the data has been
* transferred from the Java layer to the native layer and queued for playback. The streaming
* mode is most useful when playing blocks of audio data that for instance are:
* - too big to fit in memory because of the duration of the sound to play,
* - too big to fit in memory because of the characteristics of the audio data
* (high sampling rate, bits per sample ...)
* - received or generated while previously queued audio is playing.
* The static mode should be chosen when dealing with short sounds that fit in memory and
* that need to be played with the smallest latency possible. The static mode will
* therefore be preferred for UI and game sounds that are played often, and with the
* smallest overhead possible.
* Upon creation, an AudioTrack object initializes its associated audio buffer.
* The size of this buffer, specified during the construction, determines how long an AudioTrack
* can play before running out of data.
* For an AudioTrack using the static mode, this size is the maximum size of the sound that can
* be played from it.
* For the streaming mode, data will be written to the audio sink in chunks of
* sizes less than or equal to the total buffer size.
* AudioTrack is not final and thus permits subclasses, but such use is not recommended.
二、AudioTrack 构造方法参数介绍
public AudioTrack(int streamType, int sampleRateInHz, int channelConfig, int audioFormat,
int bufferSizeInBytes, int mode)
throws IllegalArgumentException {
this(streamType, sampleRateInHz, channelConfig, audioFormat,
bufferSizeInBytes, mode, AudioManager.AUDIO_SESSION_ID_GENERATE);
streamType: Android将系统的声音分为好几种流类型,下面是几个常见的:
· STREAM_MUSIC:音乐声,例如music等
· STREAM_SYSTEM:系统声音,例如低电提示音,锁屏音等
sampleRateInHz: 采样率 (MediaRecoder 的采样率通常是8000Hz AAC的通常是44100Hz。
* 设置采样率为44100,目前为常用的采样率,官方文档表示这个值可以兼容所有的设置)
audioFormat :指定音频量化位数 ,在AudioFormaat类中指定了以下各种可能的常量。
* 通常我们选择ENCODING_PCM_16BIT和ENCODING_PCM_8BIT PCM代表的是脉冲编码调制,它实际上是原始音频样本。
* 因此可以设置每个样本的分辨率为16位或者8位,16位将占用更多的空间和处理能力,表示的音频也更加接近真实。
mode : AudioTrack有两种数据加载模式(MODE_STREAM和MODE_STATIC)
三、AudioTrack 两种模式使用
3.1、AudioTrack MODE_STATIC 使用
this.audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, 44100,
audioData.length, AudioTrack.MODE_STATIC);
Log.d(TAG, "Writing audio data...");
this.audioTrack.write(audioData, 0, audioData.length);
Log.d(TAG, "Starting playback");
3.2、AudioTrack MODE_STREAM 使用
package com.ubtechinc.cruzr.voice.pcm;
import android.media.AudioFormat;
import android.media.AudioManager;
import android.media.AudioTrack;
import androidx.annotation.NonNull;
import com.ubtrobot.cruzr.core.log.ELog;
import okio.ByteString;
import static android.media.AudioTrack.PLAYSTATE_PLAYING;
public class AudioTrackManager {
private static final String TAG = "AudioTrackManager";
private AudioTrack mAudioTrack;
private volatile static AudioTrackManager mInstance;
private long bufferCount;
* 音频流类型
private static final int mStreamType = AudioManager.STREAM_MUSIC;
* 指定采样率 (MediaRecoder 的采样率通常是8000Hz AAC的通常是44100Hz。
* 设置采样率为44100,目前为常用的采样率,官方文档表示这个值可以兼容所有的设置)
private static final int mSampleRateInHz = 16000;
* 指定捕获音频的声道数目。在AudioFormat类中指定用于此的常量
private static final int mChannelConfig = AudioFormat.CHANNEL_CONFIGURATION_MONO; //单声道
* 指定音频量化位数 ,在AudioFormaat类中指定了以下各种可能的常量。
* 通常我们选择ENCODING_PCM_16BIT和ENCODING_PCM_8BIT PCM代表的是脉冲编码调制,它实际上是原始音频样本。
* 因此可以设置每个样本的分辨率为16位或者8位,16位将占用更多的空间和处理能力,表示的音频也更加接近真实。
private static final int mAudioFormat = AudioFormat.ENCODING_PCM_16BIT;
* 指定缓冲区大小。调用AudioTrack类的getMinBufferSize方法可以获得。
private int mMinBufferSize;
* STREAM的意思是由用户在应用程序通过write方式把数据一次一次得写到audiotrack中。
* 这个和我们在socket中发送数据一样,
* 应用层从某个地方获取数据,例如通过编解码得到PCM数据,然后write到audiotrack。
private static int mMode = AudioTrack.MODE_STREAM;
private IAudioPlayStateListener iAudioPlayStateListener;
private static final int BUFFER_CAPITAL = 10;
* 获取单例引用
* @return
public static AudioTrackManager getInstance() {
if (mInstance == null) {
synchronized (AudioTrackManager.class) {
if (mInstance == null) {
mInstance = new AudioTrackManager();
return mInstance;
public AudioTrackManager() {
private void initAudioTrack() {
//计算最小缓冲区 *10
mMinBufferSize = AudioTrack.getMinBufferSize(mSampleRateInHz, mChannelConfig, mAudioFormat);
ELog.i(TAG, "initAudioTrack: mMinBufferSize: " + mMinBufferSize * BUFFER_CAPITAL + " b");
mAudioTrack = new AudioTrack(mStreamType, mSampleRateInHz, mChannelConfig,
mAudioFormat, mMinBufferSize * BUFFER_CAPITAL, mMode);
public void addAudioPlayStateListener(IAudioPlayStateListener iAudioPlayStateListener) {
this.iAudioPlayStateListener = iAudioPlayStateListener;
public void prepareAudioTrack() {
bufferCount = 0;
ELog.i(TAG, "prepareAudioTrack:------> ");
if (null == mAudioTrack) {
if (mAudioTrack.getState() == mAudioTrack.STATE_UNINITIALIZED) {
if (null != iAudioPlayStateListener) {
public synchronized void write(@NonNull final ByteString bytes) {
if (null != mAudioTrack) {
int byteSize = bytes.size();
bufferCount += byteSize;
int write = mAudioTrack.write(bytes.toByteArray(), 0, bytes.size());
ELog.d(TAG, "write: 接收到数据 " + byteSize + " b | 已写入 " + bufferCount + " b");
if (write == 0 && null != iAudioPlayStateListener) {
try {
} catch (InterruptedException e) {
public void stopPlay() {
ELog.i(TAG, "stopPlay: ");
if (null == mAudioTrack) {
if (null != iAudioPlayStateListener) {
try {
if (mAudioTrack.getPlayState() == PLAYSTATE_PLAYING) {
} catch (IllegalStateException e) {
ELog.e(TAG, "stop: " + e.toString());
public void release() {
if (null == mAudioTrack) {
ELog.i(TAG, "release: ");
iAudioPlayStateListener = null;
try {
mAudioTrack = null;
} catch (Exception e) {
ELog.e(TAG, "release: " + e.toString());
public void setBufferParams(int pcmFileSize) {
//设置缓冲的大小 为PCM文件大小的10%
ELog.d(TAG, "setFileSize: PCM文件大小为:" + pcmFileSize + " b 最小缓存空间为 " + mMinBufferSize * BUFFER_CAPITAL + " b");
if (pcmFileSize < mMinBufferSize * BUFFER_CAPITAL) {
mAudioTrack = new AudioTrack(mStreamType, mSampleRateInHz, mChannelConfig,
mAudioFormat, mMinBufferSize, mMode);
ELog.d(TAG, "setFileSize: pcmFileSize 文件小于最小缓冲数据的10倍,修改为默认的1倍------>");
} else {
//缓存大小为PCM文件大小的10%,如果小于mMinBufferSize * BUFFER_CAPITAL,则按默认值设置
int cacheFileSize = (int) (pcmFileSize * 0.1);
int realBufferSize = (cacheFileSize / mMinBufferSize + 1) * mMinBufferSize;
ELog.d(TAG,"计算得到缓存空间为: "+realBufferSize+" b 最小缓存空间为 " + mMinBufferSize * BUFFER_CAPITAL + " b");
if (realBufferSize < mMinBufferSize * BUFFER_CAPITAL) {
realBufferSize=mMinBufferSize * BUFFER_CAPITAL;
mAudioTrack = new AudioTrack(mStreamType, mSampleRateInHz, mChannelConfig,
mAudioFormat, realBufferSize, mMode);
ELog.d(TAG, "setFileSize: 重置缓存空间为: " + realBufferSize + " b | "+realBufferSize/1024+" kb");
bufferCount = 0;
四、AudioTrack 和 MediaPlayer的对比
4.1 区别
- 其中最大的区别是MediaPlayer可以播放多种格式的声音文件,例如MP3,AAC,WAV,OGG,MIDI等。MediaPlayer会在framework层创建对应的音频解码器。而AudioTrack只能播放已经解码的PCM流,如果对比支持的文件格式的话则是AudioTrack只支持wav格式的音频文件,因为wav格式的音频文件大部分都是PCM流。AudioTrack不创建解码器,所以只能播放不需要解码的wav文件。
4.2 联系
- MediaPlayer在framework层还是会创建AudioTrack,把解码后的PCM数流传递给AudioTrack,AudioTrack再传递给AudioFlinger进行混音,然后才传递给硬件播放,所以是MediaPlayer包含了AudioTrack。
4.3 SoundPool
- 在接触Android音频播放API的时候,发现SoundPool也可以用于播放音频。下面是三者的使用场景:MediaPlayer 更加适合在后台长时间播放本地音乐文件或者在线的流式资源; SoundPool 则适合播放比较短的音频片段,比如游戏声音、按键声、铃声片段等等,它可以同时播放多个音频; 而 AudioTrack 则更接近底层,提供了非常强大的控制能力,支持低延迟播放,适合流媒体和VoIP语音电话等场景。