pinyin4j的使用

pinyin4j的使用
pinyin4j是一个功能强悍的汉语拼音工具包,主要是从汉语获取各种格式和需求的拼音,功能强悍,下面看看如何使用pinyin4j。
本人以前用AscII编码提取工具,效果不理想,现在用pinyin4j简单实现了一个。功能还不是很完美,陆续再改进吧。
import net.sourceforge.pinyin4j.PinyinHelper;
import net.sourceforge.pinyin4j.format.HanyuPinyinCaseType;
import net.sourceforge.pinyin4j.format.HanyuPinyinOutputFormat;
import net.sourceforge.pinyin4j.format.HanyuPinyinToneType;
import net.sourceforge.pinyin4j.format.exception.BadHanyuPinyinOutputFormatCombination;

import java.io.UnsupportedEncodingException;

/**
* 拼音工具
*
* @author leizhimin 2009-7-15 15:26:21
*/

public class PinyinToolkit {

        /**
         * 获取汉字串拼音首字母,英文字符不变
         *
         * @param chinese 汉字串
         * @return 汉语拼音首字母
         */

        public static String cn2FirstSpell(String chinese) {
                StringBuffer pybf = new StringBuffer();
                char[] arr = chinese.toCharArray();
                HanyuPinyinOutputFormat defaultFormat = new HanyuPinyinOutputFormat();
                defaultFormat.setCaseType(HanyuPinyinCaseType.LOWERCASE);
                defaultFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
                for ( int i = 0; i < arr.length; i++) {
                        if (arr[i] > 128) {
                                try {
                                        String[] _t = PinyinHelper.toHanyuPinyinStringArray(arr[i], defaultFormat);
                                        if (_t != null) {
                                                pybf.append(_t[0].charAt(0));
                                        }
                                } catch (BadHanyuPinyinOutputFormatCombination e) {
                                        e.printStackTrace();
                                }
                        } else {
                                pybf.append(arr[i]);
                        }
                }
                return pybf.toString().replaceAll( "\\W", "").trim();
        }

        /**
         * 获取汉字串拼音,英文字符不变
         *
         * @param chinese 汉字串
         * @return 汉语拼音
         */

        public static String cn2Spell(String chinese) {
                StringBuffer pybf = new StringBuffer();
                char[] arr = chinese.toCharArray();
                HanyuPinyinOutputFormat defaultFormat = new HanyuPinyinOutputFormat();
                defaultFormat.setCaseType(HanyuPinyinCaseType.LOWERCASE);
                defaultFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
                for ( int i = 0; i < arr.length; i++) {
                        if (arr[i] > 128) {
                                try {
                                        pybf.append(PinyinHelper.toHanyuPinyinStringArray(arr[i], defaultFormat)[0]);
                                } catch (BadHanyuPinyinOutputFormatCombination e) {
                                        e.printStackTrace();
                                }
                        } else {
                                pybf.append(arr[i]);
                        }
                }
                return pybf.toString();
        }

        public static void main(String[] args) throws UnsupportedEncodingException {
                String x = "嘅囧誰說壞學生來勼髮視頻裆児";
                System.out.println(cn2FirstSpell(x));
                System.out.println(cn2Spell(x));
        }
}
运行结果:
kjsshxsljfspde
kaijiongshuishuohuaixueshenglaijiufashipindanger

Process finished with exit code 0
在某些系统上可能有字符集的问题,需要做预处理。

你可能感兴趣的:(pinyin4j的使用)