liuxinglanyue

htmlparser的编码问题

转：http://gbfd2012.iteye.com/blog/732042

  htmlparser在提取网站内容时，有时会出现乱码或者是编码不能转换的问题。这是htmlparser的一个小bug，因为htmlparser作为一个开源软件已经很长时间没有更新了。

org.htmlparser.util.EncodingChangeException: character mismatch (new: 中 [0x4e2d] != old: [0xd6?]) for encoding change from ISO-8859-1 to GB2312 at character offset 23或者会出现页面的乱码问题。
    为了彻底避免上述问题，我们可以改下htmlparser的源码的两个类。
    package org.htmlparser.lexer;和InputStreamSource类。另外我们还要用 CodepageDetectorProxy（根据二进制流来分析网页编码）来提前解析网页编码。
    htmlparser中设置编码一般为
    Parser parser=new Parser(url);
    parser.setEnconding("bianma");
    但存在漏洞。
    htmlparser编码的分析过程：htmlparser会根据服务器返回的文件头信息与网页的meta标签中的编码进行对比，如果服务器返回的文件头编码为空，默认返回为ISO-8859-1的编码，它会meta标签的charset里的编码对比。

    改进的思路：利用CodepageDetectorProxy.jar对网页进行编码分析，获得网页的编码格式。将htmlparser的服务器返回的默认编码设置为CodepageDetectorProxy解析的编码。这样的编码和meta标签的编码总能保持一致了。。代码如下：
整个修改过程如下：

import info.monitorenter.cpdetector.io.CodepageDetectorProxy;
import info.monitorenter.cpdetector.io.ParsingDetector;
import java.net.MalformedURLException;
import java.net.URL;
import org.htmlparser.lexer.Page;

public class WebEncoding {

	public String AnalyEnconding(String path){
	URL url=null;
	try {
		url=new URL(path);
	} catch (MalformedURLException e) {
		e.printStackTrace();
	}
	CodepageDetectorProxy detector = CodepageDetectorProxy.getInstance(); 
	detector.add(new ParsingDetector(false)); 
	java.nio.charset.Charset charset = null;  
	try {  
	      charset = detector.detectCodepage(url);  
	} catch (Exception ex) {ex.printStackTrace();}
	 if(charset.name().equalsIgnoreCase("utf-8")||charset.name().equals("UTF-8")){
		 Page.GaoBinDEFAULT_CHARSET="utf-8";
      }else{
    	  Page.GaoBinDEFAULT_CHARSET="gb2312";
     }
	   return Page.getGaoBinDEFAULT_CHARSET();
	}
}

package org.htmlparser.lexer;

import java.io.*;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.net.*;
import java.util.zip.*;
import org.htmlparser.http.ConnectionManager;
import org.htmlparser.util.ParserException;

// Referenced classes of package org.htmlparser.lexer:
//            InputStreamSource, PageIndex, StringSource, Cursor, 
//            Stream, Source

public class Page
    implements Serializable
{

    public Page()
    {
        this("");
    }

    public Page(URLConnection connection)
        throws ParserException
    {
        if(null == connection)
        {
            throw new IllegalArgumentException("connection cannot be null");
        } else
        {
            setConnection(connection);
            mBaseUrl = null;
            return;
        }
    }

    public Page(InputStream stream, String charset)
        throws UnsupportedEncodingException
    {
        if(null == stream)
            throw new IllegalArgumentException("stream cannot be null");
        if(null == charset)
            charset = "ISO-8859-1";
        mSource = new InputStreamSource(stream, charset);
        mIndex = new PageIndex(this);
        mConnection = null;
        mUrl = null;
        mBaseUrl = null;
    }

    public Page(String text, String charset)
    {
        if(null == text)
            throw new IllegalArgumentException("text cannot be null");
        if(null == charset)
            charset = "ISO-8859-1";
        mSource = new StringSource(text, charset);
        mIndex = new PageIndex(this);
        mConnection = null;
        mUrl = null;
        mBaseUrl = null;
    }

    public Page(String text)
    {
        this(text, null);
    }

    public Page(Source source)
    {
        if(null == source)
        {
            throw new IllegalArgumentException("source cannot be null");
        } else
        {
            mSource = source;
            mIndex = new PageIndex(this);
            mConnection = null;
            mUrl = null;
            mBaseUrl = null;
            return;
        }
    }

    public static ConnectionManager getConnectionManager()
    {
        return mConnectionManager;
    }

    public static void setConnectionManager(ConnectionManager manager)
    {
        mConnectionManager = manager;
    }

    public String getCharset(String content)
    {
        String CHARSET_STRING = "charset";
        String ret;
        if(null == mSource)
            ret = "ISO-8859-1";
        else
            ret = mSource.getEncoding();
        if(null != content)
        {
            int index = content.indexOf("charset");
            if(index != -1)
            {
                content = content.substring(index + "charset".length()).trim();
                if(content.startsWith("="))
                {
                    content = content.substring(1).trim();
                    index = content.indexOf(";");
                    if(index != -1)
                        content = content.substring(0, index);
                    if(content.startsWith("\"") && content.endsWith("\"") && 1 < content.length())
                        content = content.substring(1, content.length() - 1);
                    if(content.startsWith("'") && content.endsWith("'") && 1 < content.length())
                        content = content.substring(1, content.length() - 1);
                    ret = findCharset(content, ret);
                }
            }
        }
        return ret;
    }

    public static String findCharset(String name, String fallback)
    {
        String ret;
        try
        {
            Class cls = Class.forName("java.nio.charset.Charset");
            Method method = cls.getMethod("forName", new Class[] {
                java.lang.String.class
            });
            Object object = method.invoke(null, new Object[] {
                name
            });
            method = cls.getMethod("name", new Class[0]);
            object = method.invoke(object, new Object[0]);
            ret = (String)object;
        }
        catch(ClassNotFoundException cnfe)
        {
            ret = name;
        }
        catch(NoSuchMethodException nsme)
        {
            ret = name;
        }
        catch(IllegalAccessException ia)
        {
            ret = name;
        }
        catch(InvocationTargetException ita)
        {
            ret = fallback;
            System.out.println("unable to determine cannonical charset name for " + name + " - using " + fallback);
        }
        return ret;
    }

    private void writeObject(ObjectOutputStream out)
        throws IOException
    {
        if(null != getConnection())
        {
            out.writeBoolean(true);
            out.writeInt(mSource.offset());
            String href = getUrl();
            out.writeObject(href);
            setUrl(getConnection().getURL().toExternalForm());
            Source source = getSource();
            mSource = null;
            PageIndex index = mIndex;
            mIndex = null;
            out.defaultWriteObject();
            mSource = source;
            mIndex = index;
        } else
        {
            out.writeBoolean(false);
            String href = getUrl();
            out.writeObject(href);
            setUrl(null);
            out.defaultWriteObject();
            setUrl(href);
        }
    }

    private void readObject(ObjectInputStream in)
        throws IOException, ClassNotFoundException
    {
        boolean fromurl = in.readBoolean();
        if(fromurl)
        {
            int offset = in.readInt();
            String href = (String)in.readObject();
            in.defaultReadObject();
            if(null != getUrl())
            {
                URL url = new URL(getUrl());
                try
                {
                    setConnection(url.openConnection());
                }
                catch(ParserException pe)
                {
                    throw new IOException(pe.getMessage());
                }
            }
            Cursor cursor = new Cursor(this, 0);
            for(int i = 0; i < offset; i++)
                try
                {
                    getCharacter(cursor);
                }
                catch(ParserException pe)
                {
                    throw new IOException(pe.getMessage());
                }

            setUrl(href);
        } else
        {
            String href = (String)in.readObject();
            in.defaultReadObject();
            setUrl(href);
        }
    }

    public void reset()
    {
        getSource().reset();
        mIndex = new PageIndex(this);
    }

    public void close()
        throws IOException
    {
        if(null != getSource())
            getSource().destroy();
    }

    protected void finalize()
        throws Throwable
    {
        close();
    }

    public URLConnection getConnection()
    {
        return mConnection;
    }

    public void setConnection(URLConnection connection)
        throws ParserException
    {
        mConnection = connection;
        mConnection.setConnectTimeout(6000);
        mConnection.setReadTimeout(6000);
        try
        {
            getConnection().connect();
        }
        catch(UnknownHostException uhe)
        {
            throw new ParserException("Connect to " + mConnection.getURL().toExternalForm() + " failed.", uhe);
        }
        catch(IOException ioe)
        {
            throw new ParserException("Exception connecting to " + mConnection.getURL().toExternalForm() + " (" + ioe.getMessage() + ").", ioe);
        }
        String type = getContentType();
        String charset = getCharset(type);
        try
        {
            String contentEncoding = connection.getContentEncoding();
            System.out.println("contentEncoding="+contentEncoding);
            Stream stream;
            if(null != contentEncoding && -1 != contentEncoding.indexOf("gzip"))
                stream = new Stream(new GZIPInputStream(getConnection().getInputStream()));
            else
            if(null != contentEncoding && -1 != contentEncoding.indexOf("deflate"))
                stream = new Stream(new InflaterInputStream(getConnection().getInputStream(), new Inflater(true)));
            else{
                stream = new Stream(getConnection().getInputStream());
            }

            try
            {
		        /*
            	 * 时间:2010年8月6日
            	 * 原因:当String charset = getCharset(type);返回来的是ISO-8859-1的时候,需要处理一下
            	 */
            	if(charset.indexOf("ISO-8859-1")!=-1){
            		
            		charset =getGaoBinDEFAULT_CHARSET() ;
            		
            	}
		mSource = new InputStreamSource(stream, charset);
            }
            catch(UnsupportedEncodingException uee)
            {
                charset = "ISO-8859-1";
                mSource = new InputStreamSource(stream, charset);
            }
        }
        catch(IOException ioe)
        {
            throw new ParserException("Exception getting input stream from " + mConnection.getURL().toExternalForm() + " (" + ioe.getMessage() + ").", ioe);
        }
        mUrl = connection.getURL().toExternalForm();
		mIndex = new PageIndex(this);
    }

    public String getUrl()
    {
        return mUrl;
    }

    public void setUrl(String url)
    {
        mUrl = url;
    }

    public String getBaseUrl()
    {
        return mBaseUrl;
    }

    public void setBaseUrl(String url)
    {
        mBaseUrl = url;
    }

    public Source getSource()
    {
        return mSource;
    }

    public String getContentType()
    {
        String ret = "text/html";
        URLConnection connection = getConnection();
        if(null != connection)
        {
            String content = connection.getHeaderField("Content-Type");
            if(null != content)
                ret = content;
        }
        return ret;
    }

    public char getCharacter(Cursor cursor)
        throws ParserException
    {
        int i = cursor.getPosition();
        int offset = mSource.offset();
        char ret;
        if(offset == i)
            try
            {
                i = mSource.read();
                if(-1 == i)
                {
                    ret = '\uFFFF';
                } else
                {
                    ret = (char)i;
                    cursor.advance();
                }
            }
            catch(IOException ioe)
            {
                throw new ParserException("problem reading a character at position " + cursor.getPosition(), ioe);
            }
        else
        if(offset > i)
        {
            try
            {
                ret = mSource.getCharacter(i);
            }
            catch(IOException ioe)
            {
                throw new ParserException("can't read a character at position " + i, ioe);
            }
            cursor.advance();
        } else
        {
            throw new ParserException("attempt to read future characters from source " + i + " > " + mSource.offset());
        }
        if('\r' == ret)
        {
            ret = '\n';
            if(mSource.offset() == cursor.getPosition())
                try
                {
                    i = mSource.read();
                    if(-1 != i)
                        if('\n' == (char)i)
                            cursor.advance();
                        else
                            try
                            {
                                mSource.unread();
                            }
                            catch(IOException ioe)
                            {
                                throw new ParserException("can't unread a character at position " + cursor.getPosition(), ioe);
                            }
                }
                catch(IOException ioe)
                {
                    throw new ParserException("problem reading a character at position " + cursor.getPosition(), ioe);
                }
            else
                try
                {
                    if('\n' == mSource.getCharacter(cursor.getPosition()))
                        cursor.advance();
                }
                catch(IOException ioe)
                {
                    throw new ParserException("can't read a character at position " + cursor.getPosition(), ioe);
                }
        }
        if('\n' == ret)
            mIndex.add(cursor);
        return ret;
    }

    public void ungetCharacter(Cursor cursor)
        throws ParserException
    {
        cursor.retreat();
        int i = cursor.getPosition();
        try
        {
            char ch = mSource.getCharacter(i);
            if('\n' == ch && 0 != i)
            {
                ch = mSource.getCharacter(i - 1);
                if('\r' == ch)
                    cursor.retreat();
            }
        }
        catch(IOException ioe)
        {
            throw new ParserException("can't read a character at position " + cursor.getPosition(), ioe);
        }
    }

    public String getEncoding()
    {
        return getSource().getEncoding();
    }

    public void setEncoding(String character_set)
        throws ParserException
    {
    	Page.GaoBinDEFAULT_CHARSET = character_set;
        getSource().setEncoding(character_set);
    }

    public URL constructUrl(String link, String base)
        throws MalformedURLException
    {
        return constructUrl(link, base, false);
    }

    public URL constructUrl(String link, String base, boolean strict)
        throws MalformedURLException
    {
        int index;
        URL url;
        if(!strict && '?' == link.charAt(0))
        {
            if(-1 != (index = base.lastIndexOf('?')))
                base = base.substring(0, index);
            url = new URL(base + link);
        } else
        {
            url = new URL(new URL(base), link);
        }
        String path = url.getFile();
        boolean modified = false;
        boolean absolute = link.startsWith("/");
        if(!absolute)
            do
            {
                if(!path.startsWith("/."))
                    break;
                if(path.startsWith("/../"))
                {
                    path = path.substring(3);
                    modified = true;
                    continue;
                }
                if(!path.startsWith("/./") && !path.startsWith("/."))
                    break;
                path = path.substring(2);
                modified = true;
            } while(true);
        while(-1 != (index = path.indexOf("/\\"))) 
        {
            path = path.substring(0, index + 1) + path.substring(index + 2);
            modified = true;
        }
        if(modified)
            url = new URL(url, path);
        return url;
    }

    public String getAbsoluteURL(String link)
    {
        return getAbsoluteURL(link, false);
    }

    public String getAbsoluteURL(String link, boolean strict)
    {
        String ret;
        if(null == link || "".equals(link))
            ret = "";
        else
            try
            {
                String base = getBaseUrl();
                if(null == base)
                    base = getUrl();
                if(null == base)
                {
                    ret = link;
                } else
                {
                    URL url = constructUrl(link, base, strict);
                    ret = url.toExternalForm();
                }
            }
            catch(MalformedURLException murle)
            {
                ret = link;
            }
        return ret;
    }

    public int row(Cursor cursor)
    {
        return mIndex.row(cursor);
    }

    public int row(int position)
    {
        return mIndex.row(position);
    }

    public int column(Cursor cursor)
    {
        return mIndex.column(cursor);
    }

    public int column(int position)
    {
        return mIndex.column(position);
    }

    public String getText(int start, int end)
        throws IllegalArgumentException
    {
        String ret;
        try
        {
            ret = mSource.getString(start, end - start);
        }
        catch(IOException ioe)
        {
            throw new IllegalArgumentException("can't get the " + (end - start) + "characters at position " + start + " - " + ioe.getMessage());
        }
        return ret;
    }

    public void getText(StringBuffer buffer, int start, int end)
        throws IllegalArgumentException
    {
        if(mSource.offset() < start || mSource.offset() < end)
            throw new IllegalArgumentException("attempt to extract future characters from source" + start + "|" + end + " > " + mSource.offset());
        int length;
        if(end < start)
        {
            length = end;
            end = start;
            start = length;
        }
        length = end - start;
        try
        {
            mSource.getCharacters(buffer, start, length);
        }
        catch(IOException ioe)
        {
            throw new IllegalArgumentException("can't get the " + (end - start) + "characters at position " + start + " - " + ioe.getMessage());
        }
    }

    public String getText()
    {
        return getText(0, mSource.offset());
    }

    public void getText(StringBuffer buffer)
    {
        getText(buffer, 0, mSource.offset());
    }

    public void getText(char array[], int offset, int start, int end)
        throws IllegalArgumentException
    {
        if(mSource.offset() < start || mSource.offset() < end)
            throw new IllegalArgumentException("attempt to extract future characters from source");
        int length;
        if(end < start)
        {
            length = end;
            end = start;
            start = length;
        }
        length = end - start;
        try
        {
            mSource.getCharacters(array, offset, start, end);
        }
        catch(IOException ioe)
        {
            throw new IllegalArgumentException("can't get the " + (end - start) + "characters at position " + start + " - " + ioe.getMessage());
        }
    }

    public String getLine(Cursor cursor)
    {
        int line = row(cursor);
        int size = mIndex.size();
        int start;
        int end;
        if(line < size)
        {
            start = mIndex.elementAt(line);
            if(++line <= size)
                end = mIndex.elementAt(line);
            else
                end = mSource.offset();
        } else
        {
            start = mIndex.elementAt(line - 1);
            end = mSource.offset();
        }
        return getText(start, end);
    }

    public String getLine(int position)
    {
        return getLine(new Cursor(this, position));
    }

    public String toString()
    {
        String ret;
        if(mSource.offset() > 0)
        {
            StringBuffer buffer = new StringBuffer(43);
            int start = mSource.offset() - 40;
            if(0 > start)
                start = 0;
            else
                buffer.append("...");
            getText(buffer, start, mSource.offset());
            ret = buffer.toString();
        } else
        {
            ret = super.toString();
        }
        return ret;
    }

    public static final String DEFAULT_CHARSET = "ISO-8859-1";
    public static String GaoBinDEFAULT_CHARSET;
	public static final String DEFAULT_CONTENT_TYPE = "text/html";
    public static final char EOF = 65535;
    protected String mUrl;
    protected String mBaseUrl;
    protected Source mSource;
    protected PageIndex mIndex;
    protected transient URLConnection mConnection;
    protected static ConnectionManager mConnectionManager = new ConnectionManager();
    
    public static String getGaoBinDEFAULT_CHARSET() {
		return GaoBinDEFAULT_CHARSET;
	}

	public static void setGaoBinDEFAULT_CHARSET(String gaoBinDEFAULT_CHARSET) {
		GaoBinDEFAULT_CHARSET = gaoBinDEFAULT_CHARSET;
	}

}

package org.htmlparser.lexer;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.UnsupportedEncodingException;

import org.htmlparser.util.EncodingChangeException;
import org.htmlparser.util.ParserException;

public class InputStreamSource
    extends
        Source
{
    /**
     * An initial buffer size.
     * Has a default value of {16384}.
     */
    public static int BUFFER_SIZE = 16384;

    /**
     * The stream of bytes.
     * Set to <code>null</code> when the source is closed.
     */
    protected transient InputStream mStream;

    /**
     * The character set in use.
     */
    protected String mEncoding;

    /**
     * The converter from bytes to characters.
     */
    protected transient InputStreamReader mReader;

    /**
     * The characters read so far.
     */
    protected char[] mBuffer;

    /**
     * The number of valid bytes in the buffer.
     */
    protected int mLevel;

    /**
     * The offset of the next byte returned by read().
     */
    protected int mOffset;
    /**
     * The bookmark.
     */
    protected int mMark;

    /**
     * Create a source of characters using the default character set.
     * @param stream The stream of bytes to use.
     * @exception UnsupportedEncodingException If the default character set
     * is unsupported.
     */
    public InputStreamSource (InputStream stream)
        throws
            UnsupportedEncodingException
    {
        this (stream, null, BUFFER_SIZE);
    }

    public InputStreamSource (InputStream stream, String charset)
        throws
            UnsupportedEncodingException
    {
        this (stream, charset, BUFFER_SIZE);
    }

    /**
     * Create a source of characters.
     * @param stream The stream of bytes to use.
     * @param charset The character set used in encoding the stream.
     * @param size The initial character buffer size.
     * @exception UnsupportedEncodingException If the character set
     * is unsupported.
     */
    public InputStreamSource (InputStream stream, String charset, int size)
        throws
            UnsupportedEncodingException
    {
        if (null == stream)
            stream = new Stream (null);
        else
            // bug #1044707 mark()/reset() issues
            if (!stream.markSupported ())
                // wrap the stream so we can reset
                stream = new Stream (stream);
        mStream = stream;
        if (null == charset)
        {
            mReader = new InputStreamReader (stream);
            mEncoding = mReader.getEncoding ();
        }
        else
        {
            mEncoding = charset;
            mReader = new InputStreamReader (stream, charset);
        }
        mBuffer = new char[size];
        mLevel = 0;
        mOffset = 0;
        mMark = -1;
    }
  /**
     * Serialization support.
     * @param out Where to write this object.
     * @exception IOException If serialization has a problem.
     */
    private void writeObject (ObjectOutputStream out)
        throws
            IOException
    {
        int offset;
        char[] buffer;

        if (null != mStream)
        {
            // remember the offset, drain the input stream, restore the offset
            offset = mOffset;
            buffer = new char[4096];
            while (EOF != read (buffer))
                ;
            mOffset = offset;
        }

        out.defaultWriteObject ();
    }

    /**
     * Deserialization support.
     * @param in Where to read this object from.
     * @exception IOException If deserialization has a problem.
     */
    private void readObject (ObjectInputStream in)
        throws
            IOException,
            ClassNotFoundException
    {
        in.defaultReadObject ();
        if (null != mBuffer) // buffer is null when destroy's been called
            // pretend we're open, mStream goes null when exhausted
            mStream = new ByteArrayInputStream (new byte[0]);
    }

    /**
     * Get the input stream being used.
     * @return The current input stream.
     */
    public InputStream getStream ()
    {
        return (mStream);
    }

    /**
     * Get the encoding being used to convert characters.
     * @return The current encoding.
     */
    public String getEncoding ()
    {
        return (mEncoding);
    }

    /**
     * Begins reading from the source with the given character set.
     * If the current encoding is the same as the requested encoding,
     * this method is a no-op. Otherwise any subsequent characters read from
     * this page will have been decoded using the given character set.<p>
     * Some magic happens here to obtain this result if characters have already
     * been consumed from this source.
     * Since a Reader cannot be dynamically altered to use a different character
     * set, the underlying stream is reset, a new Source is constructed
     * and a comparison made of the characters read so far with the newly
     * read characters up to the current position.
     * If a difference is encountered, or some other problem occurs,
     * an exception is thrown.
     * @param character_set The character set to use to convert bytes into
     * characters.
     * @exception ParserException If a character mismatch occurs between
     * characters already provided and those that would have been returned
     * had the new character set been in effect from the beginning. An
     * exception is also thrown if the underlying stream won't put up with
     * these shenanigans.
     */
    public void setEncoding (String character_set)
        throws
            ParserException
    {   
        String encoding;
        InputStream stream;
        char[] buffer;
        int offset;
        char[] new_chars;

        encoding = getEncoding ();
        if(encoding!=null){
        	character_set=encoding;
        }
        if (!encoding.equalsIgnoreCase (character_set))
        {
            stream = getStream ();
            try
            {
                buffer = mBuffer;
                offset = mOffset;
                stream.reset ();
                try
                {
                    mEncoding = character_set;
                    mReader = new InputStreamReader (stream, character_set);
                    mBuffer = new char[mBuffer.length];
                    mLevel = 0;
                    mOffset = 0;
                    mMark = -1;
                    if (0 != offset)
                    {
                        new_chars = new char[offset];
                        if (offset != read (new_chars))
                            throw new ParserException ("reset stream failed");
                        for (int i = 0; i < offset; i++)
                            if (new_chars[i] != buffer[i])
                                throw new EncodingChangeException ("character mismatch (new: "
                                + new_chars[i]
                                + " [0x"
                                + Integer.toString (new_chars[i], 16)
                                + "] != old: "
                                + " [0x"
                                + Integer.toString (buffer[i], 16)
                                + buffer[i]
                                + "]) for encoding change from "
                                + encoding
                                + " to "
                                + character_set
                                + " at character offset "
                                + i);
                    }
                }
                catch (IOException ioe)
                {
                    throw new ParserException (ioe.getMessage (), ioe);
                }
            }
            catch (IOException ioe)
            {   // bug #1044707 mark()/reset() issues
                throw new ParserException ("Stream reset failed ("
                    + ioe.getMessage ()
                    + "), try wrapping it with a org.htmlparser.lexer.Stream",
                    ioe);
            }
        }
    }

    /**
     * Fetch more characters from the underlying reader.
     * Has no effect if the underlying reader has been drained.
     * @param min The minimum to read.
     * @exception IOException If the underlying reader read() throws one.
     */
    protected void fill (int min)
        throws
            IOException
    {
        char[] buffer;
        int size;
        int read;

        if (null != mReader) // mReader goes null when it's been sucked dry
        {
            size = mBuffer.length - mLevel; // available space
            if (size < min) // oops, better get some buffer space
            {
                // unknown length... keep doubling
                size = mBuffer.length * 2;
                read = mLevel + min;
                if (size < read) // or satisfy min, whichever is greater
                    size = read;
                else
                    min = size - mLevel; // read the max
                buffer = new char[size];
            }
            else
            {
                buffer = mBuffer;
                min = size;
            }

            // read into the end of the 'new' buffer
            read = mReader.read (buffer, mLevel, min);
            if (EOF == read)
            {
                mReader.close ();
                mReader = null;
            }
            else
            {
                if (mBuffer != buffer)
                {   // copy the bytes previously read
                    System.arraycopy (mBuffer, 0, buffer, 0, mLevel);
                    mBuffer = buffer;
                }
                mLevel += read;
            }
            // todo, should repeat on read shorter than original min
        }
    }
    /**
     * Does nothing.
     * It's supposed to close the source, but use destroy() instead.
     * @exception IOException <em>not used</em>
     * @see #destroy
     */
    public void close () throws IOException
    {
    }

    /**
     * Read a single character.
     * This method will block until a character is available,
     * an I/O error occurs, or the end of the stream is reached.
     * @return The character read, as an integer in the range 0 to 65535
     * (<tt>0x00-0xffff</tt>), or {@link #EOF EOF} if the end of the stream has
     * been reached
     * @exception IOException If an I/O error occurs.
     */
    public int read () throws IOException
    {
        int ret;

        if (mLevel - mOffset < 1)
        {
            if (null == mStream)
                throw new IOException ("source is closed");
            fill (1);
            if (mOffset >= mLevel)
                ret = EOF;
            else
                ret = mBuffer[mOffset++];
        }
        else
            ret = mBuffer[mOffset++];

        return (ret);
    }

    /**
     * Read characters into a portion of an array.  This method will block
     * until some input is available, an I/O error occurs, or the end of the
     * stream is reached.
     * @param cbuf Destination buffer
     * @param off Offset at which to start storing characters
     * @param len Maximum number of characters to read
     * @return The number of characters read, or {@link #EOF EOF} if the end of
     * the stream has been reached
     * @exception IOException If an I/O error occurs.
     */
    public int read (char[] cbuf, int off, int len) throws IOException
    {
        int ret;

        if (null == mStream)
            throw new IOException ("source is closed");
        if ((null == cbuf) || (0 > off) || (0 > len))
            throw new IOException ("illegal argument read ("
                + ((null == cbuf) ? "null" : "cbuf")
                + ", " + off + ", " + len + ")");
        if (mLevel - mOffset < len)
            fill (len - (mLevel - mOffset)); // minimum to satisfy this request
        if (mOffset >= mLevel)
            ret = EOF;
        else
        {
            ret = Math.min (mLevel - mOffset, len);
            System.arraycopy (mBuffer, mOffset, cbuf, off, ret);
            mOffset += ret;
        }

        return (ret);
    }

    /**
     * Read characters into an array.
     * This method will block until some input is available, an I/O error occurs,
     * or the end of the stream is reached.
     * @param cbuf Destination buffer.
     * @return The number of characters read, or {@link #EOF EOF} if the end of
     * the stream has been reached.
     * @exception IOException If an I/O error occurs.
     */
    public int read (char[] cbuf) throws IOException
    {
        return (read (cbuf, 0, cbuf.length));
    }

    /**
     * Reset the source.
     * Repositions the read point to begin at zero.
     * @exception IllegalStateException If the source has been closed.
     */
    public void reset ()
        throws
            IllegalStateException
    {
        if (null == mStream)
            throw new IllegalStateException ("source is closed");
        if (-1 != mMark)
            mOffset = mMark;
        else
            mOffset = 0;
    }

    /**
     * Tell whether this source supports the mark() operation.
     * @return <code>true</code>.
     */
    public boolean markSupported ()
    {
        return (true);
    }

    /**
     * Mark the present position in the source.
     * Subsequent calls to {@link #reset()}
     * will attempt to reposition the source to this point.
     * @param  readAheadLimit <em>Not used.</em>
     * @exception IOException If the source is closed.
     *
     */
    public void mark (int readAheadLimit) throws IOException
    {
        if (null == mStream)
            throw new IOException ("source is closed");
        mMark = mOffset;
    }

    /**
     * Tell whether this source is ready to be read.
     * @return <code>true</code> if the next read() is guaranteed not to block
     * for input, <code>false</code> otherwise.
     * Note that returning false does not guarantee that the next read will block.
     * @exception IOException If the source is closed.
     */
    public boolean ready () throws IOException
    {
        if (null == mStream)
            throw new IOException ("source is closed");
        return (mOffset < mLevel);
    }

    /**
     * Skip characters.
     * This method will block until some characters are available,
     * an I/O error occurs, or the end of the stream is reached.
     * <em>Note: n is treated as an int</em>
     * @param n The number of characters to skip.
     * @return The number of characters actually skipped
     * @exception IllegalArgumentException If <code>n</code> is negative.
     * @exception IOException If an I/O error occurs.
     */
    public long skip (long n)
        throws
            IOException,
            IllegalArgumentException
    {
        long ret;

        if (null == mStream)
            throw new IOException ("source is closed");
        if (0 > n)
            throw new IllegalArgumentException ("cannot skip backwards");
        else
        {
            if (mLevel - mOffset < n)
                fill ((int)(n - (mLevel - mOffset))); // minimum to satisfy this request
            if (mOffset >= mLevel)
                ret = EOF;
            else
            {
                ret = Math.min (mLevel - mOffset, n);
                mOffset += ret;
            }
        }

        return (ret);
    }
   /**
     * Undo the read of a single character.
     * @exception IOException If the source is closed or no characters have
     * been read.
     */
    public void unread () throws IOException
    {
        if (null == mStream)
            throw new IOException ("source is closed");
        if (0 < mOffset)
            mOffset--;
        else
            throw new IOException ("can't unread no characters");
    }

    /**
     * Retrieve a character again.
     * @param offset The offset of the character.
     * @return The character at <code>offset</code>.
     * @exception IOException If the offset is beyond {@link #offset()} or the
     * source is closed.
     */
    public char getCharacter (int offset) throws IOException
    {
        char ret;

        if (null == mStream)
            throw new IOException ("source is closed");
        if (offset >= mBuffer.length)
            throw new IOException ("illegal read ahead");
        else
            ret = mBuffer[offset];
        
        return (ret);
    }

    /**
     * Retrieve characters again.
     * @param array The array of characters.
     * @param offset The starting position in the array where characters are to be placed.
     * @param start The starting position, zero based.
     * @param end The ending position
     * (exclusive, i.e. the character at the ending position is not included),
     * zero based.
     * @exception IOException If the start or end is beyond {@link #offset()}
     * or the source is closed.
     */
    public void getCharacters (char[] array, int offset, int start, int end) throws IOException
    {
        if (null == mStream)
            throw new IOException ("source is closed");
        System.arraycopy (mBuffer, start, array, offset, end - start);
    }
    
    /**
     * Retrieve a string.
     * @param offset The offset of the first character.
     * @param length The number of characters to retrieve.
     * @return A string containing the <code>length</code> characters at <code>offset</code>.
     * @exception IOException If the offset or (offset + length) is beyond
     * {@link #offset()} or the source is closed.
     */
    public String getString (int offset, int length) throws IOException
    {
        String ret;

        if (null == mStream)
            throw new IOException ("source is closed");
        if (offset + length > mBuffer.length)
            throw new IOException ("illegal read ahead");
        else
            ret = new String (mBuffer, offset, length);
        
        return (ret);
    }

    /**
     * Append characters already read into a <code>StringBuffer</code>.
     * @param buffer The buffer to append to.
     * @param offset The offset of the first character.
     * @param length The number of characters to retrieve.
     * @exception IOException If the offset or (offset + length) is beyond
     * {@link #offset()} or the source is closed.
     */
    public void getCharacters (StringBuffer buffer, int offset, int length) throws IOException
    {
        if (null == mStream)
            throw new IOException ("source is closed");
        buffer.append (mBuffer, offset, length);
    }

    /**
     * Close the source.
     * Once a source has been closed, further {@link #read() read},
     * {@link #ready ready}, {@link #mark mark}, {@link #reset reset},
     * {@link #skip skip}, {@link #unread unread},
     * {@link #getCharacter getCharacter} or {@link #getString getString}
     * invocations will throw an IOException.
     * Closing a previously-closed source, however, has no effect.
     * @exception IOException If an I/O error occurs
     */
    public void destroy () throws IOException
    {
        mStream = null;
        if (null != mReader)
            mReader.close ();
        mReader = null;
        mBuffer = null;
        mLevel = 0;
        mOffset = 0;
        mMark = -1;
    }

    /**
     * Get the position (in characters).
     * @return The number of characters that have already been read, or
     * {@link #EOF EOF} if the source is closed.
     */
    public int offset ()
    {
        int ret;

        if (null == mStream)
            ret = EOF;
        else
            ret = mOffset;

        return (ret);
    }

    /**
     * Get the number of available characters.
     * @return The number of characters that can be read without blocking or
     * zero if the source is closed.
     */
    public int available ()
    {
        int ret;

        if (null == mStream)
            ret = 0;
        else
            ret = mLevel - mOffset;

        return (ret);
    }
}

你可能感兴趣的:(Blog,UP)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
如何在 Fork 的 GitHub 项目中保留自己的修改并同步上游更新？github_fork_update iBaoxing github
如何在Fork的GitHub项目中保留自己的修改并同步上游更新？在GitHub上Fork了一个项目后，你可能会对项目进行一些修改，同时原作者也在不断更新。如果想要在保留自己修改的基础上，同步原作者的最新更新，很多人会不知所措。本文将详细讲解如何在不丢失自己改动的情况下，将上游仓库的更新合并到自己的仓库中。问题描述假设你在GitHub上Fork了一个项目，并基于该项目做了一些修改，随后你发现原作者对
【无标题】达瓦达瓦 JhonKI 考研
博客主页：https://blog.csdn.net/2301_779549673欢迎点赞收藏⭐留言如有错误敬请指正！本文由JohnKi原创，首发于CSDN未来很长，值得我们全力奔赴更美好的生活✨文章目录前言111️‍111❤️111111111111111总结111前言111骗骗流量券，嘿嘿111111111111111111111111111️‍111❤️111111111111111总结11
上图为是否色发 JhonKI 考研
博客主页：https://blog.csdn.net/2301_779549673欢迎点赞收藏⭐留言如有错误敬请指正！本文由JohnKi原创，首发于CSDN未来很长，值得我们全力奔赴更美好的生活✨文章目录前言111️‍111❤️111111111111111总结111前言111骗骗流量券，嘿嘿111111111111111111111111111️‍111❤️111111111111111总结11
每日一题——第八十八题互联网打工人no1 C语言程序设计每日一练 c语言
题目：输入一个9位的无符号整数，判断其是否有重复数字#include#include#includeintmain(){charnum_str[10];printf("请输入一个9位数的无符号数：");scanf_s("%9d",&num_str);if(strlen(num_str)!=9){printf("输入的不是一个9位无符号整数，请重新输入");}else{if(hasDuplicate
143234234123432 JhonKI 考研
博客主页：https://blog.csdn.net/2301_779549673欢迎点赞收藏⭐留言如有错误敬请指正！本文由JohnKi原创，首发于CSDN未来很长，值得我们全力奔赴更美好的生活✨文章目录前言111️‍111❤️111111111111111总结111前言111骗骗流量券，嘿嘿111111111111111111111111111️‍111❤️111111111111111总结11
SpringBlade dict-biz/list 接口 SQL 注入漏洞文章永久免费只为良心 oracle 数据库
SpringBladedict-biz/list接口SQL注入漏洞POC:构造请求包查看返回包你的网址/api/blade-system/dict-biz/list?updatexml(1,concat(0x7e,md5(1),0x7e),1)=1漏洞概述在SpringBlade框架中，如果dict-biz/list接口的后台处理逻辑没有正确地对用户输入进行过滤或参数化查询（PreparedSta
Python中深拷贝与浅拷贝的区别 yuxiaoyu.
转自：http://blog.csdn.net/u014745194/article/details/70271868定义：在Python中对象的赋值其实就是对象的引用。当创建一个对象，把它赋值给另一个变量的时候，python并没有拷贝这个对象，只是拷贝了这个对象的引用而已。浅拷贝：拷贝了最外围的对象本身，内部的元素都只是拷贝了一个引用而已。也就是，把对象复制一遍，但是该对象中引用的其他对象我不复
ExpRe[25] bash外的其它shell：zsh和fish tritone ExpRe bash linux ubuntu shell
文章目录zsh基础配置实用特性插件`autojump`语法高亮自动补全fish优点缺点时效性本篇撰写时间为2021.12.15，由于计算机技术日新月异，博客中所有内容都有时效和版本限制，具体做法不一定总行得通，链接可能改动失效，各种软件的用法可能有修改。但是其中透露的思想往往是值得学习的。本篇前置：ExpRe[10]Ubuntu[2]准备神秘软件、备份恢复软件https://www.cnblogs
Python实现简单的机器学习算法 master_chenchengg python python 办公效率 python开发 IT
Python实现简单的机器学习算法开篇：初探机器学习的奇妙之旅搭建环境：一切从安装开始必备工具箱第一步：安装Anaconda和JupyterNotebook小贴士：如何配置Python环境变量算法初体验：从零开始的Python机器学习线性回归：让数据说话数据准备：从哪里找数据编码实战：Python实现线性回归模型评估：如何判断模型好坏逻辑回归：从分类开始理论入门：什么是逻辑回归代码实现：使用skl
2024.9.14 Python，差分法解决区间加法，消除游戏，压缩字符串 RaidenQ python 游戏开发语言算法力扣
1.区间加法假设你有一个长度为n的数组，初始情况下所有的数字均为0，你将会被给出k个更新的操作。其中，每个操作会被表示为一个三元组：[startIndex,endIndex,inc]，你需要将子数组A[startIndex…endIndex]（包括startIndex和endIndex）增加inc。请你返回k次操作后的数组。示例:输入:length=5,updates=[[1,3,2],[2,4,
matlab mle 优化,MLE+: Matlab Toolbox for Integrated Modeling, Control and Optimization for Buildings... Simon Zhong matlab mle 优化
摘要：FollowingunilateralopticnervesectioninadultPVGhoodedrat,theaxonguidancecueephrin-A2isup-regulatedincaudalbutnotrostralsuperiorcolliculus(SC)andtheEphA5receptorisdown-regulatedinaxotomisedretinalgan
Vue中table合并单元格用法 weixin_30613343 javascript ViewUI
地名结果人名性别{{item.name}}已完成未完成{{item.groups[0].name}}{{item.groups[0].sex}}{{item.groups[son].name}}{{item.groups[son].sex}}exportdefault{data(){return{list:[{name:'地名1',result:'1',groups:[{name:'张三',sex
SpringCloudAlibaba—Sentinel(限流) 菜鸟爪哇
前言：自己在学习过程的记录，借鉴别人文章，记录自己实现的步骤。借鉴文章：https://blog.csdn.net/u014494148/article/details/105484410Sentinel介绍Sentinel诞生于阿里巴巴，其主要目标是流量控制和服务熔断。Sentinel是通过限制并发线程的数量（即信号隔离）来减少不稳定资源的影响，而不是使用线程池，省去了线程切换的性能开销。当资源
光盘文件系统 (iso9660) 格式解析穷人小水滴光盘文件系统 iso9660 deno GNU/Linux javascript
越简单的系统,越可靠,越不容易出问题.光盘文件系统(iso9660)十分简单,只需不到200行代码,即可实现定位读取其中的文件.参考资料:https://wiki.osdev.org/ISO_9660相关文章:《光盘防水嘛?DVD+R刻录光盘泡水实验》https://blog.csdn.net/secext2022/article/details/140583910《光驱的内部结构及日常使用》ht
springboot+vue项目实战一-创建SpringBoot简单项目苹果酱0567 面试题汇总与解析 spring boot 后端 java 中间件开发语言
这段时间抽空给女朋友搭建一个个人博客，想着记录一下建站的过程，就当做笔记吧。虽然复制zjblog只要一个小时就可以搞定一个网站，或者用cms系统，三四个小时就可以做出一个前后台都有的网站，而且想做成啥样也都行。但是就是要从新做，自己做的意义不一样，更何况，俺就是专门干这个的，嘿嘿嘿要做一个网站，而且从零开始，首先呢就是技术选型了，经过一番思量决定选择-SpringBoot做后端，前端使用Vue做一
科幻游戏《外卖员模拟器》主要地理环境设定 (1) 穷人小水滴游戏科幻设计
游戏名称:《外卖员模拟器》(英文名称:waimai_se)作者:穷人小水滴本故事纯属虚构,如有雷同实属巧合.故事发生在一个(架空)平行宇宙的地球,21世纪(超低空科幻流派).相关文章:https://blog.csdn.net/secext2022/article/details/141790630目录1星球整体地理设定2巨蛇国主要设定3海蛇市主要设定3.1主要地标建筑3.2交通3.3能源(电力)
C++ lambda闭包消除类成员变量 barbyQAQ c++c++java 算法
原文链接：https://blog.csdn.net/qq_51470638/article/details/142151502一、背景在面向对象编程时，常常要添加类成员变量。然而类成员一旦多了之后，也会带来干扰。拿到一个类，一看成员变量好几十个，就问你怕不怕？二、解决思路可以借助函数式编程思想，来消除一些不必要的类成员变量。三、实例举个例子：classClassA{public:...intfu
tiff批量转png 诺有缸的高飞鸟 opencv 图像处理 python opencv 图像处理
目录写在前面代码完写在前面1、本文内容tiff批量转png2、平台/环境opencv,python3、转载请注明出处：https://blog.csdn.net/qq_41102371/article/details/132975023代码importnumpyasnpimportcv2importosdeffindAllFile(base):file_list=[]forroot,ds,fsin
博客网站制作教程 2401_85194651 java maven
首先就是技术框架：后端：Java+SpringBoot数据库：MySQL前端：Vue.js数据库连接：JPA(JavaPersistenceAPI)1.项目结构blog-app/├──backend/│├──src/main/java/com/example/blogapp/││├──BlogApplication.java││├──config/│││└──DatabaseConfig.java
00. 这里整理了最全的爬虫框架（Java + Python）有一只柴犬爬虫系列爬虫 java python
目录1、前言2、什么是网络爬虫3、常见的爬虫框架3.1、java框架3.1.1、WebMagic3.1.2、Jsoup3.1.3、HttpClient3.1.4、Crawler4j3.1.5、HtmlUnit3.1.6、Selenium3.2、Python框架3.2.1、Scrapy3.2.2、BeautifulSoup+Requests3.2.3、Selenium3.2.4、PyQuery3.2
详解：如何设计出健壮的秒杀系统？夜空_2cd3
作者：Yrion博客园：cnblogs.com/wyq178/p/11261711.html前言：秒杀系统相信很多人见过，比如京东或者淘宝的秒杀，小米手机的秒杀。那么秒杀系统的后台是如何实现的呢？我们如何设计一个秒杀系统呢？对于秒杀系统应该考虑哪些问题？如何设计出健壮的秒杀系统？本期我们就来探讨一下这个问题：image目录一：****秒杀系统应该考虑的问题二：****秒杀系统的设计和技术方案三：*
【树一线性代数】005入门 Owlet_woodBird 算法
Index本文稍后补全，推荐阅读：https://blog.csdn.net/weixin_60702024/article/details/141874376分析实现总结本文稍后补全，推荐阅读：https://blog.csdn.net/weixin_60702024/article/details/141874376已知非空二叉树T的结点值均为正整数，采用顺序存储方式保存，数据结构定义如下:t
Ubuntu18.04 Docker部署Kinship(Django)项目过程 Dante617
1Docker的安装https://blog.csdn.net/weixin_41735055/article/details/1003551792下载镜像dockerpullprogramize/python3.6.8-dlib下载的镜像里包含python3.6.8和dlib19.17.03启动镜像dockerrun-it--namekinship-p7777:80-p3307:3306-p55
leetcode中等.数组(21-40)python 九日火 python leetcode
80.RemoveDuplicatesfromSortedArrayII(m-21)Givenasortedarraynums,removetheduplicatesin-placesuchthatduplicatesappearedatmosttwiceandreturnthenewlength.Donotallocateextraspaceforanotherarray,youmustdoth
使用datepicker和uploadify的冲突解决（IE双击才能打开附件上传对话框） zhanglb12
在开发的过程当中，IE的兼容无疑是我们的一块绊脚石，在我们使用的如期的datepicker插件和使用上传附件的uploadify插件的时候，两者就产生冲突，只要点击过时间的插件，uploadify上传框要双才能打开ie浏览器提示错误Missinginstancedataforthisdatepicker解决方案//if(.browser.msie&&'9.0'===.browser.version
[Swift]LeetCode943. 最短超级串 | Find the Shortest Superstring 黄小二哥 swift
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★➤微信公众号：山青咏芝（shanqingyongzhi）➤博客园地址：山青咏芝（https://www.cnblogs.com/strengthen/）➤GitHub地址：https://github.com/strengthen/LeetCode➤原文地址：https://www.cnblogs.com/streng
[Swift]LeetCode767. 重构字符串 | Reorganize String weixin_30591551 swift runtime
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★➤微信公众号：山青咏芝（shanqingyongzhi）➤博客园地址：山青咏芝（https://www.cnblogs.com/strengthen/）➤GitHub地址：https://github.com/strengthen/LeetCode➤原文地址：https://www.cnblogs.com/streng
前端代码上传文件余生逆风飞翔前端 javascript 开发语言
点击上传文件import{ElNotification}from'element-plus'import{API_CONFIG}from'../config/index.js'import{UploadFilled}from'@element-plus/icons-vue'import{reactive}from'vue'import{BASE_URL}from'../config/index'i
Dockerfile命令详解之 FROM 清风怎不知意容器化 java 前端 javascript
许多同学不知道Dockerfile应该如何写，不清楚Dockerfile中的指令分别有什么意义，能达到什么样的目的，接下来我将在容器化专栏中详细的为大家解释每一个指令的含义以及用法。专栏订阅传送门https://blog.csdn.net/qq_38220908/category_11989778.html指令不区分大小写。但是，按照惯例，它们应该是大写的，以便更容易地将它们与参数区分开来。(引用
java线程Thread和Runnable区别和联系 zx_code java jvm thread 多线程 Runnable
我们都晓得java实现线程2种方式，一个是继承Thread，另一个是实现Runnable。模拟窗口买票，第一例子继承thread，代码如下 package thread; public class ThreadTest { public static void main(String[] args) { Thread1 t1 = new Thread1(
【转】JSON与XML的区别比较丁_新 json xml
1.定义介绍 (1).XML定义扩展标记语言 (Extensible Markup Language, XML) ，用于标记电子文件使其具有结构性的标记语言，可以用来标记数据、定义数据类型，是一种允许用户对自己的标记语言进行定义的源语言。 XML使用DTD(document type definition)文档类型定义来组织数据;格式统一，跨平台和语言，早已成为业界公认的标准。 XML是标
c++ 实现五种基础的排序算法 CrazyMizzz C++c 算法
#include<iostream> using namespace std; //辅助函数，交换两数之值 template<class T> void mySwap(T &x, T &y){ T temp = x; x = y; y = temp; } const int size = 10; //一、用直接插入排
我的软件麦田的设计者我的软件音乐类娱乐放松
这是我写的一款app软件，耗时三个月，是一个根据央视节目开门大吉改变的，提供音调，猜歌曲名。1、手机拥有者在android手机市场下载本APP，同意权限，安装到手机上。2、游客初次进入时会有引导页面提醒用户注册。（同时软件自动播放背景音乐）。3、用户登录到主页后，会有五个模块。a、点击不胫而走，用户得到开门大吉首页部分新闻，点击进入有新闻详情。b、
linux awk命令详解被触发 linux awk
awk是行处理器: 相比较屏幕处理的优点，在处理庞大文件时不会出现内存溢出或是处理缓慢的问题，通常用来格式化文本信息 awk处理过程: 依次对每一行进行处理，然后输出 awk命令形式: awk [-F|-f|-v] ‘BEGIN{} //{command1; command2} END{}’ file [-F|-f|-v]大参数，-F指定分隔符，-f调用脚本，-v定义变量 var=val
各种语言比较 _wy_ 编程语言
Java Ruby PHP 擅长领域
oracle 中数据类型为clob的编辑知了ing oracle clob
public void updateKpiStatus(String kpiStatus,String taskId){ Connection dbc=null; Statement stmt=null; PreparedStatement ps=null; try { dbc = new DBConn().getNewConnection(); //stmt = db
分布式服务框架 Zookeeper -- 管理分布式环境中的数据矮蛋蛋 zookeeper
原文地址： http://www.ibm.com/developerworks/cn/opensource/os-cn-zookeeper/ 安装和配置详解本文介绍的 Zookeeper 是以 3.2.2 这个稳定版本为基础，最新的版本可以通过官网 http://hadoop.apache.org/zookeeper/来获取，Zookeeper 的安装非常简单，下面将从单机模式和集群模式两
tomcat数据源 alafqq tomcat
数据库 JNDI(Java Naming and Directory Interface，Java命名和目录接口)是一组在Java应用中访问命名和目录服务的API。没有使用JNDI时我用要这样连接数据库： 03. Class.forName("com.mysql.jdbc.Driver"); 04. conn
遍历的方法百合不是茶遍历
遍历在java的泛
linux查看硬件信息的命令 bijian1013 linux
linux查看硬件信息的命令一.查看CPU： cat /proc/cpuinfo 二.查看内存： free 三.查看硬盘： df linux下查看硬件信息 1、lspci 列出所有PCI 设备； lspci - list all PCI devices:列出机器中的PCI设备（声卡、显卡、Modem、网卡、USB、主板集成设备也能
java常见的ClassNotFoundException bijian1013 java
1.java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory 添加包common-logging.jar2.java.lang.ClassNotFoundException: javax.transaction.Synchronization
【Gson五】日期对象的序列化和反序列化 bit1129 反序列化
对日期类型的数据进行序列化和反序列化时，需要考虑如下问题： 1. 序列化时，Date对象序列化的字符串日期格式如何 2. 反序列化时，把日期字符串序列化为Date对象，也需要考虑日期格式问题 3. Date A -> str -> Date B,A和B对象是否equals 默认序列化和反序列化 import com
【Spark八十六】Spark Streaming之DStream vs. InputDStream bit1129 Stream
1. DStream的类说明文档： /** * A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous * sequence of RDDs (of the same type) representing a continuous st
通过nginx获取header信息 ronin47 nginx header
1. 提取整个的Cookies内容到一个变量，然后可以在需要时引用，比如记录到日志里面， if ( $http_cookie ~* "(.*)$") { set $all_cookie $1; } 变量$all_cookie就获得了cookie的值，可以用于运算了
java-65.输入数字n，按顺序输出从1最大的n位10进制数。比如输入3，则输出1、2、3一直到最大的3位数即999 bylijinnan java
参考了网上的http://blog.csdn.net/peasking_dd/article/details/6342984 写了个java版的： public class Print_1_To_NDigit { /** * Q65.输入数字n，按顺序输出从1最大的n位10进制数。比如输入3，则输出1、2、3一直到最大的3位数即999 * 1.使用字符串
Netty源码学习-ReplayingDecoder bylijinnan java netty
ReplayingDecoder是FrameDecoder的子类，不熟悉FrameDecoder的，可以先看看 http://bylijinnan.iteye.com/blog/1982618 API说，ReplayingDecoder简化了操作，比如： FrameDecoder在decode时，需要判断数据是否接收完全： public class IntegerH
js特殊字符过滤 cngolon js特殊字符 js特殊字符过滤
1.js中用正则表达式过滤特殊字符, 校验所有输入域是否含有特殊符号function stripscript(s) { var pattern = new RegExp("[`~!@#$^&*()=|{}':;',\\[\\].<>/?~！@#￥……&*（）——|{}【】‘；：”“'。，、？]"
hibernate使用sql查询 ctrain Hibernate
import java.util.Iterator; import java.util.List; import java.util.Map; import org.hibernate.Hibernate; import org.hibernate.SQLQuery; import org.hibernate.Session; import org.hibernate.Transa
linux shell脚本中切换用户执行命令方法 daizj linux shell 命令切换用户
经常在写shell脚本时，会碰到要以另外一个用户来执行相关命令，其方法简单记下： 1、执行单个命令：su - user -c "command" 如：下面命令是以test用户在/data目录下创建test123目录 [root@slave19 /data]# su - test -c "mkdir /data/test123"
好的代码里只要一个 return 语句 dcj3sjt126com return
别再这样写了：public boolean foo() { if (true) { return true; } else { return false;
Android动画效果学习 dcj3sjt126com android
1、透明动画效果方法一：代码实现 public View onCreateView(LayoutInflater inflater, ViewGroup container, Bundle savedInstanceState) { View rootView = inflater.inflate(R.layout.fragment_main, container, fals
linux复习笔记之bash shell (4)管道命令 eksliang linux管道命令汇总 linux管道命令 linux常用管道命令
转载请出自出处： http://eksliang.iteye.com/blog/2105461 bash命令执行的完毕以后，通常这个命令都会有返回结果，怎么对这个返回的结果做一些操作呢？那就得用管道命令‘|’。上面那段话，简单说了下管道命令的作用，那什么事管道命令呢？答：非常的经典的一句话，记住了，何为管
Android系统中自定义按键的短按、双击、长按事件 gqdy365 android
在项目中碰到这样的问题：由于系统中的按键在底层做了重新定义或者新增了按键，此时需要在APP层对按键事件（keyevent）做分解处理，模拟Android系统做法，把keyevent分解成： 1、单击事件：就是普通key的单击； 2、双击事件：500ms内同一按键单击两次； 3、长按事件：同一按键长按超过1000ms（系统中长按事件为500ms）； 4、组合按键：两个以上按键同时按住；
asp.net获取站点根目录下子目录的名称 hvt .net C#asp.net hovertree Web Forms
使用Visual Studio建立一个.aspx文件(Web Forms)，例如hovertree.aspx,在页面上加入一个ListBox代码如下： <asp:ListBox runat="server" ID="lbKeleyiFolder" /> 那么在页面上显示根目录子文件夹的代码如下： string[] m_sub
Eclipse程序员要掌握的常用快捷键 justjavac java eclipse 快捷键 ide
判断一个人的编程水平，就看他用键盘多，还是鼠标多。用键盘一是为了输入代码（当然了，也包括注释），再有就是熟练使用快捷键。曾有人在豆瓣评《卓有成效的程序员》：“人有多大懒，才有多大闲”。之前我整理了一个程序员图书列表，目的也就是通过读书，让程序员变懒。写道程序员作为特殊的群体，有的人可以这么懒，懒到事情都交给机器去做，而有的人又可
c++编程随记 lx.asymmetric C++笔记
为了字体更好看，改变了格式…… &&运算符： #include<iostream> using namespace std; int main(){ int a=-1,b=4,k; k=(++a<0)&&!(b--
linux标准IO缓冲机制研究音频数据 linux
一、什么是缓存I/O(Buffered I/O)缓存I/O又被称作标准I/O,大多数文件系统默认I/O操作都是缓存I/O。在Linux的缓存I/O机制中，操作系统会将I/O的数据缓存在文件系统的页缓存(page cache)中，也就是说，数据会先被拷贝到操作系统内核的缓冲区中，然后才会从操作系统内核的缓冲区拷贝到应用程序的地址空间。1.缓存I/O有以下优点:A.缓存I/O使用了操作系统内核缓冲区，
随想生活暗黑小菠萝生活
其实账户之前就申请了，但是决定要自己更新一些东西看也是最近。从毕业到现在已经一年了。没有进步是假的，但是有多大的进步可能只有我自己知道。毕业的时候班里12个女生，真正最后做到软件开发的只要两个包括我，PS：我不是说测试不好。当时因为考研完全放弃找工作，考研失败，我想这只是我的借口。那个时候才想到为什么大学的时候不能好好的学习技术，增强自己的实战能力，以至于后来找工作比较费劲。我
我认为POJO是一个错误的概念 windshome java POJO 编程 J2EE 设计
这篇内容其实没有经过太多的深思熟虑，只是个人一时的感觉。从个人风格上来讲，我倾向简单质朴的设计开发理念；从方法论上，我更加倾向自顶向下的设计；从做事情的目标上来看，我追求质量优先，更愿意使用较为保守和稳妥的理念和方法。 &