wu1g119

java pdf转html

项目中有的功能需要用到pdf2html,拿到这个功能需求第一反应是去找是否有开源的现有代码, 重复造轮子不划算嘛,

经过各种比对各种试验,最后找到了tika-app 这是apache提供了对各种格式文件进行解析的解决方案.

然后根据这个到git上找到了实例代码,https://github.com/neumino/PDF-to-standard-HTML

不过这个大哥的例子中有点使用细节不太好,实际使用中稍微需要修改一下.

项目需要用到的支持jar,其中fontbox-x.jar和tika-app-x.jar可到官网自行下载,tika-app-x.jar新版本近50M左右,非常巨大.

以下是tika-app转换代码

package com.neumino.pdftostandardhtml;

import org.apache.pdfbox.pdfparser.PDFStreamParser;
import org.apache.pdfbox.pdfwriter.ContentStreamWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDStream;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDFontDescriptor;
import org.apache.pdfbox.util.PDFImageWriter;
import org.apache.pdfbox.util.PDFOperator;
import org.apache.pdfbox.util.PDFTextStripper;
import org.apache.pdfbox.util.TextPosition;

import java.awt.image.BufferedImage;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;

import java.util.ArrayList;
import java.util.List;


public class Pdf2Html extends PDFTextStripper {
    private BufferedWriter htmlFile;
    private String outputFileName;
    private int type = 0;
    private float zoom = (float) 2;
    private int marginTopBackground = 0;
    private int lastMarginTop = 0;
    private int max_gap = 15;
    float previousAveCharWidth = -1;
    private int resolution = 72; //default resolution
    private boolean needToStartNewSpan = false;
    private int lastMarginLeft = 0;
    private int lastMarginRight = 0;
    private int numberSpace = 0;
    private int sizeAllSpace = 0;
    private boolean addSpace;
    private int startXLine;
    private boolean wasBold = false;
    private boolean wasItalic = false;
    private int lastFontSizePx = 0;
    private String lastFontString = "";
    private StringBuffer currentLine = new StringBuffer();

    /**
     * Public constructor
     * @param outputFileName The html file
     * @param type represents how we are going to create the html file
     *                         0: we create a new block for every letters
     *                         1: we create a new block for every words
     *                         2: we create a new block for every line
     *                         3: we create a new block for every line - using a cache to set the word-spacing property
     * @param zoom 1.5 - 2 is a good range
     * @throws IOException
     */
    public Pdf2Html(String outputFileName, int type, float zoom) throws IOException {
        try {
            htmlFile = new BufferedWriter(new OutputStreamWriter(
                        new FileOutputStream(outputFileName), "UTF8"));

            String header = "" + "" +
                "" +
                "Html file" + "" +
                "" + "";
            htmlFile.write(header);
            this.type = type;
            this.zoom = zoom;
            this.outputFileName = outputFileName;
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
            System.err.println("Error: Unsupported encoding.");
            System.exit(1);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
            System.err.println("Error: File not found.");
            System.exit(1);
        } catch (IOException e) {
            e.printStackTrace();
            System.err.println("Error: IO error, could not open html file.");
            System.exit(1);
        }
    }

    /**
     * Close the HTML file
     */
    public void closeFile() {
        try {
            htmlFile.close();
        } catch (IOException e) {
            e.printStackTrace();
            System.err.println("Error: IO error, could not close html file.");
            System.exit(1);
        }
    }

    /**
     * Convert a PDF file to HTML
     *
     * @param pathToPdf Path to the PDF file
     *
     * @throws IOException If there is an error processing the operation.
     */
    public void convertPdfToHtml(String pathToPdf) throws Exception {
        int positionDotPdf = pathToPdf.lastIndexOf(".pdf");

        if (positionDotPdf == -1) {
            System.err.println("File doesn't have .pdf extension");
            System.exit(1);
        }

        int positionLastSlash = outputFileName.lastIndexOf("/");

        if (positionLastSlash == -1) {
            positionLastSlash = 0;
        } else {
            positionLastSlash++;
        }

        String fileName = outputFileName.substring(positionLastSlash,outputFileName.lastIndexOf(".html"));

        PDDocument document = null;

        try {
            document = PDDocument.load(pathToPdf);

            if (document.isEncrypted()) {
                document.decrypt("");
            }

            List allPages = document.getDocumentCatalog().getAllPages();

            // Retrieve and save text in the HTML file
            for (int i = 0; i < allPages.size(); i++) {
                System.out.println("Processing page " + i);

                PDPage page = (PDPage) allPages.get(i);

                BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, resolution);

                htmlFile.write("");
                marginTopBackground += (zoom * image.getHeight());

                PDStream contents = page.getContents();

                if (contents != null) {
                    this.processStream(page, page.findResources(), page.getContents().getStream());
                }

                htmlFile.write("");
                htmlFile.write("");
            }

            // Remove the text
            for (int i = 0; i < allPages.size(); i++) {
                PDPage page = (PDPage) allPages.get(i);
                PDFStreamParser parser = new PDFStreamParser(page.getContents());
                parser.parse();

                List tokens = parser.getTokens();
                List newTokens = new ArrayList();

                for (int j = 0; j < tokens.size(); j++) {
                    Object token = tokens.get(j);

                    if (token instanceof PDFOperator) {
                        PDFOperator op = (PDFOperator) token;

                        if (op.getOperation().equals("TJ") || op.getOperation().equals("Tj")) {
                            newTokens.remove(newTokens.size() - 1);

                            continue;
                        }
                    }

                    newTokens.add(token);
                }

                PDStream newContents = new PDStream(document);
                ContentStreamWriter writer = new ContentStreamWriter(newContents.createOutputStream());
                writer.writeTokens(newTokens);
                //newContents.addCompression(); //Looks like it faster without the compression, but no extensive tests have been run.
                page.setContents(newContents);
            }

            //Save background images
            //TODO: Do not save the image if it's blank. (Retrieve the text of one page, remove it from the document, get the image, check if it's blank, save it or not and write the html file)
            PDFImageWriter imageWriter = new PDFImageWriter();

            String imageFormat = "png";
            String password = "";
            int startPage = 1;
            int endPage = Integer.MAX_VALUE;
            String outputPrefix = outputFileName.substring(0, outputFileName.lastIndexOf("/") + 1) +
                fileName + "/";
            int imageType = BufferedImage.TYPE_INT_RGB;

            //new dir
            File newdir = new File(outputPrefix);

            if (!newdir.exists()) {
                newdir.mkdirs();
            }

            boolean success = imageWriter.writeImage(document, imageFormat, password, startPage,endPage, outputPrefix + fileName, imageType, (int) (resolution * zoom));

            if (!success) {
                System.err.println("Error: no writer found for image format '" + imageFormat + "'");
                System.exit(1);
            }
        } finally {
            if (document != null) {
                document.close();
            }
        }
    }

    /**
     * A method provided as an event interface to allow a subclass to perform
     * some specific functionality when text needs to be processed.
     *
     * @param text The text to be processed
     */
    protected void processTextPosition(TextPosition text) {
        try {
            int marginLeft = (int) ((text.getXDirAdj()) * zoom);
            int fontSizePx = Math.round(text.getFontSizeInPt() / 72 * resolution * zoom);
            int marginTop = (int) (((text.getYDirAdj()) * zoom) - fontSizePx);

            String fontString = "";
            PDFont font = text.getFont();
            PDFontDescriptor fontDescriptor = font.getFontDescriptor();

            if (fontDescriptor != null) {
                fontString = fontDescriptor.getFontName();
            } else {
                fontString = "";
            }

            int indexPlus = fontString.indexOf("+");

            if (indexPlus != -1) {
                fontString = fontString.substring(indexPlus + 1);
            }

            boolean isBold = fontString.contains("Bold");
            boolean isItalic = fontString.contains("Italic");

            int indexDash = fontString.indexOf("-");

            if (indexDash != -1) {
                fontString = fontString.substring(0, indexDash);
            }

            int indexComa = fontString.indexOf(",");

            if (indexComa != -1) {
                fontString = fontString.substring(0, indexComa);
            }

            switch (type) {
            case 0:
                renderingSimple(text, marginLeft, marginTop, fontSizePx, fontString, isBold,
                    isItalic);

                break;

            case 1:
                renderingGroupByWord(text, marginLeft, marginTop, fontSizePx, fontString, isBold,
                    isItalic);

                break;

            case 2:
                renderingGroupByLineNoCache(text, marginLeft, marginTop, fontSizePx, fontString,
                    isBold, isItalic);

                break;

            case 3:
                renderingGroupByLineWithCache(text, marginLeft, marginTop, fontSizePx, fontString,
                    isBold, isItalic);

                break;

            default:
                renderingSimple(text, marginLeft, marginTop, fontSizePx, fontString, isBold,
                    isItalic);

                break;
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    /**
     * The method that given one character is going to write it in the HTML file.
     *
     * @param text
     * @param marginLeft
     * @param marginTop
     * @param fontSizePx
     * @param fontString
     * @param isBold
     * @param isItalic
     * @throws IOException
    
     */
    private void renderingSimple(TextPosition text, int marginLeft, int marginTop, int fontSizePx,
        String fontString, boolean isBold, boolean isItalic) throws IOException {
        htmlFile.write("");

        htmlFile.write(text.getCharacter());

        htmlFile.write("");
    }

    /**
     * The method that given one character is going to write it only if it's the end of a word in the HTML file.
     *
     * @param text
     * @param marginLeft
     * @param marginTop
     * @param fontSizePx
     * @param fontString
     * @param isBold
     * @param isItalic
     * @throws IOException
    
     */
    private void renderingGroupByWord(TextPosition text, int marginLeft, int marginTop,
        int fontSizePx, String fontString, boolean isBold, boolean isItalic)
        throws IOException {
        if (lastMarginTop == marginTop) {
            if ((needToStartNewSpan) || (wasBold != isBold) || (wasItalic != isItalic) ||
                    (lastFontSizePx != fontSizePx) || (lastMarginLeft > marginLeft) ||
                    ((marginLeft - lastMarginRight) > max_gap)) {
                if (lastMarginTop != 0) {
                    htmlFile.write("");
                }

                htmlFile.write("");

                needToStartNewSpan = false;
            }

            if (text.getCharacter().equals(" ")) {
                htmlFile.write(" ");

                needToStartNewSpan = true;
            } else {
                htmlFile.write(text.getCharacter().replace("<", "<").replace(">", ">"));
            }
        } else {
            if (text.getCharacter().equals(" ")) {
                htmlFile.write(" ");
                needToStartNewSpan = true;
            } else {
                needToStartNewSpan = false;

                if (lastMarginTop != 0) {
                    htmlFile.write("");
                }

                htmlFile.write("");

                htmlFile.write(text.getCharacter().replace("<", "<").replace(">", ">"));
            }

            lastMarginTop = marginTop;
        }

        lastMarginLeft = marginLeft;
        lastMarginRight = (int) (marginLeft + text.getWidth());

        wasBold = isBold;
        wasItalic = isItalic;
        lastFontSizePx = fontSizePx;
    }

    /**
     * The method that given one character is going to write it only if it's the end of a line in the HTML file.
     *
     * @param text
     * @param marginLeft
     * @param marginTop
     * @param fontSizePx
     * @param fontString
     * @param isBold
     * @param isItalic
     * @throws IOException
    
     */
    private void renderingGroupByLineNoCache(TextPosition text, int marginLeft, int marginTop,
        int fontSizePx, String fontString, boolean isBold, boolean isItalic)
        throws IOException {
        if (lastMarginTop == marginTop) {
            if (lastMarginLeft > marginLeft) {
                htmlFile.write("");
                htmlFile.write("");
            }

            lastMarginTop = marginTop;
        } else {
            if (lastMarginTop != 0) {
                htmlFile.write("");
            }

            htmlFile.write("");

            lastMarginTop = marginTop;
        }

        htmlFile.write(text.getCharacter().replace("<", "<").replace(">", ">"));
        lastMarginLeft = marginLeft;
    }

    /**
     * The method that given one character is going to write it only if it's the end of a line in the HTML file.
     * A cache is used to set the word-spacing property.
     *
     * @param text
     * @param marginLeft
     * @param marginTop
     * @param fontSizePx
     * @param fontString
     * @param isBold
     * @param isItalic
     * @throws IOException
    
     */
    private void renderingGroupByLineWithCache(TextPosition text, int marginLeft, int marginTop,
        int fontSizePx, String fontString, boolean isBold, boolean isItalic)
        throws IOException {
        if ((marginLeft - lastMarginRight) > text.getWidthOfSpace()) {
            currentLine.append(" ");

            sizeAllSpace += (marginLeft - lastMarginRight);
            numberSpace++;
            addSpace = false;
        }

        if ((lastMarginTop != marginTop) || (!lastFontString.equals(fontString)) ||
                (wasBold != isBold) || (wasItalic != isItalic) || (lastFontSizePx != fontSizePx) ||
                (lastMarginLeft > marginLeft) || ((marginLeft - lastMarginRight) > 150)) {
            if (lastMarginTop != 0) {
                boolean display = true;

                // if the bloc is empty, we do not display it (for a lighter result)
                if (currentLine.length() == 1) {
                    char firstChar = currentLine.charAt(0);

                    if (firstChar == ' ') {
                        display = false;
                    }
                }

                if (display) {
                    if (numberSpace != 0) {
                        int spaceWidth = Math.round((((float) sizeAllSpace) / ((float) numberSpace)) -
                                text.getWidthOfSpace());
                        htmlFile.write("");

                    htmlFile.write(currentLine.toString());

                    htmlFile.write("\n");
                }
            }

            numberSpace = 0;
            sizeAllSpace = 0;

            currentLine = new StringBuffer();
            startXLine = marginLeft;
            lastMarginTop = marginTop;
            wasBold = isBold;
            wasItalic = isItalic;
            lastFontSizePx = fontSizePx;
            lastFontString = fontString;

            addSpace = false;
        } else {
            int sizeCurrentSpace = (int) (marginLeft - lastMarginRight - text.getWidthOfSpace());

            if (sizeCurrentSpace > 5) {
                if (lastMarginTop != 0) {
                    if (numberSpace != 0) {
                        int spaceWidth = Math.round((((float) sizeAllSpace) / ((float) numberSpace)) -
                                text.getWidthOfSpace());
                        htmlFile.write("");

                    htmlFile.write(currentLine.toString());

                    htmlFile.write("\n");
                }

                numberSpace = 0;
                sizeAllSpace = 0;

                currentLine = new StringBuffer();
                startXLine = marginLeft;
                lastMarginTop = marginTop;
                wasBold = isBold;
                wasItalic = isItalic;
                lastFontSizePx = fontSizePx;
                lastFontString = fontString;

                addSpace = false;
            } else {
                if (addSpace) {
                    currentLine.append(" ");

                    sizeAllSpace += (marginLeft - lastMarginRight);
                    numberSpace++;
                    addSpace = false;
                }
            }
        }

        if (text.getCharacter().equals(" ")) {
            addSpace = true;

            //sizeAllSpace += text.getWidthOfSpace();
        } else {
            currentLine.append(text.getCharacter().replace("<", "<").replace(">", ">"));
        }

        lastMarginLeft = marginLeft;
        lastMarginRight = (int) (marginLeft + (text.getWidth() * zoom));
    }
}

转换调用代码

package test;

import com.neumino.pdftostandardhtml.Pdf2Html;

public class T1 {

	public static void main(String[] args) {
//		if (args.length != 4) {
//			System.out.println("Invalid arguments");
//			System.out.println("Use: java -jar PDF-to-standard-HTML.jar path/to/pdf/file.pdf path/to/output/file.html type zoom");
//			System.out.println("type is an int:");
//			System.out.println("0 for the simplest method");
//			System.out.println("1 to group letters by word");
//			System.out.println("2 to group letters by line");
//			System.out.println("3 to group letters by line using a cache");
//			System.exit(1);
//		}
		long s=System.currentTimeMillis();
		
		String pdfFileName = "d:/test3.pdf";

		String outputFileName =  "D:/t2/t8.html";
		int type =  2;
		float zoom = 2;

		try {
			Pdf2Html pdf2Html = new Pdf2Html(outputFileName, type, zoom);
			pdf2Html.convertPdfToHtml(pdfFileName);
			pdf2Html.closeFile();
		} catch (Exception e) {
            System.err.println( "Filed to convert Pdf to Html." );
			e.printStackTrace();
		}
		
		System.out.println("Done"+(System.currentTimeMillis()-s));
	}

}

最后附上代码去除tika-app-x.jar 这个jar太大上传不了

PDF-to-standard-HTML.rar (7.7 MB)
下载次数: 25

查看图片附件

LocalDateTime 转 String igotyback java 开发语言
importjava.time.LocalDateTime;importjava.time.format.DateTimeFormatter;publicclassMain{publicstaticvoidmain(String[]args){//获取当前时间LocalDateTimenow=LocalDateTime.now();//定义日期格式化器DateTimeFormatterformat
swagger访问路径 igotyback swagger
Swagger2.x版本访问地址：http://{ip}:{port}/{context-path}/swagger-ui.html{ip}是你的服务器IP地址。{port}是你的应用服务端口，通常为8080。{context-path}是你的应用上下文路径，如果应用部署在根路径下，则为空。Swagger3.x版本对于Swagger3.x版本（也称为OpenAPI3）访问地址：http://{ip
html 中如何使用 uniapp 的部分方法某公司摸鱼前端 html uni-app 前端
示例代码：Documentconsole.log(window);效果展示：好了，现在就可以uni.使用相关的方法了
直抒《紫罗兰永恒花园外传》雷姆的黑色童话
没看过《紫罗兰永恒花园》的我莫名的看完了《紫罗兰永恒花园外传》，又莫名的被故事中的姐妹之情狠狠地感动了的一把。感动何在：困苦中相依为命的姐妹二人被迫分离，用一个人的自由换取另一个人的幸福。之后，虽相隔不知几许依旧心心念念彼此牵挂。这种深深的姐妹情谊就是令我为之动容的所在。贝拉和泰勒分别影片开始，海天之间一个孩童凭栏眺望，手中拿着折旧的信纸。镜头一转，挑灯伏案的薇尔莉特正在打字机前奋笔疾书。这些片段
四章-32-点要素的聚合彩云飘过
本文基于腾讯课堂老胡的课《跟我学Openlayers--基础实例详解》做的学习笔记，使用的openlayers5.3.xapi。源码见1032.html，对应的官网示例https://openlayers.org/en/latest/examples/cluster.htmlhttps://openlayers.org/en/latest/examples/earthquake-clusters.
DIV+CSS+JavaScript技术制作网页（旅游主题网页设计与制作）云南大理 STU学生网页设计网页设计期末网页作业 html静态网页 html5期末大作业网页设计 web大作业
️精彩专栏推荐作者主页:【进入主页—获取更多源码】web前端期末大作业：【HTML5网页期末作业(1000套)】程序员有趣的告白方式：【HTML七夕情人节表白网页制作(110套)】文章目录二、网站介绍三、网站效果▶️1.视频演示2.图片演示四、网站代码HTML结构代码CSS样式代码五、更多源码二、网站介绍网站布局方面：计划采用目前主流的、能兼容各大主流浏览器、显示效果稳定的浮动网页布局结构。网站程
关于城市旅游的HTML网页设计——(旅游风景云南 5页)HTML+CSS+JavaScript 二挡起步 web前端期末大作业 javascript html css 旅游风景
⛵源码获取文末联系✈Web前端开发技术描述网页设计题材，DIV+CSS布局制作,HTML+CSS网页设计期末课程大作业|游景点介绍|旅游风景区|家乡介绍|等网站的设计与制作|HTML期末大学生网页设计作业，Web大学生网页HTML：结构CSS：样式在操作方面上运用了html5和css3，采用了div+css结构、表单、超链接、浮动、绝对定位、相对定位、字体样式、引用视频等基础知识JavaScrip
HTML网页设计制作大作业（div+css）云南我的家乡旅游景点带文字滚动二挡起步 web前端期末大作业 web设计网页规划与设计 html css javascript dreamweaver 前端
Web前端开发技术描述网页设计题材，DIV+CSS布局制作,HTML+CSS网页设计期末课程大作业游景点介绍|旅游风景区|家乡介绍|等网站的设计与制作HTML期末大学生网页设计作业HTML：结构CSS：样式在操作方面上运用了html5和css3，采用了div+css结构、表单、超链接、浮动、绝对定位、相对定位、字体样式、引用视频等基础知识JavaScript：做与用户的交互行为文章目录前端学习路线
Python中深拷贝与浅拷贝的区别 yuxiaoyu.
转自：http://blog.csdn.net/u014745194/article/details/70271868定义：在Python中对象的赋值其实就是对象的引用。当创建一个对象，把它赋值给另一个变量的时候，python并没有拷贝这个对象，只是拷贝了这个对象的引用而已。浅拷贝：拷贝了最外围的对象本身，内部的元素都只是拷贝了一个引用而已。也就是，把对象复制一遍，但是该对象中引用的其他对象我不复
希望和悲伤都是照亮我们人生的一缕光山月映雪
我开始并不想读《云边有个小卖部》，但看到好几个学生就都在读这本书，为了了解学生的阅读实际，我就拿起这本书翻看起来。读了十几页，发现小说的语言中不时有一些粗俗的字眼，感觉自己读不下去了。小说一开始把云边镇风景写的特别的美好，我错判为脱离现实的鸳鸯蝴蝶派小说，对于人为制造的童话世界的人与物，我真的不太感兴趣，所以就没有再读了。有天在教室闲转，顺手又拿起了这本书看了起来，这次我才真的看进去了。这部小说除
K近邻算法_分类鸢尾花数据集 _feivirus_ 算法机器学习和数学分类机器学习 K近邻
importnumpyasnpimportpandasaspdfromsklearn.datasetsimportload_irisfromsklearn.model_selectionimporttrain_test_splitfromsklearn.metricsimportaccuracy_score1.数据预处理iris=load_iris()df=pd.DataFrame(data=ir
基于Python给出的PDF文档转Markdown文档的方法程序媛了了 python pdf 开发语言
注：网上有很多将Markdown文档转为PDF文档的方法，但是却很少有将PDF文档转为Markdown文档的方法。就算有，比如某些网站声称可以将PDF文档转为Markdown文档，尝试过，不太符合自己的要求，而且无法保证文档没有泄露风险。于是本人为了解决这个问题，借助GPT（能使用GPT镜像或者有条件直接使用GPT的，反正能调用GPT接口就行）生成Python代码来完成这个功能。笔记、代码难免存在
难念的经轩辕一风
今天中午从公司出来办事，站在马路旁边招手打的。不久来了一辆，路边停下，坐上，一转脸准备告诉师傅去哪的地址，惊呼，姚师傅？啊，咋是你？这也太巧了吧。在上海，出租车少说也有上万辆吧，而且地方这么大，况且在我出来的这个时间点上碰上了，有时候解释不了，就是那么巧。我和姚师傅咋认识的呢？因工作的原因，我要经常去外地出差，家住的离火车站比较远，每次都是打车过去。可能家在郊区的原因，平时门口公路上的出租车并不多
python tif转png Python与遥感 python 开发语言
importosfromosgeoimportgdalimportnumpyasnpfromPILimportImage#提取432三波段fromspectralimport*#输入文件夹路径defget_img(dataset_img):width=dataset_img.RasterXSize#获取行列数height=dataset_img.RasterYSizebands=dataset_i
python怎么将png转为tif_png转tif weixin_39977276
发国外的文章要求图片是tif，cmyk色彩空间的。大小尺寸还有要求。比如网上大神多，找到了一段代码，感谢！https://www.jianshu.com/p/ec2af4311f56https://github.com/KevinZc007/image2Tifimportjava.awt.image.BufferedImage;importjava.io.File;importjava.io.Fi
ARMv8 Debug __pop_ ARMv8 ARM64 架构 linux 运维
内容来自DEN0024A_v8_architecture_PG.pdf本质ARMv8Debug是什么历史在ARMv4开始被引入,并已发展成一系列广泛的调试(debug1)和跟踪(trace)功能ARMv6和ARMv7-a新增了自托管调试(debug2)和性能评测(trace-enhance)ARMv8处理器提供硬件功能侵入式:调试工具能够对核心活动提供显著级别的控制非侵入式:以非侵入性方式收集有关
tiff批量转png 诺有缸的高飞鸟 opencv 图像处理 python opencv 图像处理
目录写在前面代码完写在前面1、本文内容tiff批量转png2、平台/环境opencv,python3、转载请注明出处：https://blog.csdn.net/qq_41102371/article/details/132975023代码importnumpyasnpimportcv2importosdeffindAllFile(base):file_list=[]forroot,ds,fsin
00. 这里整理了最全的爬虫框架（Java + Python）有一只柴犬爬虫系列爬虫 java python
目录1、前言2、什么是网络爬虫3、常见的爬虫框架3.1、java框架3.1.1、WebMagic3.1.2、Jsoup3.1.3、HttpClient3.1.4、Crawler4j3.1.5、HtmlUnit3.1.6、Selenium3.2、Python框架3.2.1、Scrapy3.2.2、BeautifulSoup+Requests3.2.3、Selenium3.2.4、PyQuery3.2
详解：如何设计出健壮的秒杀系统？夜空_2cd3
作者：Yrion博客园：cnblogs.com/wyq178/p/11261711.html前言：秒杀系统相信很多人见过，比如京东或者淘宝的秒杀，小米手机的秒杀。那么秒杀系统的后台是如何实现的呢？我们如何设计一个秒杀系统呢？对于秒杀系统应该考虑哪些问题？如何设计出健壮的秒杀系统？本期我们就来探讨一下这个问题：image目录一：****秒杀系统应该考虑的问题二：****秒杀系统的设计和技术方案三：*
《转介绍方法论》学习笔记小可乐的妈妈
一、高效转介绍的流程：价值观---执行----方案一）转介绍发生的背景：1、对象：谁向谁转介绍？全员营销，人人参与。①员工的激励政策、客户的转介绍诱因制作客户画像：a信任；支付能力；意愿度；便利度（根据家长具备四个特征的个数分为四类）B性格分类C职业分类D年龄性别②执行：套路，策略，方法，流程2、诱因：为什么要转介绍？认同信任；多方共赢；传递美好；零风险承诺打动人心，超越期待。选择做教育，就是选择
2022-06-29 感恩学习相信小陶
感恩！六点签到相信很多人都有过这样的经验，拼命想的时候答案怎么都想不出来，不去想的时候，答案却自动冒出来了。为什么？这是因为潜意识也会工作，它非常神奇。你要相信，那些百思不得其解的问题早已扎根在你的头脑中，即使你不再刻意去想，潜意识也会自动围着它转。或许有一天，你会突然得到答案。这也是为什么有时我们会有顿悟的感觉。学会等待，也是进行持续思考的一个重要方法。
yolov5＞onnx＞ncnn＞apk 图像处理大大大大大牛啊 opencv实战代码讲解 yolo onnx ncnn 安卓
一.yolov5pt模型转onnx条件：colabnotebookyolov51.安装环境!pipinstallonnx>=1.7.0#forONNXexport!pipinstallcoremltools==4.0#forCoreMLexport!pipinstallonnx-simplifier2.修改common.py在classFocus下面
非对称加密算法————RSA理论及详情 hu19930613
转自：https://www.kancloud.cn/kancloud/rsa_algorithm/48484一、一点历史1976年以前，所有的加密方法都是同一种模式：（1）甲方选择某一种加密规则，对信息进行加密；（2）乙方使用同一种规则，对信息进行解密。由于加密和解密使用同样规则（简称"密钥"），这被称为"对称加密算法"（Symmetric-keyalgorithm）。这种加密模式有一个最大弱点
斟一小组鸡血视频和自己一起成长
http://m.v.qq.com/play/play.html?coverid=&vid=c0518henl2a&ptag=2_6.0.0.14297_copy有一种努力叫做靠自己http://m.v.qq.com/play/play.html?coverid=&vid=i0547o426g4&ptag=2_6.0.0.14297_copy世界最励志短片https://v.qq.com/x/pa
Dockerfile命令详解之 FROM 清风怎不知意容器化 java 前端 javascript
许多同学不知道Dockerfile应该如何写，不清楚Dockerfile中的指令分别有什么意义，能达到什么样的目的，接下来我将在容器化专栏中详细的为大家解释每一个指令的含义以及用法。专栏订阅传送门https://blog.csdn.net/qq_38220908/category_11989778.html指令不区分大小写。但是，按照惯例，它们应该是大写的，以便更容易地将它们与参数区分开来。(引用
探索ASPICE V3.1：汽车行业软件开发的中文指南阮懿同
探索ASPICEV3.1：汽车行业软件开发的中文指南ASPICE_V3.1中文版.pdf.zip项目地址:https://gitcode.com/open-source-toolkit/422a2在汽车软件工程领域，高质量的标准对于确保行车安全和提升用户体验至关重要。今天，我们为您介绍一个珍贵的开源宝藏——ASPICEV3.1中文版资源。这是一篇专为国内汽车行业开发者、质量管理者准备的深度解读，旨
《HTML 与 CSS—— 响应式设计》陈在天box html css 前端
一、引言在当今数字化时代，人们使用各种不同的设备访问互联网，包括智能手机、平板电脑、笔记本电脑和台式机等。为了确保网站在不同设备上都能提供良好的用户体验，响应式设计成为了网页开发的关键。HTML和CSS作为网页开发的基础技术，在实现响应式设计方面发挥着重要作用。本文将深入探讨HTML与CSS中的响应式设计原理、方法和最佳实践。二、响应式设计的概念与重要性（一）概念响应式设计是一种网页设计方法，旨在
【C语言】- 自定义类型：结构体、枚举、联合 Cavalier_01 C语言
【C语言】：操作符（https://mp.csdn.net/editor/html/115218055）数据类型（https://mp.csdn.net/editor/html/115219664）自定义类型：结构体、枚举、联合（https://mp.csdn.net/editor/html/115373785）变量、常量（https://mp.csdn.net/editor/html/11523
html+css网页设计旅游网站首页1个页面 html+css+js网页设计 html css 旅游
html+css网页设计旅游网站首页1个页面网页作品代码简单，可使用任意HTML辑软件（如：Dreamweaver、HBuilder、Vscode、Sublime、Webstorm、Text、Notepad++等任意html编辑软件进行运行及修改编辑等操作）。获取源码1，访问该网站https://download.csdn.net/download/qq_42431718/897527112，点击
spring mvc @RequestBody String类型参数 zoyation spring-mvc spring mvc
通过如下配置：text/html;charset=UTF-8application/json;charset=UTF-8在springmvc的Controller层使用@RequestBody接收Content-Type为application/json的数据时，默认支持Map方式和对象方式参数@RequestMapping(value="/{code}/saveUser",method=Requ
java责任链模式 3213213333332132 java 责任链模式村民告县长
责任链模式，通常就是一个请求从最低级开始往上层层的请求，当在某一层满足条件时，请求将被处理，当请求到最高层仍未满足时，则请求不会被处理。就是一个请求在这个链条的责任范围内，会被相应的处理，如果超出链条的责任范围外，请求不会被相应的处理。下面代码模拟这样的效果：创建一个政府抽象类,方便所有的具体政府部门继承它。 package 责任链模式; /** *
linux、mysql、nginx、tomcat 性能参数优化 ronin47
一、linux 系统内核参数 /etc/sysctl.conf文件常用参数 net.core.netdev_max_backlog = 32768 #允许送到队列的数据包的最大数目 net.core.rmem_max = 8388608 #SOCKET读缓存区大小 net.core.wmem_max = 8388608 #SOCKET写缓存区大
php命令行界面 dcj3sjt126com PHP cli
常用选项 php -v php -i PHP安装的有关信息 php -h 访问帮助文件 php -m 列出编译到当前PHP安装的所有模块执行一段代码 php -r 'echo "hello, world!";' php -r 'echo "Hello, World!\n";' php -r '$ts = filemtime("
Filter&Session 171815164 session
Filter HttpServletRequest requ = (HttpServletRequest) req; HttpSession session = requ.getSession(); if (session.getAttribute("admin") == null) { PrintWriter out = res.ge
连接池与Spring,Hibernate结合 g21121 Hibernate
前几篇关于Java连接池的介绍都是基于Java应用的，而我们常用的场景是与Spring和ORM框架结合，下面就利用实例学习一下这方面的配置。 1.下载相关内容： &nb
[简单]mybatis判断数字类型 53873039oycg mybatis
昨天同事反馈mybatis保存不了int类型的属性,一直报错，错误信息如下: Caused by: java.lang.NumberFormatException: For input string: "null" at sun.mis
项目启动时或者启动后ava.lang.OutOfMemoryError: PermGen space 程序员是怎么炼成的 eclipse jvm tomcat catalina.sh eclipse.ini
在启动比较大的项目时，因为存在大量的jsp页面，所以在编译的时候会生成很多的.class文件，.class文件是都会被加载到jvm的方法区中，如果要加载的class文件很多，就会出现方法区溢出异常 java.lang.OutOfMemoryError: PermGen space. 解决办法是点击eclipse里的tomcat，在
我的crm小结 aijuans crm
各种原因吧，crm今天才完了。主要是接触了几个新技术： Struts2、poi、ibatis这几个都是以前的项目中用过的。 Jsf、tapestry是这次新接触的，都是界面层的框架，用起来也不难。思路和struts不太一样，传说比较简单方便。不过个人感觉还是struts用着顺手啊，当然springmvc也很顺手，不知道是因为习惯还是什么。jsf和tapestry应用的时候需要知道他们的标签、主
spring里配置使用hibernate的二级缓存几步 antonyup_2006 java spring Hibernate xml cache
．在spring的配置文件中 applicationContent.xml，hibernate部分加入 xml 代码 <prop key="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</prop> <prop key="hi
JAVA基础面试题百合不是茶抽象实现接口 String类接口继承抽象类继承实体类自定义异常
/* * 栈（stack）：主要保存基本类型（或者叫内置类型）（char、byte、short、 *int、long、 float、double、boolean）和对象的引用，数据可以共享，速度仅次于 * 寄存器（register），快于堆。堆（heap）：用于存储对象。 */ &
让sqlmap文件 "继承" 起来 bijian1013 java ibatis sqlmap
多个项目中使用ibatis , 和数据库表对应的 sqlmap文件（增删改查等基本语句)，dao, pojo 都是由工具自动生成的, 现在将这些自动生成的文件放在一个单独的工程中，其它项目工程中通过jar包来引用，并通过"继承"为基础的sqlmap文件，dao,pojo 添加新的方法来满足项
精通Oracle10编程SQL(13)开发触发器 bijian1013 oracle 数据库 plsql
/* *开发触发器 */ --得到日期是周几 select to_char(sysdate+4,'DY','nls_date_language=AMERICAN') from dual; select to_char(sysdate,'DY','nls_date_language=AMERICAN') from dual; --建立BEFORE语句触发器 CREATE O
【EhCache三】EhCache查询 bit1129 ehcache
本文介绍EhCache查询缓存中数据，EhCache提供了类似Hibernate的查询API，可以按照给定的条件进行查询。要对EhCache进行查询，需要在ehcache.xml中设定要查询的属性数据准备 @Before public void setUp() { //加载EhCache配置文件 Inpu
CXF框架入门实例白糖_ spring Web 框架 webservice servlet
CXF是apache旗下的开源框架，由Celtix + XFire这两门经典的框架合成，是一套非常流行的web service框架。它提供了JAX-WS的全面支持，并且可以根据实际项目的需要，采用代码优先（Code First）或者 WSDL 优先（WSDL First）来轻松地实现 Web Services 的发布和使用，同时它能与spring进行完美结合。在apache cxf官网提供
angular.equals boyitech AngularJS AngularJS API AnguarJS 中文API angular.equals
angular.equals 描述: 比较两个值或者两个对象是不是相等。还支持值的类型，正则表达式和数组的比较。两个值或对象被认为是相等的前提条件是以下的情况至少能满足一项：两个值或者对象能通过=== （恒等）的比较两个值或者对象是同样类型，并且他们的属性都能通过angular
java-腾讯暑期实习生-输入一个数组A[1,2,...n]，求输入B，使得数组B中的第i个数字B[i]=A[0]*A[1]*...*A[i-1]*A[i+1] bylijinnan java
这道题的具体思路请参看何海涛的微博：http://weibo.com/zhedahht import java.math.BigInteger; import java.util.Arrays; public class CreateBFromATencent { /** * 题目：输入一个数组A[1,2,...n]，求输入B，使得数组B中的第i个数字B[i]=A
FastDFS 的安装和配置修订版 Chen.H linux fastDFS 分布式文件系统
FastDFS Home:http://code.google.com/p/fastdfs/ 1. 安装 http://code.google.com/p/fastdfs/wiki/Setup http://hi.baidu.com/leolance/blog/item/3c273327978ae55f93580703.html 安装libevent (对libevent的版本要求为1.4.
[强人工智能]拓扑扫描与自适应构造器 comsci 人工智能
当我们面对一个有限拓扑网络的时候,在对已知的拓扑结构进行分析之后,发现在连通点之后,还存在若干个子网络,且这些网络的结构是未知的,数据库中并未存在这些网络的拓扑结构数据....这个时候,我们该怎么办呢? 那么,现在我们必须设计新的模块和代码包来处理上面的问题
oracle merge into的用法 daizj oracle sql merget into
Oracle中merge into的使用 http://blog.csdn.net/yuzhic/article/details/1896878 http://blog.csdn.net/macle2010/article/details/5980965 该命令使用一条语句从一个或者多个数据源中完成对表的更新和插入数据. ORACLE 9i 中，使用此命令必须同时指定UPDATE 和INSE
不适合使用Hadoop的场景 datamachine hadoop
转自：http://dev.yesky.com/296/35381296.shtml。　　Hadoop通常被认定是能够帮助你解决所有问题的唯一方案。当人们提到“大数据”或是“数据分析”等相关问题的时候，会听到脱口而出的回答：Hadoop! 实际上Hadoop被设计和建造出来，是用来解决一系列特定问题的。对某些问题来说，Hadoop至多算是一个不好的选择，对另一些问题来说，选择Ha
YII findAll的用法 dcj3sjt126com yii
看文档比较糊涂，其实挺简单的： $predictions=Prediction::model()->findAll("uid=:uid",array(":uid"=>10)); 第一个参数是选择条件：”uid=10″。其中:uid是一个占位符，在后面的array(“:uid”=>10)对齐进行了赋值；更完善的查询需要
vim 常用 NERDTree 快捷键 dcj3sjt126com vim
下面给大家整理了一些vim NERDTree的常用快捷键了，这里几乎包括了所有的快捷键了，希望文章对各位会带来帮助。切换工作台和目录 ctrl + w + h 光标 focus 左侧树形目录ctrl + w + l 光标 focus 右侧文件显示窗口ctrl + w + w 光标自动在左右侧窗口切换ctrl + w + r 移动当前窗口的布局位置 o 在已有窗口中打开文件、目录或书签，并跳
Java把目录下的文件打印出来蕃薯耀列出目录下的文件文件夹下面的文件目录下的文件
Java把目录下的文件打印出来 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年7月11日 11:02:
linux远程桌面----VNCServer与rdesktop hanqunfeng Desktop
windows远程桌面到linux，需要在linux上安装vncserver，并开启vnc服务，同时需要在windows下使用vnc-viewer访问Linux。vncserver同时支持linux远程桌面到linux。 linux远程桌面到windows，需要在linux上安装rdesktop，同时开启windows的远程桌面访问。下面分别介绍，以windo
guava中的join和split功能 jackyrong java
guava库中，包含了很好的join和split的功能，例子如下： 1）将LIST转换为使用字符串连接的字符串 List<String> names = Lists.newArrayList("John", "Jane", "Adam", "Tom");
Web开发技术十年发展历程 lampcy android Web 浏览器 html5
回顾web开发技术这十年发展历程： Ajax 03年的时候我上六年级，那时候网吧刚在小县城的角落萌生。传奇，大话西游第一代网游一时风靡。我抱着试一试的心态给了网吧老板两块钱想申请个号玩玩，然后接下来的一个小时我一直在，注，册，账，号。彼时网吧用的512k的带宽，注册的时候，填了一堆信息，提交，页面跳转，嘣，”您填写的信息有误，请重填”。然后跳转回注册页面，以此循环。我现在时常想，如果当时a
架构师之mima-----------------mina的非NIO控制IOBuffer(说得比较好) nannan408 buffer
1.前言。如题。 2.代码。 IoService IoService是一个接口，有两种实现：IoAcceptor和IoConnector；其中IoAcceptor是针对Server端的实现，IoConnector是针对Client端的实现；IoService的职责包括： 1、监听器管理 2、IoHandler 3、IoSession
ORA-00054:resource busy and acquire with NOWAIT specified Everyday都不同 oracle session Lock
[Oracle] 今天对一个数据量很大的表进行操作时，出现如题所示的异常。此时表明数据库的事务处于“忙”的状态，而且被lock了，所以必须先关闭占用的session。 step1，查看被lock的session： select t2.username, t2.sid, t2.serial#, t2.logon_time from v$locked_obj
javascript学习笔记 tntxia JavaScript
javascript里面有6种基本类型的值:number、string、boolean、object、function和undefined。number：就是数字值，包括整数、小数、NaN、正负无穷。string:字符串类型、单双引号引起来的内容。boolean:true、false object:表示所有的javascript对象，不用多说function:我们熟悉的方法，也就是
Java enum的用法详解 xieke90 enum 枚举
Java中枚举实现的分析：示例： public static enum SEVERITY{ INFO,WARN,ERROR } enum很像特殊的class，实际上enum声明定义的类型就是一个类。而这些类都是类库中Enum类的子类 (java.l

java pdf转html

你可能感兴趣的:(pdf2html,tika-app,pdf转html)