brandohero

【摘要】图像文本检测提取算法

Scene Text Recognition with Bilateral Regression
Jacqueline Feild and Erik Learned-Miller
Technical Report UM-CS-2012-021
University of Massachusetts Amherst
Abstract
This paper focuses on improving the recognition of text in images of natural scenes,
such as storefront signs or street signs. This is a difficult problem due to lighting con-
ditions, variation in font shape and color, and complex backgrounds. We present a
word recognition system that addresses these difficulties using an innovative technique
to extract and recognize foreground text in an image. First, we develop a new method,
called bilateral regression, for extracting and modeling one coherent (although not nec-
essarily contiguous) region from an image. The method models smooth color changes
across an image region without being corrupted by neighboring image regions. Second,
rather than making a hard decision early in the pipeline about which region is fore-
ground, we generate a set of possible foreground hypotheses, and choose among these
using feedback from a recognition system. We show increased recognition performance
using our segmentation method compared to the current state of the art. Overall, using
our system we also show a substantial increase in word accuracy on the word spotting
task over the current state of the art on the ICDAR 2003 word recognition data set.

Scene text detection using graph model built upon maximally stable extremal regions

Pattern Recognition Letters 34 (2013)

Abstract

Scene text detection could be formulated as a bi-label (text and non-text regions) segmentation problem.
However, due to the high degree of intraclass variation of scene characters as well as the limited number
of training samples, single information source or classifier is not enough to segment text from non-text
background. Thus, in this paper, we propose a novel scene text detection approach using graph model
built upon Maximally Stable Extremal Regions (MSERs) to incorporate various information sources into
one framework. Concretely, after detecting MSERs in the original image, an irregular graph whose nodes
are MSERs, is constructed to label MSERs as text regions or non-text ones. Carefully designed features
contribute to the unary potential to assess the individual penalties for labeling a MSER node as text or
non-text, and color and geometric features are used to define the pairwise potential to punish the likely
discontinuities. By minimizing the cost function via graph cut algorithm, different information carried by
the cost function could be optimally balanced to get the final MSERs labeling result. The proposed method
is naturally context-relevant and scale-insensitive. Experimental results on the ICDAR 2011 competition
dataset show that the proposed approach outperforms state-of-the-art methods both in recall and
precision.

Text extraction from scene images by character appearance and structure modeling

Computer Vision and Image Understanding 117 (2013)

Abstract

In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene
text classification and detection are still open research topics. Our proposed algorithm is able to model
both character appearance and structure to generate representative and discriminative text descriptors.
The contributions of this paper include three aspects: (1) a new character appearance model by a struc-
ture correlation algorithm which extracts discriminative appearance features from detected interest
points of character samples; (2) a new text descriptor based on structons and correlatons, which model
character structure by structure differences among character samples and structure component co-occur-
rence; and (3) a new text region localization method by combining color decomposition, character con-
tour refinement, and string line alignment to localize character candidates and refine detected text
regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm,
including text classification, text detection, and character identification. The evaluation results on bench-
mark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text
classification and detection, and significantly outperforms the existing algorithms for character
identification.

Text detection in images using sparse representation with discriminative dictionaries

Image and Vision Computing 28 (2010)

Text detection is important in the retrieval of texts from digital pictures, video databases and webpages.
However, it can be very challenging since the text is often embedded in a complex background. In this paper,
we propose a classification-based algorithm for text detection using a sparse representation with
discriminative dictionaries. First, the edges are detected by the wavelet transform and scanned into patches
by a sliding window. Then, candidate text areas are obtained by applying a simple classification procedure
using two learned discriminative dictionaries. Finally, the adaptive run-length smoothing algorithm and
projection profile analysis are used to further refine the candidate text areas. The proposed method is
evaluated on the Microsoft common test set, the ICDAR 2003 text locating set, and an image set collected
from the web. Extensive experiments show that the proposed method can effectively detect texts of various
sizes, fonts and colors from images and videos.

Scene Text Localization Using Gradient Local Correlation

2013 12th International Conference on Document Analysis and Recognition

In this paper, we propose an efficient scene text
localization method using gradient local correlation, which can
characterize the density of pairwise edges and stroke width
consistency to get a text confidence map. Gradient local
correlation is insensitive to the gradient direction and robust to
noise, small character size and shadow. Based on the text
confidence map, the regions with high confidence are segmented
into connected components (CCs), which are classified to text
CCs and non-text CCs using an SVM classifier. Then, the text
CCs with similar color and stroke width are grouped into text
lines, which are in turn partitioned into words. Experimental
results on the ICDAR 2003 text locating competition dataset
demonstrate the effectiveness of our method.

Scene Text Detection using Sparse Stroke Information and MLP

Pattern Recognition (ICPR 2012)

In this article, we present a novel set of features for
detection of text in images of natural scenes using a
multi-layer perceptron (MLP) classifier. An estimate of
the uniformity in stroke thickness is one of our features
and we obtain the same using only a subset of the
distance transform values of the concerned region.
Estimation of the uniformity in stroke thickness on the
basis of sparse sampling of the distance transform
values is a novel approach. Another feature is the
distance between the foreground and background
colors computed in a perceptually uniform and
illumination-invariant color space. Remaining features
include two ratios of anti-parallel edge gradient
orientations, a regularity measure between the skeletal
representation and Canny edgemap of the object,
average edge gradient magnitude, variation in the
foreground gray levels and five others. Here, we
present the results of the proposed approach on the
ICDAR 2003 database and another database of scene
images consisting of text of Indian scripts.

Bayesian Network Scores Based Text Localization in Scene

2014 International Joint Conference on Neural Networks (IJCNN)
July 6-11, 2014, Beijing, China

Text localization in scene images is an essential and
interesting task to analyze the image contents. In this work, a
Bayesian network scores using K2 algorithm in conjunction
with the geometric features based effective text localization
method with the help of maximally stable extremal regions
(MSERs). First, all MSER-based extracted candidate characters
are directly compared with an existing text localization method
to ﬁnd text regions. Second, adjacent extracted MSER-based
candidate characters are not encompassed into text regions
due to strict edges constraint. Therefore, extracted candidate
character regions are incorporated into text regions using
selection rules. Third, K2 algorithm-based Bayesian networks
scores are learned for the complimentary candidate character
regions. Bayesian logistic regression classiﬁer is built on the
Bayesian network scores by computing the posterior probability
of complimentary candidate character region corresponding
to non-character candidates. The higher posterior probability
of complimentary Candidate character regions are further
grouped into words or sentences. Bayesian networks scores
based text localization system, na

ICDAR 2013 Robust Reading Competition (Challenge
2 Task 2.1: Text Localization) database. Experimental results
have established signiﬁcant competitive performance with the
state-of-the-art text detection systems.

K2 Algorithm-based Text Detection with An Adaptive Classifier Threshold

International Journal of Image Processing (IJIP), Volume (8) : Issue (3) : 2014

In natural scene images, text detection is a challenging study area for dissimilar content-based
image analysis tasks. In this paper, a Bayesian network scores are used to classify candidate
character regions by computing posterior probabilities. The posterior probabilities are used to
define an adaptive threshold to detect text in scene images with accuracy. Therefore, candidate
character regions are extracted through maximally stable extremal region. K2 algorithm-based
Bayesian network scores are learned by evaluating dependencies amongst features of a given
candidate character region. Bayesian logistic regression classifier is trained to compute posterior
probabilities to define an adaptive classifier threshold. The candidate character regions below
from adaptive classifier threshold are discarded as non-character regions. Finally, text regions are
detected with the use of effective text localization scheme based on geometric features. The
entire system is evaluated on the ICDAR 2013 competition database. Experimental results
produce competitive performance (precision, recall and harmonic mean) with the recently
published algorithms.

Text localization techniques can be grouped into region-based, connected component (CC)-
based [1] and hybrid methods [2].

Region-based techniques employ a sliding window to look for
image text with the use of machine learning techniques for text identification. Sliding window
based methods tend to be slow due to multi scale processing of images. A new text detection
algorithm extracts six dissimilar classes of text features. Modest AdaBoost classifier is used to
recognize text regions based on text features [3].

CC-based methods group extracted candidate
characters into text regions with similar geometric features. CC-based methods are demanding to
apply additional checks for eliminating false positives. To find CCs, stroke width for every pixel is
computed to group neighboring pixels. These CCs were screened and grouped into text regions
[4].

Pan et al. [2] proposed hybrid method that exploits image regions to detect text candidates
and extracts CCs as candidate characters by local binarization. False positive components are
eliminated efficiently with the use of conditional random field (CRFs) model. Finally, character
components are grouped into lines/words. Recently, Yin et al. [5] extracted maximally stable
extremal regions (MSERs) as letter candidates. Non-letter candidates are eliminated using
geometric information. Candidate text regions are constructed by grouping similar letter
candidates using disjoint set. For each candidate text region, vertical and horizontal variances,
color, geometry and stroke width are extracted to identify text regions using Adaboost classifier.
Besides, MSER based method is the winner of ICDAR 2011 Robust Reading Competition [6] with
promising performance.

Keywords: Bayesian Network, Adaptive Threshold, Bayesian Logistic Regression, Scene Image

. Our text localization method shows a competitive recall 62.37, precision 84.97 and a 71.94

harmonic mean, which is competitive with
the leading methods reported by [10]. However, our text localization method performs better than
the ICDAR 2011 Robust Reading Competition methods reported by [6].

TABLE 1: Performance (%) Comparison of Text Detection Methods on ICDAR 2013 Dataset.

A Skeleton Based Descriptor for Detecting Text in Real Scene Images

关键方法：相似区域作为邻居计算相似距离变成graph，骨架检测子：检验每个区域的骨架skeleton和笔画宽度，使用graph-cut消除错误区域

21st International Conference on Pattern Recognition (ICPR 2012)
November 11-15, 2012. Tsukuba, Japan

Text extraction from natural scene image: A survey

Edges are reliable features for text detection. Usually, an edge detector (e.g., Canny) is used first followed by morphological operations to extract text from background and to eliminate non-text regions. Edge-based methods are usually more efficient and simple in nature scene text extraction. Good performance is often found on scene images exhibiting strong edges. For the same reason, a major problem of edge-based methods lies with the fact that good edge profiles are hard to obtain under the influence of shadow or highlight.....

Scene text detection via stroke width

关键方法： mser（增加对比度），笔划宽度（计算方差并过滤），相邻区域聚类（计算距离图和角度图带来鲁棒性）

INTRODUCTION

In recent decades, detecting text in complex nature scenes is a hot topic in computer vision, since text in images provides much semantic information for human to understand the environment. Moreover, text detection is a prerequisite for a couple of purposes, such as content-based image analysis, image retrieval, etc. Unlike overlay text detection in video frames where lots of prior knowledge can be employed, text detection in natural scene images is a difficult problem due to complex background, variations in text's size, font, color, orientation and lighting conditions.

Generally, methods on this topic can be divided into two categories: learning-based methods and connected component (CC)-based methods.

In order to distinguish text regions from non-text ones, learning-based methods use some features to train a classifier (e.g., SVM or AdaBoost). Pan et al. [6] use a polynomial classifier in the verification stage and evaluate five widely used features, including HOG, LBP, DCT, Gabor filter and wavelet, then find the combination of HOG and wavelet showing the best performance. Wang et al. [9] use gray scale contrast feature and edge orientation histogram feature to train a SVM. The main limitations of learning-based methods are high computational complexity and the difficulty to select the best features for scene text detection.

Figure 1. Overview of text detection process. (a) Detected MSERs. (b) CCs after geometric filtering. (c) CCs after stroke width extraction. (d) Detected text.

View All | Next

CC-based methods, on the other hand, usually generate separated CCs using some properties, such as edge, stroke width and color. After that, some geometric constraints are designed to remove false positives. Epshtein et al. [1] propose stroke width transform, which converts value of each color pixel into the width of most likely stroke.Zhang and Kasturi [11] use HOG to locate text edges and then Graph Spectrum is utilized to group the characters and remove false positives. The advantage of these methods is that their computational complexity is low. However, the performance of CC-based methods are likely to degrade when dealing with texts in complex background.

In this paper, a novel CC-based text detection algorithm is proposed to overcome the difficulties mentioned above. We make three major contributions compared with other methods available in literature. (1) Though MSER has been exploited in the text detection task, such as [5], most of those approaches use bare MSER algorithm, ignoring the fact that MSER is sensitive to image blur. We overcome this obstacle by incorporating intensity information on the boundary between text and background. (2) Since stroke width is one of the inherent properties of text, which is insensitive to size, font, color, orientation of text, stroke width on the skeleton of CCs is extracted to distinguish between text and non-text regions. (3) We only detect text on one scale, this is more efficient than the work [6] which requires image pyramid in order to detect text with different sizes.

SECTION II

TEXT DETECTION ALGORITHM

An overview of our algorithm is depicted in Figure 1. On every input color image, we first resize it into resolution, then MSER-s are detected and considered as text region candidates (Section 2.1). As a next step, we design some simple heuristic rules to remove MSERs which are not text regions (Section 2.2).Different from stroke width transform in the work [1], we propose stroke width generated by distance transform on the skeleton of each CC to eliminate non-text areas (Section 2.3). In the final step, we group characters into words based on Euclidean distance, orientation and similarities between characters (Section 2.4).

2.1. Contrast-enhanced MSER Detection

The concept of MSER is introduced by Matas et al. [4]. Since a single letter usually shares similar color and its intensity is often quite different from back-ground, MSER can locate these text regions efficiently. MSER has many good properties, such as invariance to affine transformation of image intensities, stability [4] etc., however, it is sensitive to image blur. An example demonstrating this is shown in Figure 3 (b). It is obvious that most of characters are blurred and connected, so it is really difficult for us to get true stroke width of every character in Section 2.3. In order to overcome this problem, we propose a novel contranst-enhanced MSER algorithm as follows.

For an input image I, based on the observation that there are large changes in intensity at the boundary between text pixels and background, an intensity image In is obtained as in HSI color space. After that, we check intensity gradient using In , where is a threshold, if this condition is met, then update:TeX Sourcewhere , parameter is a predefined threshold. The aim of this procedure is to enhance the contrast between characters and background (Figure 2). Finally, we conduct MSER detection on this contrast-enhanced image. Figure 3 (c)illustrates the result of our contrast-enhanced MSER detection where all letters in the same word are separated.

2.2. Geometric Filtering

After locating bounding boxes of MSER, we design some simple geometric rules to filter out obvious non-text regions. Firstly, by assuming all characters have been separated, we limit the aspect ratio of each bounding box between 0.3 and 3. Secondly, text region candidates with low saturation (less than 0.3) or small area (less than 30 pixels) are unlikely to be text regions, thus they should be removed. Thirdly, since text may be surrounded by non-text CCs (e.g., the signboard containing characters is detected inFigure 1 (a)), we reject this kind of false positive by limiting the number of bounding boxes within a particular bounding box to three. For definitions of aspect ratio, saturation and area, see [12].

Figure 2. Contrast enhancement process.

Previous | View All | Next

Figure 3. (a) Original characters. (b) Bare MSER detection. (c) Contrast-enhanced MSER detection.

Previous | View All | Next

2.3. Stroke Width Extraction

Stroke width is defined as the length of a straight line from a text edge pixel to another along its gradient di-rection. The basic motivation of our stroke width extraction algorithm is that stroke width almost remains the same in a single character, however, there is significant change in stroke width in non-text regions as a result of their irregularity. There are several researches exploited this property, such as the work [1], [10], both of which calculate stroke width from a stroke boundary to another along gradient direction. Since skeleton is an effective tool to represent the structure of a region, inspired by the work [8] which uses skeleton to analyze text string straightness, we take advantage of skeleton to extract stroke width.

The initial step of stroke width extraction is to get skeletons of MSERs remained. On every foreground pixel on the skeleton, distance transform is applied to compute the Euclidean distance from this pixel to the nearest boundary of the corresponding MSER. Then we obtain a skeleton-distance map. This process is depicted in Figure 4. Figure 4 (a) illustrates a non-text MSER and text MSER from Figure 1 (a), and their corresponding skeleton and skeleton-distance map are shown in Figure 4 (b) andFigure 4 (c) respectively.

Variance on skeleton-distance map of each CC is computed to measure the difference between text regions and false positives. Table 1 lists values of variance obtained fromFigure 4 (c). Note that text characters have much smaller variances compared with the false positive. Based on this property we remove CCs with large variances. It can be seen in Figure 1 (c) that some false positives are eliminated after this procedure.

Figure 4. (a) Detected MSER of false positive and text. (b) Skeleton map. (c) Skeleton-distance map.

Previous | View All | Next

TABLE 1. V ARIANCES OF F ALSE P OSITIVE AND C HARACTERS.

2.4. CC grouping

The main aim of CC grouping is to group adjacent characters detected in the previous steps into separated meaningful words and further reject false positives. Based on the observation that characters in the the same word usually share some similar properties, such as in-tensity, size, stroke width etc., these valuable information can be utilized in CC grouping. The details of our CC grouping method are illustrated below.

Center points of CCs are extracted as the first step of the proposed method. Then we obtain two maps, namely distance map and orientation map, by computing the Euclidean distance D and orientation angle between each CC pairs. If D is smaller thanM axDistance, which is defined as the maximum Euclidean distance from each CC to another, these two CCs are considered as adjacent candidates.

In the following step, we check between each adjacent pair of CCs on the orientation map. By assuming that texts usually lie in the horizonal direction, we set between and . Every pair of CC satisfying this rule is checked by similarity criteria below:

wi+wj >1.2 x D
max(wi/wj,wj/wi) <5
max(hi/hj, hj/hi) <2
max(si/sj, sjsi) < 1.6
max(ni/nj,nj/ni) <1.7

where denote width, height, mean of stroke width, intensity of bounding box respectively, and all the thresholds are obtained from ICDAR 2003 training set. This is based on the observation that adjacent characters in the same word usually share similar stroke width and intensity. Adjacent CCs obeying all the rules are considered as true adjacent text characters thus are grouped together. The result of our CC grouping method is illustrated in Figure 1 (d), it is obvious that all characters are grouped successfully, meanwhile, all false positives are rejected.

SECTION III

EXPERIMENTS

To evaluate the robustness of the proposed algorith-m, we adopt the testing images in the public bench IC-DAR 2003 text locating dataset [3] in our experimen-t. Three widely used measurement criterions, namely precision(p), recall(r) and f measure are exploited to evaluate the performance of our method. In order to detect both bright and dark text objects, two rounds of MSER detection are performed for each testing image and the final result is the combination of two round results.

As for the parameters setting, we set the gradient threshold as 30 and as 50 empirically. Besides, CCs whose stroke variance larger than 0.2 should be rejected. Furthermore, M axDistance is set as 300 to measure the maximum distance between two letters.

We compare our text detection result with a number of state-of-the-art methods tested on the same database using p, r and f criteria. The comparison result is shown in Table 2. We can see that the proposed approach has the highest recall rate of 0.59.

Recently, ICDAR 2011 Robust Reading Competition [7] was organized to evaluate the state-of-the-art process in text detection from complex nature scene. We also adopt the dataset used in this competition. Table 3 shows our text detection results on this dataset.

Figure 5 illustrates some results of our robust text detection algorithm. Estimated text regions are surrounded by blue bounding boxes. Note that the proposed method is insensitive to text color, font, size and position. With the proposed method, most text regions are detected, meanwhile, few false positives left.

We also present some failure examples in Figure 6. Because of the illumination problem, ‘Bus’ and ‘Times’ in Figure 6 (a) are not detected. All letters are discarded inFigure 6 (b) due to similar color between text and background. Moreover, characters ‘X’, ‘M’, and ‘L’ in Figure 6 (c) are eliminated because of large changes in stroke width, but this kind of text is rare in the dataset, which will not affect the overall result to a large ex-tent. We notice that the performance of our algorithm depends much on the potential text regions detected in the initial step (e.g., sometimes text cannot be detected using the contrast-enhanced MSER algorithm).

Figure 5. Sample output of our method.

Previous | View All | Next

Figure 6. Failure examples.

Previous | View All

TABLE 2. R ESULT ON ICDAR 2003 D ATASET.

TABLE 3. R ESULT ON ICDAR 2011 D ATASET.

SECTION IV

CONCLUSION

In this work, a novel CC-based methodology for text detection in natural scene images is presented. MSER- are first utilized as potential text regions. A significant novelty of our work compared with previous research is that we apply skeleton to extract stroke width. Moreover, our robust CC grouping method can not only group characters into separated words, but also eliminate false positives at the same time. Text detection results on the ICDAR datasets demonstrate that our algorithm performs comparable to other methods.

你可能感兴趣的:(【摘要】图像文本检测提取算法)

数据标注工具详解 Sally璐璐 ai 大数据
数据标注工具是构建高质量AI训练数据集的核心基础设施，其功能覆盖图像、文本、视频、音频、3D点云等多模态数据的标注与管理。以下从工具类型、核心功能、行业应用及技术趋势等方面进行系统介绍：一、主流数据标注工具分类与特性1.通用型标注平台LabelStudio由Heartex开发的开源工具，支持文本、图像、视频、音频及时间序列数据标注，可通过YAML自定义标注界面19。其内置质量控制机制（如标注审核、
信息抽取领域关键Benchmark方法：分类体系
信息抽取领域关键Benchmark方法：分类体系摘要信息抽取（InformationExtraction,IE）作为自然语言处理的核心任务之一，旨在从非结构化文本中识别并结构化关键信息（如实体、关系、事件等），广泛应用于知识图谱构建、智能问答和数据分析等领域。近年来，随着深度学习技术的快速发展，信息抽取方法在性能和应用范围上取得了显著进步，但同时也面临着任务多样性、跨领域泛化性以及低资源场景下的适
基于级联深度学习算法在双参数MRI中检测前列腺病变的评估| 文献速递-AI辅助的放射影像疾病诊断有Li 人工智能深度学习算法
Title题目EvaluationofaCascadedDeepLearning–basedAlgorithmforProstateLesionDetectionatBiparametricMRI基于级联深度学习算法在双参数MRI中检测前列腺病变的评估Background背景MultiparametricMRI(mpMRI)improvesprostatecancer(PCa)detectionc
常见的强化学习算法分类及其特点 ywfwyht 人工智能算法分类人工智能
强化学习（ReinforcementLearning,RL）是一种机器学习方法，通过智能体（Agent）与环境（Environment）的交互来学习如何采取行动以最大化累积奖励。以下是一些常见的强化学习算法分类及其特点：1.基于值函数的算法这些算法通过估计状态或状态-动作对的价值来指导决策。Q-Learning无模型的离线学习算法。通过更新Q值表来学习最优策略。更新公式：Q(s,a)←Q(s,a)
图像处理100问-中文版(记录) STO检测王学习
https://gitee.com/mengfansheng163/ImageProcessing100Wen
【Python】PyRoboPath：Python机器人路径规划的终极指南宅男很神经 python 开发语言
PyRoboPath：Python机器人路径规划的终极指南第1部分：PyRoboPath与路径规划基础第1章：PyRoboPath概览与核心理念1.1什么是PyRoboPath？PyRoboPath是一个先进的、开源的Python库，致力于为学术研究人员、行业工程师以及机器人爱好者提供一套完整、高效、易用且可扩展的机器人路径规划解决方案。它不仅仅是一个算法的集合，更是一个集成了机器人建模、环境表示
最新抖音 iOS 设备注册算法（配合心跳做不上榜人气用） qq_1771238069 ios 算法 cocoa
最新业务需要研究了一周时间做出来了可以配合心跳包做抖音人气用一下部分代码#-*-encoding:utf-8-*-importjson,random,time,sysimportrequestsfromurllib.parseimporturlparse,parse_qsimportratelimitfromloguruimportloggerfromspiders.reg.confimportm
Scikit-learn：机器学习的「万能工具箱」科技林总 DeepSeek学AI 人工智能
——三行代码构建AI模型的全栈指南**###**一、诞生背景：让机器学习从实验室走向大众****2010年前的AI困境**：-学术界模型难以工程化-算法实现碎片化（MATLAB/C++主导）-企业应用门槛极高>**破局者**：DavidCournapeau发起*Scikit-learn*项目，**统一算法接口**+**Python简易语法**=机器学习民主化革命---###**二、设计哲学：一致性
基于MATLAB图像特征识别及提取实现图像分类 jghhh01 机器学习算法人工智能
基于MATLAB的图形处理程序，可以进行图像特征识别及提取，进而实现图像分类。hog_svm.m,2276svm_images/test_image/1.jpg,20980svm_images/test_image/2.jpg,18246svm_images/test_image/3.jpg,13835svm_images/test_image/4.jpg,18539svm_images/test
Edge-TTS在广电系统中的语音合成技术的创新应用
Edge-TTS在广电系统中的语音合成技术的创新应用作者：本人是一名县级融媒体中心的工程师，多年来一直坚持学习、提升自己。喜欢Python编程、人工智能、网络安全等多领域的技术。摘要随着人工智能技术的快速发展，文字转语音(Text-to-Speech,TTS)系统已成为多种应用的重要组成部分，尤其在广播电视领域。本文介绍了一种基于Edge-TTS大模型的文字转语音工具，该工具结合了现代文本处理和语
掌握软件工程领域持续集成的部署流程
掌握软件工程领域持续集成的部署流程关键词：持续集成、自动化构建、版本控制、单元测试、持续交付、DevOps、流水线摘要：本文通过面包工厂的生动比喻，揭示持续集成的核心原理。我们将构建一条"代码加工流水线"，用真实的Jenkins配置案例展示从代码提交到自动化部署的全过程，并探讨现代软件开发中持续集成带来的革命性变化。背景介绍目的和范围本文面向初入软件行业的开发者，系统讲解持续集成（Continuo
Selenium测试安全策略：防止逆向工程软件工程实践软件工程最佳实践 AI软件构建大数据系统架构 selenium 网络 tcp/ip ai
Selenium测试安全策略：防止逆向工程关键词：Selenium自动化测试、逆向工程、代码安全、敏感信息保护、测试脚本防护摘要：本文从Selenium自动化测试的实际场景出发，深入解析测试脚本面临的逆向工程风险（如敏感信息泄露、测试逻辑被破解），通过生活案例类比技术概念，系统讲解代码混淆、敏感信息加密、日志脱敏等核心安全策略，并提供可落地的实战代码与工具推荐，帮助测试人员构建“防逆向”的安全测试
Serverless架构下的持续交付实践软件工程实践软件工程最佳实践 AI软件构建大数据系统架构 serverless 架构运维 ai
Serverless架构下的持续交付实践关键词：Serverless架构、持续交付、DevOps、无服务器计算、自动化部署摘要：本文深入探讨了Serverless架构下的持续交付实践。首先介绍了Serverless架构和持续交付的背景知识，接着解释了相关核心概念及其关系，详细阐述了核心算法原理与操作步骤，通过数学模型加深理解，结合实际项目案例展示了代码实现与解读，探讨了实际应用场景，推荐了相关工具
联咏NT98567高度集成边缘IPC应用SoC规格特性 weixin_Todd_Wong2010 边缘计算人工智能计算机视觉 python c++神经网络
联咏NT98567MQG是一款高度集成的SoC，具有高图像质量、低比特率和低功耗的特点，适用于电池应用，目标是2Mp至5Mp/8Mp边缘IP摄像头应用。该SoC集成了双核ARMCortexA7CPU、新一代ISP、H.265/H.264视频压缩编解码器、视频处理引擎（VPE）用于双传感器拼接和鱼眼去畸变、高性能硬件DLA模块、图形引擎、显示控制器、以太网PHY、USB2.0主机/设备、音频编解码器
海思Hi3519DV500方案1200万无人机吊舱套板 weixin_Todd_Wong2010 嵌入式硬件 AI 前端边缘计算图像处理
海思Hi3519DV500方案1200万无人机吊舱套板Hi3519DV500是一颗面向行业市场推出的超高清智能网络摄像头SoC。该芯片最高支持四路sensor输入，支持最高4K@30fps的ISP图像处理能力，支持2FWDR、多级降噪、六轴防抖、全景拼接、多光谱融合等多种传统图像增强和处理算法，支持通过AI算法对输入图像进行实时降躁等处理，为用户提供了卓越的图像处理能力，集成了高效的神经网络推理引
飞算 JavaAI 2.0.0和 AI 编程技术设计的 120 章 Java 系统教程 AI编程员 001AI传统＆编程语言 002AI编程工具汇总 003AI编程作品汇总开发语言深度学习 pillow AI编程人工智能
以下是基于飞算JavaAI2.0.0和AI编程技术设计的120章Java系统教程，涵盖从基础到高阶、理论到实践的全栈知识体系，结合经典案例与企业级项目实战，适合零基础到架构师的学习路径：第一部分：基础入门（第1-30章）Java开发环境配置JDK21+IntelliJIDEA+飞算AI插件安装第一个AI生成的HelloWorld程序基础语法与AI辅助编程数据类型、变量、运算符飞算AI：自动生成算法
非结构化数据真“野”？聊聊AI处理它时踩过的那些坑 Echo_Wish Python 进阶人工智能
非结构化数据真“野”？聊聊AI处理它时踩过的那些坑在AI圈子里有一句“老话”：真正的世界，是非结构化的。图像、音频、视频、文本、传感器原始数据……这些在数据库里没个字段、没个主键的家伙，占据了全世界80%以上的数据量。咱们都喜欢说“数据是新时代的石油”，但很少人说：非结构化数据，就是粘稠未提炼的原油——处理它，才是最累的活。这篇文章，我不想跟你讲那些“炫技”的论文和模型，而是从一个一线AI工程师的
算法大厨日记：猫猫狐狐带你用代码做一锅香喷喷的“预测汤” Gyoku Mint AI修炼日记猫猫狐狐的小世界人工智能人工智能机器学习 python 算法 database 深度学习数据挖掘
️【开场·今天的料理名叫“预测炖汤”】猫猫：“咱今天突发奇想，决定用机器学习代码给你炖一锅‘预测汤’喵！这不是教你代码，是要告诉你怎么把‘算法’吃进肚子里~”狐狐：“别急，她又在打比方了。这锅汤从数据准备到调参优化，就跟你平常做饭的过程没两样，只不过食材都被咱们用代码换了一遍。”【第一步·数据准备，就是挑菜啦】猫猫：“首先是挑菜（数据预处理），不能什么菜都扔进去锅里吧？要洗干净去皮（数据清洗），再
计算机基础和Java编程的练习题柳依依@ Java入门 java 开发语言
1.计算机的核心硬件是什么？各自有什么用？中央处理器（CPU）：负责执行程序中的指令，进行算术和逻辑运算，是计算机的“大脑”。内存（RAM）：临时存储CPU正在处理的程序和数据，速度快但断电后数据丢失。硬盘（HDD/SSD）：永久存储操作系统、应用程序和用户数据，断电后数据不丢失。主板：连接所有硬件组件，提供数据传输的通道。显卡（GPU）：负责图形渲染，将数字信号转换为图像显示在屏幕上。电源：为计
Python实例题：基于 KNN 算法的手写数字识别
目录Python实例题题目要求：解题思路：代码实现：Python实例题题目基于KNN算法的手写数字识别要求：实现一个基于K-NearestNeighbors(KNN)算法的手写数字识别系统。支持以下功能：使用MNIST数据集训练和测试模型实现KNN分类算法可视化手写数字样本评估模型性能（准确率、混淆矩阵等）添加用户交互界面，允许用户绘制数字并进行识别。解题思路：使用sklearn加载MNIST数据
Python实例题：基于遗传算法的旅行商问题求解狐凄实例 python 开发语言
目录Python实例题题目要求：解题思路：代码实现：Python实例题题目基于遗传算法的旅行商问题求解要求：使用遗传算法解决旅行商问题（TSP）。支持以下功能：随机生成城市坐标或导入预定义城市实现遗传算法的基本操作（选择、交叉、变异）可视化进化过程和最终路径统计进化过程中的适应度变化允许用户调整遗传算法参数（种群大小、迭代次数、交叉率、变异率等）。解题思路：用列表表示城市访问顺序作为染色体。使用欧
【算法笔记】红黑树插入操作 PXM的算法星球算法笔记算法笔记
红黑树插入与调整详解一、红黑树的五大性质红黑树是一种自平衡的二叉搜索树（BST），其核心特性如下：颜色属性：每个节点非红即黑根属性：根节点必须为黑色叶子属性：所有的NIL叶子节点都是黑色红节点约束：红色节点的子节点必须为黑色（即无连续红节点）黑高平衡：从任一节点到其所有后代叶子节点的路径中，黑色节点数量相等二、插入操作流程阶段1：标准BST插入从根节点开始查找插入位置新节点总是红色按照BST规则插
什么是Sentinel? 以及优点肘击鸣的百k路 sentinel
Sentinel是阿里巴巴开源的轻量级流量治理与系统保护组件，专注于微服务架构下的实时流量控制、熔断降级和系统稳定性保障。其核心目标是通过动态规则管理防止服务因高并发、突发流量或依赖故障导致雪崩崩溃。⚙️Sentinel的核心功能流量控制基于QPS（每秒请求数）或并发线程数限制资源访问，支持直接拒绝、匀速排队（漏桶算法）、慢启动（令牌桶算法）等策略。细粒度控制：可针对特定接口、方法甚至热点参数（如
用AI给AR加“智慧”：揭秘增强现实智能互动的优化秘密 Echo_Wish 人工智能前沿技术人工智能 ar
用AI给AR加“智慧”：揭秘增强现实智能互动的优化秘密引子：增强现实，到底还能怎么更聪明？还记得当年PokémonGO火爆全球的场景吗？玩家们手机对准街头，虚拟小精灵活灵活现地跳出来，那就是增强现实（AR）最经典的应用之一。随着硬件发展和算法进步，AR正逐步从“炫酷玩具”变成生产力工具、教育助手、零售新体验。但AR想要更“聪明”，不是简单把虚拟物放到现实里那么简单，而是让虚拟世界和现实环境更自然地
推荐算法特征工程实战：用户与物料动态画像构建指南 Jay Kay 推荐算法推荐算法算法机器学习
在推荐系统的特征工程中，动态画像是提升推荐精准性的核心武器。通过捕捉用户行为偏好和物料热度变化，算法能实现千人千面的精准推荐。本文结合两张关键图表，深入解析动态画像的构建方法与工程实践。一、用户动态画像：六大维度精准刻画兴趣偏好用户动态画像基于六个关键维度构建（如表2-1所示），形成"6W"行为模型：用户粒度物料属性时间粒度动作类型统计对象统计方法1.核心维度解析（附典型场景）维度可选值应用场景用
tensorRT 与 torchserve-GPU性能对比 joker-G 计算机视觉 pytorch python
实验对比前端时间搭建了TensorRT、Torchserve-GPU，最近抽时间将这两种方案做一个简单的实验对比。实验数据Cuda11.0、Xeon®62423.1*80、RTX309024G、Resnet50TensorRT、Torchserve-GPU各自一张卡搭建10进程接口，感兴趣的可以查看我个人其他文章。30进程并发、2000张1200*720像素图像的总量数据TensorRT的部署使用
TensorFlow：开启智能时代的引擎科技林总 DeepSeek学AI 人工智能
想象一下，计算机能看懂病历、汽车能自动驾驶、机器能创作艺术——这一切的核心，正是深度学习的力量。而推动这场革命的引擎之一，就是今天的主角：**TensorFlow**。---###**一、背景：为什么需要TensorFlow？1.**深度学习的爆发**-传统编程无法解决图像识别、自然语言处理等复杂问题。-神经网络需要高效工具处理海量数据和计算。2.**Google的答案**-2015年开源Tens
深入了解数据库领域行式存储的架构设计数据库管理艺术数据库专家之路大数据AI人工智能 MCP&Agent SQL实战数据库 ai
深入了解数据库领域行式存储的架构设计关键词：行式存储、数据库架构、OLTP、存储引擎、行记录格式摘要：本文将以“行式存储”为核心，从生活场景切入，逐步拆解数据库行式存储的底层架构设计。我们将通过“图书馆藏书”的趣味比喻、具体代码示例和真实数据库（如MySQLInnoDB）的实践案例，深入理解行式存储的核心原理、适用场景及未来趋势，帮助读者建立对数据库存储架构的系统认知。背景介绍目的和范围数据库是现
非关系型数据库在数据库领域的崛起与应用数据库管理艺术数据库专家之路大数据AI人工智能 MCP&Agent SQL实战数据库 nosql 网络 ai
非关系型数据库在数据库领域的崛起与应用关键词：非关系型数据库、关系型数据库、崛起原因、应用场景、数据库领域摘要：本文主要探讨了非关系型数据库在数据库领域的崛起与应用。首先介绍了非关系型数据库的背景，包括目的、预期读者等内容。接着详细解释了非关系型数据库、关系型数据库等核心概念，并阐述了它们之间的关系。然后深入讲解了非关系型数据库的核心算法原理、数学模型和公式。通过项目实战展示了非关系型数据库的实际
PostgreSQL数据库的自动化备份脚本编写与部署数据库管理艺术数据库专家之路大数据AI人工智能 MCP&Agent SQL实战数据库 postgresql 自动化 ai
PostgreSQL数据库的自动化备份脚本编写与部署关键词：PostgreSQL、自动化备份、pg_dump、crontab、数据库运维摘要：数据库是企业的“数字心脏”，一旦数据丢失可能导致不可挽回的损失。本文将用“给小学生讲故事”的方式，从备份的重要性出发，逐步讲解如何编写PostgreSQL自动化备份脚本（含全量备份、压缩、日志记录、旧文件清理），并通过crontab实现定时执行。无论你是刚接
ViewController添加button按钮解析。（翻译）张亚雄 c
<div class="it610-blog-content-contain" style="font-size: 14px"></div>// ViewController.m // Reservation software // // Created by 张亚雄 on 15/6/2.
mongoDB 简单的增删改查开窍的石头 mongodb
在上一篇文章中我们已经讲了mongodb怎么安装和数据库/表的创建。在这里我们讲mongoDB的数据库操作在mongo中对于不存在的表当你用db.表名他会自动统计下边用到的user是表明，db代表的是数据库添加(insert):
log4j配置 0624chenhong log4j
1) 新建java项目 2) 导入jar包，项目右击，properties—java build path—libraries—Add External jar，加入log4j.jar包。 3) 新建一个类com.hand.Log4jTest package com.hand; import org.apache.log4j.Logger; public class
多点触摸(图片缩放为例) 不懂事的小屁孩多点触摸
多点触摸的事件跟单点是大同小异的，上个图片缩放的代码，供大家参考一下 import android.app.Activity; import android.os.Bundle; import android.view.MotionEvent; import android.view.View; import android.view.View.OnTouchListener
有关浏览器窗口宽度高度几个值的解析换个号韩国红果果 JavaScript html
1 元素的 offsetWidth 包括border padding content 整体的宽度。 clientWidth 只包括内容区 padding 不包括border。 clientLeft = offsetWidth -clientWidth 即这个元素border的值 offsetLeft 若无已定位的包裹元素
数据库产品巡礼：IBM DB2概览蓝儿唯美 db2
IBM DB2是一个支持了NoSQL功能的关系数据库管理系统，其包含了对XML，图像存储和Java脚本对象表示（JSON）的支持。DB2可被各种类型的企业使用，它提供了一个数据平台，同时支持事务和分析操作，通过提供持续的数据流来保持事务工作流和分析操作的高效性。 DB2支持的操作系统 DB2可应用于以下三个主要的平台: 工作站，DB2可在Linus、Unix、Windo
java笔记5 a-john java
控制执行流程： 1，true和false 利用条件表达式的真或假来决定执行路径。例：（a==b）。它利用条件操作符“==”来判断a值是否等于b值，返回true或false。java不允许我们将一个数字作为布尔值使用，虽然这在C和C++里是允许的。如果想在布尔测试中使用一个非布尔值，那么首先必须用一个条件表达式将其转化成布尔值，例如if(a!=0)。 2，if-els
Web开发常用手册汇总 aijuans PHP
一门技术，如果没有好的参考手册指导,很难普及大众。这其实就是为什么很多技术，非常好，却得不到普遍运用的原因。正如我们学习一门技术，过程大概是这个样子： ①我们日常工作中，遇到了问题，困难。寻找解决方案，即寻找新的技术； ②为什么要学习这门技术？这门技术是不是很好的解决了我们遇到的难题，困惑。这个问题，非常重要，我们不是为了学习技术而学习技术，而是为了更好的处理我们遇到的问题，才需要学习新的
今天帮助人解决的一个sql问题 asialee sql
今天有个人问了一个问题，如下： type AD value A
意图对象传递数据百合不是茶 android 意图Intent Bundle对象数据的传递
学习意图将数据传递给目标活动; 初学者需要好好研究的 1,将下面的代码添加到main.xml中 <?xml version="1.0" encoding="utf-8"?> <LinearLayout xmlns:android="http:/
oracle查询锁表解锁语句 bijian1013 oracle object session kill
一.查询锁定的表如下语句，都可以查询锁定的表语句一： select a.sid, a.serial#, p.spid, c.object_name, b.session_id, b.oracle_username, b.os_user_name from v$process p, v$s
mac osx 10.10 下安装 mysql 5.6 二进制文件［tar.gz］征客丶 mysql osx
场景：在 mac osx 10.10 下安装 mysql 5.6 的二进制文件。环境：mac osx 10.10、mysql 5.6 的二进制文件步骤：[所有目录请从根“/”目录开始取，以免层级弄错导致找不到目录] 1、下载 mysql 5.6 的二进制文件，下载目录下面称之为 mysql5.6SourceDir；下载地址：http://dev.mysql.com/downl
分布式系统与框架 bit1129 分布式
RPC框架 Dubbo 什么是Dubbo Dubbo是一个分布式服务框架，致力于提供高性能和透明化的RPC远程服务调用方案，以及SOA服务治理方案。其核心部分包含: 远程通讯: 提供对多种基于长连接的NIO框架抽象封装，包括多种线程模型，序列化，以及“请求-响应”模式的信息交换方式。集群容错: 提供基于接
那些令人蛋痛的专业术语白糖_ spring Web SSO IOC
spring 【控制反转(IOC)/依赖注入(DI)】：由容器控制程序之间的关系，而非传统实现中，由程序代码直接操控。这也就是所谓“控制反转”的概念所在：控制权由应用代码中转到了外部容器，控制权的转移，是所谓反转。简单的说：对象的创建又容器(比如spring容器)来执行，程序里不直接new对象。 Web 【单点登录(SSO)】：SSO的定义是在多个应用系统中，用户
《给大忙人看的java8》摘抄 braveCS java8
函数式接口：只包含一个抽象方法的接口 lambda表达式：是一段可以传递的代码你最好将一个lambda表达式想象成一个函数，而不是一个对象，并记住它可以被转换为一个函数式接口。事实上，函数式接口的转换是你在Java中使用lambda表达式能做的唯一一件事。方法引用：又是要传递给其他代码的操作已经有实现的方法了，这时可以使
编程之美-计算字符串的相似度 bylijinnan java 算法编程之美
public class StringDistance { /** * 编程之美计算字符串的相似度 * 我们定义一套操作方法来把两个不相同的字符串变得相同，具体的操作方法为： * 1.修改一个字符（如把“a”替换为“b”）; * 2.增加一个字符（如把“abdd”变为“aebdd”）; * 3.删除一个字符（如把“travelling”变为“trav
上传、下载压缩图片 chengxuyuancsdn 下载
/** * * @param uploadImage --本地路径(tomacat路径) * @param serverDir --服务器路径 * @param imageType --文件或图片类型 * 此方法可以上传文件或图片.txt,.jpg,.gif等 */ public void upload(String uploadImage,Str
bellman-ford(贝尔曼-福特)算法 comsci 算法 F#
Bellman-Ford算法(根据发明者 Richard Bellman 和 Lester Ford 命名)是求解单源最短路径问题的一种算法。单源点的最短路径问题是指：给定一个加权有向图G和源点s，对于图G中的任意一点v，求从s到v的最短路径。有时候这种算法也被称为 Moore-Bellman-Ford 算法，因为 Edward F. Moore zu 也为这个算法的发展做出了贡献。与迪科
oracle ASM中ASM_POWER_LIMIT参数 daizj ASM oracle ASM_POWER_LIMIT 磁盘平衡
ASM_POWER_LIMIT 该初始化参数用于指定ASM例程平衡磁盘所用的最大权值，其数值范围为0~11，默认值为1。该初始化参数是动态参数，可以使用ALTER SESSION或ALTER SYSTEM命令进行修改。示例如下： SQL>ALTER SESSION SET Asm_power_limit=2;
高级排序:快速排序 dieslrae 快速排序
public void quickSort(int[] array){ this.quickSort(array, 0, array.length - 1); } public void quickSort(int[] array,int left,int right){ if(right - left <= 0
C语言学习六指针_何谓变量的地址一个指针变量到底占几个字节 dcj3sjt126com C语言
# include <stdio.h> int main(void) { /* 1、一个变量的地址只用第一个字节表示 2、虽然他只使用了第一个字节表示，但是他本身指针变量类型就可以确定出他指向的指针变量占几个字节了 3、他都只存了第一个字节地址，为什么只需要存一个字节的地址，却占了4个字节，虽然只有一个字节，但是这些字节比较多，所以编号就比较大，
phpize使用方法 dcj3sjt126com PHP
phpize是用来扩展php扩展模块的，通过phpize可以建立php的外挂模块,下面介绍一个它的使用方法,需要的朋友可以参考下安装（fastcgi模式）的时候，常常有这样一句命令：代码如下: /usr/local/webserver/php/bin/phpize 一、phpize是干嘛的？ phpize是什么？ phpize是用来扩展php扩展模块的，通过phpi
Java虚拟机学习 - 对象引用强度 shuizhaosi888 JAVA虚拟机
本文原文链接：http://blog.csdn.net/java2000_wl/article/details/8090276 转载请注明出处！无论是通过计数算法判断对象的引用数量，还是通过根搜索算法判断对象引用链是否可达，判定对象是否存活都与“引用”相关。引用主要分为：强引用(Strong Reference)、软引用(Soft Reference)、弱引用(Wea
.NET Framework 3.5 Service Pack 1（完整软件包）下载地址 happyqing .net 下载 framework
Microsoft .NET Framework 3.5 Service Pack 1（完整软件包） http://www.microsoft.com/zh-cn/download/details.aspx?id=25150 Microsoft .NET Framework 3.5 Service Pack 1 是一个累积更新，包含很多基于 .NET Framewo
JAVA定时器的使用 jingjing0907 java timer 线程定时器
1、在应用开发中，经常需要一些周期性的操作，比如每5分钟执行某一操作等。对于这样的操作最方便、高效的实现方式就是使用java.util.Timer工具类。 privatejava.util.Timer timer; timer = newTimer(true); timer.schedule( newjava.util.TimerTask() { public void run()
Webbench 流浪鱼 webbench
首页下载地址 http://home.tiscali.cz/~cz210552/webbench.html Webbench是知名的网站压力测试工具，它是由Lionbridge公司（http://www.lionbridge.com）开发。 Webbench能测试处在相同硬件上，不同服务的性能以及不同硬件上同一个服务的运行状况。webbench的标准测试可以向我们展示服务器的两项内容：每秒钟相
第11章动画效果（中） onestopweb 动画
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
windows下制作bat启动脚本. sanyecao2314 java cmd 脚本 bat
java -classpath C:\dwjj\commons-dbcp.jar;C:\dwjj\commons-pool.jar;C:\dwjj\log4j-1.2.16.jar;C:\dwjj\poi-3.9-20121203.jar;C:\dwjj\sqljdbc4.jar;C:\dwjj\voucherimp.jar com.citsamex.core.startup.MainStart
Java进行RSA加解密的例子 tomcat_oracle java
加密是保证数据安全的手段之一。加密是将纯文本数据转换为难以理解的密文；解密是将密文转换回纯文本。　　数据的加解密属于密码学的范畴。通常，加密和解密都需要使用一些秘密信息，这些秘密信息叫做密钥，将纯文本转为密文或者转回的时候都要用到这些密钥。　　对称加密指的是发送者和接收者共用同一个密钥的加解密方法。　　非对称加密(又称公钥加密)指的是需要一个私有密钥一个公开密钥，两个不同的密钥的
Android_ViewStub 阿尔萨斯 ViewStub
public final class ViewStub extends View java.lang.Object android.view.View android.view.ViewStub 类摘要： ViewStub 是一个隐藏的，不占用内存空间的视图对象，它可以在运行时延迟加载布局资源文件。当 ViewSt