GarfieldEr007

Content-Based Information Retrieval 基于内容的信息检索

Content-Based Information Retrieval

Information Era

The Internet and the WWW
Milestones of the Information Era

Multimedia Information Retrieval

Keyword / Text - based Search
Content-based Search

References

Information Era

The second half of the 20th century and the beginning of the third millenium can be described as the information era of the mankind's development because of the enormous impact of information on the human lifestyle and way of thinking. Permanent and intensive exposure to information broaden people's views and deepen knowledge and awareness about the environment and the world in general. This process globalises society and creates new living and educational standards (Hanjalic e.a., 2000).

The information era is a consequence of digital revolution, which started in the nineteen forties - fifties and has continuously built up. Representation of information in the digital form allows for a lossless data compression or lossy compression with a low quality loss, which in turn results in a large reduction of the time and channel capacity for data transmission and of the space required for data storage. Possibilities to combine and transmit or process different types of information such as audio, visual or textual data without quality loss and permanently increasing performance-to-price ratio of digital transmission, storage, and processing resulted in the advent and continuous advance of multimediasystems and applications. Today's digital telecommunication networks such as the Internet provide extremely high-speed information transfer, called frequently "information superhighway".

In a broad sense, multimedia is a general framework for interaction with information avaliable from different sources, and the more that is known about the content means the better can be its representation, processing, analysis, and so forth, in terms of efficiency and allowed functionalities (Rao e. a., 2002). Digital imagery is the most important part of multimedia data types. If video and audio data are in the predominant use of entertainment and news applications, images are actively exploited in almost all human activities (Castelli & Bergman, 2002). Large collections of images are maintained in many application domains, for example, medical, astronomical, geologic image databases, digital libraries and museums accessible via the Internet, collections of space and aerial remetely sensed images of the Earth's surface, and so on. Although digital imagery have been used, starting from the nineteen sixties - seventies, the Internet and the World Wide Web browsers brought a new world, the virtual cyberspace, that stimulates the era of information explosion (Shih, 2002).

The Internet and the WWW

The terms Internet and the World Wide Web are not synonymous although describe two related things. The Internet is a massive networking infrastructure linking millons of computers together globally. In this network any computer can communicate with any other computer as long as they are both connected to the Internet. More than 100 countries are linked into exchanges of data, news and opinions. Unlike online services, which are centrally controlled, the Internet is decentralised by design. Each Internet computer, called a host, is independent. Its operator chooses which Internet services to use and which local services to make available to other computers. "Remarkably, this anarchy by design works exceedingly well" (Webopedia, 2002).

Information travels over the Internet via a variety of languages known as protocols. A protocol consists of a set of conventions or rules, which govern communications by allowing networks to interconnect and ensuring compatibility between devices of different manufacturers. Examples of the protocols are:

TCP	Transmission Control Protocol	converts messages into streams of packets at the source and then reassembles them back into messages at the destination
IP	Internet Protocol	handles addressing and routing of packets across multiple nodes and even multiple networks with multiple standards
TCP/IP	combined TCP and IP
FTP	File Transfer Protocol	transfers files from one computer to another; based on TCP/IP protocol
HTTP	Hypertext Transfer Protocol	transfers compound documents with links; based on TCP/IP protocol
IPP	Internet Printing Protocol
IIP	Internet Imaging Protocol	transports high-quality images and metadata across the Internet, using the Flashpix format; integrated with TCP/IP and HTTP
SMTP	Simple Mail Transfer Protocol

The protocols deal with Internet media types, which identify type/subtype and encoding of transmitted data. The media types are used by Multipurpose Internet Mail Extensions (MIME) and others. Basic standard media types registered with Internet Assigned Numbers Authority (IANA) are text, application (e.g. document), audio, image, and video (Buckley & Beretta, 2000).

The World Wide Web, or simply Web, is a way of accessing information over the medium of the Internet. It is an information-sharing model built on the top of the Internet. The Web is based on three specifications: URL (Uniform Resource Locator) to locate information, HTML (Hypertext Markup Language) to write simple documents, and HTTP. The Web uses the HTTP protocol, only one of the languages spoken over the Internet, to transmit data. Web services, which use the HTTP to allow applications to communicate, use the Web to share information. The Web utilises browsers to access Web documents (called Web pages) that are linked to each other via hyperlinks. Web documents also contain text, graphics, sounds, and video.

Content-Based Information Retrieval 基于内容的信息检索_第1张图片

The Web is just one of the ways that information can be disseminated over the Internet. The Internet, not the Web, is also used for e-mail, which relies on the SMTP, instant messaging, Usenet news groups, and FTP. Thus the Web is only a large portion of the Internet as well as the Web is the basic way to publish and access information on networks within companies (Intranet). Although it is nominally based on the HTML standard, a steady stream of innovations in the domains of multimedia and interactivity has greatly expanded Web capabilities (Blumberg & Hughes, 1997).

In addition to Internet standards, Multimedia communication is governed by own standards that provide a compromise between what is theoretically possible and what is stechnically feasible as well as guarantee well balanced cost / performance ratio (Rao e. a.):

MPEG-1	ISO/IEC IS 11172	Coding of moving pictures and associated audio	multimedia CD-ROM applications at a bit rate up to about 1.5 Mb/s; the standard format for distribution of video material across the Web
MPEG-2	ISO/IEC IS 13818	Generic coding of moving pictures and associated audio	high-quality coding for all digital multi-media transmissions at data rates of 2 to 50 Mb/s; digital TV and HDTV in home entertainment, business, and scientific aplications)
MPEG-4	ISO/IEC IS 14496	Coding of audiovisual objects	coding and flexible representation of both natural and synthetic, real time and non-real time audio and video data; digital TV, interactive graphics applications, interactive multimedia (distribution of and access to content on the Web)
MPEG-4 VTC	ISO/IEC IS 14496-2 Pt.2 Visual	Visual Texture Coding	the MPEG-4 algorithm for compressing the texture information in photo-realistic 3D models; it can be used for compression still images
JPEG2000		Still-image compression for multimedia communication	an emerging standard intended to provide rate distortion and subject image quality performance superior to existing standards
MPEG-7	ISO/IEC IS 15938	Multimedia Content Description Interface	to describe multimedia content so that users can search, browse, and retrieve the content more efficiently and effectively than with existing mainly text-based search engines
MPEG21	ISO/IEC IS 18034	Multimedia framework	to enable transparent and augmented use of mulimedia resources across a wide range of networks and devices

Milestones of the Information Era

For last two decades, an average information consumer permanently was raising his expectations regarding the amount, variety and technical quality of the received multimedia information, as well as of the systems for information receiving, processing, storage, and replay or display. The Internet and the Web created a virtual world linking us together, having unique multimedia capabilities, and yielding a new concept of E-Utopia. The new concept realises new activities such as e-conferencing, e-entertainment, e-commerce, e-learning, telemedicine, and so forth. All these activities involve distributed multimedia databases and techniques for multimedia communication and content-based information search and retrieval (Hanjalic e.a., 2000; Rao e.a., 2002; Shih, 2002).

The advent of virtual reality environments (e.g., Apple Computer's Quick Time VR) and the virtual reality markup language (VRML) for rendering 3D objects and scenes added much to the Web unique multimedia capabilities. But the Web continues to grow as both an interactive and a publishing environment and offers new types of interactions and ways to distribute, utilise, and visualise information. Some experts anticipate that today's interaction with a database via the Web will necessarily evolve to become the interaction with a variety of knowledge bases and will result in the more intelligent Web.

In addition to the present e-activities over the Web, one can expect the advent of smart houses, which can communicate with owners, repair services, shops, police, and others over the Web in order to suggest appropriate decisions using current measurements together with specific knowledge bases. For instance, an appliance may connect through the Web to a central facility and inform the vendor about the status of all of its subsystems for deriving the most cost-effective time schedule for service and routing of service.

In near future most of households are expected to be equipped with receivers for Digital Video Broadcasting (DVB) and Audio Broadcasting (DAB), providing together hundreds of high-quality audiovisual channels, combined with a high-speed Internet connection to access countless archives of invormation all over the world. Today we witness a fast development of home digital multimedia archives and digital libraries for a large-scale collection of multimedia information, e.g. digital museum archives or professional digital multimedia archives at service providers such as TV and radio broadcasters, Internet providers, etc. Digital medical images are widely used in telemedicine based on the Web, mainly, for continuing medical education and diagnostic purposes (Della Mea e.a., 2001).

At present, more than a hundred million digital images and videos are already embedded in Web pages, and these collections are rapidly expanding because "a picture is worth a thousand words". Gigabytes of new images, audio and video clips are stored everyday in various repositories accessed through the Web (Shih, 2002). In some cases, such as space and aerial imagery of the Earth's surface, amounts of the stored data exceed thousands of Terabytes. Thus, among other new challenges of the information era, mechanisms for content-based information retrieval, especially, for efficient retrieval of image and video information stored in the Web-based multimedia databases, become the most important and difficult issue.

Multimedia Information Retrieval

"Anyone who has surfed the Web has explained at one point or another that there is so much information available, so much to search and so much to keep up with". (Smeulders & Jain, 1997)

Multimedia information differs from conventional text or numerical data in that multimedia objects require a large amount of memory and special processing operations. A multimedia database management system should be able to handle various data types (image, video, audio, text) and a large number of such objects, provide a high-performance and cost-effective storage of the objects, and support such functions as insert, delete, update, and search (Shih, 2002). A typical multimedia document or presentation contains a number of objects of different types, such as picture, music, and text. Thus content-based multimedia information retrieval has become a very important new research issue. Unlike a traditional searching scheme based on text and numerical data comparison, it is hard to model the searching and matching criteria of multimedia information.

Image and video retrieval is based on how contents of an image or a chain of images can be represented. Conventional techniques of text data retrieval can be applied only if every image and video record is accompanied with a textual content description (image metadata). But image or video content is much more versatile compared to text, and in the most cases the query topics are not reflected in the textual metadata available. Images, by their very nature, contain "non-textual", unstructured information, which hardly can be captured automatically. Computational techniques that pursue the goals of indexing the unstructured visual information are called content-based video information retrieval (CBVIR), or more frequently content-based image retrieval (CBIR). Thus both the content-based video information retrieval and the general-purpose content-based information retrieval share the same abbreviation (CBIR).

Content-Based Information Retrieval 基于内容的信息检索_第2张图片

Architecture of a CBIR system

In CBIR, the user should describe the desired content in terms of visual features, images should be ranked with respect to similarity to the description, and the top-rank (or most similar) images should be retrieved. At the lowest, or initial level of description, an image is considered as a collection of pixels. Although a pixel-level content might be of interest for some specific applications (say, in remote sensing of the Earth's surface), today's CBIR is based on more elaborated descriptors showing specific local and global photometric and geometric features of visual objects and semantic relationships between the features.

Features can be divided into general-purpose and domain-specific. In the most cases general features are colour, texture, geometric shape, sketch, and spatial relationships. Domain-specific features are used in special applications such as surveying and mapping of the Earth's surface using remotely sensed imagery or biometrics based on human face or fingerprint recognition. But extraction of adequate descriptors and especially inference of semantic content are extremely difficult problems having no universal solution. Higher levels of image content description involve objects and abstract relationships. Such a description is more or less easily formed by human vision, but it is often difficult to detect and recognise objects of interest in one or more images (Castelli & Bergman, 2002).

The most difficult issue of multimedia information retrieval is how to make a query describing the needs of the user. For example, it is a hard task to conduct a query like "Find me a picture with a house and a car" and it is even harder to match a specification against the large amount of picture files in a multimedia database. Generally, human and automated content-based information retrieval differ much. Human retrieval tasks (queries) are stated at cognitive level and exploit human knowledge, analysis, and understanding of the information context in terms of objects, persons, sceneries, meaning of speech fragments or the context of a story in general. Therefore, the queries by content can be formulated in different ways, e.g.

"Find the most recent image of Australian Prime Minister John Howard"
"Find all images of an American bald eagle"
"Find the movie scene where Titanic hits the iceberg"
"Classify all the images according to the place where they are taken"
"Select recent aerial images of Rangitoto"
"Find images depicting similar tornadoes in Alabama"
"Select most impressive sunset images" and so on.

The notion of content is hardly formalised at present. Among a host of possible definitions, content is defined on the Web as:

meaning or message contained and communicated by a work of art, including its emotional, intellectual, symbolic, thematic, and narrative connotations (see www.ackland.org/tours/classes/glossary.html)
the subject matter of a work of art and its values apart from the artist's ability; form and content are the two elements that comprise a work (see www.worldimages.com/art_glossary.php)
that which conveys information, eg text, data, symbols, numerals, images, sound and vision (see www.naa.gov.au/recordkeeping/er/guidelines/14-glossary.html)
the "meat" of a document, as opposed to its format, or appearance (see www.microsoft.com/technet/prodtechnol/visio/visio2002/plan/glossary.mspx)
media to communicate information or knowledge in information retrieval (see www.cordis.lu/ist/ka1/administrations/publications/glossary.htm)

Current computer vision does not allow to easily and automatically extract semantic information. An ultimate image encoding should capture an image semantic content in a way that corresponds to human interpretation. But the initially sensed image encoding consists of the raw pixel values - grey values or colours. Image analysis addresses a spectrum of intermediate possibilities between these two extremes but mostly focusses on low-level features being functions of the pixel values (Cox e.a.,2000). Although some features like colour relate to image semantics in some cases, but typically do not reflect true meaning of the image, and a much higher level of image description is necessary to effectively and practically represent the content.

So far the content is described in terms of general-purpose and domain-specific quantitative features. The general-purpose features include colour, texture, geometric shape, sketch, and spatial relationships of regions in an image or a video sequence. The domain-specific features appear in some special applications e.g. face detection and recognition or remote sensing of the Earth. Description of semantics (meaning) is a very hard problem with no universal solution because typically each meaningful description (interpretation) of video data easily formed by human vision turns to be extremely difficult for computation and vice versa.

Let us look, for instance, to a small database of natural images below:

These 3D scenes contain various objects such as horses, foals, cows, grass fields, bushes, water, hills, and many others, and their content is manifold because the interpretation of scenes, objects, and relations between objects in each such image depend on an observer, time, goal, and other subjective and objective factors...

The most difficult problem is how to describe what content the user needs and has in mind when makes a query. The simplest but still difficult example is to explicitly outline semantic elements to be searched for: "Find a picture with a brown foal near a bush". Even a harder task is to match such or more general specification against the large multimedia database. Human queries for a data search are always on a cognitive level exploiting human knowledge of the context in terms of objects, persons, sceneries, scenarios, etc. These queries may be formulated in different ways using natural language(s) and visual examples. But a query to a CBIR system has to account for much more constrained abilities of automatic data description and search.

The content-based video information retrieval has first to cope with a "sensory gap" (Smeulders e.a., 2000) caused by distinctions between the properties of an object in the world and the properties of its computational description derived from an image or a series of images. The sensory gap makes the content description problem ill-posed and notably limits capabilities of formal representation of the image content. Secondly, there is a semantic gap, or "a discrepancy between the query a user ideally would and one which the user actually could submit to an information retrieval system" (Castelli & Bergman, 2002). Semantics ("significant" in Greek) describes relationships between words and their meanings in linguistics and between signs and what they mean in logic. In relation to images, semantics is concerned with meaning of depicted objects and their features.

The semantic gap results in considerable distinctions between a description extracted from visual data and human interpretation of the same data in each particular case. The main restriction of the content-based retrieval is that the user searches for semantic, i.e. meaningful similarity, whereas the CBIR system provides only similarity of quantitative features obtained with data processing. Semantic relationships encode human interpretations of images which are relevant to each particular application, but these interpretations constitute only a small subset of all the possible meaningful interpretations. This is why automatic description of a "true" image contents is an unsolvable problem due to an intrinsically subjective human perception of images and video sequences.

Contents is so far described with digital signatures combining recognised objects, shapes, features, and relationships, and images are ranked by their quantitative similarity to a query description in terms of these objects, shapes, features, and their relationships. The top-rank, i.e. most similar images are retrieved and output. Informally, content of a still image includes, in increasing level of complexity, perceptual, oralgorithmic properties of visual information, semantic properties, e.g. abstract primitives such as objects, roles, and scenes, and subjectiveattributes such as impressions, emotions and meaning associated to the perceptual properties (Shih, 2002). Content-based retrieval of video records involves not only the objects shown but also the timing and spatial patterns of object movements.

But tools for content description by computational image / video understanding, object tracing, and semantic analysis are still and will be for a very long future time under development. First of all, the content of an image is a very subjective notion, and there are no "objective" ways to annotate the content at a semantic level to reflect all or even most of subjective interpretations of this image. Secondly, the gaps between "formal" and "human" (user) semantics should be bridged from both sides, by extending the image descriptions and adapting the user queries to how a CBIR system operates.

As mentioned by Cox e.a., 2000, to codify image semantics one needs a language to express them. Because it has to be used for human queries and human interpretation of a database image's description, the language must be natural for expressing search goals and give accurate and consistent description of each database image. Thus, it is very difficult to design such a consistent formal language. Today's CBIR systems exploit a more practical way of using hidden languages for semantic encoding and probabilistic learning and classification frameworks for linking image features and semantic classes. In particular, specific random field models of features and their spatial distributions are involved to account for wide variations of features within the same �semantic� class, and "semantic" representations of an image are built with modern feature clustering and classification techniques such as Support Vector Machines (SVM) or Bayesian networks. The resulting feature-based labelling of blocks (regions) of an image is used for interpreting its semantic content.

The users of a CBIR system have a diversity of goals, in particular, search by association, search for a specific image, or category search(Smeulders e.a., 2000). Search by association has first no partricular aim and implies highly interactive iterative refinement of the search using sketches or example images. Search for a precise copy of the image in mind (e.g., in an art catalogue) or for another image of the same object assumes that the targer can be interactively specified as similar to a group of given examples. Category search retrieves an arbitrary image representative of a certain class either specified also by an example or derived from labels or other database information.

At present, the only feasible analysis of a video, or an image, or a musical piece, or a speech fragment, or a text can be performed only atalgorithmic level. Such analyses involve computable features of audio and video signals, e.g. colour, texture, shape, frequency components, temporal characteristics of signals, as well as algorithms operating on these features.

Content-Based Information Retrieval 基于内容的信息检索_第12张图片

In image and video retrieval, various algorithms of image segmentation into homogeneous regions, detection of moving objects in successive frames, extraction of particular (e.g., spatially invariant) types of textures and geometric shapes, determination of relations among different objects, and analysis of 2D frequency spectrum are used for getting features. But in contrast to most of computer vision applications, image and video retrieval combines automatic image recognition with active user participation in the retrieval process (Castelli & Bergman, 2002). Also, retrieval relates inherently to image ranking by similarity to a query example, rather than to image classification by matching to a model. In CBIR systems the user evaluates system responses, refines the query, and determines whether the receieved answers are relevant to that query.

Of course, there is almost no parallelism in results of the cognition-based and feature-based retrieval even in the simple tasks like "an image containing a bird". As underlined in Chang e.a., "the multimedia information is highly distributed, minimally indexed, and lacks appropriate schemas. The critical question in multimedia search is how to design a scalable, visual information retrieval system? Such audio and visual information systems require large resources for transmission, storage and processing, factors which make indexing, retrieving, and managing visual information an immense challenge".

Keyword / Text - based Search

An image can hardly be described by text annotations or keywords, although these latter are to some extent associated with semantics. In principle, by extensive investigation of an image database one may obtain a set of specific keywords covering a broad range of semantic attributes ( Cox e.a., 2000 ). The set implies an additional set of keywords defining general categories , e.g. the specific attribute "horse" yields the category attribute "animal" to be present.

At present, most popular multimedia search engines, including all first generation visual information, or image retrieval (IR) systems, are still textual, even though the Web is now a multimedia-based repository with a variety of audio, video, image, and text formats. Some popular formats for different media types are as follows (Chang e.a., 2001):

Media type	Media format	File extension
text	plain	`txt`
text	HTML	`html`, `htm`
document	PDF Portable Document Format	`pdf`
	TEX DVI Device Independent Data	`dvi`
	Postscript	`ai`, `eps`, `ps`,
image	PNG (Portable Network Graphics) image	`png`
	Windows Bitmap	`bmp`
	X Bitmap	`xbm`
	TIFF (Tag Image File Format) image	`tif`
	JPEG (Joint Photographic Experts Group) image	`jpg`
	GIF (Graphics Interchange Format) image	`gif`
audio	Midi	`midi`
	MP3	`mp3`
	RealAudio	`ra`, `ram`
	WAV Audio	`wav`
video	MPEG (Moving Picture Experts Group) Video	`mpeg`, `mpg`, `mpe`, `mpv`, `mpegv`
	QuickTime	`qt`, `mov`, `moov`
	RealMedia	`ra`, `ram`
	MPEG Audio	`mp2`, `mpa`, `abs`, `mpega`
	AVI	`avi`

In the case of text or keyword - based search, users specify keywords, and multimedia relevant to these keywords should be retrieved. Such retrieval relies strongly on metadata represented by text strings, keywords, or full scripts (Shih, 2002). Several recently developed and deployed efficient commercial multimedia search engines, such as Google Image Search, AltaVista Photo Finder, Lycos Pictures and Sounds, Yahoo! Image Surfer, and Lycos Fast MP3 Search, exploit text or keyword-based retrieval. It requires an inverted file index that describes the multimedia content and allows for obtaining fast query response. Building an index is the core part of the keyword-based multimedia information search.

Content-Based Information Retrieval 基于内容的信息检索_第13张图片

Another indexing techniques are partitioning multimedia content into categories, which the user can browse through for images of interest that match category keywords and using the text embedded around multimedia content as a way to identify its content. But the keywords and texts relate only implicitly to image / video / audio content, and be it possible to examine directly such a content, the search results could be notably refined.

Content - based Search

In content or semantics-based search, retrieval criteria and queries are specified in terms of computable data features related to semantic content of a multimedia object (audio, image, or video). Most content-based video information retrieval (CBIR) systems allow for searching the visual database contents in several different ways, either alone or combined ( Chang e.a., 2001 ; Shih, 2002 , Smeulders e.a., 2000 ): .

General interactive browsing by a user with no specific idea about the desired images or video clips.
- category (subject) search to retrieve an arbitrary image that represents a specific class (in this case clustering of visually similar images into groups can reduce the number of retrieved undesired images and navigation through a subject hierarchy allows to get to the target subject and then browse or search only that limited subset of images);
- search by association having no specific aim and iteratively refined on the basis of the user's relevance feedback.
Search to illustrate a story or document or for arbitrary pictures of expected aesthetical value.
Search for a specific image, or Query-by-X where "X" can be:
- an image example or a group of examples (Query-by-Example, or QBE) specified by a user virtually anywhere in the Internet; the images that are most similar to the query image(s) are presented in descending order of the similarity score;
- a visual sketch of the desired image or video clip drawn by a user with graphics tools provided by the system;
- direct specification of visual features (e.g., colour, texture, shape, and motion properties), which appeals to more technical users; or
- a keyword or complete text entered by the user to search for visual information that has been previously annotated with that set of keywords.

from: https://www.cs.auckland.ac.nz/courses/compsci708s1c/lectures/Glect-html/topic1c708FSC.htm

你可能感兴趣的:(计算机视觉CV)

LocalDateTime 转 String igotyback java 开发语言
importjava.time.LocalDateTime;importjava.time.format.DateTimeFormatter;publicclassMain{publicstaticvoidmain(String[]args){//获取当前时间LocalDateTimenow=LocalDateTime.now();//定义日期格式化器DateTimeFormatterformat
Linux下QT开发的动态库界面弹出操作（SDL2） 13jjyao QT类 qt 开发语言 sdl2 linux
需求：操作系统为linux，开发框架为qt，做成需带界面的qt动态库，调用方为java等非qt程序难点：调用方为java等非qt程序，也就是说调用方肯定不带QApplication::exec()，缺少了这个，QTimer等事件和QT创建的窗口将不能弹出(包括opencv也是不能弹出)；这与qt调用本身qt库是有本质的区别的思路：1.调用方缺QApplication::exec()，那么我们在接口
多线程之——ExecutorCompletionService 阿福德
在我们开发中，经常会遇到这种情况，我们起多个线程来执行，等所有的线程都执行完成后，我们需要得到个线程的执行结果来进行聚合处理。我在内部代码评审时，发现了不少这种情况。看很多同学都使用正确，但比较啰嗦，效率也不高。本文介绍一个简单处理这种情况的方法：直接上代码：publicclassExecutorCompletionServiceTest{@TestpublicvoidtestExecutorCo
tiff批量转png 诺有缸的高飞鸟 opencv 图像处理 python opencv 图像处理
目录写在前面代码完写在前面1、本文内容tiff批量转png2、平台/环境opencv,python3、转载请注明出处：https://blog.csdn.net/qq_41102371/article/details/132975023代码importnumpyasnpimportcv2importosdeffindAllFile(base):file_list=[]forroot,ds,fsin
遥感影像的切片处理 sand&wich 计算机视觉 python 图像处理
在遥感影像分析中，经常需要将大尺寸的影像切分成小片段，以便于进行详细的分析和处理。这种方法特别适用于机器学习和图像处理任务，如对象检测、图像分类等。以下是如何使用Python和OpenCV库来实现这一过程，同时确保每个影像片段保留正确的地理信息。准备环境首先，确保安装了必要的Python库，包括numpy、opencv-python和xml.etree.ElementTree。这些库将用于图像处理
windows下python opencv ffmpeg读取摄像头实现rtsp推流拉流图像处理大大大大大牛啊 opencv实战代码讲解视觉图像项目 windows python opencv
windows下pythonopencvffmpeg读取摄像头实现rtsp推流拉流整体流程1.下载所需文件1.1下载rtsp推流服务器1.2下载ffmpeg2.开启RTSP服务器3.opencv读取摄像头并调用ffmpeg进行推流4.opencv进行拉流5.opencv异步拉流整体流程1.下载所需文件1.1下载rtsp推流服务器下载RTSP服务器下载页面https://github.com/blu
c++ opencv4.3 sift匹配图像处理大大大大大牛啊图像处理 opencv实战代码讲解 opencv sift c++opencv4 特征点
c++opencv4.3sift匹配main.cppintmain(){vectorkeypoints1,keypoints2;Matimg1,img2,descriptors1,descriptors2;intnumF
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
ubuntu安装opencv最快的方法 Derek重名了
最快方法，当然不能太多文字$sudoapt-getinstallpython-opencv借助python就可以把ubuntu的opencv环境搞起来，非常快非常容易参考：https://docs.opencv.org/trunk/d2/de6/tutorial_py_setup_in_ubuntu.html
代码的执行效果高天
packagecom20210409;publicclassdemo04{publicstaticvoidmain(String[]args){//////&&当前的条件不满足,则最后结果一定不满足,后面的条件不再执行////&不管条件是否满足所有条件均作判断//intx=1,y=1;//if(++y==2&&x++==2){//x=7;//}//System.out.println("x="+x
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
计算机视觉中，Pooling的作用 Wils0nEdwards 计算机视觉人工智能
在计算机视觉中，Pooling（池化）是一种常见的操作，主要用于卷积神经网络（CNN）中。它通过对特征图进行下采样，减少数据的空间维度，同时保留重要的特征信息。Pooling的作用可以归纳为以下几个方面：1.降低计算复杂度与内存需求Pooling操作通过对特征图进行下采样，减少了特征图的空间分辨率（例如，高度和宽度）。这意味着网络需要处理的数据量会减少，从而降低了计算量和内存需求。这对大型神经网络
使用Python和Playwright破解滑动验证码 asfdsgdf python 开发语言
滑动验证码是一种常见的验证码形式，通过拖动滑块将缺失的拼图块对准原图中的空缺位置来验证用户操作。本文将介绍如何使用Python中的OpenCV进行模板匹配，并结合Playwright实现自动化破解滑动验证码的过程。所需技术OpenCV模板匹配：用于识别滑块在背景图中的正确位置。Python：主要编程语言。Playwright：用于浏览器自动化，模拟用户操作。破解过程概述获取验证码图像：下载背景图和
OpenCV图像处理技术（Python）——入门森屿_ opencv
©FuXianjun.AllRightsReserved.OpenCV入门图像作为人类感知世界的视觉基础，是人类获取信息、表达信息的重要手段，OpenCV作为一个开源的计算机视觉库，它包括几百个易用的图像成像和视觉函数，既可以用于学术研究，也可用于工业邻域，它于1999年由因特尔的GaryBradski启动，OpenCV库主要由C和C++语言编写，它可以在多个操作系统上运行。1.1图像处理基本操作
opencv学习：图像旋转的两种方法，旋转后的图片进行模板匹配代码实现夜清寒风学习 opencv 机器学习人工智能计算机视觉
图像旋转在图像处理中，rotate和rot90是两种常见的图像旋转方法，它们在功能和使用上有一些区别。下面我将分别介绍这两种方法，并解释它们的主要区别rot90方法rot90方法是NumPy提供的一种数组旋转函数，它主要用于对二维数组（如图像）进行90度的旋转。这个方法比较简单，只支持90度的倍数旋转，不支持任意角度旋转。使用NumPy进行旋转使用NumPy的rot90函数对模板图像进行旋转操作。
探索创新科技： Lite-Mono - 简约高效的小型化Mono框架杭律沛Meris
探索创新科技：Lite-Mono-简约高效的小型化Mono框架Lite-Mono[CVPR2023]Lite-Mono:ALightweightCNNandTransformerArchitectureforSelf-SupervisedMonocularDepthEstimation项目地址:https://gitcode.com/gh_mirrors/li/Lite-Mono如果你在寻找一个轻
Python OpenCV图像处理：从基础到高级的全方位指南极客代码玩转Python 开发语言 python opencv 图像处理计算机视觉
目录第一部分：PythonOpenCV图像处理基础1.1OpenCV简介1.2PythonOpenCV安装1.3实战案例：图像显示与保存1.4注意事项第二部分：PythonOpenCV图像处理高级技巧2.1图像变换2.2图像增强2.3图像复原第三部分：PythonOpenCV图像处理实战项目3.1图像滤波3.2图像分割3.3图像特征提取第四部分：PythonOpenCV图像处理注意事项与优化策略4
C# 禁止程序重复启动 wiseyao1219 c#
修改：Program.cs[STAThread]staticvoidMain(){Mutexmutex=newMutex(true,"NewGuid123456",outboolisCreatedNew);if(!isCreatedNew){MessageBox.Show(Application.ProductName+"isrunning...");return;}Application.Ena
2018-08-16【Swift 4.1】关于Swift4.0以后调用MJExtension无法模型转换问题码农happy
1、本人使用swift4.1，弄了一晚上才弄好，结果还是一个小问题真是尴尬，要在model中每个属性前面加上@objcimportUIKitclassUserModel:NSObject{@objcvardix=String()}letdic=["dix":"ffffff"]asNSDictionaryletmodel=UserModel.mj_object(withKeyValues:dic)!
python图像匹配_opencvpython中的图像匹配 weixin_39585675 python图像匹配
我一直在做一个项目，用opencvpython识别相机中显示的标志。我已经尝试过使用surf、颜色直方图匹配和模板匹配。但在这3个问题中，它并不总是返回正确的答案。我现在想要的是，解决我这个问题的最好办法是什么。模板图像示例：以下是摄像头中显示的标志示例。如果这是我想要识别的图像，该怎么用？在更新matchTemplate中的代码flags=["Cambodia.jpg","Laos.jpg","
利用Python+OpenCV实现截图匹配图像，支持自适应缩放、灰度匹配、区域匹配、匹配多个结果 xu-jssy Python自动化脚本 python opencv 开发语言图像处理自动化
可以直接通过pip获取，无需手动安装其他依赖pipinstallxug示例：importxugxug.find_image_on_screen(,,,)=========================================================================一、依赖安装pipinstallopencv-pythonpipinstallpyautogui二、获
day12 控制流程 if switch while do...while 猜数字游戏卓越小Y JAVA学习日志游戏 java 开发语言
控制流程顺序结构所有的程序都是按顺序执行if语句选择结构单选择语句if(a>0){System.out.println(“hello”);}packagecom.ckw.blog.select;importjava.util.Scanner;publicclassdemo01{publicstaticvoidmain(String[]args){intscore=0;Scannerscanner=
Vector和Stack的用法蟹道人 JavaSe java
/***作者：*日期：*功能：vector的用法*/packagecom.cg;importjava.util.*;publicclassDemo5{publicstaticvoidmain(String[]args){//Vector的使用Vectorvec=newVector();Empemp=newEmp("2011",25,"zhang");vec.add(emp);for(inti=0;
C#文件被占用的解决方案花北城 C#项目文件占用
问题打更新包时，提示文件被占用。System.IO.IOException:文件“D:\RS\RS_CCVI20111210.exe”正由另一进程使用，因此该进程无法访问该文件。在System.IO.__Error.WinIOError(Int32errorCode,StringmaybeFullPath)在System.IO.FileStream.Init(Stringpath,FileMode
数组拷贝Arraycopy xing2516 Arraycopy java
packageqing;//数组拷贝publicclassArraycopy{publicstaticvoidmain(String[]args){//一维数组拷贝Stringa[]={"小米","华为","阿里","腾讯","百度"};String[]aBak=newString[6];//从a数组第0个copy到数组aBak0个开始，长度是a数组长度System.arraycopy(a,0,a
discuz discuz_admincp.php 讲解,Discuz! 1.5-2.5 命令执行漏洞分析(CVE-2018-14729) weixin_39740419 discuz 讲解
0x00漏洞简述漏洞信息8月27号有人在GitHub上公布了有关Discuz1.5-2.5版本中后台数据库备份功能存在的命令执行漏洞的细节。漏洞影响版本Discuz!1.5-2.50x01漏洞复现官方论坛下载相应版本就好。0x02漏洞分析需要注意的是这个漏洞其实是需要登录后台的，并且能有数据库备份权限，所以比较鸡肋。我这边是用Discuz!2.5完成漏洞复现的，并用此进行漏洞分析的。漏洞点在：so
mysql 隐秘后门_【技术分享】CVE-2016-5483：利用mysqldump备份可生成后门 Toby Dai mysql 隐秘后门
预估稿费：100RMB投稿方式：发送邮件至linwei#360.cn，或登陆网页版在线投稿前言mysqldump是用来创建MySQL数据库逻辑备份的一个常用工具。它在默认配置下可以生成一个.sql文件，其中包含创建/删除表和插入数据等。在导入转储文件的时候，攻击者可以通过制造恶意表名来实现任意SQL语句查询和shell命令执行的目的。另一个与之相关的漏洞利用场景可以参考。攻击场景攻击者已经能够访问
CV、NLP、数据控掘推荐、量化海的那边- AI算法自然语言处理人工智能
下面是对CV（计算机视觉）、NLP（自然语言处理）、数据挖掘推荐和量化的简要概述及其应用领域的介绍：1.CV（计算机视觉，ComputerVision）定义：计算机视觉是一门让计算机能够从图像或视频中提取有用信息，并做出决策的学科。它通过模拟人类的视觉系统来识别、处理和理解视觉信息。主要任务：图像分类：识别图像中的物体并分类，比如猫、狗、车等。目标检测：在图像或视频中定位并识别多个对象，如人脸检测
解决mysql漏洞 Oracle MySQL Server远程安全漏洞(CVE-2015-0411) dieweidong5625 数据库运维 java
有时候会检测到服务器有很多漏洞，而大部分漏洞都是由于服务的版本过低的原因，因为官网出现漏洞就会发布新版本来修复这个漏洞，所以一般情况下，我们只需要对相应的软件包进行升级到安全版本即可。通过查阅官网信息，OracleMySQLServer远程安全漏洞(CVE-2015-0411)，受影响系统：OracleMySQLServer/usr/databases.sql//先备份原有所有数据，防止数据丢失。
opencv 学习 1 木木ainiks opencv 计算机视觉 python
opencv学习的第一天#coding:utf-8importcv2ascv#首先读图片src=cv.imread(“img/1.jpg”)#设置图片的名字cv.namedWindow(“1”,cv.WINDOW_AUTOSIZE)#显示图片第一个参数设置图片名，第二个参数图片的地址cv.imshow(“1”,src)cv.waitKey(0)#将图片写入固定位置cv.imwrite(“img/2
遍历dom 并且存储（将每一层的DOM元素存在数组中）换个号韩国红果果 JavaScript html
数组从0开始！！ var a=[],i=0; for(var j=0;j<30;j++){ a[j]=[];//数组里套数组，且第i层存储在第a[i]中 } function walkDOM(n){ do{ if(n.nodeType!==3)//筛选去除#text类型 a[i].push(n); //con
Android+Jquery Mobile学习系列(9)-总结和代码分享白糖_ JQuery Mobile
目录导航经过一个多月的边学习边练手，学会了Android基于Web开发的毛皮，其实开发过程中用Android原生API不是很多，更多的是HTML/Javascript/Css。个人觉得基于WebView的Jquery Mobile开发有以下优点： 1、对于刚从Java Web转型过来的同学非常适合，只要懂得HTML开发就可以上手做事。 2、jquerym
impala参考资料 dayutianfei impala
记录一些有用的Impala资料 1. 入门资料 >>官网翻译： http://my.oschina.net/weiqingbin/blog?catalog=423691 2. 实用进阶 >>代码&架构分析： Impala/Hive现状分析与前景展望：http
JAVA 静态变量与非静态变量初始化顺序之新解周凡杨 java 静态非静态顺序
今天和同事争论一问题，关于静态变量与非静态变量的初始化顺序，谁先谁后，最终想整理出来！测试代码： import java.util.Map; public class T { public static T t = new T(); private Map map = new HashMap(); public T(){ System.out.println(&quo
跳出iframe返回外层页面 g21121 iframe
在web开发过程中难免要用到iframe，但当连接超时或跳转到公共页面时就会出现超时页面显示在iframe中，这时我们就需要跳出这个iframe到达一个公共页面去。首先跳转到一个中间页，这个页面用于判断是否在iframe中，在页面加载的过程中调用如下代码： <script type="text/javascript"> //<!-- function
JAVA多线程监听JMS、MQ队列 510888780 java多线程
背景：消息队列中有非常多的消息需要处理，并且监听器onMessage（）方法中的业务逻辑也相对比较复杂，为了加快队列消息的读取、处理速度。可以通过加快读取速度和加快处理速度来考虑。因此从这两个方面都使用多线程来处理。对于消息处理的业务处理逻辑用线程池来做。对于加快消息监听读取速度可以使用1.使用多个监听器监听一个队列；2.使用一个监听器开启多线程监听。对于上面提到的方法2使用一个监听器开启多线
第一个SpringMvc例子布衣凌宇 spring mvc
第一步：导入需要的包；第二步：配置web.xml文件 <?xml version="1.0" encoding="UTF-8"?> <web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi=
我的spring学习笔记15-容器扩展点之PropertyOverrideConfigurer aijuans Spring3
PropertyOverrideConfigurer类似于PropertyPlaceholderConfigurer，但是与后者相比，前者对于bean属性可以有缺省值或者根本没有值。也就是说如果properties文件中没有某个bean属性的内容，那么将使用上下文（配置的xml文件）中相应定义的值。如果properties文件中有bean属性的内容，那么就用properties文件中的值来代替上下
通过XSD验证XML antlove xml schema xsd validation SchemaFactory
1. XmlValidation.java package xml.validation; import java.io.InputStream; import javax.xml.XMLConstants; import javax.xml.transform.stream.StreamSource; import javax.xml.validation.Schem
文本流与字符集百合不是茶 PrintWrite()的使用字符集名字别名获取
文本数据的输入输出; 输入;数据流,缓冲流输出;介绍向文本打印格式化的输出PrintWrite(); package 文本流; import java.io.FileNotFound
ibatis模糊查询sqlmap-mapping-**.xml配置 bijian1013 ibatis
正常我们写ibatis的sqlmap-mapping-*.xml文件时，传入的参数都用##标识，如下所示： <resultMap id="personInfo" class="com.bijian.study.dto.PersonDTO"> <res
java jvm常用命令工具——jdb命令(The Java Debugger) bijian1013 java jvm jdb
用来对core文件和正在运行的Java进程进行实时地调试，里面包含了丰富的命令帮助您进行调试，它的功能和Sun studio里面所带的dbx非常相似，但 jdb是专门用来针对Java应用程序的。现在应该说日常的开发中很少用到JDB了，因为现在的IDE已经帮我们封装好了，如使用ECLI
【Spring框架二】Spring常用注解之Component、Repository、Service和Controller注解 bit1129 controller
在Spring常用注解第一步部分【Spring框架一】Spring常用注解之Autowired和Resource注解（http://bit1129.iteye.com/blog/2114084）中介绍了Autowired和Resource两个注解的功能，它们用于将依赖根据名称或者类型进行自动的注入，这简化了在XML中，依赖注入部分的XML的编写，但是UserDao和UserService两个bea
cxf wsdl2java生成代码super出错,构造函数不匹配 bitray super
由于过去对于soap协议的cxf接触的不是很多,所以遇到了也是迷糊了一会.后来经过查找资料才得以解决. 初始原因一般是由于jaxws2.2规范和jdk6及以上不兼容导致的.所以要强制降为jaxws2.1进行编译生成.我们需要少量的修改: 我们原来的代码 wsdl2java com.test.xxx -client http://..... 修改后的代
动态页面正文部分中文乱码排障一例 ronin47
公司网站一部分动态页面，早先使用apache+resin的架构运行，考虑到高并发访问下的响应性能问题，在前不久逐步开始用nginx替换掉了apache。不过随后发现了一个问题，随意进入某一有分页的网页，第一页是正常的（因为静态化过了）；点“下一页”，出来的页面两边正常，中间部分的标题、关键字等也正常，唯独每个标题下的正文无法正常显示。因为有做过系统调整，所以第一反应就是新上
java-54- 调整数组顺序使奇数位于偶数前面 bylijinnan java
import java.util.Arrays; import java.util.Random; import ljn.help.Helper; public class OddBeforeEven { /** * Q 54 调整数组顺序使奇数位于偶数前面 * 输入一个整数数组，调整数组中数字的顺序，使得所有奇数位于数组的前半部分，所有偶数位于数组的后半
从100PV到1亿级PV网站架构演变 cfyme 网站架构
一个网站就像一个人，存在一个从小到大的过程。养一个网站和养一个人一样，不同时期需要不同的方法，不同的方法下有共同的原则。本文结合我自已14年网站人的经历记录一些架构演变中的体会。 1：积累是必不可少的架构师不是一天练成的。 1999年，我作了一个个人主页，在学校内的虚拟空间，参加了一次主页大赛，几个DREAMWEAVER的页面，几个TABLE作布局，一个DB连接，几行PHP的代码嵌入在HTM
[宇宙时代]宇宙时代的GIS是什么？ comsci Gis
我们都知道一个事实，在行星内部的时候，因为地理信息的坐标都是相对固定的，所以我们获取一组GIS数据之后，就可以存储到硬盘中，长久使用。。。但是，请注意，这种经验在宇宙时代是不能够被继续使用的宇宙是一个高维时空
详解create database命令 czmmiao database
完整命令 CREATE DATABASE mynewdb USER SYS IDENTIFIED BY sys_password USER SYSTEM IDENTIFIED BY system_password LOGFILE GROUP 1 ('/u01/logs/my/redo01a.log','/u02/logs/m
几句不中听却不得不认可的话 datageek
1、人丑就该多读书。 2、你不快乐是因为：你可以像猪一样懒，却无法像只猪一样懒得心安理得。 3、如果你太在意别人的看法，那么你的生活将变成一件裤衩，别人放什么屁，你都得接着。 4、你的问题主要在于：读书不多而买书太多，读书太少又特爱思考，还他妈话痨。 5、与禽兽搏斗的三种结局：(1)、赢了，比禽兽还禽兽。(2)、输了，禽兽不如。(3)、平了，跟禽兽没两样。结论：选择正确的对手很重要。 6
1 14:00 PHP中的“syntax error, unexpected T_PAAMAYIM_NEKUDOTAYIM”错误 dcj3sjt126com PHP
原文地址：http://www.kafka0102.com/2010/08/281.html 因为需要，今天晚些在本机使用PHP做些测试，PHP脚本依赖了一堆我也不清楚做什么用的库。结果一跑起来，就报出类似下面的错误：“Parse error: syntax error, unexpected T_PAAMAYIM_NEKUDOTAYIM in /home/kafka/test/
xcode6 Auto layout and size classes dcj3sjt126com ios
官方GUI https://developer.apple.com/library/ios/documentation/UserExperience/Conceptual/AutolayoutPG/Introduction/Introduction.html iOS中使用自动布局（一） http://www.cocoachina.com/ind
通过PreparedStatement批量执行sql语句【sql语句相同，值不同】梦见x光 sql 事务批量执行
比如说：我有一个List需要添加到数据库中，那么我该如何通过PreparedStatement来操作呢？ public void addCustomerByCommit(Connection conn , List<Customer> customerList) { String sql = "inseret into customer(id
程序员必知必会----linux常用命令之十【系统相关】 hanqunfeng Linux常用命令
一.linux快捷键 Ctrl+C : 终止当前命令 Ctrl+S : 暂停屏幕输出 Ctrl+Q : 恢复屏幕输出 Ctrl+U : 删除当前行光标前的所有字符 Ctrl+Z : 挂起当前正在执行的进程 Ctrl+L : 清除终端屏幕，相当于clear 二.终端命令 clear : 清除终端屏幕 reset : 重置视窗，当屏幕编码混乱时使用 time com
NGINX IXHONG nginx
pcre 编译安装 nginx conf/vhost/test.conf upstream admin { server 127.0.0.1:8080; } server { listen 80; &
设计模式--工厂模式 kerryg 设计模式
工厂方式模式分为三种： 1、普通工厂模式：建立一个工厂类，对实现了同一个接口的一些类进行实例的创建。 2、多个工厂方法的模式：就是对普通工厂方法模式的改进，在普通工厂方法模式中，如果传递的字符串出错，则不能正确创建对象，而多个工厂方法模式就是提供多个工厂方法，分别创建对象。 3、静态工厂方法模式：就是将上面的多个工厂方法模式里的方法置为静态，
Spring InitializingBean/init-method和DisposableBean/destroy-method mx_xiehd java spring bean xml
1.initializingBean/init-method 实现org.springframework.beans.factory.InitializingBean接口允许一个bean在它的所有必须属性被BeanFactory设置后，来执行初始化的工作，InitialzingBean仅仅指定了一个方法。通常InitializingBean接口的使用是能够被避免的，（不鼓励使用，因为没有必要
解决Centos下vim粘贴内容格式混乱问题 qindongliang1922 centos vim
有时候，我们在向vim打开的一个xml，或者任意文件中，拷贝粘贴的代码时，格式莫名其毛的就混乱了，然后自己一个个再重新，把格式排列好，非常耗时，而且很不爽，那么有没有办法避免呢？答案是肯定的，设置下缩进格式就可以了，非常简单：在用户的根目录下直接vi ~/.vimrc文件然后将set pastetoggle=<F9> 写入这个文件中，保存退出，重新登录，
netty大并发请求问题 tianzhihehe netty
多线程并发使用同一个channel java.nio.BufferOverflowException: null at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:183) ~[na:1.7.0_60-ea] at java.nio.ByteBuffer.put(ByteBuffer.java:832) ~[na:1.7.0_60-ea]
Hadoop NameNode单点问题解决方案之一 AvatarNode wyz2009107220 NameNode
我们遇到的情况 Hadoop NameNode存在单点问题。这个问题会影响分布式平台24*7运行。先说说我们的情况吧。我们的团队负责管理一个1200节点的集群(总大小12PB)，目前是运行版本为Hadoop 0.20，transaction logs写入一个共享的NFS filer(注：NetApp NFS Filer)。经常遇到需要中断服务的问题是给hadoop打补丁。 DataNod