Mr_Slower

django关于文件上传

1.简单示例:

from django import forms
class UploadFileForm(forms.Form):
title = forms.CharField(max_length=50)
file = forms.FileField()

A view handling this form will receive the file data in request.FILES, which is a dictionary containing a key for each FileField (or ImageField, or other FileField subclass) in the form. So the data from the above form would be accessible as request.FILES[‘file’].
Note that request.FILES will only contain data if the request method was POST and the < form> that posted the request has the attribute enctype=“multipart/form-data”. Otherwise, request.FILES will be empty.
Most of the time, you’ll simply pass the file data from request into the form as described in Binding uploaded files to a form. This would look something like:

from django.http import HttpResponseRedirect
from django.shortcuts import render
from .forms import UploadFileForm
#Imaginary function to handle an uploaded file.
from somewhere import handle_uploaded_file
def upload_file(request):
if request.method == 'POST':
form = UploadFileForm(request.POST, request.FILES)
if form.is_valid():
handle_uploaded_file(request.FILES['file'])
return HttpResponseRedirect('/success/url/')
else:
form = UploadFileForm()
return render(request, 'upload.html', {'form': form})

#Notice that we have to pass request.FILES into the form’s constructor; this is how file data gets bound into a form
#Here’s a common way you might handle an uploaded file:
def handle_uploaded_file(f):
with open('some/file/name.txt', 'wb+') as destination:
for chunk in f.chunks():
destination.write(chunk)

(1)上面示例为简单的上传文件字段及其视图函数;视图函数从request.FILES中获取文件数据,它是一个字典, 其中键为字段名,值为文件对象;注意只有当表单提交方式为POST并且表单具有 enctype=“multipart/form-data”属性时request.FILES中才有数据;否则,里面是空的;
(2)使用循环chunks方法将文件写入硬盘,而不是用read()方法,避免文件过大使内存过载;

Looping over UploadedFile.chunks() instead of using read() ensures that large files don’t overwhelm your system’s memory.
There are a few other methods and attributes available on UploadedFile objects; see UploadedFile for a complete reference.

2.Handling uploaded files with a model处理关于model的上传文件

If you’re saving a file on a Model with a FileField, using a ModelForm makes this process much easier. The file object will be saved to the location specified by the upload_to argument of the corresponding FileField when calling form.save():

from django.http import HttpResponseRedirect
from django.shortcuts import render
from .forms import ModelFormWithFileField
def upload_file(request):
if request.method == 'POST':
form = ModelFormWithFileField(request.POST, request.FILES)
if form.is_valid():
#file is saved
form.save()
return HttpResponseRedirect('/success/url/')
else:
form = ModelFormWithFileField()
return render(request, 'upload.html', {'form': form})

If you are constructing an object manually, you can simply assign the file object from request.FILES to the file field in the model:

from django.http import HttpResponseRedirect
from django.shortcuts import render
from .forms import UploadFileForm
from .models import ModelWithFileField
def upload_file(request):
if request.method == 'POST':
form = UploadFileForm(request.POST, request.FILES)
if form.is_valid():
instance = ModelWithFileField(file_field=request.FILES['file'])
instance.save()
return HttpResponseRedirect('/success/url/')
else:
form = UploadFileForm()
return render(request, 'upload.html', {'form': form})

如果你在model上的FileField保存文件, 那使用ModelForm会简单很多;当调用form.save()方法时,文件对象会保存到upload_to参数指定的位置,见第一例;
如果你想手动构建对象, 可用把request.FILE中的文件对象赋值给模型中的file field, 见第二例;

3.Uploading multiple files上传多个文件

If you want to upload multiple files using one form field, set the multiple HTML attribute of field’s widget:

from django import forms
class FileFieldForm(forms.Form):
file_field = forms.FileField(widget=forms.ClearableFileInput(attrs={'multiple':True}))

#Then override the post method of your FormView subclass to handle multiple file uploads:
from django.views.generic.edit import FormView
from .forms import FileFieldForm
class FileFieldView(FormView):
form_class = FileFieldForm
template_name = 'upload.html' # Replace with your template.
success_url = '...' # Replace with your URL or reverse().
def post(self, request, *args, **kwargs):
form_class = self.get_form_class()
form = self.get_form(form_class)
files = request.FILES.getlist('file_field')
if form.is_valid():
for f in files:
... # Do something with each file.
return self.form_valid(form)
else:
return self.form_invalid(form)

如果你想在一个表单字段上传多个文件,那么对字段插件设置multiple属性为True, 然后FormView子类覆盖post()方法,见上例;
4.Upload Handlers上传处理器

When a user uploads a file, Django passes off the file data to an upload handler – a small class that handles file data as it gets uploaded. Upload handlers are initially defined in the FILE_UPLOAD_HANDLERS setting, which defaults to:

["django.core.files.uploadhandler.MemoryFileUploadHandler",
"django.core.files.uploadhandler.TemporaryFileUploadHandler"]

Together MemoryFileUploadHandler and TemporaryFileUploadHandler provide Django’s default file upload behavior of reading small files into memory and large ones onto disk. You can write custom handlers that customize how Django handles files. You could, for example, use custom handlers to enforce user-level quotas, compress data on the fly, render progress bars, and even send data to another storage location directly without storing it locally. See Writing custom upload handlers for details on how you can customize or completely replace upload behavior.

当一个用户上传一个文件时, django将文件数据传递给一个上传处理器, 即一个处理文件数据的小类, 两个文件上传处理器见上述;
这两个处理器默认会提供如下行为:将小文件读入内存,将大文件写入硬盘; 你可以自己定制文件处理器来强制用户配额, 动态压缩数据, 渲染进度条, 甚至直接将数据发送到其他的存储位置,而无需在本地存储;

5.Where uploaded data is stored在哪里存放上传数据

Before you save uploaded files, the data needs to be stored somewhere.
By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold the entire contents of the upload in memory. This means that saving the file involves only a read from memory and a write to disk and thus is very fast.
However, if an uploaded file is too large, Django will write the uploaded file to a temporary file stored in your system’s temporary directory. On a Unix-like platform this means you can expect Django to generate a file called something like /tmp/tmpzfp6I6.upload. If an upload is large enough, you can watch this file grow in size as Django streams the data onto disk.
These specifics – 2.5 megabytes; /tmp; etc. – are simply “reasonable defaults” which can be customized as described in the next section.

当你保存文件之前, 数据需要存储在某个地方;
默认情况下, 上传文件小于2.5M时, 直接保存在内存中, 然后保存文件是直接从内存中读取然后写入硬盘, 过程非常快;
当上传文件太大时, django将会把上传文件保存在系统的临时目录中, 在Unix类型的系统中, 意味着你可以在/tmp文件夹下找到;
当然2.5M的限制可自己定制更改;

6.Changing upload handler behavior改变上传处理器的行为

There are a few settings which control Django’s file upload behavior. See File Upload Settings for details.

参考相关章节;

7.Modifying upload handlers on the fly动态更改上传处理器

Sometimes particular views require different upload behavior. In these cases, you can override upload handlers on a per-request basis by modifying request.upload_handlers. By default, this list will contain the upload handlers given by FILE_UPLOAD_HANDLERS, but you can modify the list as you would any other list.
For instance, suppose you’ve written a ProgressBarUploadHandler that provides feedback on upload progress to some sort of AJAX widget. You’d add this handler to your upload handlers like this:

request.upload_handlers.insert(0, ProgressBarUploadHandler(request))

You’d probably want to use list.insert() in this case (instead of append()) because a progress bar handler would need to run before any other handlers. Remember, the upload handlers are processed in order.
If you want to replace the upload handlers completely, you can just assign a new list:

request.upload_handlers = [ProgressBarUploadHandler(request)]

某些特殊的视图函数可能需要不同的上传行为; 在这种情况下, 你可以基于每个request覆盖上传处理器, 通过更改request.upload_handlers属性;
默认情况下, 这个列表会包含FILE_UPLODE_HANDLERS中的上传处理器,但你可以更改为任何您想要的列表;
例如, 你写了一个ProcessBarHandler可以对某个Ajax插件提供回调函数, 你可以将handler添加到列表中, 注意使用insert而不是append因为该handler需要在其他handler之前运行,而handler是按顺序运行的;你想完全替代列表也是可以的, 可以直接将列表赋值给request.upload_handlers;

Note: You can only modify upload handlers before accessing request.POST or request.FILES – it doesn’t make sense to change upload handlers after upload handling has already started. If you try to modify request. upload_handlers after reading from request.POST or request.FILES Django will throw an error.
Thus, you should always modify uploading handlers as early in your view as possible.
Also, request.POST is accessed by CsrfViewMiddleware which is enabled by default. This means you will need to use csrf_exempt() on your view to allow you to change the upload handlers. You will then need to use csrf_protect() on the function that actually processes the request. Note that this means that the handlers may start receiving the file upload before the CSRF checks have been done. Example code:

from django.views.decorators.csrf import csrf_exempt, csrf_protect
@csrf_exempt
def upload_file_view(request):
request.upload_handlers.insert(0, ProgressBarUploadHandler(request))
return _upload_file_view(request)
@csrf_protect
def _upload_file_view(request):
... # Process request

注意,你只能在request.POST或request.FILES之前更改上传处理器, 如果在之后更改,则会抛出错误;而且应该尽早更改上传处理器;
另外, request.POST是被CsrfViewMiddleware处理的, 这意味着你需要使用csrf_expect()以及csrf_protect();注意这意味着处理器可能在csrf检查完成之前开始接收上传文件;

8.上传的文件对象的方法;

class UploadedFile
During file uploads, the actual file data is stored in request.FILES. Each entry in this dictionary is an UploadedFile object (or a subclass) – a simple wrapper around an uploaded file. You’ll usually use one of these methods to access the uploaded content:

UploadedFile.read()
Read the entire uploaded data from the file. Be careful with this method: if the uploaded file is huge it can overwhelm your system if you try to read it into memory. You’ll probably want to use chunks() instead; see below.

UploadedFile.multiple_chunks(chunk_size=None)
Returns True if the uploaded file is big enough to require reading in multiple chunks. By default this will be any file larger than 2.5 megabytes, but that’s configurable; see below.

UploadedFile.chunks(chunk_size=None)
A generator returning chunks of the file. If multiple_chunks() is True, you should use this method in a loop instead of read().
In practice, it’s often easiest simply to use chunks() all the time. Looping over chunks() instead of using read() ensures that large files don’t overwhelm your system’s memory.

在上传文件期间, 实际的文件数据被存放在request.FILES中, 该字典的每个值都是一个UploadedFile对象, 一个上传文件的简单封装;你通常可以使用如下几个方法访问文内容;

(1)UploadedFile.read()
将整个上传文件读入内存,使用的时候要小心:如果上传文件比较巨大, 可能压垮你的系统;你可以使用chunks()方法;
(2)UploadedFile.multiple_chunks(chunk_size=None)
如果上传文件足够大到需要分多个块读取时返回True;默认情况下指的是大于2.5M的文件;
(3)UploadedFile.chunks(chunk_size=None)
一个返回文件块的生成器;如果multiple_chunks()返回True, 那么你应该使用该方法在循环中,而不是使用read();
实际情况下, 使用chunks()经常是最简单的方法;

其他几种有用的方法:
(4)UploadedFile.name

The name of the uploaded file (e.g. my_file.txt).

返回上传文件的文件名

(5)UploadedFile.size

The size, in bytes, of the uploaded file.

上传文件的大小;

(6)UploadedFile.content_type

The content-type header uploaded with the file (e.g. text/plain or application/pdf). Like any data supplied by the user, you shouldn’t trust that the uploaded file is actually this type. You’ll still need to validate that the file contains the content that the content-type header claims – “trust but verify.”

和上传文件一起的content-type头, 与用户提供的任何数据一样,你不应该相信上传的文件真的是这种类型;你仍然需要校验文件包含的内容;

(7)UploadedFile.content_type_extra

A dictionary containing extra parameters passed to the content-type header. This is typically provided by services, such as Google App Engine, that intercept and handle file uploads on your behalf. As a result your handler may not receive the uploaded file content, but instead a URL or other pointer to the file. (see RFC 2388 section 5.3).

一个包含传入content-type头的额外参数的字典,这通常由Google App Engine等提供服务,替代你拦截并处理上传文件;结果可能是你无法接收上传文件内容,而接到的是一个URL或指向该文件的其他指针;

(8)UploadedFile.charset

For text/* content-types, the character set (i.e. utf8) supplied by the browser. Again, “trust but verify” is the best policy here.

对于text/* 内容类型, 浏览器应用的字符集;同样, 需要验证;

Note: Like regular Python files, you can read the file line-by-line simply by iterating over the uploaded file:

for line in uploadedfile:
do_something_with(line)

Lines are split using universal newlines. The following are recognized as ending a line: the Unix end-of-line convention ‘\n’, the Windows convention ‘\r\n’, and the old Macintosh convention ‘\r’.

注意,常规的python文件,可以通过一行一行的方式读取,如上例;每行间通过通用换行符分隔;

(9)UploadedFile的子类:

Subclasses of UploadedFile include:
class TemporaryUploadedFile
A file uploaded to a temporary location (i.e. stream-to-disk). This class is used by the TemporaryFileUploadHandler. In addition to the methods from UploadedFile, it has one additional method:
TemporaryUploadedFile.temporary_file_path()
Returns the full path to the temporary uploaded file.

class InMemoryUploadedFile
A file uploaded into memory (i.e. stream-to-memory). This class is used by the MemoryFileUploadHandler.

包括两个子类:
1)TemporaryUploadedFile
上传到临时位置的文件;该类被TemporaryFileUploadHandler使用,除了上述方法外, 还有一个额外的方法:
TemporaryUploadedFile.temporary_file_path()
返回临时上传文件的全路径;

2)InMemoryUploadedFile
上传到内存中的文件;该类被MemoryFileUploadHandler使用

(10)Built-in upload handlers内置文件处理器

Together the MemoryFileUploadHandler and TemporaryFileUploadHandler provide Django’s default file upload behavior of reading small files into memory and large ones onto disk. They are located in django.core.files.uploadhandler.

class MemoryFileUploadHandler
File upload handler to stream uploads into memory (used for small files).
将文件存入内存,适用于小文件;

class TemporaryFileUploadHandler
Upload handler that streams data into a temporary file using TemporaryUploadedFile.
将文件存入临时目录;

(11)Writing custom upload handlers写自定制上传处理器

class FileUploadHandler
All file upload handlers should be subclasses of django.core.files.uploadhandler. FileUploadHandler. You can define upload handlers wherever you wish.
Required methods
Custom file upload handlers must define the following methods:

所有的上传处理器都应该是FileUploadHandler的子类,有几个必须定义的方法:

1)FileUploadHandler.receive_data_chunk(raw_data, start)

Receives a “chunk” of data from the file upload.
raw_data is a byte string containing the uploaded data.
start is the position in the file where this raw_data chunk begins.
The data you return will get fed into the subsequent upload handlers’ receive_data_chunk methods. In this way, one handler can be a “filter” for other handlers.
Return None from receive_data_chunk to short-circuit remaining upload handlers from getting this chunk. This is useful if you’re storing the uploaded data yourself and don’t want future handlers to store a copy of the data.
If you raise a StopUpload or a SkipFile exception, the upload will abort or the file will be completely skipped.

从上传文件中接收块数据;
raw_data是包含上传数据的字节串;
start是该方法从文件中的开始位置;
返回的数据将会传入后续的上传处理器中, 从这个角度讲, 一个处理器可以成为其他处理器的过滤器;
如果返回None, 则可以使后续的处理器短路从而无法获得文件块; 这在你想要自己保存上传文件并不想以后的处理器保存数据的备份时非常有用;
如果抛出StopUpload和SkipFile异常,上传将会终止或者文件将会完全跳过;

2)FileUploadHandler.file_complete(file_size)

Called when a file has finished uploading.
The handler should return an UploadedFile object that will be stored in request.FILES. Handlers may also return None to indicate that the UploadedFile object should come from subsequent upload handlers.

当一个文件上传结束时调用;
处理器应该返回一个UploadedFile对象,该对象将会被存储在request.FILES中,如果返回None,则表明该对象应该在后续的处理器中产生;

还有可选方法:
1)FileUploadHandler.chunk_size

Size, in bytes, of the “chunks” Django should store into memory and feed into the handler. That is, this attribute controls the size of chunks fed into FileUploadHandler.receive_data_chunk.
For maximum performance the chunk sizes should be divisible by 4 and should not exceed 2 GB (2³¹ bytes) in size. When there are multiple chunk sizes provided by multiple handlers, Django will use the smallest chunk size defined by any handler.
The default is 64*210 bytes, or 64 KB.

django放入内存并且传入处理器的块的大小;也就是说, 这个属性控制传入receive_data_chunk()方法的块的大小;
最大应该是4的倍数并且不超过2GB,如果多个处理器规定了不同的chunk_size ,那django将会选取其中最小的值;
默认为64KB;

2)FileUploadHandler.new_file(field_name, file_name, content_type, content_length, charset, content_type_extra)

Callback signaling that a new file upload is starting. This is called before any data has been fed to any upload handlers.
field_name is a string name of the file < input> field.
file_name is the filename provided by the browser.
content_type is the MIME type provided by the browser – E.g. ‘image/jpeg’.
content_length is the length of the image given by the browser. Sometimes this won’t be provided and will be None.
charset is the character set (i.e. utf8) given by the browser. Like content_length, this sometimes won’t be provided.
content_type_extra is extra information about the file from the content-type header. See
UploadedFile.content_type_extra.
This method may raise a StopFutureHandlers exception to prevent future handlers from handling this file.

当新文件开始上传时的回调信号;在任何数据被传入任何处理器前被调用;
参数说明见上述;

3)FileUploadHandler.upload_complete()

Callback signaling that the entire upload (all files) has completed.

当整个文件上传完成时的回调信号;

4)FileUploadHandler.handle_raw_input(input_data, META, content_length, boundary, encoding)

Allows the handler to completely override the parsing of the raw HTTP input.
input_data is a file-like object that supports read()-ing.
META is the same object as request.META.
content_length is the length of the data in input_data. Don’t read more than content_length bytes from input_data.
boundary is the MIME boundary for this request.
encoding is the encoding of the request.
Return None if you want upload handling to continue, or a tuple of (POST, FILES) if you want to return the new data structures suitable for the request directly.

允许处理器完全覆盖原生HTTP输入的解析;

你可能感兴趣的:(web)

理解Gunicorn：Python WSGI服务器的基石范范0825 ipython linux 运维
理解Gunicorn：PythonWSGI服务器的基石介绍Gunicorn，全称GreenUnicorn，是一个为PythonWSGI（WebServerGatewayInterface）应用设计的高效、轻量级HTTP服务器。作为PythonWeb应用部署的常用工具，Gunicorn以其高性能和易用性著称。本文将介绍Gunicorn的基本概念、安装和配置，帮助初学者快速上手。1.什么是Gunico
Google earth studio 简介陟彼高冈yu 旅游
GoogleEarthStudio是一个基于Web的动画工具，专为创作使用GoogleEarth数据的动画和视频而设计。它利用了GoogleEarth强大的三维地图和卫星影像数据库，使用户能够轻松地创建逼真的地球动画、航拍视频和动态地图可视化。网址为https://www.google.com/earth/studio/。GoogleEarthStudio是一个基于Web的动画工具，专为创作使用G
PHP环境搭建详细教程好看资源平台前端 php
PHP是一个流行的服务器端脚本语言，广泛用于Web开发。为了使PHP能够在本地或服务器上运行，我们需要搭建一个合适的PHP环境。本教程将结合最新资料，介绍在不同操作系统上搭建PHP开发环境的多种方法，包括Windows、macOS和Linux系统的安装步骤，以及本地和Docker环境的配置。1.PHP环境搭建概述PHP环境的搭建主要分为以下几类：集成开发环境：例如XAMPP、WAMP、MAMP，这
DIV+CSS+JavaScript技术制作网页（旅游主题网页设计与制作）云南大理 STU学生网页设计网页设计期末网页作业 html静态网页 html5期末大作业网页设计 web大作业
️精彩专栏推荐作者主页:【进入主页—获取更多源码】web前端期末大作业：【HTML5网页期末作业(1000套)】程序员有趣的告白方式：【HTML七夕情人节表白网页制作(110套)】文章目录二、网站介绍三、网站效果▶️1.视频演示2.图片演示四、网站代码HTML结构代码CSS样式代码五、更多源码二、网站介绍网站布局方面：计划采用目前主流的、能兼容各大主流浏览器、显示效果稳定的浮动网页布局结构。网站程
关于城市旅游的HTML网页设计——(旅游风景云南 5页)HTML+CSS+JavaScript 二挡起步 web前端期末大作业 javascript html css 旅游风景
⛵源码获取文末联系✈Web前端开发技术描述网页设计题材，DIV+CSS布局制作,HTML+CSS网页设计期末课程大作业|游景点介绍|旅游风景区|家乡介绍|等网站的设计与制作|HTML期末大学生网页设计作业，Web大学生网页HTML：结构CSS：样式在操作方面上运用了html5和css3，采用了div+css结构、表单、超链接、浮动、绝对定位、相对定位、字体样式、引用视频等基础知识JavaScrip
HTML网页设计制作大作业（div+css）云南我的家乡旅游景点带文字滚动二挡起步 web前端期末大作业 web设计网页规划与设计 html css javascript dreamweaver 前端
Web前端开发技术描述网页设计题材，DIV+CSS布局制作,HTML+CSS网页设计期末课程大作业游景点介绍|旅游风景区|家乡介绍|等网站的设计与制作HTML期末大学生网页设计作业HTML：结构CSS：样式在操作方面上运用了html5和css3，采用了div+css结构、表单、超链接、浮动、绝对定位、相对定位、字体样式、引用视频等基础知识JavaScript：做与用户的交互行为文章目录前端学习路线
git - Webhook让部署自动化大猪大猪
我们现在有一个需求，将项目打包上传到gitlab或者github后，程序能自动部署，不用手动地去服务器中进行项目更新并运行，如何做到？这里我们可以使用gitlab与github的挂钩，挂钩的原理就是，每当我们有请求到gitlab与github服务器时，这时他俩会根据我们配置的挂钩地扯进行访问，webhook挂钩程序会一直监听着某个端口请求，一但收到他们发过来的请求，这时就知道用户有请求提交了，这时
webpack图片等资源的处理 dmengmeng
需要的loaderfile-loader（让我们可以引入这些资源文件）url-loader（其实是file-loader的二次封装）img-loader（处理图片所需要的）在没有使用任何处理图片的loader之前，比如说css中用到了背景图片，那么最后打包会报错的，因为他没办法处理图片。其实你只想能够使用图片的话。只加一个file-loader就可以，打开网页能准确看到图片。{test:/\.(p
「豆包Marscode体验官」 | 云端 IDE 启动 & Rust 体验张风捷特烈 ide rust 开发语言后端
theme:cyanosis我正在参加「豆包MarsCode初体验」征文活动MarsCode可以看作一个运行在服务端的远程VSCode开发环境。对于我这种想要学习体验某些语言，但不想在电脑里装环境的人来说非常友好。本文就来介绍一下在MarsCode里，我的体验rust开发体验。一、MarsCode是什么它的本质是:提供代码助手和云端IDE服务的web网站，可通过下面的链接访问https://www
Python神器！WEB自动化测试集成工具 DrissionPage 亚丁号 python 开发语言
一、前言用requests做数据采集面对要登录的网站时，要分析数据包、JS源码，构造复杂的请求，往往还要应付验证码、JS混淆、签名参数等反爬手段，门槛较高。若数据是由JS计算生成的，还须重现计算过程，体验不好，开发效率不高。使用浏览器，可以很大程度上绕过这些坑，但浏览器运行效率不高。因此，这个库设计初衷，是将它们合而为一，能够在不同须要时切换相应模式，并提供一种人性化的使用方法，提高开发和运行效率
Java爬虫框架（一）--架构设计狼图腾-狼之传说 java 框架 java 任务 html解析器存储电子商务
一、架构图那里搜网络爬虫框架主要针对电子商务网站进行数据爬取，分析，存储，索引。爬虫：爬虫负责爬取，解析，处理电子商务网站的网页的内容数据库：存储商品信息索引：商品的全文搜索索引Task队列：需要爬取的网页列表Visited表：已经爬取过的网页列表爬虫监控平台：web平台可以启动，停止爬虫，管理爬虫，task队列，visited表。二、爬虫1.流程1)Scheduler启动爬虫器，TaskMast
Java：爬虫框架 dingcho Java java 爬虫
一、ApacheNutch2【参考地址】Nutch是一个开源Java实现的搜索引擎。它提供了我们运行自己的搜索引擎所需的全部工具。包括全文搜索和Web爬虫。Nutch致力于让每个人能很容易,同时花费很少就可以配置世界一流的Web搜索引擎.为了完成这一宏伟的目标,Nutch必须能够做到:每个月取几十亿网页为这些网页维护一个索引对索引文件进行每秒上千次的搜索提供高质量的搜索结果简单来说Nutch支持分
MongoDB知识概括 GeorgeLin98 持久层 mongodb
MongoDB知识概括MongoDB相关概念单机部署基本常用命令索引-IndexSpirngDataMongoDB集成副本集分片集群安全认证MongoDB相关概念业务应用场景：传统的关系型数据库（如MySQL），在数据操作的“三高”需求以及应对Web2.0的网站需求面前，显得力不从心。解释：“三高”需求：①Highperformance-对数据库高并发读写的需求。②HugeStorage-对海量数
Python实现下载当前年份的谷歌影像 sand&wich python 开发语言
在GIS项目和地图应用中，获取最新的地理影像数据是非常重要的。本文将介绍如何使用Python代码从Google地图自动下载当前年份的影像数据，并将其保存为高分辨率的TIFF格式文件。这个过程涉及地理坐标转换、多线程下载和图像处理。关键功能该脚本的核心功能包括：坐标转换：支持WGS-84与WebMercator投影之间转换，以及处理中国GCJ-02偏移。自动化下载：多线程下载地图瓦片，提高效率。图像
Spring MVC 全面指南：从入门到精通的详细解析一杯梅子酱技术栈学习 spring mvc java
引言：SpringMVC，作为Spring框架的一个重要模块，为构建Web应用提供了强大的功能和灵活性。无论是初学者还是有一定经验的开发者，掌握SpringMVC都将显著提升你的Web开发技能。本文旨在为初学者提供一个全面且易于理解的学习路径，通过详细的知识点分析和实际案例，帮助你快速上手SpringMVC，让学习过程既深刻又高效。一、SpringMVC简介1.1什么是SpringMVC？Spri
Spring Boot中实现跨域请求 BABA8891 spring boot 后端 java
在SpringBoot中实现跨域请求（CORS，Cross-OriginResourceSharing）可以通过多种方式，以下是几种常见的方法：1.使用@CrossOrigin注解在SpringBoot中，你可以在控制器或者具体的请求处理方法上使用@CrossOrigin注解来允许跨域请求。在控制器上应用：importorg.springframework.web.bind.annotation.
WebMagic：强大的Java爬虫框架解析与实战 Aaron_945 Java java 爬虫开发语言
文章目录引言官网链接WebMagic原理概述基础使用1.添加依赖2.编写PageProcessor高级使用1.自定义Pipeline2.分布式抓取优点结论引言在大数据时代，网络爬虫作为数据收集的重要工具，扮演着不可或缺的角色。Java作为一门广泛使用的编程语言，在爬虫开发领域也有其独特的优势。WebMagic是一个开源的Java爬虫框架，它提供了简单灵活的API，支持多线程、分布式抓取，以及丰富的
00. 这里整理了最全的爬虫框架（Java + Python）有一只柴犬爬虫系列爬虫 java python
目录1、前言2、什么是网络爬虫3、常见的爬虫框架3.1、java框架3.1.1、WebMagic3.1.2、Jsoup3.1.3、HttpClient3.1.4、Crawler4j3.1.5、HtmlUnit3.1.6、Selenium3.2、Python框架3.2.1、Scrapy3.2.2、BeautifulSoup+Requests3.2.3、Selenium3.2.4、PyQuery3.2
最简单将静态网页挂载到服务器上(不用nginx) 全能全知者服务器 nginx 运维前端 html 笔记
最简单将静态网页挂载到服务器上(不用nginx)如果随便弄个静态网页挂在服务器都要用nignx就太麻烦了，所以直接使用Apache来搭建一些简单前端静态网页会相对方便很多检查Web服务器服务状态：sudosystemctlstatushttpd#ApacheWeb服务器如果发现没有安装web服务器：安装Apache：sudoyuminstallhttpd启动Apache：sudosystemctl
uniapp使用内置地图选择插件，实现地址选择并在地图上标点神夜大侠 Uniapp vue.js uniapp
uniapp使用内置地图选择插件，实现地址选择并在地图上标点代码如下：page{background:#F4F5F6;}::-webkit-scrollbar{width:0;height:0;color:transparent;}page{height:100%;width:100%;font-size:24rpx;}image,view,input,textarea,label,text,na
【Golang】实现 Excel 文件下载功能 RumIV Golang golang excel 开发语言
在当今的网络应用开发中，提供数据导出功能是一项常见的需求。Excel作为一种广泛使用的电子表格格式，通常是数据导出的首选格式之一。在本教程中，我们将学习如何使用Go语言和GinWeb框架来创建一个Excel文件，并允许用户通过HTTP请求下载该文件。准备工作在开始之前，请确保您的开发环境中已经安装了Go语言和相关的开发工具。此外，您还需要安装GinWeb框架和excelize包，这两个包都将用于我
VUE3 + xterm + nestjs实现web远程终端或连接开启SSH登录的路由器和交换机。焚木灵 node.js vue
可远程连接系统终端或开启SSH登录的路由器和交换机。相关资料：xtermjs/xterm.js:Aterminalfortheweb(github.com)后端实现(NestJS)：1、安装依赖：npminstallnode-ssh@nestjs/websockets@nestjs/platform-socket.io2、我们将创建一个名为RemoteControlModule的NestJS模块，
metaRTC8.0，一个全新架构的webRTC SDK库 metaRTC webrtc 音视频
概述metaRTC8.0是metaRTC开源以来架构变化最大的一个版本，是metaIPC3.0等高性能的基础。metaRTC8.0是一个全新架构版本，并非在metaRTC7.0版本上简单升级，在QOS/语音对讲/内存占用/视频文件录制读取等方面新增多个模块，在弱网对抗/语音对讲/内存优化等效果上有显著提升。metaRTC8.0在一年多的开发中进行了近200次迭代，metaRTC8.0社区版计划在2
metaRTC/webRTC QOS 方案与实践 metaRTC metaRTC 解决方案 webrtc qos
概述质量服务(QOS/QualityofService)是指利用各种技术方案提高网络通信质量的技术，网络通信质量需要解决下面两个问题：网络问题：UDP/不稳定网络/弱网下的丢包/延时/乱序/抖动数据量问题：发送数据量超带宽负载和平滑发送拥塞控制是各种技术方案的数据基础，丢包恢复解决丢包问题，抗乱序抖动解决网络乱序抖动问题，流量控制解决平滑发送数据/数据超带宽负载/延时问题。拥塞控制(Congest
metaRTC5.0 API编程指南(一) metaRTC metaRTC c++c语言 webrtc
概述metaRTC5.0版本API进行了重构，本篇文章将介绍webrtc传输调用流程和例子。metaRTC5.0版本提供了C++和纯C两种接口。纯C接口YangPeerConnection头文件:include/yangrtc/YangPeerConnection.htypedefstruct{void*conn;YangAVInfo*avinfo;YangStreamConfigstreamco
详解“c:/work/src/components/a/b.vue“‘ has no default export报错原因 hw_happy 开发语言前端 vue.js javascript
前情提要在一个vue文件中需要引入定义的b.vue文件，但是提示b文件没有默认导出，对于vue2文件来说有exportdefault，在中，所有定义的变量、函数和组件都会自动被视为默认导出的组件内容。因此，不需要显式地使用exportdefault来导出组件。但是在我引用这个文件的时候还是提示了这个错误，原来是我的项目使用了ts和vite\webpack，因为TypeScript和Vue的默认导出
原力元宇宙：Web3时代下的虚拟现实融合与普通人逆袭的机遇口碑信息传播者
在数字化浪潮席卷全球的今天，一个崭新的概念——原力元宇宙，正以其独特的魅力吸引着越来越多的目光。作为元宇宙国际性的一个项目，原力元宇宙不仅融合了Web3第三代互联网的前沿技术，更将虚拟现实与现实生活紧密相连，为我们描绘出一幅前所未有的数字新世界画卷。13分钟视频内容讲明白原力元宇宙创富项目，中国区运营服务对接微信：ForceZen原力元宇宙，是一个时代的跨越，它代表着互联网技术的又一次革新。Web
html+css网页设计旅游网站首页1个页面 html+css+js网页设计 html css 旅游
html+css网页设计旅游网站首页1个页面网页作品代码简单，可使用任意HTML辑软件（如：Dreamweaver、HBuilder、Vscode、Sublime、Webstorm、Text、Notepad++等任意html编辑软件进行运行及修改编辑等操作）。获取源码1，访问该网站https://download.csdn.net/download/qq_42431718/897527112，点击
bat+ffmpeg批处理图片，图片批量转码张雨zy 音视频 ffmpeg
直接在cmd中输入//批量转码文件for%ain("*.png")doffmpeg-i"%a"-fs1024k"%~na.webp"//删除所有pngdel*.png@echooff表示执行了这条命令后关闭所有命令(包括本身这条命令)的回显。而echooff命令则表示关闭其他所有命令(不包括本身这条命令)的回显，@的作用就是关闭紧跟其后的一条命令的回显脚本完整代码写入脚本中后，需要多加一个%，例如
css设置当字数超过限制后以省略号（...）显示周bro css 前端 vue css3 html 经验分享
1、文字超出一行，省略超出部分，显示’…’用text-overflow:ellipsis属性来，当然还需要加宽度width属来兼容部分浏览。overflow:hidden;text-overflow:ellipsis;white-space:nowrap;2、多行文本溢出显示省略号display:-webkit-box;-webkit-box-orient:vertical;-webkit-lin
ASM系列六利用TreeApi 添加和移除类成员 lijingyao8206 jvm 动态代理 ASM 字节码技术 TreeAPI
同生成的做法一样，添加和移除类成员只要去修改fields和methods中的元素即可。这里我们拿一个简单的类做例子，下面这个Task类，我们来移除isNeedRemove方法，并且添加一个int 类型的addedField属性。 package asm.core; /** * Created by yunshen.ljy on 2015/6/
Springmvc-权限设计 bee1314 spring Web jsp
万丈高楼平地起。权限管理对于管理系统而言已经是标配中的标配了吧，对于我等俗人更是不能免俗。同时就目前的项目状况而言，我们还不需要那么高大上的开源的解决方案，如Spring Security，Shiro。小伙伴一致决定我们还是从基本的功能迭代起来吧。目标： 1.实现权限的管理（CRUD） 2.实现部门管理（CRUD) 3.实现人员的管理（CRUD） 4.实现部门和权限
算法竞赛入门经典（第二版）第2章习题 CrazyMizzz c 算法
2.4.1 输出技巧 #include <stdio.h> int main() { int i, n; scanf("%d", &n); for (i = 1; i <= n; i++) printf("%d\n", i); return 0; } 习题2-2 水仙花数(daffodil
struts2中jsp自动跳转到Action 麦田的设计者 jsp webxml struts2 自动跳转
1、在struts2的开发中，经常需要用户点击网页后就直接跳转到一个Action，执行Action里面的方法，利用mvc分层思想执行相应操作在界面上得到动态数据。毕竟用户不可能在地址栏里输入一个Action（不是专业人士） 2、＜jsp:forward page="xxx.action" /＞，这个标签可以实现跳转，page的路径是相对地址,不同与jsp和j
php 操作webservice实例 IT独行者 PHP webservice
首先大家要简单了解了何谓webservice，接下来就做两个非常简单的例子，webservice还是逃不开server端与client端。我测试的环境为：apache2.2.11 php5.2.10做这个测试之前，要确认你的php配置文件中已经将soap扩展打开，即extension=php_soap.dll; OK 现在我们来体验webservice //server端 serve
Windows下使用Vagrant安装linux系统 _wy_ windows vagrant
准备工作：下载安装 VirtualBox ：https://www.virtualbox.org/ 下载安装 Vagrant ：http://www.vagrantup.com/ 下载需要使用的 box ：官方提供的范例：http://files.vagrantup.com/precise32.box 还可以在 http://www.vagrantbox.es/
更改linux的文件拥有者及用户组(chown和chgrp) 无量 c linux chgrp chown
本文（转） http://blog.163.com/yanenshun@126/blog/static/128388169201203011157308/ http://ydlmlh.iteye.com/blog/1435157 一、基本使用：使用chown命令可以修改文件或目录所属的用户：命令
linux下抓包工具矮蛋蛋 linux
原文地址： http://blog.chinaunix.net/uid-23670869-id-2610683.html tcpdump -nn -vv -X udp port 8888 上面命令是抓取udp包、端口为8888 netstat -tln 命令是用来查看linux的端口使用情况 13 . 列出所有的网络连接 lsof -i 14. 列出所有tcp 网络连接信息 l
我觉得mybatis是垃圾！：“每一个用mybatis的男纸，你伤不起” alafqq mybatis
最近看了每一个用mybatis的男纸，你伤不起原文地址：http://www.iteye.com/topic/1073938 发表一下个人看法。欢迎大神拍砖；个人一直使用的是Ibatis框架，公司对其进行过小小的改良；最近换了公司，要使用新的框架。听说mybatis不错；就对其进行了部分的研究；发现多了一个mapper层；个人感觉就是个dao；
解决java数据交换之谜百合不是茶数据交换
交换两个数字的方法有以下三种，其中第一种最常用 /* 输出最小的一个数 */ public class jiaohuan1 { public static void main(String[] args) { int a =4; int b = 3; if(a<b){ // 第一种交换方式 int tmep =
渐变显示 bijian1013 JavaScript
<style type="text/css"> #wxf { FILTER: progid:DXImageTransform.Microsoft.Gradient(GradientType=0, StartColorStr=#ffffff, EndColorStr=#97FF98); height: 25px; } </style>
探索JUnit4扩展：断言语法assertThat bijian1013 java 单元测试 assertThat
一.概述 JUnit 设计的目的就是有效地抓住编程人员写代码的意图，然后快速检查他们的代码是否与他们的意图相匹配。 JUnit 发展至今，版本不停的翻新，但是所有版本都一致致力于解决一个问题，那就是如何发现编程人员的代码意图，并且如何使得编程人员更加容易地表达他们的代码意图。JUnit 4.4 也是为了如何能够
【Gson三】Gson解析{"data":{"IM":["MSN","QQ","Gtalk"]}} bit1129 gson
如何把如下简单的JSON字符串反序列化为Java的POJO对象? {"data":{"IM":["MSN","QQ","Gtalk"]}} 下面的POJO类Model无法完成正确的解析： import com.google.gson.Gson;
【Kafka九】Kafka High Level API vs. Low Level API bit1129 kafka
1. Kafka提供了两种Consumer API High Level Consumer API Low Level Consumer API(Kafka诡异的称之为Simple Consumer API，实际上非常复杂) 在选用哪种Consumer API时，首先要弄清楚这两种API的工作原理，能做什么不能做什么，能做的话怎么做的以及用的时候，有哪些可能的问题
在nginx中集成lua脚本：添加自定义Http头，封IP等 ronin47 nginx lua
Lua是一个可以嵌入到Nginx配置文件中的动态脚本语言，从而可以在Nginx请求处理的任何阶段执行各种Lua代码。刚开始我们只是用Lua 把请求路由到后端服务器，但是它对我们架构的作用超出了我们的预期。下面就讲讲我们所做的工作。强制搜索引擎只索引mixlr.com Google把子域名当作完全独立的网站，我们不希望爬虫抓取子域名的页面，降低我们的Page rank。 location /{
java-归并排序 bylijinnan java
import java.util.Arrays; public class MergeSort { public static void main(String[] args) { int[] a={20,1,3,8,5,9,4,25}; mergeSort(a,0,a.length-1); System.out.println(Arrays.to
Netty源码学习-CompositeChannelBuffer bylijinnan java netty
CompositeChannelBuffer体现了Netty的“Transparent Zero Copy” 查看API（ http://docs.jboss.org/netty/3.2/api/org/jboss/netty/buffer/package-summary.html#package_description）可以看到，所谓“Transparent Zero Copy”是通
Android中给Activity添加返回键 hotsunshine Activity
// this need android:minSdkVersion="11" getActionBar().setDisplayHomeAsUpEnabled(true); @Override public boolean onOptionsItemSelected(MenuItem item) {
静态页面传参 ctrain 静态
$(document).ready(function () { var request = { QueryString : function (val) { var uri = window.location.search; var re = new RegExp("" + val + "=([^&?]*)", &
Windows中查找某个目录下的所有文件中包含某个字符串的命令 daizj windows 查找某个目录下的所有文件包含某个字符串
findstr可以完成这个工作。 [html] view plain copy >findstr /s /i "string" *.* 上面的命令表示，当前目录以及当前目录的所有子目录下的所有文件中查找"string&qu
改善程序代码质量的一些技巧 dcj3sjt126com 编程 PHP 重构
有很多理由都能说明为什么我们应该写出清晰、可读性好的程序。最重要的一点，程序你只写一次，但以后会无数次的阅读。当你第二天回头来看你的代码时，你就要开始阅读它了。当你把代码拿给其他人看时，他必须阅读你的代码。因此，在编写时多花一点时间，你会在阅读它时节省大量的时间。让我们看一些基本的编程技巧：尽量保持方法简短尽管很多人都遵
SharedPreferences对数据的存储 dcj3sjt126com
SharedPreferences简介： &nbs
linux复习笔记之bash shell (2) bash基础 eksliang bash bash shell
转载请出自出处： http://eksliang.iteye.com/blog/2104329 1.影响显示结果的语系变量（locale） 1.1locale这个命令就是查看当前系统支持多少种语系，命令使用如下： [root@localhost shell]# locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8"
Android零碎知识总结 gqdy365 android
1、CopyOnWriteArrayList add(E) 和remove(int index)都是对新的数组进行修改和新增。所以在多线程操作时不会出现java.util.ConcurrentModificationException错误。所以最后得出结论：CopyOnWriteArrayList适合使用在读操作远远大于写操作的场景里，比如缓存。发生修改时候做copy，新老版本分离，保证读的高
HoverTree.Model.ArticleSelect类的作用 hvt Web .net C#hovertree asp.net
ArticleSelect类在命名空间HoverTree.Model中可以认为是文章查询条件类，用于存放查询文章时的条件，例如HvtId就是文章的id。HvtIsShow就是文章的显示属性，当为-1是，该条件不产生作用，当为0时，查询不公开显示的文章，当为1时查询公开显示的文章。HvtIsHome则为是否在首页显示。HoverTree系统源码完全开放，开发环境为Visual Studio 2013
PHP 判断是否使用代理 PHP Proxy Detector 天梯梦 proxy
1. php 类 I found this class looking for something else actually but I remembered I needed some while ago something similar and I never found one. I'm sure it will help a lot of developers who try to
apache的math库中的回归——regression（翻译） lvdccyb Math apache
这个Math库，虽然不向weka那样专业的ML库，但是用户友好，易用。多元线性回归，协方差和相关性（皮尔逊和斯皮尔曼），分布测试（假设检验，t，卡方，G），统计。数学库中还包含，Cholesky，LU，SVD，QR，特征根分解，真不错。基本覆盖了：线代，统计，矩阵，最优化理论曲线拟合常微分方程遗传算法（GA），还有3维的运算。。。
基础数据结构和算法十三：Undirected Graphs (2) sunwinner Algorithm
Design pattern for graph processing. Since we consider a large number of graph-processing algorithms, our initial design goal is to decouple our implementations from the graph representation
云计算平台最重要的五项技术 sumapp 云计算云平台智城云
云计算平台最重要的五项技术 1、云服务器云服务器提供简单高效，处理能力可弹性伸缩的计算服务，支持国内领先的云计算技术和大规模分布存储技术，使您的系统更稳定、数据更安全、传输更快速、部署更灵活。特性机型丰富通过高性能服务器虚拟化为云服务器，提供丰富配置类型虚拟机，极大简化数据存储、数据库搭建、web服务器搭建等工作；仅需要几分钟，根据CP
《京东技术解密》有奖试读获奖名单公布 ITeye管理员活动
ITeye携手博文视点举办的12月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 12月试读活动回顾： http://webmaster.iteye.com/blog/2164754 本次技术图书试读活动获奖名单及相应作品如下：一等奖（两名） Microhardest：http://microhardest.ite