django关于文件上传

1.简单示例:

from django import forms
class UploadFileForm(forms.Form):
title = forms.CharField(max_length=50)
file = forms.FileField()

A view handling this form will receive the file data in request.FILES, which is a dictionary containing a key for each FileField (or ImageField, or other FileField subclass) in the form. So the data from the above form would be accessible as request.FILES[‘file’].
Note that request.FILES will only contain data if the request method was POST and the < form> that posted the request has the attribute enctype=“multipart/form-data”. Otherwise, request.FILES will be empty.
Most of the time, you’ll simply pass the file data from request into the form as described in Binding uploaded files to a form. This would look something like:

from django.http import HttpResponseRedirect
from django.shortcuts import render
from .forms import UploadFileForm
#Imaginary function to handle an uploaded file.
from somewhere import handle_uploaded_file
def upload_file(request):
if request.method == 'POST':
form = UploadFileForm(request.POST, request.FILES)
if form.is_valid():
handle_uploaded_file(request.FILES['file'])
return HttpResponseRedirect('/success/url/')
else:
form = UploadFileForm()
return render(request, 'upload.html', {'form': form})

#Notice that we have to pass request.FILES into the form’s constructor; this is how file data gets bound into a form
#Here’s a common way you might handle an uploaded file:
def handle_uploaded_file(f):
with open('some/file/name.txt', 'wb+') as destination:
for chunk in f.chunks():
destination.write(chunk)

(1)上面示例为简单的上传文件字段及其视图函数;视图函数从request.FILES中获取文件数据,它是一个字典, 其中键为字段名,值为文件对象;注意只有当表单提交方式为POST并且表单具有 enctype=“multipart/form-data”属性时request.FILES中才有数据;否则,里面是空的;
(2)使用循环chunks方法将文件写入硬盘,而不是用read()方法,避免文件过大使内存过载;

Looping over UploadedFile.chunks() instead of using read() ensures that large files don’t overwhelm your system’s memory.
There are a few other methods and attributes available on UploadedFile objects; see UploadedFile for a complete reference.

2.Handling uploaded files with a model处理关于model的上传文件

If you’re saving a file on a Model with a FileField, using a ModelForm makes this process much easier. The file object will be saved to the location specified by the upload_to argument of the corresponding FileField when calling form.save():

from django.http import HttpResponseRedirect
from django.shortcuts import render
from .forms import ModelFormWithFileField
def upload_file(request):
if request.method == 'POST':
form = ModelFormWithFileField(request.POST, request.FILES)
if form.is_valid():
#file is saved
form.save()
return HttpResponseRedirect('/success/url/')
else:
form = ModelFormWithFileField()
return render(request, 'upload.html', {'form': form})

If you are constructing an object manually, you can simply assign the file object from request.FILES to the file field in the model:

from django.http import HttpResponseRedirect
from django.shortcuts import render
from .forms import UploadFileForm
from .models import ModelWithFileField
def upload_file(request):
if request.method == 'POST':
form = UploadFileForm(request.POST, request.FILES)
if form.is_valid():
instance = ModelWithFileField(file_field=request.FILES['file'])
instance.save()
return HttpResponseRedirect('/success/url/')
else:
form = UploadFileForm()
return render(request, 'upload.html', {'form': form})

如果你在model上的FileField保存文件, 那使用ModelForm会简单很多;当调用form.save()方法时,文件对象会保存到upload_to参数指定的位置,见第一例;
如果你想手动构建对象, 可用把request.FILE中的文件对象赋值给模型中的file field, 见第二例;

3.Uploading multiple files上传多个文件

If you want to upload multiple files using one form field, set the multiple HTML attribute of field’s widget:

from django import forms
class FileFieldForm(forms.Form):
file_field = forms.FileField(widget=forms.ClearableFileInput(attrs={'multiple':True}))

#Then override the post method of your FormView subclass to handle multiple file uploads:
from django.views.generic.edit import FormView
from .forms import FileFieldForm
class FileFieldView(FormView):
form_class = FileFieldForm
template_name = 'upload.html' # Replace with your template.
success_url = '...' # Replace with your URL or reverse().
def post(self, request, *args, **kwargs):
form_class = self.get_form_class()
form = self.get_form(form_class)
files = request.FILES.getlist('file_field')
if form.is_valid():
for f in files:
... # Do something with each file.
return self.form_valid(form)
else:
return self.form_invalid(form)

如果你想在一个表单字段上传多个文件,那么对字段插件设置multiple属性为True, 然后FormView子类覆盖post()方法,见上例;
4.Upload Handlers上传处理器

When a user uploads a file, Django passes off the file data to an upload handler – a small class that handles file data as it gets uploaded. Upload handlers are initially defined in the FILE_UPLOAD_HANDLERS setting, which defaults to:

["django.core.files.uploadhandler.MemoryFileUploadHandler",
"django.core.files.uploadhandler.TemporaryFileUploadHandler"]

Together MemoryFileUploadHandler and TemporaryFileUploadHandler provide Django’s default file upload behavior of reading small files into memory and large ones onto disk. You can write custom handlers that customize how Django handles files. You could, for example, use custom handlers to enforce user-level quotas, compress data on the fly, render progress bars, and even send data to another storage location directly without storing it locally. See Writing custom upload handlers for details on how you can customize or completely replace upload behavior.

当一个用户上传一个文件时, django将文件数据传递给一个上传处理器, 即一个处理文件数据的小类, 两个文件上传处理器见上述;
这两个处理器默认会提供如下行为:将小文件读入内存,将大文件写入硬盘; 你可以自己定制文件处理器来强制用户配额, 动态压缩数据, 渲染进度条, 甚至直接将数据发送到其他的存储位置,而无需在本地存储;

5.Where uploaded data is stored在哪里存放上传数据

Before you save uploaded files, the data needs to be stored somewhere.
By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold the entire contents of the upload in memory. This means that saving the file involves only a read from memory and a write to disk and thus is very fast.
However, if an uploaded file is too large, Django will write the uploaded file to a temporary file stored in your system’s temporary directory. On a Unix-like platform this means you can expect Django to generate a file called something like /tmp/tmpzfp6I6.upload. If an upload is large enough, you can watch this file grow in size as Django streams the data onto disk.
These specifics – 2.5 megabytes; /tmp; etc. – are simply “reasonable defaults” which can be customized as described in the next section.

当你保存文件之前, 数据需要存储在某个地方;
默认情况下, 上传文件小于2.5M时, 直接保存在内存中, 然后保存文件是直接从内存中读取然后写入硬盘, 过程非常快;
当上传文件太大时, django将会把上传文件保存在系统的临时目录中, 在Unix类型的系统中, 意味着你可以在/tmp文件夹下找到;
当然2.5M的限制可自己定制更改;

6.Changing upload handler behavior改变上传处理器的行为

There are a few settings which control Django’s file upload behavior. See File Upload Settings for details.

参考相关章节;

7.Modifying upload handlers on the fly动态更改上传处理器

Sometimes particular views require different upload behavior. In these cases, you can override upload handlers on a per-request basis by modifying request.upload_handlers. By default, this list will contain the upload handlers given by FILE_UPLOAD_HANDLERS, but you can modify the list as you would any other list.
For instance, suppose you’ve written a ProgressBarUploadHandler that provides feedback on upload progress to some sort of AJAX widget. You’d add this handler to your upload handlers like this:

request.upload_handlers.insert(0, ProgressBarUploadHandler(request))

You’d probably want to use list.insert() in this case (instead of append()) because a progress bar handler would need to run before any other handlers. Remember, the upload handlers are processed in order.
If you want to replace the upload handlers completely, you can just assign a new list:

request.upload_handlers = [ProgressBarUploadHandler(request)]

某些特殊的视图函数可能需要不同的上传行为; 在这种情况下, 你可以基于每个request覆盖上传处理器, 通过更改request.upload_handlers属性;
默认情况下, 这个列表会包含FILE_UPLODE_HANDLERS中的上传处理器,但你可以更改为任何您想要的列表;
例如, 你写了一个ProcessBarHandler可以对某个Ajax插件提供回调函数, 你可以将handler添加到列表中, 注意使用insert而不是append因为该handler需要在其他handler之前运行,而handler是按顺序运行的;你想完全替代列表也是可以的, 可以直接将列表赋值给request.upload_handlers;

Note: You can only modify upload handlers before accessing request.POST or request.FILES – it doesn’t make sense to change upload handlers after upload handling has already started. If you try to modify request. upload_handlers after reading from request.POST or request.FILES Django will throw an error.
Thus, you should always modify uploading handlers as early in your view as possible.
Also, request.POST is accessed by CsrfViewMiddleware which is enabled by default. This means you will need to use csrf_exempt() on your view to allow you to change the upload handlers. You will then need to use csrf_protect() on the function that actually processes the request. Note that this means that the handlers may start receiving the file upload before the CSRF checks have been done. Example code:

from django.views.decorators.csrf import csrf_exempt, csrf_protect
@csrf_exempt
def upload_file_view(request):
request.upload_handlers.insert(0, ProgressBarUploadHandler(request))
return _upload_file_view(request)
@csrf_protect
def _upload_file_view(request):
... # Process request

注意,你只能在request.POST或request.FILES之前更改上传处理器, 如果在之后更改,则会抛出错误;而且应该尽早更改上传处理器;
另外, request.POST是被CsrfViewMiddleware处理的, 这意味着你需要使用csrf_expect()以及csrf_protect();注意这意味着处理器可能在csrf检查完成之前开始接收上传文件;

8.上传的文件对象的方法;

class UploadedFile
During file uploads, the actual file data is stored in request.FILES. Each entry in this dictionary is an UploadedFile object (or a subclass) – a simple wrapper around an uploaded file. You’ll usually use one of these methods to access the uploaded content:

UploadedFile.read()
Read the entire uploaded data from the file. Be careful with this method: if the uploaded file is huge it can overwhelm your system if you try to read it into memory. You’ll probably want to use chunks() instead; see below.

UploadedFile.multiple_chunks(chunk_size=None)
Returns True if the uploaded file is big enough to require reading in multiple chunks. By default this will be any file larger than 2.5 megabytes, but that’s configurable; see below.

UploadedFile.chunks(chunk_size=None)
A generator returning chunks of the file. If multiple_chunks() is True, you should use this method in a loop instead of read().
In practice, it’s often easiest simply to use chunks() all the time. Looping over chunks() instead of using read() ensures that large files don’t overwhelm your system’s memory.

在上传文件期间, 实际的文件数据被存放在request.FILES中, 该字典的每个值都是一个UploadedFile对象, 一个上传文件的简单封装;你通常可以使用如下几个方法访问文内容;

(1)UploadedFile.read()
将整个上传文件读入内存,使用的时候要小心:如果上传文件比较巨大, 可能压垮你的系统;你可以使用chunks()方法;
(2)UploadedFile.multiple_chunks(chunk_size=None)
如果上传文件足够大到需要分多个块读取时返回True;默认情况下指的是大于2.5M的文件;
(3)UploadedFile.chunks(chunk_size=None)
一个返回文件块的生成器;如果multiple_chunks()返回True, 那么你应该使用该方法在循环中,而不是使用read();
实际情况下, 使用chunks()经常是最简单的方法;

其他几种有用的方法:
(4)UploadedFile.name

The name of the uploaded file (e.g. my_file.txt).

返回上传文件的文件名

(5)UploadedFile.size

The size, in bytes, of the uploaded file.

上传文件的大小;

(6)UploadedFile.content_type

The content-type header uploaded with the file (e.g. text/plain or application/pdf). Like any data supplied by the user, you shouldn’t trust that the uploaded file is actually this type. You’ll still need to validate that the file contains the content that the content-type header claims – “trust but verify.”

和上传文件一起的content-type头, 与用户提供的任何数据一样,你不应该相信上传的文件真的是这种类型;你仍然需要校验文件包含的内容;

(7)UploadedFile.content_type_extra

A dictionary containing extra parameters passed to the content-type header. This is typically provided by services, such as Google App Engine, that intercept and handle file uploads on your behalf. As a result your handler may not receive the uploaded file content, but instead a URL or other pointer to the file. (see RFC 2388 section 5.3).

一个包含传入content-type头的额外参数的字典,这通常由Google App Engine等提供服务,替代你拦截并处理上传文件;结果可能是你无法接收上传文件内容,而接到的是一个URL或指向该文件的其他指针;

(8)UploadedFile.charset

For text/* content-types, the character set (i.e. utf8) supplied by the browser. Again, “trust but verify” is the best policy here.

对于text/* 内容类型, 浏览器应用的字符集;同样, 需要验证;

Note: Like regular Python files, you can read the file line-by-line simply by iterating over the uploaded file:

for line in uploadedfile:
do_something_with(line)

Lines are split using universal newlines. The following are recognized as ending a line: the Unix end-of-line convention ‘\n’, the Windows convention ‘\r\n’, and the old Macintosh convention ‘\r’.

注意,常规的python文件,可以通过一行一行的方式读取,如上例;每行间通过通用换行符分隔;

(9)UploadedFile的子类:

Subclasses of UploadedFile include:
class TemporaryUploadedFile
A file uploaded to a temporary location (i.e. stream-to-disk). This class is used by the TemporaryFileUploadHandler. In addition to the methods from UploadedFile, it has one additional method:
TemporaryUploadedFile.temporary_file_path()
Returns the full path to the temporary uploaded file.

class InMemoryUploadedFile
A file uploaded into memory (i.e. stream-to-memory). This class is used by the MemoryFileUploadHandler.

包括两个子类:
1)TemporaryUploadedFile
上传到临时位置的文件;该类被TemporaryFileUploadHandler使用,除了上述方法外, 还有一个额外的方法:
TemporaryUploadedFile.temporary_file_path()
返回临时上传文件的全路径;

2)InMemoryUploadedFile
上传到内存中的文件;该类被MemoryFileUploadHandler使用

(10)Built-in upload handlers内置文件处理器

Together the MemoryFileUploadHandler and TemporaryFileUploadHandler provide Django’s default file upload behavior of reading small files into memory and large ones onto disk. They are located in django.core.files.uploadhandler.

class MemoryFileUploadHandler
File upload handler to stream uploads into memory (used for small files).
将文件存入内存,适用于小文件;

class TemporaryFileUploadHandler
Upload handler that streams data into a temporary file using TemporaryUploadedFile.
将文件存入临时目录;

(11)Writing custom upload handlers写自定制上传处理器

class FileUploadHandler
All file upload handlers should be subclasses of django.core.files.uploadhandler. FileUploadHandler. You can define upload handlers wherever you wish.
Required methods
Custom file upload handlers must define the following methods:

所有的上传处理器都应该是FileUploadHandler的子类,有几个必须定义的方法:

1)FileUploadHandler.receive_data_chunk(raw_data, start)

Receives a “chunk” of data from the file upload.
raw_data is a byte string containing the uploaded data.
start is the position in the file where this raw_data chunk begins.
The data you return will get fed into the subsequent upload handlers’ receive_data_chunk methods. In this way, one handler can be a “filter” for other handlers.
Return None from receive_data_chunk to short-circuit remaining upload handlers from getting this chunk. This is useful if you’re storing the uploaded data yourself and don’t want future handlers to store a copy of the data.
If you raise a StopUpload or a SkipFile exception, the upload will abort or the file will be completely skipped.

从上传文件中接收块数据;
raw_data是包含上传数据的字节串;
start是该方法从文件中的开始位置;
返回的数据将会传入后续的上传处理器中, 从这个角度讲, 一个处理器可以成为其他处理器的过滤器;
如果返回None, 则可以使后续的处理器短路从而无法获得文件块; 这在你想要自己保存上传文件并不想以后的处理器保存数据的备份时非常有用;
如果抛出StopUpload和SkipFile异常,上传将会终止或者文件将会完全跳过;

2)FileUploadHandler.file_complete(file_size)

Called when a file has finished uploading.
The handler should return an UploadedFile object that will be stored in request.FILES. Handlers may also return None to indicate that the UploadedFile object should come from subsequent upload handlers.

当一个文件上传结束时调用;
处理器应该返回一个UploadedFile对象,该对象将会被存储在request.FILES中,如果返回None,则表明该对象应该在后续的处理器中产生;

还有可选方法:
1)FileUploadHandler.chunk_size

Size, in bytes, of the “chunks” Django should store into memory and feed into the handler. That is, this attribute controls the size of chunks fed into FileUploadHandler.receive_data_chunk.
For maximum performance the chunk sizes should be divisible by 4 and should not exceed 2 GB (231 bytes) in size. When there are multiple chunk sizes provided by multiple handlers, Django will use the smallest chunk size defined by any handler.
The default is 64*210 bytes, or 64 KB.

django放入内存并且传入处理器的块的大小;也就是说, 这个属性控制传入receive_data_chunk()方法的块的大小;
最大应该是4的倍数并且不超过2GB,如果多个处理器规定了不同的chunk_size ,那django将会选取其中最小的值;
默认为64KB;

2)FileUploadHandler.new_file(field_name, file_name, content_type, content_length, charset, content_type_extra)

Callback signaling that a new file upload is starting. This is called before any data has been fed to any upload handlers.
field_name is a string name of the file < input> field.
file_name is the filename provided by the browser.
content_type is the MIME type provided by the browser – E.g. ‘image/jpeg’.
content_length is the length of the image given by the browser. Sometimes this won’t be provided and will be None.
charset is the character set (i.e. utf8) given by the browser. Like content_length, this sometimes won’t be provided.
content_type_extra is extra information about the file from the content-type header. See
UploadedFile.content_type_extra.
This method may raise a StopFutureHandlers exception to prevent future handlers from handling this file.

当新文件开始上传时的回调信号;在任何数据被传入任何处理器前被调用;
参数说明见上述;

3)FileUploadHandler.upload_complete()

Callback signaling that the entire upload (all files) has completed.

当整个文件上传完成时的回调信号;

4)FileUploadHandler.handle_raw_input(input_data, META, content_length, boundary, encoding)

Allows the handler to completely override the parsing of the raw HTTP input.
input_data is a file-like object that supports read()-ing.
META is the same object as request.META.
content_length is the length of the data in input_data. Don’t read more than content_length bytes from input_data.
boundary is the MIME boundary for this request.
encoding is the encoding of the request.
Return None if you want upload handling to continue, or a tuple of (POST, FILES) if you want to return the new data structures suitable for the request directly.

允许处理器完全覆盖原生HTTP输入的解析;

你可能感兴趣的:(web)