I found something in the Specification of OpenGL Version 4.6 (Core Profile):
The output of Vertex Shader:
If the output variables are passed directly to the vertex processing stages lead- ing to rasterization, the values of all outputs are expected to be interpolated across the primitive being rendered, unless flatshaded. Otherwise the values of all out- puts are collected by the primitive assembly stage and passed on to the subsequent pipeline stage once enough data for one primitive has been collected.
Seems that if there is no TES and GS, the "Primitive Assembly" will be done later. Otherwise where will be an "early primitive assembly " as said in official documentation.
However in "11.1.3 Shader Execution" of the specification:
The following sequence of operations is performed:
- Vertices are processed by the vertex shader (see section 11.1) and assembled into primitives as described in sections 10.1 through 10.3.
- If the current program contains a tessellation control shader, each indi- vidual patch primitive is processed by the tessellation control shader (sec- tion 11.2.1). Otherwise, primitives are passed through unmodified. If active, the tessellation control shader consumes its input patch and produces a new patch primitive, which is passed to subsequent pipeline stages.
- If the current program contains a tessellation evaluation shader, each indi- vidual patch primitive is processed by the tessellation primitive generator (section 11.2.2) and tessellation evaluation shader (see section 11.2.3). Oth- erwise, primitives are passed through unmodified. When a tessellation eval- uation shader is active, the tessellation primitive generator produces a new collection of point, line, or triangle primitives to be passed to subsequent pipeline stages. The vertices of these primitives are processed by the tes- sellation evaluation shader. The patch primitive passed to the tessellation primitive generator is consumed by this process.
- If the current program contains a geometry shader, each individual primitive is processed by the geometry shader (section 11.3). Otherwise, primitives are passed through unmodified. If active, the geometry shader consumes its input patch primitive. However, each geometry shader invocation may emit new vertices, which are arranged into primitives and passed to subsequent pipeline stages.
Following shader execution, the fixed-function operations described in chap- ter 13 are applied.
Have a look at the "fixed-function operations" in chapter 13, there will do all the "Vertex Post-processing" such as clipping, perspective-divide, viewport transform and Transform-feedback. In chapter 13, I found:
After programmable vertex processing, the following fixed-function operations are applied to vertices of the resulting primitives:...
My understand
I think the accurate time of "Primitive Assembly" happened maybe is a little hard to tell, but I tend to believe that this is done just after vertex process. As Nicol said What's clear is that the process of clipping knows about the individual primitives, so some primitive assembly has happened prior to reaching that stage..
I think one of main task of the so-called "Primitive Assembly" stage which between vertex process and rasterization is the face culling. (I am not familiar with multi-draw maybe this stage is to do with this too.)
A figure of vertex process:
A simple pipeline based on my simple understand:
# Start
(Vertices Data)
|
|
|
V
Vertex Shader # Do primitive assembly here
|
|
| (primitives)
|
V
[Tessellation Shaders]
|
|
| (primitives)
|
V
[Geometry Shader]
|
|
| (primitives)
|
V
Vertex Post-processing
|
|
| (primitives)
|
V
Primitive Assembly # Mainly do face culling
|
|
| (primitives)
|
V
Rasterization
|
|
| (fragment)
|
V
Fragment Shader
# [] means can be ignored