Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析

文章目录

  • 目的
  • 官方下载 ARM Mobile Studio
  • 启动Graphics Analyzer
  • Mali Offline Compiler 分析单元资源情况
    • 确定你的 apk 安装包都开启了 dev 模式 (debug mode)
    • 设置好 ADB
    • 连接 mali 型号真机设备
    • 确定芯片型号
    • 使用 Mali Offline Compiler 分析
    • pbr.vert shader
    • shadermap 定位 shader 资源
    • pbr.vert 分析结果
    • pbr.frag shader
  • 将 shader 优化为 unity默认的 BRDF3 修改为 blinn-phong 光照模型
    • 分析 blinn_phong.vert, bling_phong.frag 单元资源
      • blinn_phong.vert shader
      • pbr.vert vs blinn_phong.vert
      • blinn_phong.frag shader
      • pbr.frag vs blinn_phong.frag
  • Project
  • References


目的

记录,备忘,便于索引


官方下载 ARM Mobile Studio

https://developer.arm.com/downloads/search?term=Arm%20Mobile%20Studio

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第1张图片


启动Graphics Analyzer

安装完,直接开始菜单输入: Graphics Analyzer

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第2张图片

启动界面
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第3张图片


Mali Offline Compiler 分析单元资源情况


确定你的 apk 安装包都开启了 dev 模式 (debug mode)

确定真机设备安装了 开启可调试的 apk

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第4张图片


设置好 ADB

直接 Everything 定位 adb 或是 dos : where adb
在这里插入图片描述

设置到 AGA
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第5张图片

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第6张图片


连接 mali 型号真机设备

adb devices 确定有 attached
在这里插入图片描述

AGA 设备管理
在这里插入图片描述

等待 AGA 扫描出设备
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第7张图片

选择要启动的 APP,确定 Debuggable 是 绿色钩的
下面我们使用 opengl es 的
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第8张图片


确定芯片型号

现在之前的 AGA (Android Graphics Analyzer) 查看设备 芯片型号

Graphics Analyzer/Console : Mali-T880
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第9张图片


使用 Mali Offline Compiler 分析

可以直接参考我之前写的:Mali Offline Compiler - 官方视频教学 - 笔录


pbr.vert shader

shader 我们可以使用 AGA 的 shadermap 来分析是哪个 shader 资源


shadermap 定位 shader 资源

在这里插入图片描述

然后 capture 一帧,查看 shadermap 的 framebuffer,查看对应那个 shader

注意这里只能分析 fragment shader 的 id,如果是 vertex shader 就使用 fragment shader id - 1即可

比如,下面 Fragment Shader 14,那么 Vertex Shader 的 ID 为 : 13

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第10张图片

双击此 shader 即可
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第11张图片

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第12张图片

然后将此 shader 代码复制到 malioc.exe 同目录下(没什么原因,只是方便测试)
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第13张图片

malioc.exe 的目录建议使用 Everything 工具来定位

输入: malioc

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第14张图片

下面是 vertex shader 的内容

#version 320 es

#define HLSLCC_ENABLE_UNIFORM_BUFFERS 1
#if HLSLCC_ENABLE_UNIFORM_BUFFERS
#define UNITY_UNIFORM
#else
#define UNITY_UNIFORM uniform
#endif
#define UNITY_SUPPORTS_UNIFORM_LOCATION 1
#if UNITY_SUPPORTS_UNIFORM_LOCATION
#define UNITY_LOCATION(x) layout(location = x)
#define UNITY_BINDING(x) layout(binding = x, std140)
#else
#define UNITY_LOCATION(x)
#define UNITY_BINDING(x) layout(std140)
#endif
uniform 	vec3 _WorldSpaceCameraPos;
uniform 	mediump vec4 unity_SHBr;
uniform 	mediump vec4 unity_SHBg;
uniform 	mediump vec4 unity_SHBb;
uniform 	mediump vec4 unity_SHC;
uniform 	vec4 hlslcc_mtx4x4unity_ObjectToWorld[4];
uniform 	vec4 hlslcc_mtx4x4unity_WorldToObject[4];
uniform 	vec4 unity_WorldTransformParams;
uniform 	vec4 hlslcc_mtx4x4unity_MatrixVP[4];
uniform 	vec4 _MainTex_ST;
uniform 	vec4 _DetailAlbedoMap_ST;
uniform 	mediump float _UVSec;
in highp vec4 in_POSITION0;
in mediump vec3 in_NORMAL0;
in highp vec2 in_TEXCOORD0;
in highp vec2 in_TEXCOORD1;
in mediump vec4 in_TANGENT0;
out highp vec4 vs_TEXCOORD0;
out highp vec4 vs_TEXCOORD1;
out highp vec4 vs_TEXCOORD2;
out highp vec4 vs_TEXCOORD3;
out highp vec4 vs_TEXCOORD4;
out mediump vec4 vs_TEXCOORD5;
out highp vec4 vs_TEXCOORD7;
out highp vec3 vs_TEXCOORD8;
vec4 u_xlat0;
mediump vec4 u_xlat16_0;
bool u_xlatb0;
vec4 u_xlat1;
mediump vec3 u_xlat16_2;
mediump vec3 u_xlat16_3;
float u_xlat12;
void main()
{
    u_xlat0 = in_POSITION0.yyyy * hlslcc_mtx4x4unity_ObjectToWorld[1];
    u_xlat0 = hlslcc_mtx4x4unity_ObjectToWorld[0] * in_POSITION0.xxxx + u_xlat0;
    u_xlat0 = hlslcc_mtx4x4unity_ObjectToWorld[2] * in_POSITION0.zzzz + u_xlat0;
    u_xlat0 = u_xlat0 + hlslcc_mtx4x4unity_ObjectToWorld[3];
    u_xlat1 = u_xlat0.yyyy * hlslcc_mtx4x4unity_MatrixVP[1];
    u_xlat1 = hlslcc_mtx4x4unity_MatrixVP[0] * u_xlat0.xxxx + u_xlat1;
    u_xlat1 = hlslcc_mtx4x4unity_MatrixVP[2] * u_xlat0.zzzz + u_xlat1;
    gl_Position = hlslcc_mtx4x4unity_MatrixVP[3] * u_xlat0.wwww + u_xlat1;
#ifdef UNITY_ADRENO_ES3
    u_xlatb0 = !!(_UVSec==0.0);
#else
    u_xlatb0 = _UVSec==0.0;
#endif
    u_xlat0.xy = (bool(u_xlatb0)) ? in_TEXCOORD0.xy : in_TEXCOORD1.xy;
    vs_TEXCOORD0.zw = u_xlat0.xy * _DetailAlbedoMap_ST.xy + _DetailAlbedoMap_ST.zw;
    vs_TEXCOORD0.xy = in_TEXCOORD0.xy * _MainTex_ST.xy + _MainTex_ST.zw;
    u_xlat0.xyz = in_POSITION0.yyy * hlslcc_mtx4x4unity_ObjectToWorld[1].xyz;
    u_xlat0.xyz = hlslcc_mtx4x4unity_ObjectToWorld[0].xyz * in_POSITION0.xxx + u_xlat0.xyz;
    u_xlat0.xyz = hlslcc_mtx4x4unity_ObjectToWorld[2].xyz * in_POSITION0.zzz + u_xlat0.xyz;
    u_xlat0.xyz = hlslcc_mtx4x4unity_ObjectToWorld[3].xyz * in_POSITION0.www + u_xlat0.xyz;
    vs_TEXCOORD1.xyz = u_xlat0.xyz + (-_WorldSpaceCameraPos.xyz);
    vs_TEXCOORD8.xyz = u_xlat0.xyz;
    vs_TEXCOORD1.w = 0.0;
    u_xlat0.xyz = in_TANGENT0.yyy * hlslcc_mtx4x4unity_ObjectToWorld[1].xyz;
    u_xlat0.xyz = hlslcc_mtx4x4unity_ObjectToWorld[0].xyz * in_TANGENT0.xxx + u_xlat0.xyz;
    u_xlat0.xyz = hlslcc_mtx4x4unity_ObjectToWorld[2].xyz * in_TANGENT0.zzz + u_xlat0.xyz;
    u_xlat12 = dot(u_xlat0.xyz, u_xlat0.xyz);
    u_xlat12 = inversesqrt(u_xlat12);
    u_xlat0.xyz = vec3(u_xlat12) * u_xlat0.xyz;
    vs_TEXCOORD2.xyz = u_xlat0.xyz;
    vs_TEXCOORD2.w = 0.0;
    u_xlat1.x = dot(in_NORMAL0.xyz, hlslcc_mtx4x4unity_WorldToObject[0].xyz);
    u_xlat1.y = dot(in_NORMAL0.xyz, hlslcc_mtx4x4unity_WorldToObject[1].xyz);
    u_xlat1.z = dot(in_NORMAL0.xyz, hlslcc_mtx4x4unity_WorldToObject[2].xyz);
    u_xlat12 = dot(u_xlat1.xyz, u_xlat1.xyz);
    u_xlat12 = inversesqrt(u_xlat12);
    u_xlat1.xyz = vec3(u_xlat12) * u_xlat1.xyz;
    u_xlat16_2.xyz = u_xlat0.yzx * u_xlat1.zxy;
    u_xlat16_2.xyz = u_xlat1.yzx * u_xlat0.zxy + (-u_xlat16_2.xyz);
    u_xlat0.x = in_TANGENT0.w * unity_WorldTransformParams.w;
    u_xlat16_2.xyz = u_xlat0.xxx * u_xlat16_2.xyz;
    vs_TEXCOORD3.xyz = u_xlat16_2.xyz;
    vs_TEXCOORD3.w = 0.0;
    vs_TEXCOORD4.xyz = u_xlat1.xyz;
    vs_TEXCOORD4.w = 0.0;
    u_xlat16_2.x = u_xlat1.y * u_xlat1.y;
    u_xlat16_2.x = u_xlat1.x * u_xlat1.x + (-u_xlat16_2.x);
    u_xlat16_0 = u_xlat1.yzzx * u_xlat1.xyzz;
    u_xlat16_3.x = dot(unity_SHBr, u_xlat16_0);
    u_xlat16_3.y = dot(unity_SHBg, u_xlat16_0);
    u_xlat16_3.z = dot(unity_SHBb, u_xlat16_0);
    vs_TEXCOORD5.xyz = unity_SHC.xyz * u_xlat16_2.xxx + u_xlat16_3.xyz;
    vs_TEXCOORD5.w = 0.0;
    vs_TEXCOORD7 = vec4(0.0, 0.0, 0.0, 0.0);
    return;
}

mali offline compiler 分析一波,不了解怎么使用,可以参考我之前的文章:Mali Offline Compiler - 官方视频教学 - 笔录
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第15张图片


pbr.vert 分析结果

VS 分析结果,可以看到,A, LS 都占用比较高
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第16张图片


pbr.frag shader

#version 320 es
#ifdef GL_EXT_shader_texture_lod
#extension GL_EXT_shader_texture_lod : enable
#endif

precision highp float;
precision highp int;
#define HLSLCC_ENABLE_UNIFORM_BUFFERS 1
#if HLSLCC_ENABLE_UNIFORM_BUFFERS
#define UNITY_UNIFORM
#else
#define UNITY_UNIFORM uniform
#endif
#define UNITY_SUPPORTS_UNIFORM_LOCATION 1
#if UNITY_SUPPORTS_UNIFORM_LOCATION
#define UNITY_LOCATION(x) layout(location = x)
#define UNITY_BINDING(x) layout(binding = x, std140)
#else
#define UNITY_LOCATION(x)
#define UNITY_BINDING(x) layout(std140)
#endif
uniform 	mediump vec4 _WorldSpaceLightPos0;
uniform 	mediump vec4 unity_SHAr;
uniform 	mediump vec4 unity_SHAg;
uniform 	mediump vec4 unity_SHAb;
uniform 	mediump vec4 unity_SpecCube0_HDR;
uniform 	mediump vec4 _LightColor0;
uniform 	mediump vec4 _Color;
uniform 	mediump float _BumpScale;
uniform 	mediump float _Metallic;
uniform 	float _Glossiness;
uniform 	mediump float _OcclusionStrength;
uniform 	float _MainTex_mipmapBias;
uniform 	float _BumpMap_mipmapBias;
UNITY_LOCATION(0) uniform mediump sampler2D _MRATTex;
UNITY_LOCATION(1) uniform mediump sampler2D _MainTex;
UNITY_LOCATION(2) uniform mediump sampler2D _BumpMap;
UNITY_LOCATION(3) uniform mediump samplerCube unity_SpecCube0;
in highp vec4 vs_TEXCOORD0;
in highp vec4 vs_TEXCOORD1;
in highp vec4 vs_TEXCOORD2;
in highp vec4 vs_TEXCOORD3;
in highp vec4 vs_TEXCOORD4;
in mediump vec4 vs_TEXCOORD5;
layout(location = 0) out mediump vec4 SV_Target0;
vec3 u_xlat0;
vec3 u_xlat1;
mediump vec4 u_xlat16_1;
mediump vec3 u_xlat16_2;
vec4 u_xlat3;
mediump vec3 u_xlat16_3;
mediump vec4 u_xlat16_4;
mediump vec4 u_xlat16_5;
vec3 u_xlat6;
mediump float u_xlat16_7;
mediump vec3 u_xlat16_8;
vec3 u_xlat9;
mediump vec3 u_xlat16_9;
mediump vec3 u_xlat16_13;
mediump float u_xlat16_14;
mediump vec3 u_xlat16_16;
float u_xlat18;
float u_xlat27;
float u_xlat28;
float u_xlat29;
mediump float u_xlat16_32;
void main()
{
    u_xlat0.x = dot(vs_TEXCOORD1.xyz, vs_TEXCOORD1.xyz);
    u_xlat0.x = inversesqrt(u_xlat0.x);
    u_xlat9.xyz = (-vs_TEXCOORD1.xyz) * u_xlat0.xxx + _WorldSpaceLightPos0.xyz;
    u_xlat1.xyz = u_xlat0.xxx * vs_TEXCOORD1.xyz;
    u_xlat0.x = dot(u_xlat9.xyz, u_xlat9.xyz);
    u_xlat0.x = max(u_xlat0.x, 0.00100000005);
    u_xlat0.x = inversesqrt(u_xlat0.x);
    u_xlat0.xyz = u_xlat0.xxx * u_xlat9.xyz;
    u_xlat27 = dot(_WorldSpaceLightPos0.xyz, u_xlat0.xyz);
#ifdef UNITY_ADRENO_ES3
    u_xlat27 = min(max(u_xlat27, 0.0), 1.0);
#else
    u_xlat27 = clamp(u_xlat27, 0.0, 1.0);
#endif
    u_xlat27 = u_xlat27 * u_xlat27;
    u_xlat27 = max(u_xlat27, 0.100000001);
    u_xlat16_2.xyz = texture(_MRATTex, vs_TEXCOORD0.xy).xyz;
    u_xlat28 = (-_Glossiness) * u_xlat16_2.y + 1.0;
    u_xlat29 = u_xlat28 * u_xlat28 + 0.5;
    u_xlat27 = u_xlat27 * u_xlat29;
    u_xlat16_3.xyz = texture(_BumpMap, vs_TEXCOORD0.xy, _BumpMap_mipmapBias).xyz;
    u_xlat16_4.xyz = u_xlat16_3.xyz * vec3(2.0, 2.0, 2.0) + vec3(-1.0, -1.0, -1.0);
    u_xlat16_4.xy = u_xlat16_4.xy * vec2(_BumpScale);
    u_xlat16_5.xyz = u_xlat16_4.yyy * vs_TEXCOORD3.xyz;
    u_xlat16_4.xyw = vs_TEXCOORD2.xyz * u_xlat16_4.xxx + u_xlat16_5.xyz;
    u_xlat16_4.xyz = vs_TEXCOORD4.xyz * u_xlat16_4.zzz + u_xlat16_4.xyw;
    u_xlat29 = dot(u_xlat16_4.xyz, u_xlat16_4.xyz);
    u_xlat29 = inversesqrt(u_xlat29);
    u_xlat3.xyz = vec3(u_xlat29) * u_xlat16_4.xyz;
    u_xlat0.x = dot(u_xlat3.xyz, u_xlat0.xyz);
#ifdef UNITY_ADRENO_ES3
    u_xlat0.x = min(max(u_xlat0.x, 0.0), 1.0);
#else
    u_xlat0.x = clamp(u_xlat0.x, 0.0, 1.0);
#endif
    u_xlat0.x = u_xlat0.x * u_xlat0.x;
    u_xlat0.y = u_xlat28 * u_xlat28;
    u_xlat18 = u_xlat0.y * u_xlat0.y + -1.0;
    u_xlat0.x = u_xlat0.x * u_xlat18 + 1.00001001;
    u_xlat0.xz = u_xlat0.xy * u_xlat0.xy;
    u_xlat0.x = u_xlat0.x * u_xlat27;
    u_xlat0.x = u_xlat0.x * 4.0;
    u_xlat16_4.x = u_xlat28 * u_xlat0.y;
    u_xlat0.x = u_xlat0.z / u_xlat0.x;
    u_xlat0.x = u_xlat0.x + -9.99999975e-05;
    u_xlat0.x = max(u_xlat0.x, 0.0);
    u_xlat0.x = min(u_xlat0.x, 100.0);
    u_xlat16_9.xyz = texture(_MainTex, vs_TEXCOORD0.xy, _MainTex_mipmapBias).xyz;
    u_xlat6.xyz = u_xlat16_9.xyz * _Color.xyz;
    u_xlat16_13.xyz = _Color.xyz * u_xlat16_9.xyz + vec3(-0.0399999991, -0.0399999991, -0.0399999991);
    u_xlat16_5.x = u_xlat16_2.x * _Metallic;
    u_xlat16_14 = (-u_xlat16_5.x) * 0.959999979 + 0.959999979;
    u_xlat16_13.xyz = u_xlat16_5.xxx * u_xlat16_13.xyz + vec3(0.0399999991, 0.0399999991, 0.0399999991);
    u_xlat16_5.xzw = vec3(u_xlat16_14) * u_xlat6.xyz;
    u_xlat16_14 = (-u_xlat16_14) + 1.0;
    u_xlat16_14 = _Glossiness * u_xlat16_2.y + u_xlat16_14;
#ifdef UNITY_ADRENO_ES3
    u_xlat16_14 = min(max(u_xlat16_14, 0.0), 1.0);
#else
    u_xlat16_14 = clamp(u_xlat16_14, 0.0, 1.0);
#endif
    u_xlat16_7 = u_xlat16_2.z + -1.0;
    u_xlat16_7 = _OcclusionStrength * u_xlat16_7 + 1.0;
    u_xlat16_16.xyz = (-u_xlat16_13.xyz) + vec3(u_xlat16_14);
    u_xlat0.xyz = u_xlat0.xxx * u_xlat16_13.xyz + u_xlat16_5.xzw;
    u_xlat0.xyz = u_xlat0.xyz * _LightColor0.xyz;
    u_xlat3.w = 1.0;
    u_xlat16_8.x = dot(unity_SHAr, u_xlat3);
    u_xlat16_8.y = dot(unity_SHAg, u_xlat3);
    u_xlat16_8.z = dot(unity_SHAb, u_xlat3);
    u_xlat16_8.xyz = u_xlat16_8.xyz + vs_TEXCOORD5.xyz;
    u_xlat16_8.xyz = max(u_xlat16_8.xyz, vec3(0.0, 0.0, 0.0));
    u_xlat16_8.xyz = vec3(u_xlat16_7) * u_xlat16_8.xyz;
    u_xlat16_5.xyz = u_xlat16_5.xzw * u_xlat16_8.xyz;
    u_xlat27 = dot(u_xlat3.xyz, _WorldSpaceLightPos0.xyz);
#ifdef UNITY_ADRENO_ES3
    u_xlat27 = min(max(u_xlat27, 0.0), 1.0);
#else
    u_xlat27 = clamp(u_xlat27, 0.0, 1.0);
#endif
    u_xlat0.xyz = u_xlat0.xyz * vec3(u_xlat27) + u_xlat16_5.xyz;
    u_xlat16_5.x = dot(u_xlat1.xyz, u_xlat3.xyz);
    u_xlat16_5.x = u_xlat16_5.x + u_xlat16_5.x;
    u_xlat16_5.xyz = u_xlat3.xyz * (-u_xlat16_5.xxx) + u_xlat1.xyz;
    u_xlat27 = dot(u_xlat3.xyz, (-u_xlat1.xyz));
#ifdef UNITY_ADRENO_ES3
    u_xlat27 = min(max(u_xlat27, 0.0), 1.0);
#else
    u_xlat27 = clamp(u_xlat27, 0.0, 1.0);
#endif
    u_xlat16_32 = (-u_xlat27) + 1.0;
    u_xlat16_32 = u_xlat16_32 * u_xlat16_32;
    u_xlat16_32 = u_xlat16_32 * u_xlat16_32;
    u_xlat16_13.xyz = vec3(u_xlat16_32) * u_xlat16_16.xyz + u_xlat16_13.xyz;
    u_xlat16_16.xy = (-vec2(u_xlat28)) * vec2(0.699999988, 0.0799999982) + vec2(1.70000005, 0.600000024);
    u_xlat16_32 = u_xlat28 * u_xlat16_16.x;
    u_xlat16_4.x = (-u_xlat16_4.x) * u_xlat16_16.y + 1.0;
    u_xlat16_32 = u_xlat16_32 * 6.0;
    u_xlat16_1 = textureLod(unity_SpecCube0, u_xlat16_5.xyz, u_xlat16_32);
    u_xlat16_5.x = u_xlat16_1.w + -1.0;
    u_xlat16_5.x = unity_SpecCube0_HDR.w * u_xlat16_5.x + 1.0;
    u_xlat16_5.x = log2(u_xlat16_5.x);
    u_xlat16_5.x = u_xlat16_5.x * unity_SpecCube0_HDR.y;
    u_xlat16_5.x = exp2(u_xlat16_5.x);
    u_xlat16_5.x = u_xlat16_5.x * unity_SpecCube0_HDR.x;
    u_xlat16_5.xyz = u_xlat16_1.xyz * u_xlat16_5.xxx;
    u_xlat16_5.xyz = vec3(u_xlat16_7) * u_xlat16_5.xyz;
    u_xlat16_5.xyz = u_xlat16_4.xxx * u_xlat16_5.xyz;
    u_xlat0.xyz = u_xlat16_5.xyz * u_xlat16_13.xyz + u_xlat0.xyz;
    SV_Target0.xyz = u_xlat0.xyz;
    SV_Target0.w = 1.0;
    return;
}

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第17张图片

可以看到,比较严重的是 A 和 LS,T 没那么严重


将 shader 优化为 unity默认的 BRDF3 修改为 blinn-phong 光照模型


分析 blinn_phong.vert, bling_phong.frag 单元资源


blinn_phong.vert shader

#version 320 es

#define HLSLCC_ENABLE_UNIFORM_BUFFERS 1
#if HLSLCC_ENABLE_UNIFORM_BUFFERS
#define UNITY_UNIFORM
#else
#define UNITY_UNIFORM uniform
#endif
#define UNITY_SUPPORTS_UNIFORM_LOCATION 1
#if UNITY_SUPPORTS_UNIFORM_LOCATION
#define UNITY_LOCATION(x) layout(location = x)
#define UNITY_BINDING(x) layout(binding = x, std140)
#else
#define UNITY_LOCATION(x)
#define UNITY_BINDING(x) layout(std140)
#endif
uniform 	mediump vec4 unity_SHBr;
uniform 	mediump vec4 unity_SHBg;
uniform 	mediump vec4 unity_SHBb;
uniform 	mediump vec4 unity_SHC;
uniform 	vec4 hlslcc_mtx4x4unity_ObjectToWorld[4];
uniform 	vec4 hlslcc_mtx4x4unity_WorldToObject[4];
uniform 	vec4 unity_WorldTransformParams;
uniform 	vec4 hlslcc_mtx4x4unity_MatrixVP[4];
uniform 	vec4 _MainTex_ST;
in highp vec4 in_POSITION0;
in highp vec4 in_TANGENT0;
in highp vec3 in_NORMAL0;
in highp vec4 in_TEXCOORD0;
out highp vec2 vs_TEXCOORD0;
out highp vec4 vs_TEXCOORD1;
out highp vec4 vs_TEXCOORD2;
out highp vec4 vs_TEXCOORD3;
out highp vec4 vs_TEXCOORD4;
out mediump vec3 vs_TEXCOORD5;
out mediump vec3 vs_TEXCOORD6;
out highp vec4 vs_TEXCOORD7;
out highp vec4 vs_TEXCOORD8;
vec4 u_xlat0;
mediump vec4 u_xlat16_0;
vec4 u_xlat1;
vec4 u_xlat2;
vec3 u_xlat3;
mediump float u_xlat16_4;
mediump vec3 u_xlat16_5;
void main()
{
    u_xlat0 = in_POSITION0.yyyy * hlslcc_mtx4x4unity_ObjectToWorld[1];
    u_xlat0 = hlslcc_mtx4x4unity_ObjectToWorld[0] * in_POSITION0.xxxx + u_xlat0;
    u_xlat0 = hlslcc_mtx4x4unity_ObjectToWorld[2] * in_POSITION0.zzzz + u_xlat0;
    u_xlat1 = u_xlat0 + hlslcc_mtx4x4unity_ObjectToWorld[3];
    u_xlat0.xyz = hlslcc_mtx4x4unity_ObjectToWorld[3].xyz * in_POSITION0.www + u_xlat0.xyz;
    u_xlat2 = u_xlat1.yyyy * hlslcc_mtx4x4unity_MatrixVP[1];
    u_xlat2 = hlslcc_mtx4x4unity_MatrixVP[0] * u_xlat1.xxxx + u_xlat2;
    u_xlat2 = hlslcc_mtx4x4unity_MatrixVP[2] * u_xlat1.zzzz + u_xlat2;
    gl_Position = hlslcc_mtx4x4unity_MatrixVP[3] * u_xlat1.wwww + u_xlat2;
    vs_TEXCOORD0.xy = in_TEXCOORD0.xy * _MainTex_ST.xy + _MainTex_ST.zw;
    vs_TEXCOORD1.w = u_xlat0.x;
    u_xlat1.xyz = in_TANGENT0.yyy * hlslcc_mtx4x4unity_ObjectToWorld[1].yzx;
    u_xlat1.xyz = hlslcc_mtx4x4unity_ObjectToWorld[0].yzx * in_TANGENT0.xxx + u_xlat1.xyz;
    u_xlat1.xyz = hlslcc_mtx4x4unity_ObjectToWorld[2].yzx * in_TANGENT0.zzz + u_xlat1.xyz;
    u_xlat0.x = dot(u_xlat1.xyz, u_xlat1.xyz);
    u_xlat0.x = inversesqrt(u_xlat0.x);
    u_xlat1.xyz = u_xlat0.xxx * u_xlat1.xyz;
    vs_TEXCOORD1.x = u_xlat1.z;
    u_xlat2.x = dot(in_NORMAL0.xyz, hlslcc_mtx4x4unity_WorldToObject[0].xyz);
    u_xlat2.y = dot(in_NORMAL0.xyz, hlslcc_mtx4x4unity_WorldToObject[1].xyz);
    u_xlat2.z = dot(in_NORMAL0.xyz, hlslcc_mtx4x4unity_WorldToObject[2].xyz);
    u_xlat0.x = dot(u_xlat2.xyz, u_xlat2.xyz);
    u_xlat0.x = inversesqrt(u_xlat0.x);
    u_xlat2 = u_xlat0.xxxx * u_xlat2.xyzz;
    u_xlat3.xyz = u_xlat1.xyz * u_xlat2.wxy;
    u_xlat3.xyz = u_xlat2.ywx * u_xlat1.yzx + (-u_xlat3.xyz);
    u_xlat0.x = in_TANGENT0.w * unity_WorldTransformParams.w;
    u_xlat3.xyz = u_xlat0.xxx * u_xlat3.xyz;
    vs_TEXCOORD1.y = u_xlat3.x;
    vs_TEXCOORD1.z = u_xlat2.x;
    vs_TEXCOORD2.x = u_xlat1.x;
    vs_TEXCOORD3.x = u_xlat1.y;
    vs_TEXCOORD2.w = u_xlat0.y;
    vs_TEXCOORD3.w = u_xlat0.z;
    vs_TEXCOORD2.y = u_xlat3.y;
    vs_TEXCOORD3.y = u_xlat3.z;
    vs_TEXCOORD2.z = u_xlat2.y;
    vs_TEXCOORD3.z = u_xlat2.w;
    vs_TEXCOORD4 = vec4(0.0, 0.0, 0.0, 0.0);
    vs_TEXCOORD5.xyz = vec3(0.0, 0.0, 0.0);
    u_xlat16_4 = u_xlat2.y * u_xlat2.y;
    u_xlat16_4 = u_xlat2.x * u_xlat2.x + (-u_xlat16_4);
    u_xlat16_0 = u_xlat2.ywzx * u_xlat2;
    u_xlat16_5.x = dot(unity_SHBr, u_xlat16_0);
    u_xlat16_5.y = dot(unity_SHBg, u_xlat16_0);
    u_xlat16_5.z = dot(unity_SHBb, u_xlat16_0);
    vs_TEXCOORD6.xyz = unity_SHC.xyz * vec3(u_xlat16_4) + u_xlat16_5.xyz;
    vs_TEXCOORD7 = vec4(0.0, 0.0, 0.0, 0.0);
    vs_TEXCOORD8 = vec4(0.0, 0.0, 0.0, 0.0);
    return;
}

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第18张图片


pbr.vert vs blinn_phong.vert

下面是 pbr.vert shader
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第19张图片

下面是 blinn_phong.vert shader
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第20张图片

可以看到 A 从 10~11 降到 9.6~10.0
LS 也降低了一些,从 17=> 15

可以看到,vertex shader 部分提升不大


blinn_phong.frag shader

#version 320 es

precision highp float;
precision highp int;
#define HLSLCC_ENABLE_UNIFORM_BUFFERS 1
#if HLSLCC_ENABLE_UNIFORM_BUFFERS
#define UNITY_UNIFORM
#else
#define UNITY_UNIFORM uniform
#endif
#define UNITY_SUPPORTS_UNIFORM_LOCATION 1
#if UNITY_SUPPORTS_UNIFORM_LOCATION
#define UNITY_LOCATION(x) layout(location = x)
#define UNITY_BINDING(x) layout(binding = x, std140)
#else
#define UNITY_LOCATION(x)
#define UNITY_BINDING(x) layout(std140)
#endif
uniform 	vec3 _WorldSpaceCameraPos;
uniform 	mediump vec4 _WorldSpaceLightPos0;
uniform 	mediump vec4 unity_SHAr;
uniform 	mediump vec4 unity_SHAg;
uniform 	mediump vec4 unity_SHAb;
uniform 	mediump vec4 _LightColor0;
uniform 	mediump float _Metallic;
uniform 	float _Glossiness;
uniform 	mediump vec4 _Color;
uniform 	mediump float _Cutoff;
UNITY_LOCATION(0) uniform mediump sampler2D _MainTex;
UNITY_LOCATION(1) uniform mediump sampler2D _BumpMap;
UNITY_LOCATION(2) uniform mediump sampler2D _MRATTex;
in highp vec2 vs_TEXCOORD0;
in highp vec4 vs_TEXCOORD1;
in highp vec4 vs_TEXCOORD2;
in highp vec4 vs_TEXCOORD3;
in mediump vec3 vs_TEXCOORD6;
layout(location = 0) out mediump vec4 SV_Target0;
mediump vec4 u_xlat16_0;
mediump vec3 u_xlat16_1;
vec4 u_xlat2;
mediump vec3 u_xlat16_2;
bool u_xlatb2;
mediump vec3 u_xlat16_3;
mediump vec3 u_xlat16_4;
float u_xlat5;
mediump vec3 u_xlat16_7;
mediump vec2 u_xlat16_11;
mediump float u_xlat16_13;
mediump float u_xlat16_19;
float u_xlat20;
float u_xlat23;
void main()
{
    u_xlat16_0 = texture(_MainTex, vs_TEXCOORD0.xy);
    u_xlat16_1.x = u_xlat16_0.w * _Color.w + (-_Cutoff);
#ifdef UNITY_ADRENO_ES3
    u_xlatb2 = !!(u_xlat16_1.x<0.0);
#else
    u_xlatb2 = u_xlat16_1.x<0.0;
#endif
    if(u_xlatb2){discard;}
    u_xlat2.x = vs_TEXCOORD1.w;
    u_xlat2.y = vs_TEXCOORD2.w;
    u_xlat2.z = vs_TEXCOORD3.w;
    u_xlat2.xyz = (-u_xlat2.xyz) + _WorldSpaceCameraPos.xyz;
    u_xlat20 = dot(u_xlat2.xyz, u_xlat2.xyz);
    u_xlat20 = inversesqrt(u_xlat20);
    u_xlat16_1.xyz = u_xlat2.xyz * vec3(u_xlat20) + _WorldSpaceLightPos0.xyz;
    u_xlat16_19 = dot(u_xlat16_1.xyz, u_xlat16_1.xyz);
    u_xlat16_19 = inversesqrt(u_xlat16_19);
    u_xlat16_1.xyz = vec3(u_xlat16_19) * u_xlat16_1.xyz;
    u_xlat16_2.xyz = texture(_BumpMap, vs_TEXCOORD0.xy).xyz;
    u_xlat16_3.xyz = u_xlat16_2.xyz * vec3(2.0, 2.0, 2.0) + vec3(-1.0, -1.0, -1.0);
    u_xlat16_4.x = dot(vs_TEXCOORD1.xyz, u_xlat16_3.xyz);
    u_xlat16_4.y = dot(vs_TEXCOORD2.xyz, u_xlat16_3.xyz);
    u_xlat16_4.z = dot(vs_TEXCOORD3.xyz, u_xlat16_3.xyz);
    u_xlat2.x = dot(u_xlat16_4.xyz, u_xlat16_4.xyz);
    u_xlat2.x = inversesqrt(u_xlat2.x);
    u_xlat2.xyz = u_xlat2.xxx * u_xlat16_4.xyz;
    u_xlat16_1.x = dot(u_xlat2.xyz, u_xlat16_1.xyz);
    u_xlat16_1.x = max(u_xlat16_1.x, 0.0);
    u_xlat5 = log2(u_xlat16_1.x);
    u_xlat16_11.xy = texture(_MRATTex, vs_TEXCOORD0.xy).xy;
    u_xlat23 = (-_Glossiness) * u_xlat16_11.y + 1.0;
    u_xlat23 = u_xlat23 * u_xlat23;
    u_xlat16_1.x = u_xlat23 * u_xlat23;
    u_xlat23 = max(u_xlat16_1.x, 9.99999975e-05);
    u_xlat16_1.x = 2.0 / u_xlat23;
    u_xlat16_1.x = u_xlat16_1.x + -2.0;
    u_xlat23 = max(u_xlat16_1.x, 9.99999975e-05);
    u_xlat16_1.x = u_xlat23 * 4.0;
    u_xlat5 = u_xlat5 * u_xlat16_1.x;
    u_xlat5 = exp2(u_xlat5);
    u_xlat16_1.xy = vec2(u_xlat5) * vec2(0.0120000001, 11.9879999);
    u_xlat5 = u_xlat16_11.y * _Glossiness;
    u_xlat16_13 = u_xlat16_11.x * _Metallic;
    u_xlat16_1.x = u_xlat5 * u_xlat16_1.y + u_xlat16_1.x;
    u_xlat16_3.xyz = u_xlat16_0.xyz * _Color.xyz + vec3(-0.0399999991, -0.0399999991, -0.0399999991);
    u_xlat16_0 = u_xlat16_0 * _Color;
    u_xlat16_3.xyz = vec3(u_xlat16_13) * u_xlat16_3.xyz + vec3(0.0399999991, 0.0399999991, 0.0399999991);
    u_xlat16_7.x = (-u_xlat16_13) * 0.959999979 + 0.959999979;
    u_xlat16_7.xyz = u_xlat16_0.xyz * u_xlat16_7.xxx;
    SV_Target0.w = u_xlat16_0.w;
    u_xlat16_3.xyz = u_xlat16_1.xxx * u_xlat16_3.xyz;
    u_xlat16_1.x = dot(u_xlat2.xyz, _WorldSpaceLightPos0.xyz);
    u_xlat16_1.x = max(u_xlat16_1.x, 0.0);
    u_xlat16_3.xyz = u_xlat16_7.xyz * u_xlat16_1.xxx + u_xlat16_3.xyz;
    u_xlat2.w = 1.0;
    u_xlat16_4.x = dot(unity_SHAr, u_xlat2);
    u_xlat16_4.y = dot(unity_SHAg, u_xlat2);
    u_xlat16_4.z = dot(unity_SHAb, u_xlat2);
    u_xlat16_4.xyz = u_xlat16_4.xyz + vs_TEXCOORD6.xyz;
    u_xlat16_4.xyz = max(u_xlat16_4.xyz, vec3(0.0, 0.0, 0.0));
    u_xlat16_1.xyz = u_xlat16_7.xyz * u_xlat16_4.xyz;
    SV_Target0.xyz = u_xlat16_3.xyz * _LightColor0.xyz + u_xlat16_1.xyz;
    return;
}

Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第21张图片


pbr.frag vs blinn_phong.frag

下面是 pbr.frag
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第22张图片

下面是 blinn_phong.frag
Unity Shader PBR vs BlinnPhong - ARM Mobile Studio - Graphics Analyzer & Mali Offline Compiler 分析_第23张图片

可以看到 A, LS, T 都有所降低

特别是 Shortest path cycles 的, A, LS, T 都大大降低

这个 blinn_phong 的 vs, fs 都是带有 mattalic, glossness,ao 的计算,如果相比 pbr 只是优化了 BRDF 部分的内容

如果说 PBR 是 Shader LOD 300,那么这个 BlinnPhong 的可以当做 Shader LOD 200
然后后续可以将,发现,或是 BRDF 直接阉割为: diffuse + spec ( dot(l,n) * albedl + pow(dot(h,n), glossness) * spec_col * dot(l,n);,那么单元占用将更少

再将 register 合并分量(填满分量,减少 register 数量),那么将占用单元将更加少


Project

Backup project (不公开,有自己的一些私人资源)
AGA_capture_测试PBR_BlinnPhong的unity工程用例.unitypackage


References

  • Arm Mobile Studio(二)使用Graphics Analyzer(配合unity)分析手机端OpenGL API调用和Shader
  • Mali Offline Compiler - 官方视频教学 - 笔录
  • 2023/07/12 写完此文后,发现B站上有官方相关ARM 的性能分析专栏 - 后续要所有的跟着走一遍
    • 实例演示如何使用 Unity Profiler 和 Arm 等分析工具优化移动端游戏性能(一)
    • 实例演示如何使用 Unity Profiler 和 Arm 等分析工具优化移动端游戏性能(二)
    • 实例演示如何使用 Unity Profiler 和 Arm 等分析工具优化移动端游戏性能(三)
    • 实例演示如何使用 Unity Profiler 和 Arm 等分析工具优化移动端游戏性能(四)
  • Unity大咖作客 | 知乎大V「放牛的星星」,是这么做性能优化的 - 里头也有讲到一些 unit, register 等优化相关

你可能感兴趣的:(TA,-,加油站,unity,unity-shader,AGA,Graphisc,Analyzer,Mali,shader单元资源分析)