fanbird2008

Assembly Intro - An Introduction to the SSE Instruction Set

http://neilkemp.us/src/sse_tutorial/sse_tutorial.html

Intel SSE Tutorial : An Introduction to the SSE Instruction Set

Table of Contents

+ Introduction, Prerequisites, and Summary

+ SSE Introstructions

+ Example 1: Adding Vectors

+ Shuffling

+ Example 2: Cross Product

+ Introduction to Intrinsics

+ Example 3: Multiplying a Vector

+ Source Code

+ SSE Links

Introduction, Prerequisites, and Summary

I am writing this tutorial on SSE (Streaming SIMD Extensions) for three reasons. First there is a lack of well organized tutorials for the subject matter, second it's an educational process I'm personally undertaking to better familiarize myself with low level optimization techniques, and finally what better way to have fun than to mess around with a bunch of registers!?

There are some things you should already be familiar with in order to benefit from this tutorial. For starters you should have a general understanding of computer organization, the Intel x86 architecture and it's assembly language, and a solid understanding of C++. These tutorials are mostly concerned with SSE optimizations of common graphics operations, so a good understanding of 3D math will also be useful.

Being a 3D graphics programmer I am always interested in learning ways to make my applications run at faster speeds. Common sense holds that the fewer CPU operations executed between each frame, the more frames my application can draw per second. 3D math is one area that will greatly benefit from SSE optimization as matrix transformations and vector calculations (ex. dot and cross product) generally eat up valuable CPU cycles. So the focus of this tutorial will mostly be directed at building optimized 3D math functions to use with graphics applications. This tutorial will introduce the instruction set by solving common vector operations. So without further ado let's have a look at the fun packed world of SSE.

SSE Introstructions

First what the heck is SSE? Basically SSE is a collection of 128 bit CPU registers. These registers can be packed with 4 32 bit scalars after which an operation can be performed on each of the 4 elements simultaneously. In contrast it may take 4 or more operations in regular assembly to do the same thing. Below you can see two vectors (SSE registers) packed with scalars. The registers are multiplied with MULPS which then stores the result. That's 4 multiplications reduced to a single operation. The benefits of using SSE are too great to ignore.

Assembly Intro - An Introduction to the SSE Instruction Set_第1张图片

For a more detailed description of SSE you should read this article. It will give you a more thorough introduction to SIMD.

Now with the basic idea of SSE in mind let's take a look at some of the more common instructions.

Data Movement Instructions:

MOVUPS - Move 128bits of data to an SIMD register from memory or SIMD register. Unaligned.
MOVAPS - Move 128bits of data to an SIMD register from memory or SIMD register. Aligned.
MOVHPS - Move 64bits to upper bits of an SIMD register (high).
MOVLPS - Move 64bits to lowe bits of an SIMD register (low).
MOVHLPS - Move upper 64bits of source register to the lower 64bits of destination register.
MOVLHPS - Move lower 64bits of source register to the upper 64bits of destination register.
MOVMSKPS = Move sign bits of each of the 4 packed scalars to an x86 integer register.
MOVSS - Move 32bits to an SIMD register from memory or SIMD register.

Arithmetic Instructions:

A scalar will perform the operation on the first elements only. The parallel version will perform the operation on all of the elements in the register.

Parallel        Scalar
ADDPS       ADDSS - Adds operands
SUBPS        SUBSS - Subtracts operands
MULPS       MULSS - Multiplys operands
DIVPS       DIVSS - Divides operands
SQRTPS      SQRTSS - Square root of operand
MAXPS      MAXSS - Maximum of operands
MINPS        MINSS - Minimum of operands
RCPPS         RCPSS - Reciprical of operand
RSQRTPS   RSQRTSS - Reciprical of square root of operand

Comparison Intstruction:

Parallel Scalar
CMPPS CMPSS - Compares operands and return all 1s or 0s.

Logical Instructions:

ANDPS - Bitwise AND of operands
ANDNPS - Bitwise AND NOT of operands
ORPS - Bitwise OR of operands
XORPS - Bitwise XOR of operands

Truth Table:

Destination	Source	ANDPS	ANDNPS	ORPS	XORPS
1	1	1	0	1	0
1	0	0	0	1	1
0	1	0	1	1	1
0	0	0	0	0	0

Shuffle Instructions:

SHUFPS - Shuffle numbers from one operand to another or itself. - See Shuffling for more details.
UNPCKHPS - Unpack high order numbers to an SIMD register.
UNPCKLPS - Unpack low order numbers to a SIMD register.

Other instructions that are not covered here include data conversion between x86 and MMX registers, cache control instructions, and state management instructions. To learn more details about any of these instructions you can follow one of the links provided at the bottom of this page.

Example 1: Adding Vectors

Now that we are more familiar with the instruction palette lets take a look at our first example. In this example we will add two 4 element vectors using a C++ function and inline assembly. I'll start by showing you the source and then explain each step in detail.

// A 16byte = 128bit vector struct
struct Vector4
{
float x, y, z, w;
};

// Add two constant vectors and return the resulting vector
Vector4 SSE_Add ( const Vector4 &Op_A, const Vector4 &Op_B )
{
        Vector4 Ret_Vector;

        __asm
        {
              MOV EAX Op_A                              // Load pointers into CPU regs
              MOV EBX, Op_B

                MOVUPS XMM0, [EAX]                 // Move unaligned vectors to SSE regs
                MOVUPS XMM1, [EBX]

                ADDPS XMM0, XMM1                 // Add vector elements
                MOVUPS [Ret_Vector], XMM0      // Save the return vector
        }

        return Ret_Vector;
}

Because we are sending references to the function rather than copies as parameters we will need to move the vector pointers to 32bit registers EAX and EBX first. Using the MOVUPS for unaligned data, we move the literal values of the 32bit registers into SIMD registers XMM0 and XMM1. Next we use ADDPS on to add the register operands and store the resulting vector in XMM0. The final step is to move the contents of XMM0 to a vector structure before returning it. Once again we use MOVUPS to do so.

So that doesn't look so hard. We have an unaligned 4 element vector to send to and get from the addition function. I said that the vector is unaligned but what does that mean? To be more specific the vector is not guaranteed to be a 16 byte aligned struct. Huh? The SSE registers are each 128 bits or 16 bytes. If data in memory has a 16 byte border then it is said to be aligned data. What this means is that the data will start on a 16 byte boundary in memory. In this case we are doing nothing special to the data to guarantee the 16 byte alignment. In later examples we will see how to manually align data. Some compilers, such as the Intel compiler, will automatically align the data during compilation. It should be noted that using aligned data is much faster than using unaligned data.

Shuffling

Shuffling is an easy way to change the order of a single vector or combine the elements of two separate registers. The SHUFPS instruction takes two SSE registers and an 8 bit hex value. The first two elements of the destination operand are overwritten by any two elements of the destination register. The third and fourth elements of the destination register are overwritten by any two elements from the source register. The hex string is used to tell the instruction which elements to shuffle. 00, 01, 10, and 11 are used to access elements within the registers.

Examples:

SHUFPS XMM0, XMM0, 0x1B // 0x1B = 00 01 10 11 and reverses the order of the elements

SHUFPS XMM0, XMM0, 0xAA // 0xAA = 10 10 10 10 and sets all elements to the 3rd element

Example 2: Cross Product

Here is a common 3D calculation to find the normal of two vectors. It demonstrates a useful way of shuffling register elements to achieve optimal performance.

// R.x = A.y * B.z - A.z * B.y
// R.y = A.z * B.x - A.x * B.z
// R.z = A.x * B.y - A.y * B.x

// Find the cross product of two constant vectors and return it.

Vector4 SSE_CrossProduct(const Vector4 &Op_A, const Vector4 &Op_B)
{
        Vector4 Ret_Vector;
        __asm
        {
                MOV EAX Op_A                               // Load pointers into CPU regs
              MOV EBX, Op_B

                MOVUPS XMM0, [EAX]                 // Move unaligned vectors to SSE regs
              MOVUPS XMM1, [EBX]
              MOVAPS XMM2, XMM0               // Make a copy of vector A
              MOVAPS XMM3, XMM1               // Make a copy of vector B

                SHUFPS XMM0, XMM0, 0xD8      // 11 01 10 00 Flip the middle elements of A
                SHUFPS XMM1, XMM1, 0xE1       // 11 10 00 01 Flip first two elements of B
              MULPS XMM0, XMM1               // Multiply the modified register vectors

                SHUFPS XMM2, XMM2, 0xE1      // 11 10 00 01 Flip first two elements of the A copy
              SHUFPS XMM3, XMM3, 0xD8   // 11 01 10 00 Flip the middle elements of the B copy
              MULPS XMM2, XMM3                 // Multiply the modified register vectors

                SUBPS XMM0, XMM2                  // Subtract the two resulting register vectors

                MOVUPS [Ret_Vector], XMM0      // Save the return vector
        }
        return Ret_Vector;
}

Introduction to Intrinsics

Rather than redefine what an intrinsic is, I'll just quote from the MSDN.

"An intrinsic is a function known by the compiler that directly maps to a sequence of one or more assembly language instructions. Intrinsic functions are inherently more efficient than called functions because no calling linkage is required.

Intrinsics make the use of processor-specific enhancements easier because they provide a C/C++ language interface to assembly instructions. In doing so, the compiler manages things that the user would normally have to be concerned with, such as register names, register allocations, and memory locations of data." - MSDN

In short we no longer need to use the inline assembly mode when using SSE with our C++ applications and we no longer need to worry about register names. Working with intrinsics is somewhere between assembly programming and high level programming. Let's take a look at how to use them with what we already know.

Example 3: Multiplying a Vector

Here is an example of multiplying a vector by a floating point scalar. It uses an intrinsic function to initialize a 128 bit object.

// Multiply a constant vector by a constant scalar and return the result

Vector4 SSE_Multiply(const Vector4 &Op_A, const float &Op_B)
{
        Vector4 Ret_Vector;

        __m128 F = _mm_set1_ps(Op_B)           // Create a 128 bit vector with four elements Op_B

        __asm
        {
                MOV EAX, Op_A                               // Load pointer into CPU reg

              MOVUPS XMM0, [EAX]                   // Move the vector to an SSE reg
              MULPS XMM0, F                           // Multiply vectors
              MOVUPS [Ret_Vector], XMM0       // Save the return vector
        }
        return Ret_Vector;
}

What we see here is the use of a 128 bit object F which is set using an a memory initialization intrinsic called _mm_set1_ps( float ). This sets all four elements of F to a floating point value. We can then use this 128 bit object with our SSE assembly code.

There are many more intrinsics which can greatly simplify SSE development. Check out the MSDN for a complete reference of intrinsics.

Source Code

The source code and html for this tutorial can be downloaded here. It demonstrates vector operations using SSE and inline assembly.

Because inline assembly is used in the examples the source will not compile with the x64 Visual C++ compiler.

SSE Links

MSDN

Tommesani Docs

计算机导论与计算机组成原理关系,计算机组成原理
一、课程简介课程中文名称:《计算机组成原理与汇编语言》课程英文名称:Computerprincipleandassemblylanguage课程编号:ZYB08003课程性质:专业必修课学时数:54学时(其中授课学时，课堂实验学时，讨论学时，自学学时)学分:3学分适用专业:计算机科学与技术课程的主要任务本课程的作用是通过课堂理论学习和实际操作训练，使学生掌握计算机硬件组成的基本原理、汇编语言程序设
2025 前端工程化：从混沌到秩序，AI时代构建高性能 Web 应用的制胜之道（万字干货）
引言：AI时代前端工程化的新范式2025年的前端开发已不再是简单的页面构建，而是涉及复杂系统设计、性能优化、跨端兼容和团队协作的综合工程学科。本指南将深入剖析当前前端工程化的核心技术栈，结合最新工具链和最佳实践，帮助开发者构建高效、可维护的现代Web应用。前端工程化已从"构建工具+打包优化"的初级阶段，演进为包含模块化架构、智能化开发、性能监控和安全防护的全链路体系。随着WebAssembly、A
前端计算机视觉：使用 OpenCV.js 在浏览器中实现图像处理亿只小灿灿前端 OpenCV 前端计算机视觉 opencv
一、OpenCV.js简介与环境搭建OpenCV（OpenSourceComputerVisionLibrary）是一个强大的计算机视觉库，广泛应用于图像和视频处理领域。传统上，OpenCV主要在后端使用Python或C++等语言。但随着WebAssembly(Wasm)技术的发展，OpenCV也有了JavaScript版本——OpenCV.js，它可以直接在浏览器中高效运行，为前端开发者提供了前
什么是WebAssembly（WASM） MonkeyKing.sun wasm 区块链
WebAssembly（WASM）是一种高性能的低级编程语言字节码格式，可在网页和非网页环境中运行，支持多语言编译，运行速度接近原生代码。它在区块链中的作用是：作为智能合约的执行引擎，被多条非以太坊链（如Polkadot、EOS、CosmWasm）采用。Polkadot和EOS是使用WebAssembly的两个代表性区块链平台，它们与Solidity+EVM（以太坊生态）形成鲜明对比。一、什么是W
c++常见英文单词（自用）叫我六胖子 c++英文 c++
c++常见英文单词application应用程式应用、应用程序applicationframework应用程式框架、应用框架应用程序框架architecture架构、系统架构体系结构argument引数（传给函式的值）。叁见parameter叁数、实质叁数、实叁、自变量array阵列数组arrowoperatorarrow（箭头）运算子箭头操作符assembly装配件assemblylanguag
WebAssembly:wasm探索与TypeScript模块wasm应用 _Zou 前端笔记 webgl笔记 typescript c++wasm webassembly macos
目录安装编译环境HelloWorldEmscripten/bind实践TypeScript模块WASM引用更多相关链接安装编译环境前置条件：git\cmake\python\node。编译安装Emscripten通过EmscriptenSDK构建Emscripten是自动的，下面是步骤。$gitclonehttps://github.com/juj/emsdk.git$cdemsdk$./emsd
WebAssembly (Wasm) 与 JavaScript 字符串交互 hongkid wasm javascript 交互
随着WebAssembly（简称Wasm）技术的发展，越来越多的Web应用开始利用Wasm来提高性能。Wasm是一种在现代Web浏览器中运行的二进制格式，可以提供接近原生代码的速度。然而，Wasm和JavaScript之间进行数据交换时需要特别注意，尤其是对于字符串这种复杂类型的数据。基础知识在Wasm中，内存是通过线性内存（LinearMemory）来管理的，它是一个连续的字节数组。由于Wasm
C#Blazor应用-跨平台WEB开发VB.NET 专注VB编程开发20年服务器运维
在C#中实现Blazor应用需要结合Razor语法和C#代码，Blazor允许使用C#同时开发前端和后端逻辑。以下是一个完整的C#Blazor实现示例，包含项目创建、基础组件和数据交互等内容：一、创建Blazor项目使用VisualStudio新建项目→选择“BlazorApp”→勾选“ASP.NETCore托管”（可选WebAssembly或服务器端渲染）。使用.NETCLIdotnetnewb
PyWASM：一个纯Python编写的WebAssembly解释器安装与使用指南申芹琴
PyWASM：一个纯Python编写的WebAssembly解释器安装与使用指南项目地址:https://gitcode.com/gh_mirrors/py/py-wasmPyWASM是由Ethereum社区开发的一个项目，它提供了在Python中执行WebAssembly（WASM）代码的能力。本指南将引导您了解项目的关键结构，以及如何起步使用此库。1.项目目录结构及介绍PyWASM的项目结构清
探索未来：CPython on WASM 邹澜鹤Gardener
探索未来：CPythononWASM去发现同类优质开源项目:https://gitcode.com/在现代Web开发中，JavaScript长期以来一直是一统天下的王者，但随着WebAssembly（WASM）的崛起，其他编程语言也开始在浏览器中展现自己的魅力。CPythononWASM是一个令人激动的开源项目，它让Python可以直接在浏览器环境中运行，无需JavaScript作为中介。这个项目
Python 在 WebAssembly（WASM）中的探索白鹭微波vd python wasm 开发语言
```htmlPython在WebAssembly（WASM）中的探索Python在WebAssembly（WASM）中的探索近年来，WebAssembly（简称WASM）作为一种新兴的网页技术标准，正在快速崛起。它是一种可以在现代浏览器中高效运行的二进制格式，为开发者提供了接近原生性能的运行环境。与此同时，Python作为一门功能强大且灵活的语言，在Web开发领域也有着广泛的应用。本文将探讨如何
WPF/Net Core 简单显示PDF rollingman WPF C#wpf pdf c#.net core
使用自带的WebView2控件显示PDF文件第一种方式：WebView2库导入使用NuGet第二种方式：使用PDF第三方库显示第一种方式：WebView2库导入使用NuGet工具–>NuGet包管理器–>管理解决方案的NuGet程序包，搜索WebView2安装xaml中加入xmlns:wv2="clr-namespace:Microsoft.Web.WebView2.Wpf;assembly=Mi
浏览器游戏的次世代革命：WebAssembly 3.0 实战指南 Lucas55555555 游戏 wasm
破局开篇：开发者必须跨越的性能鸿沟在2025年，WebAssembly（WASM）技术已经成为高性能Web应用的核心驱动力。特别是WASM3引擎的广泛应用，使得在浏览器中实现主机级游戏画质成为可能。本文将深入探讨WASM3的关键特性、性能优势、核心代码实现以及未来的发展趋势。WASM3技术栈的性能优势WASM3技术栈在性能方面的优势主要体现在以下三个维度：1.SIMD并行计算SIMD（Single
[特殊字符] AIGC工具深度实战：GPT与通义灵码如何彻底重构企业开发流程 Lucas55555555 AIGC gpt 重构
第一模块：理念颠覆——为什么AIGC不是“玩具”而是“效能倍增器”？▍企业开发的核心痛点图谱（2025版）研发效能瓶颈：需求膨胀与交付时限矛盾持续尖锐，传统敏捷方法论已触天花板知识断层加剧：新技术栈（如Rust、WebAssembly）兴起，传统培训模式跟不上迭代速度质量保障困境：人工测试覆盖率和AI大模型类产品的黑盒特性形成根本冲突人力成本高企：一线城市资深Java/Python工程师年薪突破7
Unity中实现HybridCLR热更新 Hello Bug. #Unity相关技术 unity 游戏引擎
一：前言HybridCLR又称作huatuo（华佗）、wolong（卧龙）热更方案，底层是C++编写的，是一种热更新方案，与Lua、ILRuntime等都是不同的热更方案HybridCLR扩充了il2cpp的代码，使它由纯AOTruntime变成AOT+Interpreter混合runtime，进而支持动态加载assembly，实现热更新HybridCLR官网HybridCLR热更原理IOS不允许
内存的代价：如何正确与 WASM 模块传值交互 EndingCoder WebAssembly 实战与前沿应用 wasm 交互性能优化主线程性能 javascript
关键要点线性内存模型：WebAssembly（WASM）使用单一的线性内存块，供WASM和JavaScript（JS）共享数据。高效数据交换：通过指针和ArrayBuffer，WASM和JS可以高效传递数组、对象等复杂结构。字符串处理：使用TextEncoder和TextDecoder解决字符串编码问题，确保跨语言兼容性。内存管理：Rust的Drop机制与JS的垃圾回收（GC）需协调配合，防止内存
小程序WebAssembly实践：用Rust实现高性能计算模块的完整路径即可皕微信小程序小程序 wasm rust
引言在小程序生态中，JavaScript因其动态类型和解释执行特性，在处理复杂计算时可能成为性能瓶颈。通过WebAssembly（WASM）技术，开发者可将计算密集型逻辑迁移到更高效的底层语言（如Rust），实现性能的跨数量级提升。本文将通过完整实践路径，演示如何用Rust编写高性能计算模块，并集成到微信小程序中。一、技术选型与原理1.1为什么选择Rust+WebAssembly？性能优势：Rus
Web 架构之 WebAssembly（WASM）性能优化实践懂搬砖原力计划 web架构前端架构 wasm
文章目录思维导图正文一、WebAssembly基础1.什么是WebAssembly2.WebAssembly工作原理3.WebAssembly与JavaScript交互二、性能优化策略1.代码层面优化2.内存管理优化3.编译优化三、优化工具与调试1.性能分析工具2.调试技巧四、实际案例分析1.案例一：图像处理2.案例二：游戏开发总结思维导图graphLRclassDefstartendfill:#
C# WebAssembly革命：用C#打造《赛博朋克2077》级Web3D游戏引擎墨夶 C#学习资料2 c#wasm 游戏引擎
1.环境搭建：C#与WebAssembly的“基础设施”核心场景：工具链整合：.NETSDK+Emscripten+VSCode的完美配合编译参数的“黑科技”：-sWASM=1与-sSIDE_MODULE=1的协同作用代码示例：环境配置与编译流程#安装.NETSDKdotnetinstall-sdk-v7.0.406#安装Emscripten（Windows示例）gitclonehttps://g
WebAssembly 2.0：超越浏览器的全栈计算革命
——解锁高性能、跨平台与安全隔离的下一代基础设施一、Wasm核心架构升级解析1.多层执行模型演进A[源代码C/Rust/Go...]-->|LLVM|B[Wasm二进制.wasm]B-->C[浏览器运行时]B-->D[WASI运行时]B-->E[智能合约VM]C-->F[Web应用]D-->G[服务端函数]E-->H[区块链DApp]2.关键性能指标对比场景JavaScriptWebAssembl
ZZU汇编语言实验二（保姆级教程）米线YH linux 汇编 ZZU
1.步骤三：跟着步骤进行操作即可2.步骤四编写显示ASCII码表的汇编语言程序(1)在桌面打开vscode，新建文件exp2_1.s。按照教材习题2.14的要求，编写显示ASCII码表的汇编语言程序。要求：在数据段定义ASCII可显示字符的数值，而不是字符本身。(2)保存文件到Desktop/ZZUassembly/ZZUNASM/experiments/exp2/。(3)汇编连接，运行程序，观察
WebVm：无需本地安装，一款在浏览器运行的 Linux它来了！！！敲代码的苦13 WebVm linux 运维服务器
WebVm：无需本地安装，一款在浏览器运行的Linux它来了！！！WebVM是一款可以在浏览器中运行的Linux虚拟机。不是那种HTML+JavaScript模拟的UI，完全通过HTML5/WebAssembly技术实现客户端运行。通过集成CheerpX虚拟化引擎，可直接在浏览器中运行未经修改的Debian系统。主要特点：1、完整Linux环境：运行未修改的Debian发行版，支持大多数原生开发工
【无标题】定时移动鼠标脚本 Yin Zhibo scipy
设置两分钟的时间间隔（以秒为单位）$interval=120while($true){#创建对象，模拟鼠标移动$signature=@"[System.Reflection.Assembly]::LoadWithPartialName(“System.Windows.Forms”)|Out-Null$signature=@"[System.Runtime.InteropServices.DllIm
如何在.NET控制台应用程序中获取应用程序的路径？ p15097962069 c#.net console
如何在控制台应用程序中找到应用程序的路径？在WindowsForms中，我可以使用Application.StartupPath查找当前路径，但这似乎在控制台应用程序中不可用。#1楼上面的答案是我需要的90％，但是为我返回了Uri而不是常规路径。如MSDN论坛帖子中所述，如何将URI路径转换为普通文件路径？，我使用了以下内容：//Getnormalfilepathofthisassembly'sp
vulkan游戏引擎Makefile.testbed启动环境配置文件 Magnum Lehar 游戏引擎
1.makefile.testbed.windows.makDIR:=$(subst/,\,${CURDIR})BUILD_DIR:=binOBJ_DIR:=objASSEMBLY:=testbedEXTENSION:=.exeCOMPILER_FLAGS:=-g-MD-Werror=vla-Wno-missing-braces-fdeclspec#-fPICINCLUDE_FLAGS:=-Ien
Maven打包可执行jar包方法大全（史上最全）技术园地 Maven maven打包可执行jar包
目录打包方法方法一：使用maven-jar-plugin和maven-dependency-plugin方法二：使用maven-assembly-plugin(推荐)方法三：使用maven-shade-plugin方法四：使用onejar-maven-plugin方法五：使用spring-boot-maven-plugin方法六：使用tomcat7-maven-plugin参考打包方法方法一：使用
vulkan游戏引擎的makefile启动环境实现 Magnum Lehar 游戏引擎
1.makefile.engine.windows.makDIR:=$(subst/,\,${CURDIR})BUILD_DIR:=binOBJ_DIR:=objASSEMBLY:=engineEXTENSION:=.dllCOMPILER_FLAGS:=-g-MD-Werror=vla-fdeclspec#-fPICINCLUDE_FLAGS:=-Iengine\src-I$(VULKAN_SD
编译和汇编区别 hitsz_syl 汇编
编译（Compilation）和汇编（Assembly）的区别1.定义:输入与输编译汇编输入：高级语言代码（如.c,.java）输入：汇编语言代码（如.asm,.s）输出：机器码（.exe,.o）或汇编代码（.s）输出：纯二进制机器码（.obj,.bin）3.执行过程编译汇编1.词法分析（LexicalAnalysis）2.语法分析（SyntaxAnalysis）3.语义分析（SemanticAn
鸿蒙OS&UniApp集成WebAssembly实现高性能计算：从入门到实践#三方框架 #Uniapp 淼学派对 uniapp鸿蒙os harmonyos uni-app wasm
UniApp集成WebAssembly实现高性能计算：从入门到实践引言在移动应用开发领域，性能始终是一个永恒的话题。随着计算需求的不断增加，特别是在图像处理、数据分析等领域，如何在跨平台应用中实现高性能计算成为了一个重要课题。本文将详细介绍如何在UniApp框架中集成WebAssembly，以实现高性能的计算功能，并特别关注其在鸿蒙系统上的适配与优化。WebAssembly简介WebAssembl
Python爬虫（46） Python爬虫进阶：多线程异步抓取与WebAssembly反加密实战指南一个天蝎座白勺程序猿 Python爬虫入门到高阶实战 python 爬虫 wasm
目录引言：当传统爬虫遭遇新型反爬壁垒背景分析：现代反爬技术的演进路径1.前端加密的三种典型方案2.传统爬虫的局限性技术架构设计：三阶突破方案阶段一：性能跃迁——多线程异步架构1.concurrent.futures多线程实战2.aiohttp异步框架进阶阶段二：反爬突破——WebAssembly逆向工程1.加密参数定位技巧2.WebAssembly逆向六步法2.1提取wasm文件：2.2反编译工具
分享100个最新免费的高匿HTTP代理IP mcj8089 代理IP 代理服务器匿名代理免费代理IP 最新代理IP
推荐两个代理IP网站： 1. 全网代理IP：http://proxy.goubanjia.com/ 2. 敲代码免费IP：http://ip.qiaodm.com/ 120.198.243.130:80,中国/广东省 58.251.78.71:8088,中国/广东省 183.207.228.22:83,中国/
mysql高级特性之数据分区 annan211 java 数据结构 mongodb 分区 mysql
mysql高级特性 1 以存储引擎的角度分析，分区表和物理表没有区别。是按照一定的规则将数据分别存储的逻辑设计。器底层是由多个物理字表组成。 2 分区的原理分区表由多个相关的底层表实现，这些底层表也是由句柄对象表示，所以我们可以直接访问各个分区。存储引擎管理分区的各个底层表和管理普通表一样(所有底层表都必须使用相同的存储引擎)，分区表的索引只是
JS采用正则表达式简单获取URL地址栏参数 chiangfai js 地址栏参数获取
GetUrlParam:function GetUrlParam(param){ var reg = new RegExp("(^|&)"+ param +"=([^&]*)(&|$)"); var r = window.location.search.substr(1).match(reg); if(r!=null
怎样将数据表拷贝到powerdesigner (本地数据库表) Array_06 powerDesigner
================================================== 1、打开PowerDesigner12，在菜单中按照如下方式进行操作 file->Reverse Engineer->DataBase 点击后，弹出 New Physical Data Model 的对话框 2、在General选项卡中 Model name:模板名字，自
logbackのhelloworld 飞翔的马甲日志 logback
一、概述 1.日志是啥？当我是个逗比的时候我是这么理解的：log.debug()代替了system.out.print(); 当我项目工作时，以为是一堆得.log文件。这两天项目发布新版本，比较轻松，决定好好地研究下日志以及logback。传送门1：日志的作用与方法： http://www.infoq.com/cn/articles/why-and-how-log 上面的作
新浪微博爬虫模拟登陆随意而生新浪微博
转载自：http://hi.baidu.com/erliang20088/item/251db4b040b8ce58ba0e1235 近来由于毕设需要，重新修改了新浪微博爬虫废了不少劲，希望下边的总结能够帮助后来的同学们。现行版的模拟登陆与以前相比，最大的改动在于cookie获取时候的模拟url的请求
synchronized 香水浓 java thread
Java语言的关键字，可用来给对象和方法或者代码块加锁，当它锁定一个方法或者一个代码块的时候，同一时刻最多只有一个线程执行这段代码。当两个并发线程访问同一个对象object中的这个加锁同步代码块时，一个时间内只能有一个线程得到执行。另一个线程必须等待当前线程执行完这个代码块以后才能执行该代码块。然而，当一个线程访问object的一个加锁代码块时，另一个线程仍然
maven 简单实用教程 AdyZhang maven
1. Maven介绍 1.1. 简介 java编写的用于构建系统的自动化工具。目前版本是2.0.9，注意maven2和maven1有很大区别，阅读第三方文档时需要区分版本。 1.2. Maven资源见官方网站；The 5 minute test，官方简易入门文档；Getting Started Tutorial，官方入门文档；Build Coo
Android 通过 intent传值获得null aijuans android
我在通过intent 获得传递兑现过的时候报错，空指针,我是getMap方法进行传值，代码如下 1 2 3 4 5 6 7 8 9 public void getMap(View view){ Intent i =
apache 做代理报如下错误：The proxy server received an invalid response from an upstream baalwolf response
网站配置是apache＋tomcat,tomcat没有报错，apache报错是： The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /. Reason: Error reading fr
Tomcat6 内存和线程配置 BigBird2012 tomcat6
1、修改启动时内存参数、并指定JVM时区（在windows server 2008 下时间少了8个小时）在Tomcat上运行j2ee项目代码时，经常会出现内存溢出的情况，解决办法是在系统参数中增加系统参数： window下，在catalina.bat最前面 set JAVA_OPTS=-XX:PermSize=64M -XX:MaxPermSize=128m -Xms5
Karam与TDD bijian1013 Karam TDD
一.TDD 测试驱动开发（Test-Driven Development,TDD）是一种敏捷（AGILE）开发方法论，它把开发流程倒转了过来，在进行代码实现之前，首先保证编写测试用例，从而用测试来驱动开发（而不是把测试作为一项验证工具来使用）。 TDD的原则很简单： a.只有当某个
[Zookeeper学习笔记之七]Zookeeper源代码分析之Zookeeper.States bit1129 zookeeper
public enum States { CONNECTING, //Zookeeper服务器不可用，客户端处于尝试链接状态 ASSOCIATING, //？？？ CONNECTED, //链接建立，可以与Zookeeper服务器正常通信 CONNECTEDREADONLY, //处于只读状态的链接状态，只读模式可以在
【Scala十四】Scala核心八：闭包 bit1129 scala
Free variable A free variable of an expression is a variable that’s used inside the expression but not defined inside the expression. For instance, in the function literal expression (x: Int) => (x
android发送json并解析返回json ronin47 android
package com.http.test; import org.apache.http.HttpResponse; import org.apache.http.HttpStatus; import org.apache.http.client.HttpClient; import org.apache.http.client.methods.HttpGet; import
一份IT实习生的总结 brotherlamp PHP php资料 php教程 php培训 php视频
今天突然发现在不知不觉中自己已经实习了 3 个月了，现在可能不算是真正意义上的实习吧，因为现在自己才大三，在这边撸代码的同时还要考虑到学校的功课跟期末考试。让我震惊的是，我完全想不到在这 3 个月里我到底学到了什么，这是一件多么悲催的事情啊。同时我对我应该 get 到什么新技能也很迷茫。所以今晚还是总结下把，让自己在接下来的实习生活有更加明确的方向。最后感谢工作室给我们几个人这个机会让我们提前出来
据说是2012年10月人人网校招的一道笔试题-给出一个重物重量为X,另外提供的小砝码重量分别为1，3，9。。。3^N。将重物放到天平左侧，问在两边如何添加砝码 bylijinnan java
public class ScalesBalance { /** * 题目： * 给出一个重物重量为X,另外提供的小砝码重量分别为1，3，9。。。3^N。（假设N无限大，但一种重量的砝码只有一个） * 将重物放到天平左侧，问在两边如何添加砝码使两边平衡 * * 分析： * 三进制 * 我们约定括号表示里面的数是三进制，例如 47=(1202
dom4j最常用最简单的方法 chiangfai dom4j
要使用dom4j读写XML文档,需要先下载dom4j包,dom4j官方网站在 http://www.dom4j.org/目前最新dom4j包下载地址:http://nchc.dl.sourceforge.net/sourceforge/dom4j/dom4j-1.6.1.zip 解开后有两个包,仅操作XML文档的话把dom4j-1.6.1.jar加入工程就可以了,如果需要使用XPath的话还需要
简单HBase笔记 chenchao051 hbase
一、Client-side write buffer 客户端缓存请求描述：可以缓存客户端的请求，以此来减少RPC的次数，但是缓存只是被存在一个ArrayList中，所以多线程访问时不安全的。可以使用getWriteBuffer()方法来取得客户端缓存中的数据。默认关闭。二、Scan的Caching 描述： next( )方法请求一行就要使用一次RPC,即使
mysqldump导出时出现when doing LOCK TABLES daizj mysql mysqdump 导数据
　　执行　mysqldump -uxxx -pxxx -hxxx -Pxxxx database tablename > tablename.sql　导出表时，会报 mysqldump: Got error: 1044: Access denied for user 'xxx'@'xxx' to database 'xxx' when doing LOCK TABLES 解决
CSS渲染原理 dcj3sjt126com Web
从事Web前端开发的人都与CSS打交道很多，有的人也许不知道css是怎么去工作的，写出来的css浏览器是怎么样去解析的呢？当这个成为我们提高css水平的一个瓶颈时，是否应该多了解一下呢？一、浏览器的发展与CSS
《阿甘正传》台词 dcj3sjt126com
Part Ⅰ: 《阿甘正传》Forrest Gump经典中英文对白 Forrest: Hello! My names Forrest. Forrest Gump. You wanna Chocolate? I could eat about a million and a half othese. My momma always said life was like a box ochocol
Java处理JSON dyy_gusi json
Json在数据传输中很好用，原因是JSON 比 XML 更小、更快，更易解析。在Java程序中，如何使用处理JSON，现在有很多工具可以处理，比较流行常用的是google的gson和alibaba的fastjson，具体使用如下： 1、读取json然后处理 class ReadJSON { public static void main(String[] args)
win7下nginx和php的配置 geeksun nginx
1. 安装包准备 nginx : 从nginx.org下载nginx-1.8.0.zip php：从php.net下载php-5.6.10-Win32-VC11-x64.zip， php是免安装文件。 RunHiddenConsole: 用于隐藏命令行窗口 2. 配置 # java用8080端口做应用服务器，nginx反向代理到这个端口即可 p
基于2.8版本redis配置文件中文解释 hongtoushizi redis
转载自： http://wangwei007.blog.51cto.com/68019/1548167 在Redis中直接启动redis-server服务时, 采用的是默认的配置文件。采用redis-server xxx.conf 这样的方式可以按照指定的配置文件来运行Redis服务。下面是Redis2.8.9的配置文
第五章常用Lua开发库3-模板渲染 jinnianshilongnian nginx lua
动态web网页开发是Web开发中一个常见的场景，比如像京东商品详情页，其页面逻辑是非常复杂的，需要使用模板技术来实现。而Lua中也有许多模板引擎，如目前我在使用的lua-resty-template，可以渲染很复杂的页面，借助LuaJIT其性能也是可以接受的。如果学习过JavaEE中的servlet和JSP的话，应该知道JSP模板最终会被翻译成Servlet来执行；而lua-r
JZSearch大数据搜索引擎颠覆者 JavaScript
系统简介：大数据的特点有四个层面：第一，数据体量巨大。从TB级别，跃升到PB级别；第二，数据类型繁多。网络日志、视频、图片、地理位置信息等等。第三，价值密度低。以视频为例，连续不间断监控过程中，可能有用的数据仅仅有一两秒。第四，处理速度快。最后这一点也是和传统的数据挖掘技术有着本质的不同。业界将其归纳为4个“V”——Volume，Variety，Value，Velocity。大数据搜索引
10招让你成为杰出的Java程序员 pda158 java 编程框架
如果你是一个热衷于技术的 Java 程序员，那么下面的 10 个要点可以让你在众多 Java 开发人员中脱颖而出。　　 1. 拥有扎实的基础和深刻理解 OO 原则　　对于 Java 程序员，深刻理解 Object Oriented Programming（面向对象编程）这一概念是必须的。没有 OOPS 的坚实基础，就领会不了像 Java 这些面向对象编程语言
tomcat之oracle连接池配置小网客 oracle
tomcat版本7.0 配置oracle连接池方式：修改tomcat的server.xml配置文件： <GlobalNamingResources> <Resource name="utermdatasource" auth="Container" type="javax.sql.DataSou
Oracle 分页算法汇总 vipbooks oracle sql 算法 .net
这是我找到的一些关于Oracle分页的算法，大家那里还有没有其他好的算法没？我们大家一起分享一下！ -- Oracle 分页算法一 select * from ( select page.*,rownum rn from (select * from help) page -- 20 = (currentPag

Assembly Intro - An Introduction to the SSE Instruction Set

你可能感兴趣的:(assembly)