c ++反汇编与逆向分析_用C / C ++进行Web汇编简介

c ++反汇编与逆向分析

c ++反汇编与逆向分析_用C / C ++进行Web汇编简介_第1张图片

I've been taking advantage of Web Assembly lately. It is supported by all the major browsers, allows one to make use of already existing useful code that has been written for other environments, and provides some performance benefits over JavaScript. Web Assembly has a lot of potential and support and I'd like to introduce other developers to it. I'm going to be using C++ in this post. But by no means is this the only language in which someone can make use of Web Assembly. In this post, I talk about why someone might want to consider web assembly and how to get a development environment setup.

最近,我一直在利用Web Assembly。 所有主要浏览器都支持它,它允许一个人使用为其他环境编写的已经存在的有用代码,并提供一些优于JavaScript的性能优势。 Web Assembly具有很大的潜力和支持,我想向其他开发人员介绍它。 在这篇文章中,我将使用C ++。 但这绝对不是有人可以使用Web Assembly的唯一语言。 在这篇文章中,我讨论了为什么有人可能想考虑Web组装以及如何进行开发环境设置。

什么是Web Assembly? (What is Web Assembly?)

Web Assembly is a specification for a virtual machine that runs in the browser. Compared to the highly dynamic JavaScript, Web Assembly can achieve much higher performance. Contrary to popular misconception though, Web Assembly doesn't completely replace JavaScript. You will probably use the two together. Web Assembly is based on LLVM (Low Level Virtual Machine), a stack based virtual machine that compilers can target. If someone wanted to make a new programming language, they could have the compiler for their language produce LLVM code and then use an already existing tool chain to compile it to platform specific code. A person building a compiler for a new language wouldn't need to make completely separate systems for different CPU architectures. Web Assembly being LLVM based could run code that was written by a variety of languages. Currently, there isn't support for garbage collection yet which restricts the languages that target it presently. C/C++, C#, and Rust are a few languages that can be used with Web Assembly presently with more expected in the future.

Web Assembly是在浏览器中运行的虚拟机的规范。 与高度动态JavaScript相比,Web Assembly可以实现更高的性能。 与流行的误解相反,Web Assembly并未完全替代JavaScript。 您可能会同时使用两者。 Web Assembly基于LLVM(低级虚拟机),LLVM是编译器可以定位的基于堆栈的虚拟机。 如果有人想制作一种新的编程语言,则可以让该语言的编译器生成LLVM代码,然后使用现有的工具链将其编译为平台特定的代码。 为新语言构建编译器的人不需要为不同的CPU体系结构制作完全独立的系统。 基于LLVM的Web程序集可以运行由多种语言编写的代码。 当前,尚不支持垃圾回收,这限制了目前针对垃圾回收的语言。 C / C ++,C#和Rust是目前可以与Web Assembly一起使用的几种语言,并且在将来会越来越多。

我可以使用哪些其他语言? (What Other Languages Can I Use?)

  • C/C++ - I'll be using that language in this article

    C / C ++-我将在本文中使用该语言
  • C#/.NET - I've got interest in this one and will write about it in the future.

    C#/。NET-我对此感兴趣,并将在以后进行介绍。
  • Elixir

    长生不老药

  • Go

  • Java

    Java
  • Python

    Python

  • Rust - This is a newer language

    Rust-这是一种较新的语言

为什么要使用Web Assembly? (Why Use Web Assembly?)

I suggest Web Assembly primarily for the performance benefits in computationally expensive operations. The binary format it uses is much more strict than JavaScript and it is more suitable for computationally intensive operations. There is also a lot of existing and tested code for work such as cryptography or video decoders that exist in C/C++ that one might want to use in a page. Despite all its flexibility, interpreted JavaScript code doesn't run as fast as a native binary. For some types of applications, this difference in performance isn't important (such as in a word processor). For other applications, differences in performance translate into differences in experiences.

我建议Web Assembly主要是为了在计算量大的操作中提高性能。 它使用的二进制格式比JavaScript严格得多,并且更适合于计算密集型操作。 还有很多现有的和经过测试的工作代码,例如人们可能希望在页面中使用的C / C ++中的加密或视频解码器。 尽管具有很大的灵活性,但是解释后JavaScript代码的运行速度不像本地二进制文件那样快。 对于某些类型的应用程序,这种性能差异并不重要(例如在文字处理器中)。 对于其他应用程序,性能差异会转化为体验差异。

While the demand for performance is a motivation to make a native binary, there are also security considerations. Native binaries may have access to more system resources than a web implemented solution. There may be more concern with ensuring that a program (especially if it is from a third party) doesn't do anything malicious or access resources without permission. Web Assembly helps bridge the gap between these two needs; it provides a higher performance execution environment within a sandbox.

虽然对性能的需求是制作本机二进制文件的动机,但也有安全方面的考虑。 本地二进制文件可能比Web实现的解决方案有权访问更多的系统资源。 确保程序(尤其是来自第三方的程序)不会进行任何恶意操作或未经许可访问资源可能会引起更多关注。 Web Assembly帮助弥合这两个需求之间的鸿沟; 它在沙箱中提供了更高性能的执行环境。

c ++反汇编与逆向分析_用C / C ++进行Web汇编简介_第2张图片

C ++? 我不能因此导致缓冲区溢出吗? (C++? Can't I Cause a Buffer Overflow With That?)

Sure. But only within the confines of the sandbox in which the code will run. It could crash your program, but it can't cause arbitrary execution of code outside the sandbox. Also note that presently Web Assembly doesn't have any bindings to Host APIs. When you target Web Assembly, you don't have an environment that allows you to bypass the security restrictions in which JavaScript code will run. There's no direct access to the file system, there's no access to memory outside of your program, you will still be restricted to communicating with WebSockets and HTTP request that don't violate CORS restrictions.

当然。 但仅在运行代码的沙箱范围内。 它可能会使您的程序崩溃,但不会导致沙盒外部代码的任意执行。 还要注意,目前Web Assembly没有与Host API的任何绑定。 当您以Web程序集为目标时,您没有一个环境可以绕过JavaScript代码在其中运行的安全限制。 没有直接访问文件系统的权限,也没有访问程序外部内存的权限,您仍然会被限制与不违反CORS限制的WebSockets和HTTP请求进行通信。

如何设置开发人员环境 (How Do I Setup a Developer Environment)

There are different versions of instructions on the Internet for installing the Web Assembly tools. If you are running Windows 10, you may come across a set of instructions that start with telling you to install the Windows Subsystem for Linux. Don't use those instructions; I personally think they are unnecessarily complex. While I have the Windows Sub System for Linux installed and running for other purposes that's not where I like to compile my Web Assembly code.

Internet上有不同版本的说明,用于安装Web程序集工具。 如果您运行的是Windows 10,则可能会遇到一系列说明,这些说明从告诉您安装Linux的Windows子系统开始。 不要使用这些说明; 我个人认为它们不必要地复杂。 尽管我已经安装了Linux的Windows子系统并已将其运行用于其他目的,但这不是我希望编译Web Assembly代码的地方。

Using your operating system of choice (Windows 10/8/7, macOS, Linux) clone the Emscripten git repository, run a few scripts from it, and you are ready to go. Here are the commands to use. If you are on Windows, omit the ./ at the beginning of the commands.

使用您选择的操作系统(Windows 10/8/7,macOS,Linux)克隆Emscripten git存储库,从中运行一些脚本,即可开始使用。 这是要使用的命令。 如果您使用的是Windows,请在命令开头省略./

git https://github.com/emscripten-core/emsdk.git
cd emsdk
git pull
./emsdk install latest
./emsdk activate latest

With the tools installed, you will also want to set the some environment variables. There is a script for doing this. On Windows 10, run:

安装了工具后,您还需要设置一些环境变量。 有执行此操作的脚本。 在Windows 10上,运行:

emsdk_env.bat

For the other operating systems, run:

对于其他操作系统,运行:

source emsdk_env.sh

The updates that this makes to environment variables isn't persistent; it will need to be run again with the next reboot. For an editor, I suggest using Visual Studio Code. I'll be compiling from the command line in this article. Feel free to use the editor of your choice.

这对环境变量所做的更新不是持久性的。 它需要在下次重新启动时再次运行。 对于编辑器,我建议使用Visual Studio Code。 我将在本文的命令行中进行编译。 随意使用您选择的编辑器。

Web Assembly Explorer (Web Assembly Explorer)

I don't use it in this tool within this article, but Web Assembly Explorer is available as an online tool for compiling C++ into Web Assembly and is an option if you don't have the tools installed. https://mbebenita.github.io/WasmExplorer/

我不会在本文中的此工具中使用它,但是Web Assembly Explorer可以用作将C ++编译到Web Assembly中的在线工具,如果没有安装工具,则可以选择使用。 https://mbebenita.github.io/WasmExplorer/

你好,世界 (Hello World)

Now that we have the tools installed, we can compile and run something. We will do a hello world program. Type the following source code and save it in hello.cpp.

现在我们已经安装了工具,我们可以编译并运行某些东西。 我们将做一个你好世界程序。 输入以下源代码并将其保存在hello.cpp中

#include 
int main(int argc, char**argv) 
{
     printf("Hello World!\n");
    return 0;
}

To compile the code from the command line, type the following:

要从命令行编译代码,请键入以下内容:

emcc hello.cpp -o hello.html

After the compiler runs, you will have three new files:

编译器运行后,您将拥有三个新文件:

  • hello.wasm - the compiled version of your program

    hello.wasm-程序的编译版本

  • hello.html - an HTML page for hosting your web assembly

    hello.html-用于托管Web程序集HTML页面

  • hello.js - JavaScript for loading your web assembly into the page

    hello.js-用于将Web程序集加载到页面中JavaScript

If you try to open the HTML file directly, your code probably will not run. Instead, the page will have to be served through an HTTP server. If you have node installed, use the node http-server. You can install the http-server with:

如果您尝试直接打开HTML文件,则您的代码可能无法运行。 而是必须通过HTTP服务器为该页面提供服务。 如果已安装节点,请使用节点http-server 。 您可以使用以下方法安装http-server

npm install  http-server -g

Then, start the server from the directory with your hello.html:

然后,使用hello.html从目录启动服务器:

http-server . -p 81

Here, I've instructed the http-server to run on port 81. You can use the port of your choice here provided nothing else is using it. Remember to substitute the port that you chose throughout the rest of these instructions.

在这里,我已指示http-server在端口81上运行。 您可以在此处使用您选择的端口,前提是没有其他端口正在使用它。 记住,在其余所有说明中,请替换为您选择的端口。

Open a browser and navigate to http://localhost:81/hello.html. You'll see your code run. If you view the source for the page, there is a lot of "noise" in the file. Much of that noise is from the displayed images being embedded within the HTML. That's fine for playing around. But you will want to have something customized to your own needs.

打开浏览器并导航到http:// localhost:81 / hello.html 。 您将看到代码运行。 如果您查看页面的源代码,则文件中会有很多“噪音”。 大部分噪音来自嵌入HTML中的显示图像。 玩耍很好。 但是,您将需要根据自己的需求进行定制。

We can provide a shell or template file for the compiler to use. Emscripten has a minimal file available at https://github.com/emscripten-core/emscripten/blob/master/src/shell_minimal.html. Download that file. It will be used as our starting point. It is convenient for the sake of distribution for everything to be in one file. But I don't like the CSS and JavaScript being embedded within the file. The CSS here isn't needed and is being deleted. I'm moving the JavaScript to its own file and added a script references to it in my HTML. There are several items within the HTML and the script that are not necessarily needed. Let's look at the script first and start making this minimal file even more minimalist.

我们可以提供一个shell或模板文件供编译器使用。 Emscripten的最小文件位于https://github.com/emscripten-core/emscripten/blob/master/src/shell_minimal.html 。 下载该文件。 它将用作我们的起点。 为了将所有内容分发在一个文件中,这很方便。 但是我不喜欢将CSS和JavaScript嵌入文件中。 这里不需要CSS,并且CSS已被删除。 我将JavaScript移至其自己的文件,并在HTML中添加了对其的脚本引用。 HTML和脚本中有一些不必要的项目。 让我们先来看一下脚本,然后开始使这个最小文件变得更加简约。

At the top of the script, there are three variables to page elements to indicate download and progress. Those are not absolutely necessary. I'm deleting them. I need to delete references to them too. Lower in the JavaScript is a method named setStatus . I'm replacing its body with a call to console.log() to print the text that is passed to it. The first set of programs that I'm going to write won't use a canvas. The element isn't needed for now; I'm commenting it out instead of deleting it so that I can use it later. Having deleted the first three lines of this file and code that references them, I'm returning to the HTML. Most of it is being deleted. I've commented out the canvas reference. There is a line in the HTML file with the text { { { SCRIPT }}}. The compiler will take this file as a template and replace { { { SCRIPT }}} with the reference to the script specific to our Web Assembly file.

在脚本的顶部,页面元素包含三个变量,以指示下载和进度。 这些不是绝对必要的。 我正在删除它们。 我也需要删除对它们的引用。 JavaScript中较低的是名为setStatus的方法。 我将其主体替换为对console.log()的调用,以打印传递给它的文本。 我要编写的第一套程序不会使用画布。 目前不需要该元素; 我将其注释掉,而不是将其删除,以便以后使用。 删除了该文件的前三行和引用它们的代码后,我将返回HTML。 大部分被删除。 我已注释掉画布参考。 HTML文件中的一行带有文本{ { { SCRIPT }}} 。 编译器将把这个文件作为模板,并用对我们Web汇编文件特定的脚本的引用替换{ { { SCRIPT }}}



  
    
    
    Emscripten-Generated Code
    
    
     
  
    
    
    {
     {
     { SCRIPT }}}
  



When the Web Assembly program executes a printf(), the text will be written to the textarea element. I place my hello.cpp file among these files and then compile it with the following command.

当Web Assembly程序执行printf() ,文本将被写入textarea元素。 我将hello.cpp文件放在这些文件中,然后使用以下命令对其进行编译。

emcc hello.cpp --shell-file shell_minimal.html -o hello.html

The --shell-file argument indicates what file to use as a template. The -o parameter tells the name of the HTML file to write to. If you look at hello.html, you can see it is almost identical to the input template. Run the site now and you'll see the same result, but with a much cleaner interface. Run the program again and you will see the same result with a much cleaner interface.

--shell-file参数指示要用作模板的文件。 -o参数指示要写入HTML文件的名称。 如果查看hello.html ,可以看到它几乎与输入模板相同。 现在运行该站点,您将看到相同的结果,但界面更加简洁。 再次运行该程序,您将看到一个更加简洁的界面,结果相同。

绑定功能 (Binding Functions)

I earlier mentioned that Web Assembly doesn't have any bindings to any operating system functions. It also doesn't have bindings do the browser. Nor does it have access to the DOM. It is up to the page that loads the web assembly to expose functions to it. In emscripten.js, the Modules object defines a number of functions that are going to be made available to the Web Assembly. When the C/C++ code calls printf, it will be passed through the JavaScript function defined here of the same name. It isn't a requirement that the names be the same, but it is easier to keep track of function associations if they are.

前面我提到过,Web程序集没有任何操作系统功能的绑定。 浏览器也没有绑定。 它也没有访问DOM的权限。 取决于加载Web程序集以向其公开功能的页面。 在emscripten.js中Modules对象定义了许多可用于Web Assembly的功能。 当C / C ++代码调用printf ,它将通过此​​处定义的同名JavaScript函数传递。 不需要名称相同,但是如果函数名称相同,则更容易跟踪它们。

从JavaScript调用C / C ++ (Calling C/C++ From JavaScript)

But what if you have your own functions that you wish to bind so that your JavaScript code can call the C++ code? The Module object has a function named ccall that can be used to call C/C++ code from JavaScript and another function named cwrap to build a function object that we can hold onto for repeated calls to the same function. To use these functions, some additional compile flags will be needed.

但是,如果您希望绑定自己的函数,以便JavaScript代码可以调用C ++代码,该怎么办? Module对象具有一个名为ccall的函数,可用于从JavaScript调用C / C ++代码;另一个名为cwrap的函数可构建一个函数对象,我们可以保留该函数对象以重复调用同一函数。 要使用这些功能,将需要一些其他的编译标志。

To demonstrate the use of both of these methods of calling C/C++ code from JavaScript, I'm going to declare three new functions in the C++ code.

为了演示这两种从JavaScript调用C / C ++代码的方法的用法,我将在C ++代码中声明三个新函数。

  • void testCall() - accepts no parameters and returns no value. This method only prints a string so that we know that our call to it was successful.

    void testCall() -不接受任何参数,也不返回任何值。 此方法仅输出一个string以便我们知道对其的调用已成功。

  • void printNumber(int num) - accepts an integer argument and prints it. This lets us know that our value was successfully called.

    void printNumber(int num) -接受一个整数参数并将其打印出来。 这使我们知道成功调用了我们的价值。

  • int square(int c) - accepts an integer and returns the square of that integer. This lets us see that a value can be returned back from the code.

    int square(int c) -接受一个整数并返回该整数的平方。 这使我们看到可以从代码中返回一个值。

The C++ language performs what is called name mangling; the names of the functions in the compiled code is different than the uncompiled code. For the functions that we want to use from outside the C++ code, we need to wrap declarations for the functions in an extern "C" block. If our code were being written in C instead of C++, this wouldn't be necessary. I still prefer C++ because of some of the features that the language offers. Normally, I would have a declaration such as this in a header file. But for now, my C++ program is in a single file. Close to the top of the program, I make the following declarations:

C ++语言执行所谓的名称处理。 编译代码中的函数名称与未编译代码不同。 对于要从​​C ++代码外部使用的函数,我们需要将函数的声明包装在extern“ C”块中。 如果我们的代码是用C而不是C ++编写的,则没有必要。 由于该语言提供的某些功能,我仍然更喜欢C ++。 通常,我在头文件中会有这样的声明。 但就目前而言,我的C ++程序位于单个文件中。 在程序顶部,我做了以下声明:

extern "C" {
    void testCall();
    void printNumber(int f);
    int square(int c);
}

The implementation for the functions is what you would expect.

功能的实现是您所期望的。

void testCall() 
{
    printf("function was called!\n");
}

void printNumber(int f) {
    printf("Printing the number %d\n", f);
}

int square(int c)
{
    return c*c;
}

There's a change to my main method too. I've had to include a new header file, emscripten.h, because I am about to use one of the functions that it provides. In main, added the following line.

我的main方法也有所变化。 我必须包含一个新的头文件emscripten.h ,因为我将要使用它提供的功能之一。 在main ,添加了以下行。

EM_ASM ( InitWrappers());

It will result in a JavaScript function named InitWrappers() to get called. I will talk about how EM_ASM works in a following section. I'm adding a third

你可能感兴趣的:(c++,python,linux,java,编程语言)