DaySummer

Memory Ordering in Modern Microprocessors, Part I

Since the 2.0 kernel release, Linux has supported a large number of SMP systems based on a variety of CPUs. Linux has done an excellent job of abstracting differences among these CPUs, even in kernel code. This article is an overview of one important difference: how CPUs allow memory accesses to be reordered in SMP systems.

Memory accesses are among the slowest of a CPU's operations, due to the fact that Moore's law has increased CPU instruction performance at a much greater rate than it has increased memory performance. This difference in performance increase means that memory operations have been getting increasingly expensive compared to simple register-to-register instructions. Modern CPUs sport increasingly large caches in order to reduce the overhead of these expensive memory accesses.

These caches can be thought of as simple hardware hash tables with fixed size buckets and no chaining, as shown in Figure 1. This cache has 16 lines and two ways for a total of 32 entries, each entry containing a single 256-byte cache line, which is a 256-byte-aligned block of memory. This cache line size is a little on the large size, but it makes the hexadecimal arithmetic much simpler. In hardware parlance, this is a two-way set-associative cache. It is analogous to a software hash table with 16 buckets, where each bucket's hash chain is limited to two elements at most. Because this cache is implemented in hardware, the hash function is extremely simple: extract four bits from the memory address.

In Figure 1, each box corresponds to a cache entry that can contain a 256-byte cache line. However, a cache entry can be empty, as indicated by the empty boxes in the figure. The rest of the boxes are flagged with the memory address of the cache line they contain. Because the cache lines must be 256-byte aligned, the low eight bits of each address are zero. The choice of hardware hash function means the next-higher four bits match the line number.

The situation depicted in Figure 1 might arise if the program's code was located at address 0x43210E00 through 0x43210EFF, and this program accessed data sequentially from 0x12345000 through 0x12345EFF. Suppose that the program now was to access location 0x12345F00. This location hashes to line 0xF, and both ways of this line are empty, so the corresponding 256-byte line can be accommodated. If the program was to access location 0x1233000, which hashes to line 0x0, the corresponding 256-byte cache line can be accommodated in way 1. However, if the program were to access location 0x1233E00, which hashes to line 0xE, one of the existing lines must be ejected from the cache to make room for the new cache line. This background on hardware caching allows us to look at why CPUs reorder memory accesses.

Memory Ordering in Modern Microprocessors, Part I_第1张图片

Figure 1. CPU Cache Structure for a Cache with 16 Lines and Two Entries Per Line

Why Reorder Memory Accesses?

In a word, performance! CPUs have become so fast that the large multimegabyte caches cannot keep up with them. Therefore, caches often are partitioned into nearly independent banks, as shown in Figure 2. This allows each of the banks to run in parallel, thus keeping up better with the CPU. Memory normally is divided among the cache banks by address. For example, all the even-numbered cache lines might be processed by bank 0 and all of the odd-numbered cache lines by bank 1.

However, this hardware parallelism has a dark side: memory operations now can complete out of order, which can result in some confusion, as illustrated in Figure 3. CPU 0 might write first to location 0x12345000, an even-numbered cache line, and then to location 0x12345100, an odd-numbered cache line. If bank 0 is busy with earlier requests but bank 1 is idle, the first write is visible to CPU 1 after the second write. In other words, the writes are perceived out of order by CPU 1. Reads can be reordered in a similar manner. This reordering can cause many textbook parallel algorithms to fail.

Figure 2. Hardware parallelism divides one large cache into multiple banks.

Memory Reordering and SMP Software

A few machines offer sequential consistency, in which all operations happen in the order specified by the code and where all CPUs' views of these operations are consistent with a global ordering of the combined operations. Sequentially consistent systems have some nice properties, but high performance does not tend to be one of them. The need for global ordering severely constrains the hardware's ability to exploit parallelism, and therefore, commodity CPUs and systems do not offer sequential consistency.

On these systems, three orderings must be accounted for:

Program order: the order in which the memory operations are specified in the code running on a given CPU.
Execution order: the order in which the individual memory-reference instructions are executed on a given CPU. The execution order can differ from program order due to both compiler and CPU-implementation optimizations.
Perceived order: the order in which a given CPU perceives its and other CPUs' memory operations. The perceived order can differ from the execution order due to caching, interconnect and memory-system optimizations. Different CPUs might well perceive the same memory operations as occurring in different orders.

Memory Ordering in Modern Microprocessors, Part I_第2张图片

Figure 3. CPUs can do things out of order.

Popular memory-consistency models include x86's process consistency, in which writes from a given CPU are seen in order by all CPUs, and weak consistency, which permits arbitrary reorderings limited only by explicit memory-barrier instructions. For more information on memory-consistency models, see Gharachorloo's exhaustive technical report, listed in the on-line Resources.

Summary of Memory Ordering

When it comes to how memory ordering works on different CPUs, there is good news and bad news. The bad news is each CPU's memory ordering is a bit different. The good news is you can count on a few things:

A given CPU always perceives its own memory operations as occurring in program order. That is, memory-reordering issues arise only when a CPU is observing other CPUs' memory operations.
An operation is reordered with a store only if the operation accesses a different location than does the store.
Aligned simple loads and stores are atomic.
Linux-kernel synchronization primitives contain any needed memory barriers, which is a good reason to use these primitives.

The most important differences are called out in Table 1. More detailed descriptions of specific CPUs' features will be addressed in a later installment. Parenthesized CPU names indicate modes that are allowed architecturally but rarely used in practice. The cells marked with a Y indicate weak memory ordering; the more Ys, the more reordering is possible. In general, it is easier to port SMP code from a CPU with many Ys to a CPU with fewer Ys, though your mileage may vary. However, code that uses standard synchronization primitives—spinlocks, semaphores, RCU—should not need explicit memory barriers, because any required barriers already are present in these primitives. Only tricky code that bypasses these synchronization primitives needs barriers. It is important to note that most atomic operations, for example, atomic_inc() and atomic_add(), do not include any memory barriers.

In Table 1, the first four columns indicate whether a given CPU allows the four possible combinations of loads and stores to be reordered. The next two columns indicate whether a given CPU allows loads and stores to be reordered with atomic instructions. With only eight CPUs, we have five different combinations of load-store reorderings and three of the four possible atomic-instruction reorderings.

Memory Ordering in Modern Microprocessors, Part I_第3张图片

Table 1. Summary of Memory Ordering

The second-to-last column, dependent reads reordered, requires some explanation, which will be undertaken in the second installment of this series. The short version is Alpha requires memory barriers for readers as well as for updaters of linked data structures. Yes, this does mean that Alpha in effect can fetch the data pointed to before it fetches the pointer itself—strange but true. Please see the “Ask the Wizard” column on the manufacturer's site, listed in Resources, if you think that I am making this up. The benefit of this extremely weak memory model is Alpha can use simpler cache hardware, which in turn permitted higher clock frequencies in Alpha's heyday.

The last column in Table 1 indicates whether a given CPU has a incoherent instruction cache and pipeline. Such CPUs require that special instructions be executed for self-modifying code. In absence of these instructions, the CPU might execute the old rather than the new version of the code. This might seem unimportant—after all, who writes self-modifying code these days? The answer is that every JIT out there does. Writers of JIT code generators for such CPUs must take special care to flush instruction caches and pipelines before attempting to execute any newly generated code. These CPUs also require that the exec() and page-fault code flush the instruction caches and pipelines before attempting to execute any binaries just read into memory, lest the CPU end up executing the prior contents of the affected pages.

How Linux Copes

One of Linux's great advantages is it runs on a wide variety of different CPUs. Unfortunately, as we have seen, these CPUs sport a wide variety of memory-consistency models. So what is a portable kernel to do?

Linux provides a carefully chosen set of memory-barrier primitives, as follows:

smp_mb(): “memory barrier” that orders both loads and stores. This means loads and stores preceding the memory barrier are committed to memory before any loads and stores following the memory barrier.
smp_rmb(): “read memory barrier” that orders only loads.
smp_wmb(): “write memory barrier” that orders only stores.
smp_read_barrier_depends(): forces subsequent operations that depend on prior operations to be ordered. This primitive is a no-op on all platforms except Alpha.

The smp_mb(), smp_rmb() and smp_wmb() primitives also force the compiler to eschew any optimizations that would have the effect of reordering memory optimizations across the barriers. The smp_read_barrier_depends() primitive must do the same, but only on Alpha CPUs.

These primitives generate code only in SMP kernels; however, each also has a UP version—mb(), rmb(), wmb() and read_barrier_depends(), respectively—that generate a memory barrier even in UP kernels. The smp_ versions should be used in most cases. However, these latter primitives are useful when writing drivers, because memory-mapped I/O accesses must remain ordered even in UP kernels. In absence of memory-barrier instructions, both CPUs and compilers happily would rearrange these accesses. At best, this would make the device act strangely; at worst, it would crash your kernel or, in some cases, even damage your hardware.

So most kernel programmers need not worry about the memory-barrier peculiarities of each and every CPU, as long as they stick to these memory-barrier interfaces. If you are working deep in a given CPU's architecture-specific code, of course, all bets are off.

But it gets better. All of Linux's locking primitives, including spinlocks, reader-writer locks, semaphores and read-copy updates (RCUs), include any needed barrier primitives. So if you are working with code that uses these primitives, you don't even need to worry about Linux's memory-ordering primitives. That said, deep knowledge of each CPU's memory-consistency model can be helpful when debugging, to say nothing of writing architecture-specific code or synchronization primitives.

Besides, they say a little knowledge is a dangerous thing. Just imagine the damage you could do with a lot of knowledge! For those who want to understand more about individual CPUs' memory consistency models, the next installment will describe those of the most popular and prominent CPUs.

Conclusions

As noted earlier, the good news is Linux's memory-ordering primitives and synchronization primitives make it unnecessary for most Linux kernel hackers to worry about memory barriers. This is especially good news given the large number of CPUs and systems that Linux supports and the resulting wide variety of memory-consistency models. However, there are times when knowing about memory barriers can be helpful, and I hope that this article has served as a good introduction to them.

Acknowledgements

I owe thanks to many CPU architects for patiently explaining the instruction- and memory-reordering features of their CPUs, particularly Wayne Cardoza, Ed Silha, Anton Blanchard, Tim Slegel, Juergen Probst, Ingo Adlung and Ravi Arimilli. Wayne deserves special thanks for his patience in explaining Alpha's reordering of dependent loads, a lesson that I resisted learning quite strenuously!

Legal Statement

This work represents the view of the author and does not necessarily represent the view of IBM. IBM, zSeries and Power PC are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Linux is a registered trademark of Linus Torvalds. i386 is a trademarks of Intel Corporation or its subsidiaries in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of such companies. Copyright (c) 2005 by IBM Corporation.

Resources for this article: /article/8331.

Paul E. McKenney is a Distinguished Engineer with IBM's Linux Technology Center. He has worked on NUMA and SMP algorithms and, in particular, RCU for longer than he cares to admit. In his spare time, he jogs and supports the usual house-wife-and-kids habit.

--------------------------------------------------------------------------------------------------------------------------------------------
original url: http://www.linuxjournal.com/article/8211

Linux驱动学习--V4L2框架文艺小少年 linux 运维服务器 V4L2
一、引言V4L2是Videoforlinux2的简称,为linux中关于视频设备的内核驱动。在Linux中，视频设备是设备文件，可以像访问普通文件一样对其进行读写，摄像头在/dev/video0下。V4L2在设计时，是要支持很多广泛的设备的，它们之中只有一部分在本质上是真正的视频设备。主要有以下几种几种接口视频采集接口(videocaptureinterface):这种应用的设备可以是高频头或者摄
LINUX 磁盘和文件系统管理（二）好多知识都想学 linux
LVM管理命令常用的LVM管理命令PV（物理卷）、VG（卷组）、LV（逻辑卷）格式：pvcreate(pvdisplay、pvremove)[分区或磁盘位置]vgcreate[卷组名][物理卷位置][物理卷位置]vgremove[卷组名]vgextend[需要扩展卷组名][物理卷位置]lvcreate-L[容量大小]-n[逻辑卷名][卷组名]lvextend-L[+扩展大小]/dev/卷组名/逻辑
Linux C++ 编程死锁详解 PM简读馆 Linux嵌入式驱动开发开发语言 c++linux
作者简介：程序员转项目管理领域优质创作者个人邮箱：[[email protected]]PMP资料导航：PM菜鸟（查阅PMP大纲考点）座右铭：上善若水，水善利万物而不争。绿泡泡：PM简读馆（包含更多PM常用免费资料）目录概要一、死锁的四个必要条件二、常见死锁场景三、代码解释1、资源申请顺序不一致问题描述解决方案2.优先级倒置问题描述解决方案3.线程间循环等待问题描述解决方案4.锁嵌套（LockNe
【Linux 初学篇】（1）目录结构、远程登录、vim 和 vi、用户管理 2401_83817418 程序员 linux vim 运维
/usr/local这是一个给主机额外安装软件（软件）所安装的目录。一般是通过编译源码方式安装的程序1.2.9boot存放的是启动Linux时使用的一些核心文件，包括一些连接文件以及镜像文件1.2.10proc这是一个虚拟的目录，它是系统内存的映射，访问这个目录来获取系统信息（该目录不能动）1.2.11srvservice的缩写，该目录存放一些服务启动之后所需要提取的数据（该目录不能动）1.2.1
Linux---sqlite3数据库磨十三数据库 linux sqlite
一、数据库分类1.按数据关系分类类型特点代表产品关系型数据库-使用SQL（结构化查询语言）-数据以行列形式存储，支持事务和复杂查询MySQL、Oracle、SQLite非关系型数据库-无固定表结构（如键值对、文档、图）-高扩展性，适合非结构化数据MongoDB、Redis2.按功能规模分类类型特点代表产品大型数据库高并发、高可用性，支持企业级应用Oracle、DB2中型数据库适用于中小型企业，跨平
30岁了，零基础想转行网安从头开始现实吗？白帽子凯哥哥 tcp/ip 安全 web安全学习网络
这篇文章没有什么套路。就是一套自学理论和方向，具体的需要配合网络黑白去学习。毕竟是有网络才会有黑白！有自学也有培训！1.打死也不要相信什么分分钟钟教你成为大黑阔的，各种包教包会的教程,就算打不死也不要去购买那些所谓的盗号软件之类的东西。2，我之前让你们在没有目的的时候学习linux,在学习LINUX的同时你第一个遇到的问题就是命令。作为一个黑客入门着来说你必须要懂什么是命令化系统,什么是图形化系统
Linux 线程鹰击长空KO C Linux linux C
【1】什么是线程1.概念线程：是一个进程并发执行多个任务的机制。并发：多个任务同时进行。（cpu以ms级别的速度进程调度，切换进程和线程）；进程的上下文切换：上下文：运行一个程序所需要的所有资源。上下文切换：替换原有内容，是一个耗时的操作。为了提高系统的性能，引入一个轻量级的进程概念，称之为线程。线程：属于进程，每一个进程至少需要一个线程作为指令执行体，线程运行在进程空间内。多线程：一个进程中，有
linux bash 取得命令执行的结果,Linux的Bash特性之：命令的执行结果以及状态结果... 寻书人 linux bash 取得命令执行的结果
linux中的命令执行的状态结果：bash通过状态返回值来输出此结果：成功：0失败：1-255命令执行完成之后，其状态返回值保存于bash的特殊变量$?中；命令正常执行时，有命令的返回值：根据命令及其功能不同，结果各不相同；引用命令的执行结果(命令结果)：$(COMMAND)或COMMAND(反引号)例如：ll/tmp/命令运行后返回的结果为命令的执行结果，命令执行成功后的状态结果返回在特殊变量$
linux基础02（Bash+vim用法）景天科技苑 linux基础与进阶 shell脚本编写实战 linux bash vim
Bash详解：在Linux系统中，Bash是一种Unixshell，用于与操作系统进行交互，执行命令和脚本，以及管理文件和目录。Bash是BourneAgainSHell的缩写，是一种强大的命令行界面工具，广泛用于Linux和其他类Unix操作系统。Bash提供了非常丰富的命令集和脚本编程功能，可以用于自动化任务、批处理操作、系统管理、软件开发等各种用途。我们在服务器上操作命令时，谨记：1、在服务
Linux之bash常用命令 Ssaty. linux bash unix
第1关：linux之bash常用命令基本知识任务描述本关任务：根据基本知识点，回答一些选择题。相关知识为了完成本关任务，你需要掌握：1.Linux简单介绍2.Linux的优缺点3.操作系统介绍4.UNIX操作系统5.MINIX操作系统6.GNU计划7.POSIX标准8.程序学习方法9.为什么学习Linux10.Linux发展历史Linux简单介绍Linux核心理念：万物皆文件。Linux：是一个内
# 本质剖析为什么要使用HashSet 撒乎乎不撒深入浅出聊点底层高效学习 java 数据结构
#本质剖析为什么要使用HashSet单列集合-HashSet特点一：去重与遍历支持数据去重，可以使用迭代器或foreach遍历数据。两种遍历方式的比较迭代器遍历通过调用实现了Iterable接口的Iteratoriterator();方法,从而获取迭代器对象，逐一访问元素。优点：支持在遍历过程中安全地删除元素，避免并发修改异常。适用场景：对集合进行删除操作时推荐使用。示例代码：Iteratorit
Typora的学习，Markdown的语法简介，VsCode+Markdown的愉快写作 Geek-Men 机器人工程专业的菜狗日常 markdown html html5
Typora的学习，Markdown的语法简介，VsCode+Markdown的愉快写作来，看个神器相信用了这，以后你将抛弃其他的文本编辑器，什么，学了之后可以让文章逼格拉满？不用再用word来折磨自己，让写作从此愉悦？还不快点来和我一起学习？什么是Typora？Typora是一款支持实时预览的MarkDown文本编译器。支持Windows，MacOS，以及Linux三方平台白嫖党狂喜，因为它是完
GoogleTest学习实践郭涤生 c/c++c++单元测试功能测试
第1步：环境安装与配置对于Linux系统#安装编译依赖sudoapt-getinstallbuild-essentialcmakelibgtest-dev#编译安装cd/usr/src/gtestsudocmakeCMakeLists.txtsudomakesudocp*.a/usr/libCMake集成示例cmake_minimum_required(VERSION3.14)project(My
Linux系统下如何部署svmspro平台安防视频中间件/视频资源汇聚平台 linux adb 运维实时音视频 SVMSPro 信息可视化
上传svmspro服务rz回车后选择svmspro.zip上传如果提示rz命令未找到，请先运行`yuminstall-ylrzsz`安装将svmspro.zip解压出来，并拷贝到/usr/目录下，命令如下：unzipsvmspro.zip//解压程序包cpsvmspro/usr/-r//将svmspro文件夹拷贝到/usr/目录下，方便后续设置成服务安装mysql数据库，SVMSProlinux需
Linux系统中软连接与硬链接的区别 OWEN-KAI linux 运维服务器
软硬链接对比项目软链接硬链接inode号源文件与软连接inode号不相同源文件与软连接inode号相同删除源文件影响不影响创建方式ln-s源文件链接文件ln源文件链接文件目录可以√不可以×跨磁盘可以√不可以×好处软件升级重指向解决当前磁盘空间不够用问题文件多一个入口注意：1、删除软硬链接均不影响源文件2、删除一个文件inode号为0，并且没有被进程占用才真正被删除
保姆级教学——本地免费部署DeepSeek-R1模型并通过Python调用 shuaige_shiwoa python+AI python 开发语言 AI编程 ai
以下是如何在本地免费部署DeepSeek-R1模型并通过Python调用的详细指南：一、环境准备（Windows/Linux/Mac通用）1.硬件要求最低配置：16GB内存+20GB可用磁盘空间推荐配置：NVIDIAGPU（显存≥8GB）+CUDA11.8（CPU模式支持但速度较慢）2.软件依赖#创建虚拟环境（可选但推荐）condacreate-ndeepseekpython=3.10condaa
LeetCode HOT 100 —— 146.LRU缓存 HDU-五七小卡 LeetCode 热题 HOT 100 leetcode 缓存链表
题目请你设计并实现一个满足LRU(最近最少使用)缓存约束的数据结构。实现LRUCache类：LRUCache(intcapacity)以正整数作为容量capacity初始化LRU缓存intget(intkey)如果关键字key存在于缓存中，则返回关键字的值，否则返回-1。voidput(intkey,intvalue)如果关键字key已经存在，则变更其数据值value；如果不存在，则向缓存中插入该
LeetCode Hot100 LRU缓存 m0_67582670 leetcode leetcode 缓存 c++
请你设计并实现一个满足LRU(最近最少使用)缓存约束的数据结构。实现LRUCache类：LRUCache(intcapacity)以正整数作为容量capacity初始化LRU缓存intget(intkey)如果关键字key存在于缓存中，则返回关键字的值，否则返回-1。voidput(intkey,intvalue)如果关键字key已经存在，则变更其数据值value；如果不存在，则向缓存中插入该组k
LeetCode 热题 HOT 100 第四十七天 146. LRU 缓存中等题用python3求解阿舒带你学编程面试学习路线阿里巴巴缓存 leetcode 链表面试 java-ee
题目地址请你设计并实现一个满足LRU(最近最少使用)缓存约束的数据结构。实现LRUCache类：LRUCache(intcapacity)以正整数作为容量capacity初始化LRU缓存intget(intkey)如果关键字key存在于缓存中，则返回关键字的值，否则返回-1。voidput(intkey,intvalue)如果关键字key已经存在，则变更其数据值value；如果不存在，则向缓存中插
基于k3s部署Nginx、MySQL、PHP和Redis的详细教程
先决条件一台Linux服务器（或本地虚拟机），建议Ubuntu/CentOS基础命令行操作能力确保服务器有至少2GB内存和10GB磁盘空间1.安装k3s（极简Kubernetes）1.1一键安装#用root用户或sudo权限执行以下命令curl-sfLhttps://get.k3s.io|sh-解释：k3s是一个轻量级Kubernetes发行版，专为资源有限的环境设计这条命令会自动下载并安装k3s
python全栈开发流程_【python全栈开发】初识python weixin_39609051 python全栈开发流程
本人最开始接触python是在2013年接触，写过helloword！在此之前对开发类没有多大兴趣，不知道重要性，属于浑浑噩噩，忙忙乎乎，跌跌撞撞的。随后选择了Linux运维作为就业主攻方向。经过几年的工作实际情况，越发觉得懂开发的运维是多么的重要。经过再三思虑，决定重拾开发学习。看过php\lua\python；最终选定python作为首选主攻对象。通过博客记录python的学习之路。记录这传奇
分子动力学仿真软件：GROMACS_（2）.安装与配置GROMACS kkchenjj 分子动力学2 分子动力学仿真模拟模拟仿真
安装与配置GROMACS在本节中，我们将详细介绍如何在不同的操作系统上安装和配置GROMACS，以便您能够顺利地开始使用这一强大的分子动力学仿真软件。我们将涵盖以下内容：在Linux上的安装与配置在Windows上的安装与配置在MacOS上的安装与配置验证安装在Linux上的安装与配置1.获取GROMACS源代码首先，您需要从GROMACS官方网站或其他可靠源获取最新的GROMACS源代码。您可以
在windows下运行ollama用5600XT (其实旧的a卡应该都可以）步骤同时用ComfyUI + RX 5600 XT + DirectML 安装与配置 Zalo2 AI deepseek windows llama stable diffusion linux AI作画
Linux部分5600XT这个卡是gfx1010的核心，这个是rand1架构，这是被amd放弃的老古董包裹其他的rand1或者rand2都是。没钱也要玩AI(自能简单运用，不能训练，微调等)这张卡只有6g远远不够ai使用所以我发现好像是能和cpu一起混用的。#警告这张卡不要在linux下尝试安装rocm版本进行pytorch编译，然后进行模型微调，反正我是浪费时间了，根本行不通，因为amd根本就没
android的缓存地址,android缓存与临时文件 AIWorldLabs android的缓存地址
应用程序程序在第一次打开的时候，我们会把一些常用的数据保存到本地；或者应用程序在运行的时候，需要保存一些记录的(比如记事本)，因为耗子的工作需要保存填写的一些表单在本地，所以就整理了一下如何简单的把数据保存到本地。我们主要用到的方法就是下面这四个方法，看名字就可以看出来。getExternalCacheDir()getExternalFilesDir()getCacheDir()getFilesD
【android文件存放路径】 tangsilian android android
Android开发:filePath放在哪个文件夹Environment.getDataDirectory()=/dataEnvironment.getDownloadCacheDirectory()=/cacheEnvironment.getExternalStorageDirectory()=/mnt/sdcardEnvironment.getExternalStoragePublicDire
如何使用logrotete定时切割mysql的慢日志从不删库的DBA Mysql mysql 数据库
背景：在Linux系统中，logrotate是常用的日志文件管理工具,可以配置它来对MySQL的慢查询日志进行轮转，例如按照每天或者每周进行轮转，将旧的日志文件备份压缩并重新生成新的日志文件来继续记录！创建配置文件创建一个专门用于配置MySQL慢查询日志切割规则的文件，通常放在/etc/logrotate.d/目录下。假设你的MySQL慢查询日志文件名为/var/lib/mysql/[主机名]-s
使用expect工具实现远程批量修改服务器密码 hanruiding 服务器 github 运维
使用expect工具实现远程批量修改服务器密码linux服务器安装Expect工具1、首先查看系统中是否有安装expect。#whereisexpect2、Expect工具是依赖tcl的，需要先安装tcl#wgethttps://sourceforge.net/projects/tcl/files/Tcl/8.4.19/tcl8.4.19-src.tar.gz#tarzxvftcl8.4.19-s
FastD：高性能PHP API框架钟冶妙Tilda
FastD：高性能PHPAPI框架fastD:rocket:AhighperformancePHPAPIframework.项目地址:https://gitcode.com/gh_mirrors/fa/fastD项目介绍FastD是一个专为高性能API场景设计的PHP框架，它充分利用了Swoole的高性能特性，为开发者提供了一个轻量级且易于扩展的开发环境。FastD不仅支持快速构建API服务，还提
dns域名双栈解析缘来是黎 linux service linux
客户端既有ipv4地址，也有ipv6地址，服务端域名解析既有ipv4地址，也有ipv6地址。那么客户端向服务端发起请求时，客户端使用哪个地址发起请求，服务端如何判断客户端使用的ip协议版本，dns服务器又是如何准确的将域名解析为对应的ip协议版本的地址一、客户端地址选择机制操作系统优先级客户端操作系统（如Windows、Linux）默认采用IPv6优先策略例如：当DNS同时返回A记录（IPv4）和
力扣hot100——LRU缓存（面试高频考题） 01_ 力扣hot100 leetcode 缓存面试 LRU
请你设计并实现一个满足LRU(最近最少使用)缓存约束的数据结构。实现LRUCache类：LRUCache(intcapacity)以正整数作为容量capacity初始化LRU缓存intget(intkey)如果关键字key存在于缓存中，则返回关键字的值，否则返回-1。voidput(intkey,intvalue)如果关键字key已经存在，则变更其数据值value；如果不存在，则向缓存中插入该组k
辗转相处求最大公约数沐刃青蛟 C++漏洞
无言面对”江东父老“了，接触编程一年了，今天发现还不会辗转相除法求最大公约数。惭愧惭愧！为此，总结一下以方便日后忘了好查找。 1.输入要比较的两个数a,b 忽略：2.比较大小（因为后面要的是大的数对小的数做%操作） 3.辗转相除（用循环不停的取余，如a%b,直至b=0） 4.最后的a为两数的最大公约数 &
F5负载均衡会话保持技术及原理技术白皮书 bijian1013 F5 负载均衡
一.什么是会话保持？在大多数电子商务的应用系统或者需要进行用户身份认证的在线系统中，一个客户与服务器经常经过好几次的交互过程才能完成一笔交易或者是一个请求的完成。由于这几次交互过程是密切相关的，服务器在进行这些交互过程的某一个交互步骤时，往往需要了解上一次交互过程的处理结果，或者上几步的交互过程结果，服务器进行下
Object.equals方法：重载还是覆盖 Cwind java generics override overload
本文译自StackOverflow上对此问题的讨论。原问题链接在阅读Joshua Bloch的《Effective Java（第二版）》第8条“覆盖equals时请遵守通用约定”时对如下论述有疑问： “不要将equals声明中的Object对象替换为其他的类型。程序员编写出下面这样的equals方法并不鲜见，这会使程序员花上数个小时都搞不清它为什么不能正常工作：” pu
初始线程 15700786134
暑假学习的第一课是讲线程，任务是是界面上的一条线运动起来。既然是在界面上，那必定得先有一个界面，所以第一步就是，自己的类继承JAVA中的JFrame，在新建的类中写一个界面，代码如下： public class ShapeFr
Linux的tcpdump 被触发 tcpdump
用简单的话来定义tcpdump，就是：dump the traffic on a network，根据使用者的定义对网络上的数据包进行截获的包分析工具。 tcpdump可以将网络中传送的数据包的“头”完全截获下来提供分析。它支持针对网络层、协议、主机、网络或端口的过滤，并提供and、or、not等逻辑语句来帮助你去掉无用的信息。实用命令实例默认启动 tcpdump 普通情况下，直
安卓程序listview优化后还是卡顿肆无忌惮_ ListView
最近用eclipse开发一个安卓app，listview使用baseadapter，里面有一个ImageView和两个TextView。使用了Holder内部类进行优化了还是很卡顿。后来发现是图片资源的问题。把一张分辨率高的图片放在了drawable-mdpi文件夹下，当我在每个item中显示，他都要进行缩放，导致很卡顿。解决办法是把这个高分辨率图片放到drawable-xxhdpi下。 &nb
扩展easyUI tab控件，添加加载遮罩效果知了ing jquery
(function () { $.extend($.fn.tabs.methods, { //显示遮罩 loading: function (jq, msg) { return jq.each(function () { var panel = $(this).tabs(&
gradle上传jar到nexus 矮蛋蛋 gradle
原文地址： https://docs.gradle.org/current/userguide/maven_plugin.html configurations { deployerJars } dependencies { deployerJars "org.apache.maven.wagon
千万条数据外网导入数据库的解决方案。 alleni123 sql mysql
从某网上爬了数千万的数据，存在文本中。然后要导入mysql数据库。悲剧的是数据库和我存数据的服务器不在一个内网里面。。 ping了一下， 19ms的延迟。于是下面的代码是没用的。 ps = con.prepareStatement(sql); ps.setString(1, info.getYear())............; ps.exec
JAVA IO InputStreamReader和OutputStreamReader 百合不是茶 JAVA.io操作字符流
这是第三篇关于java.io的文章了，从开始对io的不了解-->熟悉--->模糊，是这几天来对文件操作中最大的感受，本来自己认为的熟悉了的，刚刚在回想起前面学的好像又不是很清晰了，模糊对我现在或许是最好的鼓励我会更加的去学加油！： JAVA的API提供了另外一种数据保存途径，使用字符流来保存的，字符流只能保存字符形式的流字节流和字符的难点：a,怎么将读到的数据
MO、MT解读 bijian1013 GSM
MO= Mobile originate，上行，即用户上发给SP的信息。MT= Mobile Terminate，下行，即SP端下发给用户的信息；上行:mo提交短信到短信中心下行:mt短信中心向特定的用户转发短信，你的短信是这样的，你所提交的短信，投递的地址是短信中心。短信中心收到你的短信后，存储转发，转发的时候就会根据你填写的接收方号码寻找路由，下发。在彩信领域是一样的道理。下行业务：由SP
五个JavaScript基础问题 bijian1013 JavaScript call apply this Hoisting
下面是五个关于前端相关的基础问题，但却很能体现JavaScript的基本功底。问题1：Scope作用范围考虑下面的代码： (function() { var a = b = 5; })(); console.log(b); 什么会被打印在控制台上？回答：上面的代码会打印 5。 &nbs
【Thrift二】Thrift Hello World bit1129 Hello world
本篇，不考虑细节问题和为什么，先照葫芦画瓢写一个Thrift版本的Hello World，了解Thrift RPC服务开发的基本流程 1. 在Intellij中创建一个Maven模块，加入对Thrift的依赖，同时还要加上slf4j依赖，如果不加slf4j依赖，在后面启动Thrift Server时会报错 <dependency>
【Avro一】Avro入门 bit1129 入门
本文的目的主要是总结下基于Avro Schema代码生成，然后进行序列化和反序列化开发的基本流程。需要指出的是，Avro并不要求一定得根据Schema文件生成代码，这对于动态类型语言很有用。 1. 添加Maven依赖 <?xml version="1.0" encoding="UTF-8"?> <proj
安装nginx+ngx_lua支持WAF防护功能 ronin47
需要的软件:LuaJIT-2.0.0.tar.gz nginx-1.4.4.tar.gz &nb
java-5.查找最小的K个元素-使用最大堆 bylijinnan java
import java.util.Arrays; import java.util.Random; public class MinKElement { /** * 5.最小的K个元素 * I would like to use MaxHeap. * using QuickSort is also OK */ public static void
TCP的TIME-WAIT bylijinnan socket
原文连接： http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html 以下为对原文的阅读笔记说明：主动关闭的一方称为local end，被动关闭的一方称为remote end 本地IP、本地端口、远端IP、远端端口这一“四元组”称为quadruplet，也称为socket 1、TIME_WA
jquery ajax 序列化表单 coder_xpf Jquery ajax 序列化
checkbox 如果不设定值，默认选中值为on；设定值之后，选中则为设定的值 <input type="checkbox" name="favor" id="favor" checked="checked"/> $("#favor&quo
Apache集群乱码和最高并发控制 cuisuqiang apache tomcat 并发集群乱码
都知道如果使用Http访问，那么在Connector中增加URIEncoding即可，其实使用AJP时也一样，增加useBodyEncodingForURI和URIEncoding即可。最大连接数也是一样的，增加maxThreads属性即可，如下，配置如下： <Connector maxThreads="300" port="8019" prot
websocket dalan_123 websocket
一、低延迟的客户端-服务器和服务器-客户端的连接很多时候所谓的http的请求、响应的模式，都是客户端加载一个网页，直到用户在进行下一次点击的时候，什么都不会发生。并且所有的http的通信都是客户端控制的，这时候就需要用户的互动或定期轮训的，以便从服务器端加载新的数据。通常采用的技术比如推送和comet（使用http长连接、无需安装浏览器安装插件的两种方式：基于ajax的长
菜鸟分析网络执法官 dcj3sjt126com 网络
最近在论坛上看到很多贴子在讨论网络执法官的问题。菜鸟我正好知道这回事情.人道"人之患好为人师" 手里忍不住,就写点东西吧. 我也很忙.又没有MM,又没有MONEY....晕倒有点跑题. OK,闲话少说,切如正题. 要了解网络执法官的原理. 就要先了解局域网的通信的原理. 前面我们看到了.在以太网上传输的都是具有以太网头的数据包.
Android相对布局属性全集 dcj3sjt126com android
RelativeLayout布局android:layout_marginTop="25dip" //顶部距离android:gravity="left" //空间布局位置android:layout_marginLeft="15dip //距离左边距 // 相对于给定ID控件android:layout_above 将该控件的底部置于给定ID的
Tomcat内存设置详解 eksliang jvm tomcat tomcat内存设置
Java内存溢出详解一、常见的Java内存溢出有以下三种： 1. java.lang.OutOfMemoryError: Java heap space ----JVM Heap（堆）溢出JVM在启动的时候会自动设置JVM Heap的值，其初始空间(即-Xms)是物理内存的1/64，最大空间(-Xmx)不可超过物理内存。可以利用JVM提
Java6 JVM参数选项 greatwqs java HotSpot jvm jvm参数 JVM Options
Java 6 JVM参数选项大全（中文版）作者：Ken Wu Email: [email protected] 转载本文档请注明原文链接 http://kenwublog.com/docs/java6-jvm-options-chinese-edition.htm！本文是基于最新的SUN官方文档Java SE 6 Hotspot VM Opt
weblogic创建JMC i5land weblogic jms
进入 weblogic控制太 1.创建持久化存储 --Services--Persistant Stores--new--Create FileStores--name随便起--target默认--Directory写入在本机建立的文件夹的路径--ok 2.创建JMS服务器 --Services--Messaging--JMS Servers--new--name随便起--Pers
基于 DHT 网络的磁力链接和BT种子的搜索引擎架构 justjavac DHT
上周开发了一个磁力链接和 BT 种子的搜索引擎 {Magnet & Torrent}，本文简单介绍一下主要的系统功能和用到的技术。系统包括几个独立的部分：使用 Python 的 Scrapy 框架开发的网络爬虫，用来爬取磁力链接和种子；使用 PHP CI 框架开发的简易网站；搜索引擎目前直接使用的 MySQL，将来可以考虑使
sql添加、删除表中的列 macroli sql
添加没有默认值：alter table Test add BazaarType char(1) 有默认值的添加列：alter table Test add BazaarType char(1) default(0) 删除没有默认值的列：alter table Test drop COLUMN BazaarType 删除有默认值的列：先删除约束（默认值）alter table Test DRO
PHP中二维数组的排序方法 abc123456789cba 排序二维数组 PHP
<?php/*** @package BugFree* @version $Id: FunctionsMain.inc.php,v 1.32 2005/09/24 11:38:37 wwccss Exp $*** Sort an two-dimension array by some level
hive优化之------控制hive任务中的map数和reduce数 superlxw1234 hive hive优化
一、控制hive任务中的map数: 1. 通常情况下，作业会通过input的目录产生一个或者多个map任务。主要的决定因素有： input的文件总个数，input的文件大小，集群设置的文件块大小(目前为128M, 可在hive中通过set dfs.block.size;命令查看到，该参数不能自定义修改)；2.
Spring Boot 1.2.4 发布 wiselyman spring boot
Spring Boot 1.2.4已于6.4日发布，repo.spring.io and Maven Central可以下载(推荐使用maven或者gradle构建下载)。这是一个维护版本，包含了一些修复small number of fixes,建议所有的用户升级。 Spring Boot 1.3的第一个里程碑版本将在几天后发布，包含许多

Memory Ordering in Modern Microprocessors, Part I

你可能感兴趣的:(linux,cache,performance,each,Primitive,combinations)