薰衣草之子

How Twitter Uses Redis To Scale - 105TB RAM, 39MM QPS, 10,000+ Instances

Yao Yu has worked on Twitter’s Cache team since 2010. She recently gave a really great talk: Scaling Redis at Twitter. It’s about Redis of course, but it's not just about Redis.

Yao has worked at Twitter for a few years. She's seen some things. She’s watched the growth of the cache service at Twitter explode from it being used by just one project to nearly a hundred projects using it. That's many thousands of machines, many clusters, and many terabytes of RAM.

It's clear from her talk that's she's coming from a place of real personal experience and that shines through in the practical way she explores issues. It's a talk well worth watching.

As you might expect, Twitter has a lot of cache.

Timeline Service for one datacenter using Hybrid List:

~40TB allocated heap
~30MM qps
> 6,000 instances

Use of BTree in one datacenter:

~65TB allocated heap
~9MM qps
>4,000 instances

You'll learn more about BTree and Hybrid List later in the post.

A couple of points stood out:

Redis is a brilliant idea because it takes underutilized resources on servers and turns them into valuable service.
Twitter specialized Redis with two new data types that fit their use cases perfectly. So they got the performance they needed, but it locked them into an older code based and made it hard to merge in new features. I have to wonder, why use Redis for this sort of thing? Just create a timeline service using your own datastructures. Does Redis really add anything to the party?
Summarize large chunks of log data on the node, using your local CPU power, before saturating the network.
If you want something that’s high performance separate the fast path, which is the data path, away from the slow path, which is the command and control path.
Twitter is moving towards a container environment with Mesos as the job scheduler. This is still a new approach so it's interesting to hear about how it works. One issue is the Mesos wastage problem that stems from requirement to specify hard resource usage limits in a complicated runtime world.
A central cluster manager is really important to keep a cluster in a state that’s easy to understand.
The JVM is slow and C is fast. Their cache proxy layer is moving back to C/C++.

With that in mind, let's learn more about how Redis is used at Twitter:

Why Redis?

Redis drives Timeline, Twitter’s most important service. Timeline is an index of tweets indexed by an id. Chaining tweets together in a list produces the Home Timeline. The User Timeline, which consists of tweets the user has tweeted, is just another list.
Why consider Redis instead of Memcache? The Network Bandwidth Problem and The Long Common Prefix Problem.
The Network Bandwidth Problem.
- Memcache didn’t work as well as Redis for the timeline. The problem was dealing with fanout.
- Twitter read and writes happen incrementally and they are fairly small, but the timelines themselves are fairly large.
- When a tweet is generated it needs to be written to all relevant timelines. The tweet is a small piece of data that is attached to some data structure. On read it’s desirable to load a small batch of tweets. On a scroll down another batch is loaded.
- The hometime line can be largish, what is reasonable for a viewer to read in one set. Maybe 3000 entries, for example. Which means for performance reasons accessing the databases should be avoided.
- A read-modify-write cycle for incremental writes, and small reads, on large objects (the timeline), is too expensive and creates a network bottleneck.
- On a gigalink at 100K+ reads and writes per second, if the average object size is more than 1K, the network becomes the bottleneck.
The Long Common Prefix Problem (really two problems)
- A flexible schema approach is used for data formats. An object has certain attributes that may or may not exist. A separate key can be created for each individual attribute. This requires sending out a separate request for each individual attribute and not all attributes may be in the cache.
- Metrics that are observed over time have the same name with each sample having a different time stamp. If storing each metric individually the long common prefix is being stored many many times.
- To be more space efficient in both scenarios, for metrics and a flexible schema, it is desirable to have a hierarchical key space.
A dedicated caching cluster under utilizes CPUs. For simple cases, in-memory key-value stores are CPU light. 1% of CPU time on a box can handle more than 1K requests per second for small key values. Though for different data structures the result can be different.
Redis is a brilliant idea. It sees what the server can do, but is not doing. For simple key-value stores, there’s a lot of CPU headroom on the server side for a service like Redis.
Redis was first used within Twitter in 2010 for the Timeline service. It is also used in the Ads service.
The on disk features of Redis are not used. Partly this is because inside Twitter the Cache and Storage services are in different teams so they use whatever mechanisms they think best. Partly this may be because the Storage team thinks another service fits their goals better than Redis.
Twitter forked Redis 2.4 and added some features to it, so they are stuck at 2.4 (2.8.14 is the latest stable version). Changes were: two data structure features within Redis; in-house cluster management features; in-house logging and data insight.
Hotkeys are a problem so they are a building a tiered caching solution with client side caching that will automatically cache hotkeys.

Hybrid List

Added Hybrid List to Redis for more predictable memory performance.
Timeline is a list of Tweet IDs, so it’s a list of integers. Each ID is small.
Redis supports two list types: ziplist and linklist. Ziplist is space efficient. Linked list is flexible, but as a doubly linked list has the overhead of two pointers per key, which given the size of the ID is very high overhead.
To use memory efficiently ziplists are used exclusively.
A Redis ziplist threshold is set to the max size of a Timeline. Never store a bigger Timeline than can be stored in a ziplist. This means a product decision, how many tweets can be in a Timeline, are linked to a low level component (Redis). Generally not desirable.
Adding to and deleting from a ziplist is inefficient, especially with a very large list. Deleting from a ziplist uses memmove to move data around, to make sure the list is still contiguous. Adding to a ziplist requires a memory realloc call to make enough space for the new entry.
Potential high latency for write operations due to Timeline size. Timelines vary a lot in size. Most users don’t tweet very much, so their User Timeline is small. Home Timelines, especially those involving celebreties can be huge. When updating a large timeline and the cache runs out of heap, which is often the case when using a cache, a very large number of very small timelines will be evicted before there’s enough contiguous RAM to handle one big ziplist. As all this cache management takes time, a write operation can have a high latency.
Since writes are fanned out to a lot of timelines there’s a higher chance to be caught in a write latency trap as memory is used for expanding the timelines.
It’s hard to create a SLA for write operations given the high variability of write latencies.
Hybrid List is a linked list of ziplists. A threshold is set of how big each ziplist can be in bytes. In bytes because to memory efficient it helps to allocate and deallocate blocks of the same size. When a list goes over it is spilled into the next ziplist. A ziplist is not recycled until the list is empty, which means it is possible, through deletion, to have each ziplist have only one entry. In practice, tweets aren’t deleted all that often.
Before Hybrid List a workaround was to expire larger timelines more quickly, which freed up memory for other timelines, but was expensive when a user went to view their timeline.

BTree

Added BTree to Redis to support range queries on hierarchical keys to return a list of results.
In Redis the way to deal with secondary keys or fields is a hash map. To have sorted data in order to perform a range query a sorted set is used. Sorted set orders by a score which is a double, so an arbitrary secondary key or an arbitrary name can’t be used for the sorting. Since hash map uses a linear search it’s not great if there are a lot of secondary keys or fields.
BTree is the attempt fix the shortcomings of hash map and sorted set. It’s better to just have one data structure that does what you want. It’s easier to understand and reason about.
Borrowed the BSD implementation of BTree and added it to Redis to create a BTree. Supports key lookup as well as range query. Has good lookup performance. The code is relatively simple. The downside is BTree is not memory efficient. It has a lot of meta data overhead due to the pointers.

Cluster Management

A cluster is using more than one instance of Redis for a single purpose. If a data set is larger than a single Redis instance can handle or throughput is higher than what a single instance can handle, the key space will need to be partitioned so the data can be stored in more than one shard, across a set of instances. Routing is taking a key and figuring out which shard the data for the key is on.
Thinks cluster management is the number one reason Redis adoption hasn’t exploded. When a cluster is available there’s no reason not to migrate all cache use cased to Redis.
Tricky to get Redis cluster right. People use Redis because as a data structure server the idea is to perform frequent updates. But a lot of Redis operations are not idempotent. If there’s a network glitch a retry is required and the data can be corrupted.
Redis cluster favors having a centralized manager dictating the global view. With memcache a lot clusters use a client side approach based on consistent hashing. If there’s inconsistent data, so be it. To provide really good services, a cluster needs features like detecting which shard is down and then replaying operations to get back in sync. After a long enough period spent down cache state should be cleaned up. Corrupted data in Redis is hard to detect. When there’s a list and it’s missing a chunk, it’s hard to tell.
Twitter has multiple attempts at building a Redis cluster. Twemproxy which is not used by Twitter internally, it was built for Twemcache and Redis support was added. Two more solutions were based on proxy style routing. One was associated with the Timeline service and not meant to be general. The second was a generalization of the Timeline solution that provided cluster management, replication, and shard repairing.
Three options in a cluster: servers talk to each other to reach agreement of what a cluster looks like; use a proxy; or do client side cluster management where the clients form a quorum.
Didn’t go with a server approach because the philosophy is to keep servers simple, dumb and fast.
Didn’t go with the client because changes are hard to propagate. Approximately 100 projects in Twitter use a cache cluster. Changing anything in the client would have to be pushed to 100 clients it could take years for changes to propagate. Quick iteration means it’s almost impossible to put code in the client.
Went with a proxy style routing approach and partitioning for two reasons. A cache service is a high performance service. If you want something that’s high performance separate the fast path, which is the data path, away from the slow path, which is the command and control path. If cluster management is merged into the server it complicates the code for Redis, which is a stateful service, any time you want to fix a bug or provide an upgrade to the cluster management code, the stateful Redis service must be restarted too, which will potentially throw away a bunch of data. A rolling restart of a cluster is painful.
There was a concern using the proxy approach that another network hop is inserted between the client and the server. Profiling showed the extra hop is a myth. At least in their ecosystem. Latency to through the Redis server was less than .5 milliseconds. At Twitter most of the backend services are Java based and use Finagle to talk to each other. When going through the Finagle path the latency was close to 10 milliseconds. So the extra hop isn’t the problem. Inside the JVM is the problem. Outside the JVM you can do pretty much whatever you want, unless of course you go through another JVM.
Failure of a proxy doesn’t matter much. On the data path introducing a proxy layer isn’t so bad. The client doesn’t care which proxy they talk to. If a proxy fails after a timeout the client goes to another proxy. No sharding is happening at the proxy level, they are all stateless. To scale throughput simply add more proxies. The tradeoff is additional cost. The proxy layer is allocated resources just to do the forwarding. Cluster management, sharding, and doing the view of the cluster happens outside the proxies. The proxies don’t have to agree with each other.
Twitter has instances that have 100K open connections and it works fine. There’s just overhead to pay. There’s no reason to close connections. Just keep them open, it improves latency.
Cache clusters are used as a look-aside cache. The caches themselves are not responsible for data replenishment. The client is responsible for fetching a missing key from storage then caching it. If a node goes down the shard is moved to another node. The failed machine is flushed when it comes back so no data is left around. All this is done by the cluster leader. A central viewpoint is really important to keep a cluster in a state that’s easy to understand.
Did an experiment with a proxy written in C++. The C++ proxy saw a significant performance increase (no number given). The proxy tier is being moved back to C and C++.

Data Insight

When there’s a call saying the cache system is misbehaving most of the time the cache is fine. Usually the clients are configured wrong. Or they are abusing the cache system by requesting way too many keys. Or requesting the same key over and over again and saturating the server or the link.
When you tell someone they are abusing your system they want proof. Which key? Which shard is bad? What kind of traffic leads to this behaviour? Proof requires metrics and analysis that can be shown to customers.
An SOA architecture doesn’t give you problem isolation or make debugging easier automatically. You have to have good visibility into every component that makes up the system.
Decided to build Insight into caching. The cache is written in C and is fast, so it can provide data that other components can’t. Other compents can't handle the load of providing data for every request.
Logging every single command is possible. The cache can log everything at 100K qps. Only meta data is logged, values are not logged (Good joke about the NSA).
Avoid locking and blocking. Especially don’t block on disk writes.
At 100 qps and a 100 bytes per log message, each box will log 10MB of data per second. That’s a lot of data to move off the box. 10% of network bandwidth would be used just in case something went bad. Economically not feasible.
Precompute logs on the box to reduce costs. Assumption is that it is already knows what will be computed. A process reads the logs and generates a summary and periodically sends this view of the box. The view is tiny compared to the original data.
View data is aggregated by Storm, stored, and there’s a visualization system sitting on top. You can get data like here are your top 20 keys; here’s your traffic by second and there’s a peak which means the traffic pattern is spiky; here’s are the number of unique keys, which helps with capacity planning. A lot can be done when every single log is captured.
Insight is very valuable for operations. If there are packet drops often that can be linked to either a hot key or spiky traffic behaviour.

Wish List For Redis

Explicit memory management.
Deployable (Lua) Scripts. Talked about near the start.
Multi-threading. Would make cluster management easier. Twitter has a lot of “tall boxes,” where a host has 100+ GB of memory and a lot of CPUs. To use the full capabilities of a server a lot of Redis instances need to be started on a physical machine. With multi-threading fewer instances would need to be started which is much easier to manage.

Lessons Learned

Scale demands predictability. The larger the cluster, the more customers, the more predictable and deterministic you want your service to be. When there’s one customer and there’s a problem you can dig into a problem and it’s intriguing. When you have 70 customers you can’t keep up.
Tail latencies matter. When you do fanouts to a lot of shards, when one is slow your entire query will be slow.
Deterministic configuration is operationally important. Twitter is moving towards a container environment. Mesos is used as the job scheduler. The scheduler fulfills the request for the amount of CPU, memory etc. A monitor kills any job that goes over its resource requirement. Redis causes a problem in a container environment. Redis introduces external fragmentation, meaning you use more memory to store the same amount of data. If you don’t want to be killed you have to compensate for that with oversupply. You have to think my memory fragmentation ratio won’t go over 5%, but I’ll allocate 10% more as a buffer space. Maybe even 20%. Or I think I’ll get 5000 connections per host, but just in case let me allocate memory for 10,000 connections. The result is a huge potential for waste. Super low latency services don’t play well with Mesos today, so these jobs are isolated from other jobs.
Knowing your resource usage at runtime is really helpful. In a large cluster bad stuff happens. You think you are safe but things happen and behaviour is unexpected. Most services today can’t degrade gracefully. For example, when a limit of 10GB of RAM is reached then requests are rejected until there’s free RAM. This only fails a small percentage of traffic that’s proportional to the resource that they require. That's graceful. Garbage collection problems are not graceful, traffic just gets dropped on the floor, this problem affects a lot of teams in a lot of companies every day.
Push computation to the data. If you look at relative network speeds, CPU speeds, and disk speeds, it makes sense to do computation before going to disk and do computation before going to the network. An example is summarizing logs on a node before they are pushed to a centralized monitoring service. LUA in Redis another way to apply computation close to the data.
LUA is not production ready in Redis today. On demand scripting means service providers can’t guarantee their SLA. A loaded script can do anything. What service provider would want to take the risk of blowing their SLA because of someone elses code? A deployment model would be better. It would allow for code review and benchmarking, so resource usage and performance could be properly calculated.
Redis as the next high performance stream processing platform. It has pub-sub and scripting. Why not?

Redis | 基于 Redis 实现机器列表 Token 缓存的 Java 实现 Andya_net Spring &SpringBoot等框架技术中间件 &工具类 #Redis 缓存 java redis
关注：CodingTechWork引言在分布式系统中，Token缓存是一种常见的需求。它可以帮助我们快速验证用户身份，减少对数据库的频繁访问，提高系统的性能和响应速度。本文将介绍如何使用Redis来实现机器列表的Token缓存，在KubernetesPod部署的环境中，为了避免多个Pod同时执行相同的定时任务（如刷新缓存Token），我们需要引入分布式锁机制。以下是基于RedisTemplat
Redis主从复制的问题总结 DP成长之路面试 Mysql基础
读写分离的问题1.数据复制的延迟读写分离时，master会异步的将数据复制到slave，如果这是slave发生阻塞，则会延迟master数据的写命令，造成数据不一致的情况解决方法：可以对slave的偏移量值进行监控，如果发现某台slave的偏移量有问题，则将数据读取操作切换到master，但本身这个监控开销比较高，所以关于这个问题，大部分的情况是可以直接使用而不去考虑的。2.读到过期的数据我们知道
python测试开发面试题测试界萧萧软件测试 python jvm 开发语言功能测试自动化测试软件测试单元测试
技术相关：代码功底、数据库（mysql、redis）、Linux命令、计算机网络、数据结构与算法相关、测试相关问题、项目经验、行为面试问题、团队相关代码功底请解释一下Python中的垃圾回收机制。Python中的垃圾回收机制主要是通过引用计数和标记清除两种方式来实现的。引用计数：每一个对象都有一个引用计数器，每当一个新的引用指向这个对象时，引用计数器就会加1；反之则减1。当引用计数器变为0时，这个
docker 安装镜像及使用命令时间头秃大师 docker 容器运维
目录1.Mysql2.Redis3.Nginx4.Elasticsearch单机ik分词器官网集群指导个人集群5.RocketMQdockerpull容器名:版本号拉取容器,不指定版本号默认最新的dockerexec-it容器名称bash可以进入该容器,进行操作run命令解释-d后台启动-p宿主机端口:容器端口--name容器名称-epass_word=123456(环境变量，k=v)-v目录映射
Spring Boot02(数据库、Redis)02---java八股凉漠 java八股数据库 java spring boot
MySQL和Redis的区别？1.数据类型：MySQL是一种关系型数据库，表结构化存储，使用SQL查询。支持表、列、行等结构化数据。Redis是一种基于内存的缓存系统，支持多种数据结构，如字符串、哈希表、列表、集合、有序集合等。2.存储方式：MySQL则将数据存储在磁盘上，读写速度相对较慢，但可以存储更大的数据量。Redis将所有数据存储在内存中，因此读写速度非常快。3.访问模式：MySQL则使用
Redis主从架构的详解秦霜 redis redis
1核心原理slavenode启动，仅仅保存masternode的信息，包括masternode的host和ip，但是复制流程没开始masterhost和ip是从哪儿来的?redis.conf里面的slaveof配置的slavenode内部有个定时任务，每秒检查是否有新的masternode要连接和复制，如果发现，就跟masternode建立socket网络连接slavenode发送ping命令给m
Redis主从复制原理及注意事项鸨哥学JAVA 数据库 java 服务器
主从复制特点主从复制，是指将一台Redis服务器的数据，复制到其他的Redis服务器。前者称为主节点(master)，后者称为从节点(slave),数据的复制是单向的，只能由主节点到从节点.具有以下特点：1.异步复制,从2.8版本开始的。2.允许单个master配置多个slave3.允许master->slave->slave模式4.master在进行replication时是非阻塞的，在repl
六级备考：词汇量积累（day9）无敌大饺子 dot 职场和发展
attribute归功于distrubute分发redistribute重新分配regime政体reign统治reinforce加强，加固enhance提高，增强loyal忠诚royal皇室的sovereign君主admit准许，承认transmit传送admittedly诚然，不可否认的submit递交submissive服从的summit发射peak顶峰omit删除emit发光,发出emitt
Alluxio 携手 vLLM Production Stack 加速大语言模型推理
近日，Alluxio宣布与芝加哥大学LMCache实验室开发的vLLMProductionStack项目达成战略合作。作为大语言模型（LLM）推理领域的开源项目，vLLMProductionStack旨在为LLM推理提供高效的集群级全栈解决方案。此次合作将深度融合双方技术优势，共同推动新一代AI基础设施在LLM推理场景中的创新突破。AI推理的崛起重塑了数据基础设施需求，相较于传统工作负载呈现出独特
22、web前端开发之html5（三）跟着汪老师学编程前端 html5
六.离线存储与缓存在网络环境不稳定或需要优化资源加载速度的场景下，离线存储与缓存技术显得尤为重要。HTML5引入了多种离线存储和缓存机制，帮助开发者提升用户体验。本节将详细介绍ApplicationCache、localStorage、sessionStorage以及IndexedDB等技术，帮助你理解如何在不同场景下选择合适的存储和缓存策略。1、ApplicationCacheApplicati
Redis + Caffeine多级缓存电商场景深度解析 nlog3n Java学习缓存 redis 数据库
Redis+Caffeine多级缓存Redis+Caffeine多级缓存电商场景深度解析一、实施目的二、具体实施2.1架构设计2.2组件配置2.3核心代码实现三、实施效果3.1性能指标对比3.2业务指标改善3.3系统稳定性四、关键策略4.1缓存预热4.2一致性保障4.3监控配置Prometheus监控指标电商多级缓存完整实现方案1.基础配置1.1Maven依赖1.2配置文件2.核心实现类2.1缓存
cpu的一级数据缓存和一级指令缓存有什么区别七贤岭双花红棍缓存
CPU的一级数据缓存（L1DataCache，L1D）和一级指令缓存（L1InstructionCache，L1I）是两种专门设计用于优化不同任务的缓存，它们的核心区别在于存储内容、访问模式和硬件设计目标。以下是详细对比：1.存储内容不同一级数据缓存(L1D)一级指令缓存(L1I)存储CPU运行时需要读写的数据（如变量、数组、计算结果等）。存储CPU待执行的指令（程序代码的二进制机器码）。例如：a
Redis原理：watch命令 csjane1079 redis redis java
在前面的文章中有提到，在multi前可以通过watch来观察哪些key，被观察的这些key，会被redis服务器监控，涉及该key被修改时，则在exec命令执行过程中会被识别出来，exec就不会再执行命令。源码分析//监控对应的keyvoidwatchCommand(client*c){intj;if(c->flags&CLIENT_MULTI){addReplyError(c,"WATCHins
【数据库事务、消息队列事务、Redis 事务、Spring 事务详细分析】逍遥运德数据库数据库 spring redis rabbitmq rocketmq
数据库事务、消息队列事务、Redis事务、Spring事务**的详细分析在分布式系统和应用开发中，事务管理是确保数据一致性和可靠性的关键机制。以下是针对数据库事务、消息队列事务、Redis事务、Spring事务的详细分析，包括原理、特点、适用场景和对比总结。1.数据库事务原理：数据库事务基于ACID特性（原子性、一致性、隔离性、持久性），通过事务日志（如RedoLog、UndoLog）和锁机制实现
Nginx Stream 代理配置全解析：TCP/UDP 流量转发及常见问题排查秃头摸鱼侠 nginx nginx tcp/ip udp
Nginx除了可以处理HTTP代理，还可以用于TCP/UDP流量转发，适用于数据库代理（MySQL、PostgreSQL）、Redis负载均衡、WebSocket代理、游戏服务器流量分发等场景。相比HAProxy，Nginx配置更加灵活，并且可以结合stream模块进行高效的TCP/UDP代理。本篇文章将带你深入了解NginxStream代理的配置方法、负载均衡策略，以及常见问题的解决方案，帮助你
在 Go 中如何使用分布式锁解决并发问题？后端go面试并发分布式锁
在分布式系统中，协调多个服务实例之间的共享资源访问是一个经典的挑战。传统的单机锁（如sync.Mutex）无法实现跨进程工作，此时就需要用到分布式锁了。本文将介绍Go语言生态中基于Redis实现的分布式锁库redsync，并探讨其使用方法和实现原理。分布式锁首先我们来探讨下为什么需要分布式锁？当我们编写的程序出现资源竞争的时候，就需要使用互斥锁来保证并发安全。而我们的服务很有可能不会单机部署，而是
lvm-cache实操 linuxlvm
一、背景1.联想服务器SR590，本机安装了2块800G的固态硬盘、12块1.8T的机械硬盘。2.2块800G的固态组了一个raid1，作为第一个卷组，实际可用空间为744G。3.12块1.8T的机械硬盘做了一个raid10，作为第二个卷组，实际可用空间为9.8T。（组此raid10时，提示使用GPT分区，可忽略，centos7会自动使用GPT分区）4.安装centos7操作系统到第二个卷组上。此
Redis持久化策略（RDB&AOF）尚早立志 Redis redis 缓存
持久化是将数据写入持久存储，例如固态磁盘（SSD）。本文主要基于Redis4.0.11版本编写，Redis主要提供了RDB和AOF以及RDB和AOF混合模式等几种持久化策略。截止本文编辑时间，Redis最新版本为7.2.4，对于4.0.11版本之后至最新版本中间的一些新特性或变化，本文会特殊说明。RDB方式RDB方式，将当前redis实例内存中的数据集快照写入磁盘。恢复时，直接将快照文件读到内存中
【中大厂面试题】阿里云Java 后端校招最新面试题扫地僧009 互联网大厂面试题阿里云 java 数据库开发语言面试
目录MySQL事务隔离级别有哪些？幻读和脏读的区别？如何防止幻读？事务的mvcc机制原理是什么？mysql的什么命令会加上间隙锁？Java双亲委派机制是什么？垃圾回收cms和g1的区别是什么？spring三级缓存解决循环依赖问题？如何使用spring实现事务？介绍事务传播模型有哪些？springboot常用注解有哪些？介绍NIOBIOAIO？Redisredis高级数据结构的使用场景linuxli
Cache 映射方式详解 jiuri_1215 软考知识点学习软考
Cache映射方式详解一、直接映射（DirectMapping）原理：每个主存块只能映射到唯一的一个Cache块，映射公式为：Cache块号=主存块号modCache块数示例：Cache大小：8块（块号0-7）主存块号：20→20mod8=4，映射到Cache块4。示意图：主存块────┬───0→Cache0│1→Cache1│...主存块20──→Cache4（唯一映射）│...└───23→
Redis单进程、单线程、多线程之详解（Redis Single Process, Single Thread, and Multi Thread Explanation） Linux运维老纪用心耕耘开启数据库之门 redis 数据库缓存运维开发云计算 linux
Redis是单进程单线程？支持多线程？Redis是单线程还是多线程？是单进程还是单线程？.具体来说，Redis使用一个单独的线程处理绝大部分的任务，包括：数据读写...等，但最新的版本已经包含多线程的功能。首先，从单线程谈起，单线程依然是核心处理。Redis单线程处理数据的方式之所以高效，是因为它利用了：I/O多路复用机制，可以同时处理多个客户端的请求。I/O多路复用机制（I/OMultiplex
C++ 缓存(lru结合lfu) ShAn DiAn C++缓存 c++数据结构链表
1.ARC（AdaptiveReplacementCache）算法的核心思想LRU（最近最久未使用）算法的主要不足在于它只考虑时间局部性，当遇到突发性的冷数据访问时，可能会将热点数据挤出缓存，造成缓存污染。例如，如果缓存大小为4，当前缓存中有热点数据A、B、C、D，突然有大量冷数据E、F、G、H访问，这些冷数据会依次替换掉热点数据，导致缓存命中率急剧下降。而LFU（最近最少使用）算法虽然考虑了访问
Spring Boot实战：MySQL与Redis数据一致性深度解析与代码实战 weixin_535033321 spring boot mysql redis
SpringBoot实战：MySQL与Redis数据一致性深度解析与代码实战一、数据一致性问题概述二、常见解决方案三、选择合适的解决方案四、总结在SpringBoot开发中，MySQL作为关系型数据库，提供了强大的数据存储和查询能力；而Redis作为内存数据库，以其高速读写性能成为缓存层的首选。然而，当这两者共同服务于一个系统时，如何确保它们之间的数据一致性，成为了一个不可忽视的问题。本文将深入探
分布式环境下的重复请求防护：非Redis锁替代方案全解析敲键盘的小夜猫应用场景 java redis 分布式 redis 数据库
目录引言方案一：前端防护策略方案二：后端协同控制方案三：流量控制与过滤滑动窗口限流布隆过滤器方案四：基于框架的实践方案多层防护策略与最佳实践总结引言在Web应用开发中，防止用户重复点击提交是一个常见却棘手的问题。重复提交不仅会导致数据重复、资源浪费，在交易、下单等场景中甚至可能造成严重的业务异常。通常情况下，我们会使用Redis分布式锁来解决这个问题，但当Redis不可用或由于架构限制无法使用时，
Spring Boot实战：MySQL与Redis数据一致性深度解析与代码实战程序员Hagei spring boot mysql redis
SpringBoot实战：MySQL与Redis数据一致性深度解析与代码实战一、数据一致性问题概述二、常见解决方案三、选择合适的解决方案四、总结在SpringBoot开发中，MySQL作为关系型数据库，提供了强大的数据存储和查询能力；而Redis作为内存数据库，以其高速读写性能成为缓存层的首选。然而，当这两者共同服务于一个系统时，如何确保它们之间的数据一致性，成为了一个不可忽视的问题。本文将深入探
Redis缓存异常场景深度解析：穿透、击穿、雪崩及终极解决方案 java开发小黄缓存 redis 数据库
一、引言在高并发系统中，缓存承担着流量洪峰的削峰填谷作用。然而当缓存层出现异常时，可能引发数据库级联崩溃，造成系统瘫痪。本文将深入剖析缓存穿透、缓存击穿、缓存雪崩三大典型问题，并提供企业级解决方案。文章包含7种防御策略、3个实战案例，助您构建坚如磐石的缓存体系。二、缓存穿透（CachePenetration）2.1现象与危害现象：恶意请求不存在的数据，绕过缓存直击数据库危害：数据库压力暴增，可能导
2025年科技行业裁员潮：全球近3万名员工受影响 Yvette-W IT职业圈科技企业
进入2025年，科技行业的裁员潮仍在持续。据最新统计，今年以来，全球已有近3万名科技从业者失去工作，裁员浪潮不仅席卷美国，还波及欧洲、亚洲等主要经济体。科技企业正面临市场需求变化、经济不确定性和人工智能（AI）技术的快速发展，这些因素共同推动了裁员趋势。知名科技公司相继裁员Block裁员超900人Twitter联合创始人杰克·多西（JackDorsey）创立的金融科技公司Block宣布裁员931人
深入解析 RedissonMultiLock —— 分布式联锁的原理与实战救救孩子把 JAVA Redis 分布式 Redisson redis
在分布式系统中，为了确保业务操作的一致性和数据安全，我们常常需要对多个资源（如订单、库存、商品等）同时加锁。虽然Redisson提供的单一资源锁（RLock）使用简单，但在业务逻辑涉及多个资源时，仅靠单个锁显得力不从心。为此，Redisson提供了**联锁（MultiLock）**机制，它能把多个RLock组合成一个整体锁，只有当所有子锁都成功加锁后，才能算真正拿到了锁。1.RedissonMul
手写 LRU（最近最少使用）缓存和单例模式的双重检查锁实现代码 IT-david 缓存单例模式 java
1.LRU缓存实现LRU缓存需要淘汰最近最少使用的元素，通常使用哈希表（快速查找）和双向链表（快速插入/删除）组合实现。Java实现importjava.util.HashMap;publicclassLRUCache{//节点定义classNode{intkey;intvalue;Nodeprev;Nodenext;Node(intkey,intvalue){this.key=key;this.
Redis 实战凄戚 nosql java 缓存 database
RedisRedis是一个开源的，高级的键值对存储和一个适用的解决方案，用于构建高性能，可扩展的Web应用程序。场景：在互联网中经常用来缓存热点数据：1.redis数据在内存中，可以保证读取的高效（接近每秒数十万次）；2.减少下层持久层数据库读取压力，像mongodb，每秒近千次就有压力；3.redis单线程运行，天然具备读写的原子性使用：1.先get读取redis，没有读到再去db；将db读到的
java的(PO,VO,TO,BO,DAO,POJO) Cb123456 VO TO BO POJO DAO
转: http://www.cnblogs.com/yxnchinahlj/archive/2012/02/24/2366110.html ------------------------------------------------------------------- O/R Mapping 是 Object Relational Mapping（对象关系映
spring ioc原理（看完后大家可以自己写一个spring） aijuans spring
最近，买了本Spring入门书：spring In Action 。大致浏览了下感觉还不错。就是入门了点。Manning的书还是不错的，我虽然不像哪些只看Manning书的人那样专注于Manning,但怀着崇敬的心情和激情通览了一遍。又一次接受了IOC 、DI、AOP等Spring核心概念。先就IOC和DI谈一点我的看法。IO
MyEclipse 2014中Customize Persperctive设置无效的解决方法 Kai_Ge MyEclipse2014
高高兴兴下载个MyEclipse2014，发现工具条上多了个手机开发的按钮，心生不爽就想弄掉他！结果发现Customize Persperctive失效！！有说更新下就好了，可是国内Myeclipse访问不了，何谈更新... so~这里提供了更新后的一下jar包，给大家使用！ 1、将9个jar复制到myeclipse安装目录\plugins中 2、删除和这9个jar同包名但是版本号较
SpringMvc上传 120153216 springMVC
@RequestMapping(value = WebUrlConstant.UPLOADFILE) @ResponseBody public Map<String, Object> uploadFile(HttpServletRequest request,HttpServletResponse httpresponse) { try { //
Javascript----HTML DOM 事件何必如此 JavaScript html Web
HTML DOM 事件允许Javascript在HTML文档元素中注册不同事件处理程序。事件通常与函数结合使用，函数不会在事件发生前被执行！注：DOM：指明使用的 DOM 属性级别。 1.鼠标事件属性
动态绑定和删除onclick事件 357029540 JavaScript jquery
因为对JQUERY和JS的动态绑定事件的不熟悉，今天花了好久的时间才把动态绑定和删除onclick事件搞定!现在分享下我的过程。在我的查询页面，我将我的onclick事件绑定到了tr标签上同时传入当前行(this值)参数，这样可以在点击行上的任意地方时可以选中checkbox，但是在我的某一列上也有一个onclick事件是用于下载附件的，当
HttpClient|HttpClient请求详解 7454103 apache 应用服务器网络协议网络应用 Security
HttpClient 是 Apache Jakarta Common 下的子项目，可以用来提供高效的、最新的、功能丰富的支持 HTTP 协议的客户端编程工具包，并且它支持 HTTP 协议最新的版本和建议。本文首先介绍 HTTPClient，然后根据作者实际工作经验给出了一些常见问题的解决方法。HTTP 协议可能是现在 Internet 上使用得最多、最重要的协议了，越来越多的 Java 应用程序需
递归逐层统计树形结构数据 darkranger 数据结构
将集合递归获取树形结构: /** * * 递归获取数据 * @param alist:所有分类 * @param subjname:对应统计的项目名称 * @param pk:对应项目主键 * @param reportList: 最后统计的结果集 * @param count:项目级别 */ public void getReportVO(Arr
访问WEB-INF下使用frameset标签页面出错的原因 aijuans struts2
<frameset rows="61,*,24" cols="*" framespacing="0" frameborder="no" border="0">
MAVEN常用命令 avords
Maven库： http://repo2.maven.org/maven2/ Maven依赖查询： http://mvnrepository.com/ Maven常用命令： 1. 创建Maven的普通java项目： mvn archetype:create -DgroupId=packageName
PHP如果自带一个小型的web服务器就好了 houxinyou apache 应用服务器 Web PHP 脚本
最近单位用PHP做网站，感觉PHP挺好的，不过有一些地方不太习惯，比如，环境搭建。PHP本身就是一个网站后台脚本，但用PHP做程序时还要下载apache，配置起来也不太很方便，虽然有好多配置好的apache+php+mysq的环境，但用起来总是心里不太舒服，因为我要的只是一个开发环境，如果是真实的运行环境，下个apahe也无所谓，但只是一个开发环境，总有一种杀鸡用牛刀的感觉。如果php自己的程序中
NoSQL数据库之Redis数据库管理(list类型) bijian1013 redis 数据库 NoSQL
3.list类型及操作 List是一个链表结构，主要功能是push、pop、获取一个范围的所有值等等，操作key理解为链表的名字。Redis的list类型其实就是一个每个子元素都是string类型的双向链表。我们可以通过push、pop操作从链表的头部或者尾部添加删除元素，这样list既可以作为栈，又可以作为队列。 &nbs
谁在用Hadoop？ bingyingao hadoop 数据挖掘公司应用场景
Hadoop技术的应用已经十分广泛了，而我是最近才开始对它有所了解，它在大数据领域的出色表现也让我产生了兴趣。浏览了他的官网，其中有一个页面专门介绍目前世界上有哪些公司在用Hadoop，这些公司涵盖各行各业，不乏一些大公司如alibaba,ebay,amazon,google,facebook,adobe等，主要用于日志分析、数据挖掘、机器学习、构建索引、业务报表等场景,这更加激发了学习它的热情。
【Spark七十六】Spark计算结果存到MySQL bit1129 mysql
package spark.examples.db import java.sql.{PreparedStatement, Connection, DriverManager} import com.mysql.jdbc.Driver import org.apache.spark.{SparkContext, SparkConf} object SparkMySQLInteg
Scala: JVM上的函数编程 bookjovi scala erlang haskell
说Scala是JVM上的函数编程一点也不为过，Scala把面向对象和函数型编程这两种主流编程范式结合了起来，对于熟悉各种编程范式的人而言Scala并没有带来太多革新的编程思想，scala主要的有点在于Java庞大的package优势，这样也就弥补了JVM平台上函数型编程的缺失，MS家.net上已经有了F#，JVM怎么能不跟上呢？对本人而言
jar打成exe bro_feng java jar exe
今天要把jar包打成exe，jsmooth和exe4j都用了。遇见几个问题。记录一下。两个软件都很好使，网上都有图片教程，都挺不错。首先肯定是要用自己的jre的，不然不能通用，其次别忘了把需要的lib放到classPath中。困扰我很久的一个问题是，我自己打包成功后，在一个同事的没有装jdk的电脑上运行，就是不行，报错jvm.dll为无效的windows映像，如截图最后发现
读《研磨设计模式》-代码笔记-策略模式-Strategy bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /* 策略模式定义了一系列的算法，并将每一个算法封装起来，而且使它们还可以相互替换。策略模式让算法独立于使用它的客户而独立变化简单理解： 1、将不同的策略提炼出一个共同接口。这是容易的，因为不同的策略，只是算法不同，需要传递的参数
cmd命令值cvfM命令 chenyu19891124 cmd
cmd命令还真是强大啊。今天发现jar -cvfM aa.rar @aaalist 就这行命令可以根据aaalist取出相应的文件例如：在d：\workspace\prpall\test.java 有这样一个文件，现在想要将这个文件打成一个包。运行如下命令即可比如在d：\wor
OpenJWeb(1.8) Java Web应用快速开发平台 comsci java 框架 Web 项目管理企业应用
OpenJWeb(1.8) Java Web应用快速开发平台的作者是我们技术联盟的成员，他最近推出了新版本的快速应用开发平台 OpenJWeb(1.8)，我帮他做做宣传 OpenJWeb快速开发平台以快速开发为核心，整合先进的java 开源框架，本着自主开发+应用集成相结合的原则，旨在为政府、企事业单位、软件公司等平台用户提供一个架构透
Python 报错：IndentationError: unexpected indent daizj python tab 空格缩进
IndentationError: unexpected indent 是缩进的问题，也有可能是tab和空格混用啦 Python开发者有意让违反了缩进规则的程序不能通过编译，以此来强制程序员养成良好的编程习惯。并且在Python语言里，缩进而非花括号或者某种关键字，被用于表示语句块的开始和退出。增加缩进表示语句块的开
HttpClient 超时设置 dongwei_6688 httpclient
HttpClient中的超时设置包含两个部分： 1. 建立连接超时，是指在httpclient客户端和服务器端建立连接过程中允许的最大等待时间 2. 读取数据超时，是指在建立连接后，等待读取服务器端的响应数据时允许的最大等待时间在HttpClient 4.x中如下设置： HttpClient httpclient = new DefaultHttpC
小鱼与波浪 dcj3sjt126com
一条小鱼游出水面看蓝天，偶然间遇到了波浪。　　小鱼便与波浪在海面上游戏，随着波浪上下起伏、汹涌前进。　　小鱼在波浪里兴奋得大叫：“你每天都过着这么刺激的生活吗？简直太棒了。”　　波浪说：“岂只每天过这样的生活，几乎每一刻都这么刺激！还有更刺激的，要有潮汐变化，或者狂风暴雨，那才是兴奋得心脏都会跳出来。”　　小鱼说：“真希望我也能变成一个波浪，每天随着风雨、潮汐流动，不知道有多么好！”　　很快，小鱼
Error Code: 1175 You are using safe update mode and you tried to update a table dcj3sjt126com mysql
快速高效用：SET SQL_SAFE_UPDATES = 0；下面的就不要看了！今日用MySQL Workbench进行数据库的管理更新时，执行一个更新的语句碰到以下错误提示： Error Code: 1175 You are using safe update mode and you tried to update a table without a WHERE that
枚举类型详细介绍及方法定义 gaomysion enum javaee
转发 http://developer.51cto.com/art/201107/275031.htm 枚举其实就是一种类型，跟int, char 这种差不多，就是定义变量时限制输入的，你只能够赋enum里面规定的值。建议大家可以看看，这两篇文章，《java枚举类型入门》和《C++的中的结构体和枚举》，供大家参考。枚举类型是JDK5.0的新特征。Sun引进了一个全新的关键字enum
Merge Sorted Array hcx2013 array
Given two sorted integer arrays nums1 and nums2, merge nums2 into nums1 as one sorted array. Note:You may assume that nums1 has enough space (size that is
Expression Language 3.0新特性 jinnianshilongnian el 3.0
Expression Language 3.0表达式语言规范最终版从2013-4-29发布到现在已经非常久的时间了；目前如Tomcat 8、Jetty 9、GlasshFish 4已经支持EL 3.0。新特性包括：如字符串拼接操作符、赋值、分号操作符、对象方法调用、Lambda表达式、静态字段/方法调用、构造器调用、Java8集合操作。目前Glassfish 4/Jetty实现最好，对大多数新特性
超越算法来看待个性化推荐 liyonghui160com 超越算法来看待个性化推荐
一提到个性化推荐，大家一般会想到协同过滤、文本相似等推荐算法，或是更高阶的模型推荐算法，百度的张栋说过，推荐40%取决于UI、30%取决于数据、20%取决于背景知识，虽然本人不是很认同这种比例，但推荐系统中，推荐算法起的作用起的作用是非常有限的。就像任何
写给Javascript初学者的小小建议 pda158 JavaScript
　　一般初学JavaScript的时候最头痛的就是浏览器兼容问题。在Firefox下面好好的代码放到IE就不能显示了，又或者是在IE能正常显示的代码在firefox又报错了。　　如果你正初学JavaScript并有着一样的处境的话建议你：初学JavaScript的时候无视DOM和BOM的兼容性，将更多的时间花在了解语言本身（ECMAScript）。只在特定浏览器编写代码（Chrome/Fi
Java 枚举 ShihLei java enum 枚举
注：文章内容大量借鉴使用网上的资料，可惜没有记录参考地址，只能再传对作者说声抱歉并表示感谢！一基础 1）语法枚举类型只能有私有构造器（这样做可以保证客户代码没有办法新建一个enum的实例）枚举实例必须最先定义 2）特性 &nb
Java SE 6 HotSpot虚拟机的垃圾回收机制 uuhorse java HotSpot GC 垃圾回收 VM
官方资料，关于Java SE 6 HotSpot虚拟机的garbage Collection，非常全，英文。 http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning &