chs_jdmdr

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs OrientDB vs Aerospike vs N

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

(Yes it's a long title, since people kept asking me to write about this and that too :) I do when it has a point.)

While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.)

But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another. This means that it is a bigger responsibility onsoftware architects to choose the appropriate one for a project right at the beginning.

In this light, here is a comparison of Cassandra, Mongodb, CouchDB, Redis, Riak, Couchbase (ex-Membase), Hypertable, ElasticSearch, Accumulo, VoltDB, Kyoto Tycoon,Scalaris, OrientDB, Aerospike, Neo4j and HBase:

The most popular ones

Redis (V3.0RC)

Written in: C
Main point: Blazing fast
License: BSD
Protocol: Telnet-like, binary safe
Disk-backed in-memory database,
Dataset size limited to computer RAM (but can span multiple machines' RAM with clustering)
Master-slave replication, automatic failover
Simple values or data structures by keys
but complex operations like ZREVRANGEBYSCORE.
INCR & co (good for rate limiting or statistics)
Bit operations (for example to implement bloom filters)
Has sets (also union/diff/inter)
Has lists (also a queue; blocking pop)
Has hashes (objects of multiple fields)
Sorted sets (high score table, good for range queries)
Lua scripting capabilities (!)
Has transactions (!)
Values can be set to expire (as in a cache)
Pub/Sub lets one implement messaging

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: To store real-time stock prices. Real-time analytics. Leaderboards. Real-time communication. And wherever you used memcached before.

MongoDB (2.6.7)

Written in: C++
Main point: Retains some friendly properties of SQL. (Query, index)
License: AGPL (Drivers: Apache)
Protocol: Custom, binary (BSON)
Master/slave replication (auto failover with replica sets)
Sharding built-in
Queries are javascript expressions
Run arbitrary javascript functions server-side
Better update-in-place than CouchDB
Uses memory mapped files for data storage
Performance over features
Journaling (with --journal) is best turned on
On 32bit systems, limited to ~2.5Gb
Text search integrated
GridFS to store big data + metadata (not actually an FS)
Has geospatial indexing
Data center aware

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

Cassandra (2.0)

Written in: Java
Main point: Store huge datasets in "almost" SQL
License: Apache
Protocol: CQL3 & Thrift
CQL3 is very similar SQL, but with some limitations that come from the scalability (most notably: no JOINs, no aggregate functions.)
CQL3 is now the official interface. Don't look at Thrift, unless you're working on a legacy app. This way, you can live without understanding ColumnFamilies, SuperColumns, etc.
Querying by key, or key range (secondary indices are also available)
Tunable trade-offs for distribution and replication (N, R, W)
Data can have expiration (set on INSERT).
Writes can be much faster than reads (when reads are disk-bound)
Map/reduce possible with Apache Hadoop
All nodes are similar, as opposed to Hadoop/HBase
Very good and reliable cross-datacenter replication
Distributed counter datatype.
You can write triggers in Java.

Best used: When you need to store data so huge that it doesn't fit on server, but still want a friendly familiar interface to it.

For example: Web analytics, to count hits by hour, by browser, by IP, etc. Transaction logging. Data collection from huge sensor arrays.

ElasticSearch (0.20.1)

Written in: Java
Main point: Advanced Search
License: Apache
Protocol: JSON over HTTP (Plugins: Thrift, memcached)
Stores JSON documents
Has versioning
Parent and children documents
Documents can time out
Very versatile and sophisticated querying, scriptable
Write consistency: one, quorum or all
Sorting by score (!)
Geo distance sorting
Fuzzy searches (approximate date, etc) (!)
Asynchronous replication
Atomic, scripted updates (good for counters, etc)
Can maintain automatic "stats groups" (good for debugging)
Still depends very much on only one developer (kimchy).

Best used: When you have objects with (flexible) fields, and you need "advanced search" functionality.

For example: A dating service that handles age difference, geographic location, tastes and dislikes, etc. Or a leaderboard system that depends on many variables.

Classic document and BigTable stores

CouchDB (V1.2)

Written in: Erlang
Main point: DB consistency, ease of use
License: Apache
Protocol: HTTP/REST
Bi-directional (!) replication,
continuous or ad-hoc,
with conflict detection,
thus, master-master replication. (!)
MVCC - write operations do not block reads
Previous versions of documents are available
Crash-only (reliable) design
Needs compacting from time to time
Views: embedded map/reduce
Formatting views: lists & shows
Server-side document validation possible
Authentication possible
Real-time updates via '_changes' (!)
Attachment handling
thus, CouchApps (standalone js apps)

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

Accumulo (1.4)

Written in: Java and C++
Main point: A BigTable with Cell-level security
License: Apache
Protocol: Thrift
Another BigTable clone, also runs of top of Hadoop
Originally from the NSA
Cell-level security
Bigger rows than memory are allowed
Keeps a memory map outside Java, in C++ STL
Map/reduce using Hadoop's facitlities (ZooKeeper & co)
Some server-side programming

Best used: If you need to restict access on the cell level.

For example: Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

HBase (V0.92.0)

Written in: Java
Main point: Billions of rows X millions of columns
License: Apache
Protocol: HTTP/REST (also Thrift)
Modeled after Google's BigTable
Uses Hadoop's HDFS as storage
Map/reduce with Hadoop
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A high performance Thrift gateway
HTTP supports XML, Protobuf, and binary
Jruby-based (JIRB) shell
Rolling restart for configuration changes and minor upgrades
Random access performance is like MySQL
A cluster consists of several different types of nodes

Best used: Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already.

For example: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

Hypertable (0.9.6.5)

Written in: C++
Main point: A faster, smaller HBase
License: GPL 2.0
Protocol: Thrift, C++ library, or HQL shell
Implements Google's BigTable design
Run on Hadoop's HDFS
Uses its own, "SQL-like" language, HQL
Can search by key, by cell, or for values in column families.
Search can be limited to key/column ranges.
Sponsored by Baidu
Retains the last N historical values
Tables are in namespaces
Map/reduce with Hadoop

Best used: If you need a better HBase.

For example: Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

Graph databases

OrientDB (2.0)

Written in: Java
Main point: Document-based graph database
License: Apache 2.0
Protocol: binary, HTTP REST/JSON, or Java API for embedding
Has transactions, full ACID conformity
Can be used both as a document and as a graph database (vertices with properties)
Both nodes and relationships can have metadata
Multi-master architecture
Supports relationships between documents via persistent pointers (LINK, LINKSET, LINKMAP, LINKLIST field types)
SQL-like query language (Note: no JOIN, but there are pointers)
Web-based GUI (quite good-looking, self-contained)
Inheritance between classes. Indexing of nodes and relationships
User functions in SQL or JavaScript
Sharding
Advanced path-finding with multiple algorithms and Gremlin traversal language
Advanced monitoring, online backups are commercially licensed

Best used: For graph-style, rich or complex, interconnected data.

For example: For searching routes in social relations, public transport links, road maps, or network topologies.

Neo4j (V1.5M02)

Written in: Java
Main point: Graph database - connected data
License: GPL, some features AGPL/commercial
Protocol: HTTP/REST (or embedding in Java)
Standalone, or embeddable into Java applications
Full ACID conformity (including durable data)
Both nodes and relationships can have metadata
Integrated pattern-matching-based query language ("Cypher")
Also the "Gremlin" graph traversal language can be used
Indexing of nodes and relationships
Nice self-contained web admin
Advanced path-finding with multiple algorithms
Indexing of keys and relationships
Optimized for reads
Has transactions (in the Java API)
Scriptable in Groovy
Clustering, replication, caching, online backup, advanced monitoring and High Availability are commercially licensed

Best used: For graph-style, rich or complex, interconnected data.

For example: For searching routes in social relations, public transport links, road maps, or network topologies.

The "long tail"
(Not widely known, but definitely worthy ones)

Couchbase (ex-Membase) (2.0)

Written in: Erlang & C
Main point: Memcache compatible, but with persistence and clustering
License: Apache
Protocol: memcached + extensions
Very fast (200k+/sec) access of data by key
Persistence to disk
All nodes are identical (master-master replication)
Provides memcached-style in-memory caching buckets, too
Write de-duplication to reduce IO
Friendly cluster-management web GUI
Connection proxy for connection pooling and multiplexing (Moxi)
Incremental map/reduce
Cross-datacenter replication

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.

For example: Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga).

Scalaris (0.5)

Written in: Erlang
Main point: Distributed P2P key-value store
License: Apache
Protocol: Proprietary & JSON-RPC
In-memory (disk when using Tokyo Cabinet as a backend)
Uses YAWS as a web server
Has transactions (an adapted Paxos commit)
Consistent, distributed write operations
From CAP, values Consistency over Availability (in case of network partitioning, only the bigger partition works)

Best used: If you like Erlang and wanted to use Mnesia or DETS or ETS, but you need something that is accessible from more languages (and scales much better than ETS or DETS).

For example: In an Erlang-based system when you want to give access to the DB to Python, Ruby or Java programmers.

Aerospike (3.4.1)

Written in: C
Main point: Speed, SSD-optimized storage
License: License: AGPL (Client: Apache)
Protocol: Proprietary
Cross-datacenter replication is commercially licensed
Very fast access of data by key
Uses SSD devices as a block device to store data (RAM + persistence also available)
Automatic failover and automatic rebalancing of data when nodes or added or removed from cluster
User Defined Functions in LUA
Cluster management with Web GUI
Has complex data types (lists and maps) as well as simple (integer, string, blob)
Secondary indices
Aggregation query model
Data can be set to expire with a time-to-live (TTL)
Large Data Types

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.

For example: Storing massive amounts of profile data in online advertising or retail Web sites.

Riak (V1.2)

Written in: Erlang & C, some JavaScript
Main point: Fault tolerance
License: Apache
Protocol: HTTP/REST or custom binary
Stores blobs
Tunable trade-offs for distribution and replication
Pre- and post-commit hooks in JavaScript or Erlang, for validation and security.
Map/reduce in JavaScript or Erlang
Links & link walking: use it as a graph database
Secondary indices: but only one at once
Large object support (Luwak)
Comes in "open source" and "enterprise" editions
Full-text search, indexing, querying with Riak Search
In the process of migrating the storing backend from "Bitcask" to Google's "LevelDB"
Masterless multi-site replication and SNMP monitoring are commercially licensed

Best used: If you want something Dynamo-like data storage, but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server.

VoltDB (2.8.4.1)

Written in: Java
Main point: Fast transactions and rapidly changing data
License: AGPL v3 and proprietary
Protocol: Proprietary
In-memory relational database.
Can export data into Hadoop
Supports ANSI SQL
Stored procedures in Java
Cross-datacenter replication

Best used: Where you need to act fast on massive amounts of incoming data.

For example: Point-of-sales data analysis. Factory control systems.

Kyoto Tycoon (0.9.56)

Written in: C++
Main point: A lightweight network DBM
License: GPL
Protocol: HTTP (TSV-RPC or REST)
Based on Kyoto Cabinet, Tokyo Cabinet's successor
Multitudes of storage backends: Hash, Tree, Dir, etc (everything from Kyoto Cabinet)
Kyoto Cabinet can do 1M+ insert/select operations per sec (but Tycoon does less because of overhead)
Lua on the server side
Language bindings for C, Java, Python, Ruby, Perl, Lua, etc
Uses the "visitor" pattern
Hot backup, asynchronous replication
background snapshot of in-memory databases
Auto expiration (can be used as a cache server)

Best used: When you want to choose the backend storage algorithm engine very precisely. When speed is of the essence.

For example: Caching server. Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before.

Of course, all these systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change.

Discussion on Hacker News

Shameless plug: I'm a freelance software architect (resume), have a look at my services!

P.s.: And no, there's no date on this review. There are version numbers, since I update the databases one by one, not at the same time. And believe me, the basicproperties of databases don't change that much.

nosql数据库技术与应用知识点皆过客，揽星河 NoSQL nosql 数据库大数据数据分析数据结构非关系型数据库
Nosql知识回顾大数据处理流程数据采集(flume、爬虫、传感器)数据存储(本门课程NoSQL所处的阶段)Hdfs、MongoDB、HBase等数据清洗(入仓)Hive等数据处理、分析(Spark、Flink等)数据可视化数据挖掘、机器学习应用(Python、SparkMLlib等)大数据时代存储的挑战(三高)高并发(同一时间很多人访问)高扩展(要求随时根据需求扩展存储)高效率(要求读写速度快)
非关系型数据库天秤-white nosql
一、为什么要用Nosql1.单机MySQL的时代。一个基本的网站访问量一般不会太大，单个数据库完全足够。那时候更多使用的静态网页html，服务器根本没有太大压力。这时候网站的瓶颈是什么？-数据量如果太大，一个机器放不下。-数据量太大需要建立数据的索引（B+Tree），一个服务器内存放不下。-访问量读写混合，一个服务器承受不了。2.memcached缓存+MySQL+垂直拆分（读写分离）。网站80%
[转载] NoSQL简介 weixin_30325793 大数据数据库运维
摘自“百度百科”。NoSQL，泛指非关系型的数据库。随着互联网web2.0网站的兴起，传统的关系数据库在应付web2.0网站，特别是超大规模和高并发的SNS类型的web2.0纯动态网站已经显得力不从心，暴露了很多难以克服的问题，而非关系型的数据库则由于其本身的特点得到了非常迅速的发展。NoSQL数据库的产生就是为了解决大规模数据集合多重数据种类带来的挑战，尤其是大数据应用难题。虽然NoSQL流行语
Apache HBase基础（基本概述，物理架构，逻辑架构，数据管理，架构特点，HBase Shell） May--J--Oldhu HBase HBase shell hbase物理架构 hbase逻辑架构 hbase
NoSQL综述及ApacheHBase基础一.HBase1.HBase概述2.HBase发展历史3.HBase应用场景3.1增量数据-时间序列数据3.2信息交换-消息传递3.3内容服务-Web后端应用程序3.4HBase应用场景示例4.ApacheHBase生态圈5.HBase物理架构5.1HMaster5.2RegionServer5.3Region和Table6.HBase逻辑架构-Row7.
主流行架构 rainbowcheng 架构架构
nexus，gitlab,svn,jenkins,sonar,docker，apollo，catteambition，axure，蓝湖，禅道,WCP；redis，kafka，es，zookeeper，dubbo，shardingjdbc，mysql，InfluxDB，Telegraf，Grafana，Nginx，xxl-job，Neo4j,NebulaGraph是一个高性能的,NOSQL图形数据库
MongoDB ：第五章：MongoDB 插入更新删除查询文档 2401_84558091 作者\/mongodb 数据库
“_id”:ObjectId(“56064f89ade2f21f36b03136”),“title”:“MongoDB”,“description”:“MongoDB是一个Nosql数据库”,“by”:“菜鸟教程”,“url”:“http://www.runoob.com”,“tags”:[“mongodb”,“database”,“NoSQL”],“likes”:100}可以看到标题(title
Redis缓存机制(详解) 就是有缘人 redis 缓存数据库
1.Redis是什么?*redis是*一个运行在内存上的key-value存储系统。是NoSQL数据库之一2.缓存穿透,缓存击穿,缓存雪崩/**缓存穿透*/它会先查询Redis,Redis没有会查询数据库,数据库也没有这就是缓存穿透业界主流解决方案:布隆过滤器布隆过滤器的使用步骤布隆过滤器的使用步骤:1.针对现有所有数据,生成布隆过滤器2.在业务逻辑层,判断Redis之前先检查这个id是否在布隆过
NoSQL之REDIS配置与优化 m0_73868728 nosql redis 数据库
一、Redis简介Redis（RemoteDictionaryServer）是一个开源的、使用C语言编写的NoSQL数据库，它基于内存运行并支持持久化，采用key-value的存储形式。Redis因其高性能、丰富的数据类型支持和原子性操作而广泛应用于缓存、实时分析系统、排行榜等多种场景。二、Redis的安装1.使用包管理器安装对于大多数Linux发行版，可以使用包管理器直接安装Redis。例如，在
hadoop 0.22.0 部署笔记 weixin_33701564 大数据 java 运维
为什么80%的码农都做不了架构师？>>>因为需要使用hbase，所以开始对hbase进行学习。hbase是部署在hadoop平台上的NOSql数据库，因此在部署hbase之前需要先部署hadoop。环境：redhat5、hadoop-0.22.0.tar.gz、jdk-6u13-linux-i586.zipip192.168.1.128hostname：localhost.localdomain（
NoSQL（非关系型数据库）之Redis 花狮66 nosql redis 数据库
目录一、关系型数据库与非关系型数据库1.1关系型数据库1.2非关系型数据库1.3区别1.3.1数据存储方式不同1.3.2扩展方式不同1.4非关系型数据库产生背景二、Redis简介2.1Redis概述2.2Redis优点2.3Redis为什么这么快？总结一数据流向二各自特点一、关系型数据库与非关系型数据库1.1关系型数据库关系型数据库是一个结构化的数据库，创建在关系模型(二维表格模型)基础上，一般面
【Redis基础篇】详细讲解Redis ‍小林同学学JAVA redis数据库 redis 服务器 nosql 缓存数据库 java spring boot
这篇文章让你详细了解Redis的相关知识，有代码讲解以及图片剖析，让你更轻松掌握制作不易，感觉不错，请点赞收藏哟！！！目录1redis基础1.1定义1.2SQL和NOSQL不同点1.3特征1.4Redis通用命令1.5Redis数据结构介绍1.6Redis的java客户端2Jedis快速入门2.1操作步骤2.2Jedis连接池3SpringDataRedis3.1定义3.2优势3.3API3.4操
mongoDB 对一个做了索引的字段，要不要给默认值？ hongkid mongodb 数据库
引言在设计数据库模式时，如何处理字段的默认值是一个值得深入探讨的话题。对于MongoDB这样的NoSQL数据库来说，灵活性是其一大特点，但这同时也意味着开发者需要更加谨慎地考虑数据的一致性和完整性。本文将探讨在一个已创建索引的字段上，在插入文档时是否应该显式设置默认值的问题，并给出具体的建议。MongoDB中的索引与字段设置在MongoDB中，索引可以帮助提高查询性能，特别是在处理大规模数据集时。
大型网站核心架构要素贾欣晓架构架构
文章目录1性能1.1性能优化1.2性能度量2可用性2.1可用性指标2.2可用性目标2.3可用性方案2.4可用性度量3伸缩性3.1伸缩性度量3.2伸缩性方案3.2.1应用服务器集群3.2.2缓存服务器集群3.2.3关系数据库集群3.2.4NoSQL数据库产品4扩展性4.1扩展性度量4.2扩展性方案4.2.1事件驱动架构4.2.2分布式服务5安全性5.1安全性度量6小结关于什么是架构，一种比较通俗的说
Redis总结星空怎样
[toc]Redis是什么Redis是C语言开发的一个开源的高性能键值对(key-value)的内存数据库，可以用作数据库、缓存、消息中间件等。这是一种NoSQL的数据库。Redis作为一个内存数据库：性能优秀，数据在内存中，读写速度非常快，支持并发10WQPS。单进程单线程，是线程安全的，采用IO多了复用机制。丰富的数据类型，支持字符串(strings)、散列(hash)、列表(lists)、集
Hbase的简单使用示例傲雪凌霜，松柏长青后端大数据 hbase 数据库大数据
HBase是基于HadoopHDFS构建的分布式、列式存储的NoSQL数据库，适用于存储和检索超大规模的非结构化数据。它支持随机读写，并且能够处理PB级数据。HBase通常用于实时数据存取场景，与Hadoop生态紧密集成。使用HBase的Java示例前置条件HBase集群：确保HBase集群已经安装并启动。如果没有，你可以通过本地伪分布模式或Docker来运行HBase。Hadoop配置：HBas
Spring常用中间件贺仙姑 spring 中间件 java
1.数据库中间件（1）MySQL:常用的关系型数据库，支持JDBC和JPA。（2）PostgreSQL:功能强大的开源关系型数据库，支持复杂查询。（3）MongoDB:NoSQL数据库，适合存储非结构化数据。（4）Redis:内存数据结构存储，常用于缓存和消息队列。2.消息队列（1）RabbitMQ:开源消息代理，支持多种消息协议，适合异步处理。（2）Kafka:分布式流处理平台，适合处理大规模数
Hive和Hbase的区别傲雪凌霜，松柏长青大数据后端 hive hbase hadoop
Hive和HBase都是Hadoop生态系统中的重要组件，它们都能处理大规模数据，但各自有不同的适用场景和设计理念。以下是两者的主要区别：1.数据模型Hive：Hive类似于传统的关系型数据库(RDBMS)，以表格形式存储数据。它使用SQL-like语言HiveQL来查询和处理数据，数据通常是结构化或半结构化的。HBase：HBase是一个NoSQL数据库，基于Google的BigTable模型。
HBase 傲雪凌霜，松柏长青大数据后端 hbase 数据库大数据
ApacheHBase是一个基于Hadoop分布式文件系统（HDFS）构建的分布式、面向列的NoSQL数据库，主要用于处理大规模、稀疏的表结构数据。HBase的设计灵感来自Google的Bigtable，能够在海量数据中提供快速的随机读写操作，适合需要低延迟和高吞吐量的应用场景。HBase核心概念表（Table）：HBase的数据存储在表中，与传统的关系型数据库不同，HBase的表是面向列族（Co
ES架构及原理李澎昆 ES ES
Elasticsearch是一个兼有搜索引擎和NoSQL数据库功能的开源系统，基于Java/Lucene构建，可以用于全文搜索，结构化搜索以及近实时分析。说明：Lucene：只是一个框架，要充分利用它的功能，需要使用JAVA，并且在程序中集成Lucene，学习成本高，Lucene确实非常复杂。Elasticsearch是面向文档型数据库，这意味着它存储的是整个对象或者文档，它不但会存储它们，还会为
经验笔记：NoSQL数据库及其缓存方法实践漆黑的莫莫数据库笔记 nosql 缓存
NoSQL数据库及其缓存方法实践经验笔记随着大数据时代的到来，传统的关系型数据库在处理大规模数据时面临诸多挑战，如扩展性不足、性能瓶颈等问题。NoSQL数据库因其在可扩展性、灵活性和性能方面的优势，逐渐成为解决这些问题的有效方案之一。本文将探讨NoSQL数据库的基本概念，并分享NoSQL缓存方法的实践经验，特别关注Redis作为缓存的案例分析。一、NoSQL数据库简介NoSQL数据库是非关系型数据
Redis数据类型简介及使用场景空青726 redis 数据库缓存跳槽考研面试后端
Redis是一种开源的、基于内存的、数据结构存储的、可以用作数据库、缓存和消息队列的NoSQL系统。它提供了多种丰富的数据类型，每种数据类型都有其特定的使用场景和优点。在本文中，我们将详细介绍Redis支持的五种基本数据类型：字符串(String)、哈希(Hash)、列表(List)、集合(Set)和有序集合(Zset)，以及三种高级数据类型：HyperLogLog、Bitmap和Geo。1.字符
Hadoop组件静听山水 Hadoop hadoop
这张图片展示了Hadoop生态系统的一些主要组件。Hadoop是一个开源的大数据处理框架，由Apache基金会维护。以下是每个组件的简短介绍：HBase：一个分布式、面向列的NoSQL数据库，基于GoogleBigTable的设计理念构建。HBase提供了实时读写访问大量结构化和半结构化数据的能力，非常适合大规模数据存储。Pig：一种高级数据流语言和执行引擎，用于编写MapReduce任务。Pig
Redis概述 AC编程
一、为什么需要NoSQLHighperformance高并发读写HugeStorage海量数据的高效率存储和访问HighScalability&&HighAvailability高可拓展性和高可用性二、NoSQL数据库的四大分类键值（Key-Value）存储列存储文档数据库图形数据库三、四类NoSQL数据库比较键值（Key-Value）存储相关产品：Redis、Voldemort、TokyoCab
MongoDB | MongoDB 终端查询进击的小白菜数据库 mongodb 数据库
文章目录准备工作基本查询操作连接到MongoDB数据库切换数据库查询所有文档查询特定条件下的文档查询多个字段条件查询并限制结果数量排序结果跳过某些结果实际案例查询示例总结MongoDB是一个高性能、易扩展的文档型NoSQL数据库。本文档将指导你如何使用MongoDB的命令行工具mongoshell来查询存储在MongoDB数据库中的数据。准备工作安装MongoDB:如果你还没有安装MongoDB，
SpringBoot学习（3）Redis使用星河漫漫l springboot 运维开发学习开发语言
SpringBoot对常用的数据库支持外，对Nosql数据库也进行了封装自动化。Redis介绍Redis是目前业界使用最广泛的内存数据存储。相比Memcached，Redis支持更丰富的数据结构，例如hashes,lists,sets等，同时支持数据持久化。除此之外，Redis还提供一些类数据库的特性，比如事务，HA，主从库。可以说Redis兼具了缓存系统和数据库的一些特性，因此有着丰富的应用场景
Spring Data：JPA与Querydsl 光图强 java
JPAJPA是java的一个规范，用于在java对象和数据库之间保存数据，充当面向对象领域模型和数据库之间的桥梁。它使用Hibernate、TopLink、IBatis等ORM框架实现持久性规范。SpringDataSpringData是Spring的一个子项目，用于简化数据库访问，支持NoSql数据和关系数据库。支持的NoSql数据库包括：Mongodb、redis、Hbase、Neo4j。Sp
缓存读写策略 Cache Aside Pattern，开发必备架构师修炼缓存缓存 java 读写策略分布式
我们在前面讲到了当我们业务面临大量写并发的时候，将数据库开发成分布式存储系统，然后又介绍了NoSql数据库与关系型数据库互相配合，以用来更好的服务与我们的业务发展。但随着并发的持续增加，存储数据量的增多，数据库的磁盘IO逐渐成了系统的瓶颈，我们需要一种访问更快的组件来降低请求响应时间，提升整体系统性能，这时我们就会使用到缓存。至于缓存这个概念，这里就不去多说了，我相信大家都懂，也知道它的作用是为了
NoSQL是非关系型数据库潘志杰_34fd
NoSQL是非关系型数据库，NoSQL=NotOnlySQL。关系型数据库采用的结构化的数据，NoSQL采用的是键值对的方式存储数据。在处理非结构化/半结构化的大数据时；在水平方向上进行扩展时；随时应对动态增加的数据项时可以优先考虑使用NoSQL数据库。在考虑数据库的成熟度；支持；分析和商业智能；管理及专业性等问题时，应优先考虑关系型数据库。
如何设计能扩展到1亿用户的系统 Go语言由浅入深
原文地址要设计一个支持数亿用户的系统并不容易。对于软件架构师来说，这是一个很大的挑战(不过今天读完这篇文章之后，就会变得容易了)下面是我在本文中讨论的一些主题。从简单开始：一体机（allinone）扩展的艺术：横向扩展和纵向扩展扩展关系数据库:主-从复制、主-主复制、联合、分片、去范式化和SQL调优。选择哪种数据库：NoSQL还是SQL？高级概念：缓存、CDN、geoDNS等。今天，我不想讨论高性
探索企业级数据库新势力：Redis 基础与进阶刘大帅ps 数据库 redis 缓存运维网络 linux 服务器
目录一.关系型数据库和NoSQL数据库1.1.数据库主要分为两大类：关系型数据库与NoSQL数据库1.2.为什么还要用NoSQL数据库呢？二.RemoteDictionaryServer简介2.1.什么是redis2.2.Redis特性2.3.单线程为何如此快?2.4.Redis应用场景三Redis的安装四.Redis的基本操作一.关系型数据库和NoSQL数据库1.1.数据库主要分为两大类：关系型
[黑洞与暗粒子]没有光的世界 comsci
无论是相对论还是其它现代物理学,都显然有个缺陷,那就是必须有光才能够计算但是,我相信,在我们的世界和宇宙平面中,肯定存在没有光的世界.... 那么,在没有光的世界,光子和其它粒子的规律无法被应用和考察,那么以光速为核心的 &nbs
jQuery Lazy Load 图片延迟加载 aijuans jquery
基于 jQuery 的图片延迟加载插件，在用户滚动页面到图片之后才进行加载。对于有较多的图片的网页，使用图片延迟加载，能有效的提高页面加载速度。版本： jQuery v1.4.4+ jQuery Lazy Load v1.7.2 注意事项：需要真正实现图片延迟加载，必须将真实图片地址写在 data-original 属性中。若 src
使用Jodd的优点 Kai_Ge jodd
1. 简化和统一 controller ，抛弃 extends SimpleFormController ，统一使用 implements Controller 的方式。 2. 简化 JSP 页面的 bind, 不需要一个字段一个字段的绑定。 3. 对 bean 没有任何要求，可以使用任意的 bean 做为 formBean。使用方法简介
jpa Query转hibernate Query 120153216 Hibernate
public List<Map> getMapList(String hql, Map map) { org.hibernate.Query jpaQuery = entityManager.createQuery(hql); if (null != map) { for (String parameter : map.keySet()) { jp
Django_Python3添加MySQL/MariaDB支持 2002wmj mariaDB
现状首先，[email protected] 中默认的引擎为 django.db.backends.mysql 。但是在Python3中如果这样写的话，会发现 django.db.backends.mysql 依赖 MySQLdb[5] ，而 MySQLdb 又不兼容 Python3 于是要找一种新的方式来继续使用MySQL。 MySQL官方的方案首先据MySQL文档[3]说，自从MySQL
在SQLSERVER中查找消耗IO最多的SQL 357029540 SQL Server
返回做IO数目最多的50条语句以及它们的执行计划。 select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (tot
spring UnChecked 异常官方定义！ 7454103 spring
如果你接触过spring的事物管理！那么你必须明白 spring的非捕获异常！即 unchecked 异常！因为 spring 默认这类异常事物自动回滚！！ public static boolean isCheckedException(Throwable ex) { return !(ex instanceof RuntimeExcep
mongoDB 入门指南、示例 adminjun java mongodb 操作
一、准备工作 1、下载mongoDB 下载地址：http://www.mongodb.org/downloads 选择合适你的版本相关文档：http://www.mongodb.org/display/DOCS/Tutorial 2、安装mongoDB A、不解压模式：将下载下来的mongoDB-xxx.zip打开，找到bin目录，运行mongod.exe就可以启动服务，默
CUDA 5 Release Candidate Now Available aijuans CUDA
The CUDA 5 Release Candidate is now available at http://developer.nvidia.com/<wbr></wbr>cuda/cuda-pre-production. Now applicable to a broader set of algorithms, CUDA 5 has advanced fe
Essential Studio for WinRT网格控件测评 Axiba JavaScript html5
Essential Studio for WinRT界面控件包含了商业平板应用程序开发中所需的所有控件，如市场上运行速度最快的grid 和chart、地图、RDL报表查看器、丰富的文本查看器及图表等等。同时，该控件还包含了一组独特的库，用于从WinRT应用程序中生成Excel、Word以及PDF格式的文件。此文将对其另外一个强大的控件——网格控件进行专门的测评详述。网格控件功能 1、
java 获取windows系统安装的证书或证书链 bewithme windows
有时需要获取windows系统安装的证书或证书链，比如说你要通过证书来创建java的密钥库。有关证书链的解释可以查看此处。 public static void main(String[] args) { SunMSCAPI providerMSCAPI = new SunMSCAPI(); S
NoSQL数据库之Redis数据库管理(set类型和zset类型) bijian1013 redis 数据库 NoSQL
4.sets类型 Set是集合，它是string类型的无序集合。set是通过hash table实现的，添加、删除和查找的复杂度都是O(1)。对集合我们可以取并集、交集、差集。通过这些操作我们可以实现sns中的好友推荐和blog的tag功能。 sadd：向名称为key的set中添加元
异常捕获何时用Exception，何时用Throwable bingyingao
用Exception的情况 try { //可能发生空指针、数组溢出等异常 } catch (Exception e) {
【Kafka四】Kakfa伪分布式安装 bit1129 kafka
在http://bit1129.iteye.com/blog/2174791一文中，实现了单Kafka服务器的安装，在Kafka中，每个Kafka服务器称为一个broker。本文简单介绍下，在单机环境下Kafka的伪分布式安装和测试验证 1. 安装步骤 Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不
Project Euler bookjovi haskell
Project Euler是个数学问题求解网站，网站设计的很有意思，有很多problem，在未提交正确答案前不能查看problem的overview，也不能查看关于problem的discussion thread，只能看到现在problem已经被多少人解决了，人数越多往往代表问题越容易。看看problem 1吧： Add all the natural num
Java-Collections Framework学习与总结-ArrayDeque BrokenDreams Collections
表、栈和队列是三种基本的数据结构，前面总结的ArrayList和LinkedList可以作为任意一种数据结构来使用，当然由于实现方式的不同，操作的效率也会不同。这篇要看一下java.util.ArrayDeque。从命名上看
读《研磨设计模式》-代码笔记-装饰模式-Decorator bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.io.BufferedOutputStream; import java.io.DataOutputStream; import java.io.FileOutputStream; import java.io.Fi
Maven学习(一) chenyu19891124 Maven私服
学习一门技术和工具总得花费一段时间，5月底6月初自己学习了一些工具，maven+Hudson+nexus的搭建，对于maven以前只是听说，顺便再自己的电脑上搭建了一个maven环境，但是完全不了解maven这一强大的构建工具，还有ant也是一个构建工具，但ant就没有maven那么的简单方便，其实简单点说maven是一个运用命令行就能完成构建，测试，打包，发布一系列功
[原创]JWFD工作流引擎设计----节点匹配搜索算法(用于初步解决条件异步汇聚问题) 补充 comsci 算法工作 PHP 搜索引擎嵌入式
本文主要介绍在JWFD工作流引擎设计中遇到的一个实际问题的解决方案，请参考我的博文"带条件选择的并行汇聚路由问题"中图例A2描述的情况(http://comsci.iteye.com/blog/339756),我现在把我对图例A2的一个解决方案公布出来，请大家多指点节点匹配搜索算法(用于解决标准对称流程图条件汇聚点运行控制参数的算法) 需要解决的问题：已知分支
Linux中用shell获取昨天、明天或多天前的日期 daizj linux shell 上几年昨天获取上几个月
在Linux中可以通过date命令获取昨天、明天、上个月、下个月、上一年和下一年 # 获取昨天 date -d 'yesterday' # 或 date -d 'last day' # 获取明天 date -d 'tomorrow' # 或 date -d 'next day' # 获取上个月 date -d 'last month' #
我所理解的云计算 dongwei_6688 云计算
在刚开始接触到一个概念时，人们往往都会去探寻这个概念的含义，以达到对其有一个感性的认知，在Wikipedia上关于“云计算”是这么定义的，它说： Cloud computing is a phrase used to describe a variety of computing co
YII CMenu配置 dcj3sjt126com yii
Adding id and class names to CMenu We use the id and htmlOptions to accomplish this. Watch. //in your view $this->widget('zii.widgets.CMenu', array( 'id'=>'myMenu', 'items'=>$this-&g
设计模式之静态代理与动态代理 come_for_dream 设计模式
静态代理与动态代理代理模式是java开发中用到的相对比较多的设计模式，其中的思想就是主业务和相关业务分离。所谓的代理设计就是指由一个代理主题来操作真实主题，真实主题执行具体的业务操作，而代理主题负责其他相关业务的处理。比如我们在进行删除操作的时候需要检验一下用户是否登陆，我们可以删除看成主业务，而把检验用户是否登陆看成其相关业务
【转】理解Javascript 系列 gcc2ge JavaScript
理解Javascript_13_执行模型详解摘要: 在《理解Javascript_12_执行模型浅析》一文中,我们初步的了解了执行上下文与作用域的概念，那么这一篇将深入分析执行上下文的构建过程，了解执行上下文、函数对象、作用域三者之间的关系。函数执行环境简单的代码:当调用say方法时，第一步是创建其执行环境，在创建执行环境的过程中，会按照定义的先后顺序完成一系列操作:1.首先会创建一个
Subsets II hcx2013 set
Given a collection of integers that might contain duplicates, nums, return all possible subsets. Note: Elements in a subset must be in non-descending order. The solution set must not conta
Spring4.1新特性——Spring缓存框架增强 jinnianshilongnian spring4
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
shell嵌套expect执行命令 liyonghui160com
一直都想把expect的操作写到bash脚本里,这样就不用我再写两个脚本来执行了,搞了一下午终于有点小成就,给大家看看吧. 系统:centos 5.x 1.先安装expect yum -y install expect 2.脚本内容: cat auto_svn.sh #!/bin/bash
Linux实用命令整理 pda158 linux
0. 基本命令　　linux 基本命令整理　　1. 压缩解压　　tar -zcvf a.tar.gz a #把a压缩成a.tar.gz 　　tar -zxvf a.tar.gz #把a.tar.gz解压成a 　　2. vim小结　　2.1 vim替换　　:m,ns/word_1/word_2/gc
独立开发人员通向成功的29个小贴士 shoothao 独立开发
概述：本文收集了关于独立开发人员通向成功需要注意的一些东西,对于具体的每个贴士的注解有兴趣的朋友可以查看下面标注的原文地址。明白你从事独立开发的原因和目的。保持坚持制定计划的好习惯。万事开头难，第一份订单是关键。培养多元化业务技能。提供卓越的服务和品质。谨小慎微。营销是必备技能。学会组织，有条理的工作才是最有效率的。 “独立
JAVA中堆栈和内存分配原理 uule java
1、栈、堆 1.寄存器：最快的存储区, 由编译器根据需求进行分配,我们在程序中无法控制.2. 栈：存放基本类型的变量数据和对象的引用，但对象本身不存放在栈中，而是存放在堆（new 出来的对象）或者常量池中（字符串常量对象存放在常量池中。）3. 堆：存放所有new出来的对象。4. 静态域：存放静态成员（static定义的）5. 常量池：存放字符串常量和基本类型常量（public static f