linuxheik

分布式开源库介绍

1.有些系统的功能可能重复
比如reids既是KV数据库，也可以是缓存系统，还可以是消息分发系统
将来考虑再以什么样的形式去合并，使归纳更准确。

2.将来会做个索引，现在东西太多，导致看的很麻烦

[集群管理]

mesos

Program against your datacenter like it’s a single pool of resources

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

What is Mesos?

A distributed systems kernel

Mesos is built using the same principles as the Linux kernel, only at a different level of abstraction. The Mesos kernel runs on every machine and provides applications (e.g., Hadoop, Spark, Kafka, Elastic Search) with API’s for resource management and scheduling across entire datacenter and cloud environments.

Mesos Getting Started

Apache Mesos是一个集群管理器，提供了有效的、跨分布式应用或框架的资源隔离和共享，可以运行Hadoop、MPI、Hypertable、Spark。

特性：

Fault-tolerant replicated master using ZooKeeper
Scalability to 10,000s of nodes
Isolation between tasks with Linux Containers
Multi-resource scheduling (memory and CPU aware)
Java, Python and C++ APIs for developing new parallel applications
Web UI for viewing cluster state

书籍
深入浅出Mesos

深入浅出Mesos（一）：为软件定义数据中心而生的操作系统
深入浅出Mesos（二）：Mesos的体系结构和工作流
深入浅出Mesos（三）：持久化存储和容错
深入浅出Mesos（四）：Mesos的资源分配
深入浅出Mesos（五）：成功的开源社区
深入浅出Mesos（六）：亲身体会Apache Mesos
Apple使用Apache Mesos重建Siri后端服务
Singularity：基于Apache Mesos构建的服务部署和作业调度平台
Autodesk基于Mesos的可扩展事件系统
Myriad项目: Mesos和YARN 协同工作

[RPC]

hprose : github

High Performance Remote Object Service Engine

是一款先进的轻量级、跨语言、跨平台、无侵入式、高性能动态远程对象调用引擎库。它不仅简单易用，而且功能强大。构建分布式应用系统。

protocolbuffer

Protocol Buffers - Google's data interchange format

相关网页
https://github.com/google/protobuf
https://developers.google.com/protocol-buffers/

grpc:github

Overview

Remote Procedure Calls (RPCs) provide a useful abstraction for building distributed applications and services. The libraries in this repository provide a concrete implementation of the gRPC protocol, layered over HTTP/2. These libraries enable communication between clients and servers using any combination of the supported languages.

The Go implementation of gRPC: A high performance, open source, general RPC framework that puts mobile and HTTP/2 first. For more information see the gRPC Quick Start guide.

Doc

thrift

The Apache Thrift software framework, for scalable cross-language services development,
combines a software stack with a code generation engine 
to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.

Document

Tutorial

Thrift 是一个软件框架（远程过程调用框架），用来进行可扩展且跨语言的服务的开发。它结合了功能强大的软件堆栈和代码生成引 擎，以构建在 C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml 这些编程语言间无缝结合的、高效的服务。

thrift最初由facebook开发，07年四月开放源码，08年5月进入apache孵化器，现在是 Apache 基金会的顶级项目

thrift允许你定义一个简单的定义文件中的数据类型和服务接口，以作为输入文件，编译器生成代码用来方便地生成RPC客户端和服务器通信的无缝跨编程语言。。

著名的 Key-Value 存储服务器 Cassandra 就是使用 Thrift 作为其客户端API的。

[messaging systems分布式消息]

Kafka

Apache Kafka is publish-subscribe messaging rethought(rethink 过去式和过去分词)as a distributed commit log.

- Fast
    A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.

- Scalable
    Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization.
    It can be elastically and transparently expanded without downtime.
    Data streams are partitioned and spread over a cluster of machines to allow data streams larger than 
    the capability of any single machine and to allow clusters of co-ordinated consumers

- Durable
    Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.

- Distributed by Design
    Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.

introduction

kafka是一种高吞吐量的分布式发布订阅消息系统，特性如下：

- 通过O(1)的磁盘数据结构提供消息的持久化，这种结构对于即使数以TB的消息存储也能够保持长时间的稳定性能。
- 高吞吐量：即使是非常普通的硬件kafka也可以支持每秒数十万的消息。
- 支持通过kafka服务器和消费机集群来分区消息。
- 支持Hadoop并行数据加载。

kafka的目的是提供一个发布订阅解决方案，它可以处理消费者规模的网站中的所有动作流数据。

这种动作（网页浏览，搜索和其他用户的行动）是在现代网络上的许多社会功能的一个关键因素。

这些数据通常是由于吞吐量的要求而通过处理日志和日志聚合来解决。

对于像Hadoop的一样的日志数据和离线分析系统，但又要求实时处理的限制，这是一个可行的解决方案。

kafka的目的是通过Hadoop的并行加载机 制来统一线上和离线的消息处理，也是为了通过集群机来提供实时的消费。

NATS

NATS is an open-source, high-performance, lightweight cloud native messaging system

gnatsd Github:A High Performance NATS Server written in Go.

cnats Github:A C client for the NATS messaging system.

NATS Github:Golang client for NATS, the cloud native messaging system

Cloud Native Infrastructure(基础建设，基础设施). Open Source. Performant（高性能）. Simple. Scalable.

NATS acts as a central nervous system for distributed systems at scale, such as mobile devices, IoT networks,
and cloud native infrastructure. **Written in Go**,
NATS powers some of the largest cloud platforms in production today. 
Unlike traditional enterprise messaging systems, 
NATS has an always-on dial tone that does whatever it takes to remain available.
NATS was created by Derek Collison, 
Founder/CEO of Apcera who has spent 20+ years designing, building,
and using publish-subscribe messaging systems.

documentation

NATS is a Docker Official Image

NATS is the most Performant Cloud Native messaging platform available
With gnatsd (Golang-based server), NATS can send up to 6 MILLION MESSAGES PER SECOND.

[缓存服务器，代理服务器，负载均衡]

memcached

memcached 是高性能的分布式内存缓存服务器。一般的使用目的是，通过缓存数据库查询结果，减少数据库访问次数，以提高动态 Web 应用的速度、提高可扩展性。

What is Memcached?
Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. 

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

Memcached is simple yet powerful. Its simple design promotes quick deployment, ease of development, and solves many problems facing large data caches. Its API is available for most popular languages.

nginx

nginx [engine x] is an HTTP and reverse proxy server, a mail proxy server, and a generic TCP proxy server, originally written by Igor Sysoev. For a long time, it has been running on many heavily loaded Russian sites including Yandex, Mail.Ru, VK, and Rambler. According to Netcraft, nginx served or proxied 23.36% busiest sites in September 2015. Here are some of the success stories: Netflix, Wordpress.com, FastMail.FM.

The sources and documentation are distributed under the 2-clause BSD-like license.

Document

Now with support for HTTP/2, massive performance and security enhancements,
greater visibility into application health, and more.

redis

Redis is an open source (BSD licensed), in-memory data structure store, used as database, cache and message broker(代理人，经纪人).

It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. 

Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

try redis

[分布式并行计算框架]

mapreduce

MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster.

Conceptually similar approaches have been very well known since 1995 with the Message Passing Interface standard having reduce and scatter operations.

相关web
https://en.wikipedia.org/wiki/MapReduce
http://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/

About MapReduce

MapReduce is the heart of Hadoop®. It is this programming paradigm that allows for massive scalability across
hundreds or thousands of servers in a Hadoop cluster.
The MapReduce concept is fairly simple to understand for those who are familiar with clustered scale-out data
processing solutions.

For people new to this topic, it can be somewhat difficult to grasp, because it’s not typically something people have been exposed to previously.
If you’re new to Hadoop’s MapReduce jobs, don’t worry: we’re going to describe it in a way that gets you up
to speed quickly.

The term MapReduce actually refers to two separate and distinct tasks that Hadoop programs perform.
The first is the map job, which takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (key/value pairs). 
The reduce job takes the output from a map as input and combines those data tuples into a smaller set of tuples.
As the sequence of the name MapReduce implies, the reduce job is always performed after the map job.

MapReduce Tutorial

spark

Apache Spark™ is a fast and general engine for large-scale data processing.

Document

Programming Guides:

Quick Start:
a quick introduction to the Spark API; start here!
Spark Programming Guide:
detailed overview of Spark in all supported languages (Scala, Java, Python, R)

Deployment Guides:

Cluster Overview:
overview of concepts and components when running on a cluster
Submitting Applications:
packaging and deploying applications
Deployment modes:
- Amazon EC2: scripts that let you launch a cluster on EC2 in about 5 minutes
- Standalone Deploy Mode: launch a standalone cluster quickly without a third-party cluster manager
- Mesos: deploy a private cluster using Apache Mesos
- YARN: deploy Spark on top of Hadoop NextGen (YARN)

storm

Why use Storm?

Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!

Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.

Storm integrates with the queueing and database technologies you already use. A Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed. Read more in the tutorial.

Document

Storm (event processor)
Apache Storm is a distributed computation framework written predominantly in the Clojure programming language. Originally created by Nathan Marz[1] and team at BackType,[2] the project was open sourced after being acquired by Twitter.[3] It uses custom created "spouts" and "bolts" to define information sources and manipulations to allow batch, distributed processing of streaming data. The initial release was on 17 September 2011.[4]

A Storm application is designed as a "topology" in the shape of a directed acyclic graph (DAG) with spouts and bolts acting as the graph vertices. Edges on the graph are named streams and direct data from one node to another. Together, the topology acts as a data transformation pipeline. At a superficial level the general topology structure is similar to a MapReduce job, with the main difference being that data is processed in real-time as opposed to in individual batches. Additionally, Storm topologies run indefinitely until killed, while a MapReduce job DAG must eventually end.[5]

Storm became an Apache Top-Level Project in September 2014[6] and was previously in incubation since September 2013.[7][8]

《Storm Applied》书籍
Storm是一个分布式、容错的实时计算系统，最初由BackType开发，后来Twitter收购BackType后将其开源

hadoop

hadoop是开源的、可靠、可扩展、分布式并行计算框架
主要组成：分布式文件系统 HDFS 和 MapReduce 算法执行

HDFS Architecture Guide

What Is Apache Hadoop?

The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

```c
The project includes these modules:

Hadoop Common: The common utilities that support the other Hadoop modules.
Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
Hadoop YARN: A framework for job scheduling and cluster resource management.
Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.

Other Hadoop-related projects at Apache include:

Ambari™: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard for viewing cluster health such as heatmaps and ability to view MapReduce, Pig and Hive applications visually alongwith features to diagnose their performance characteristics in a user-friendly manner.
Avro™: A data serialization system.
Cassandra™: A scalable multi-master database with no single points of failure.
Chukwa™: A data collection system for managing large distributed systems.
HBase™: A scalable, distributed database that supports structured data storage for large tables.
Hive™: A data warehouse infrastructure(基础设施)that provides data summarization(概要) and ad hoc querying.

Ad Hoc Query：是指用户根据当时的需求而即刻定义的查询。是一种条件不固定、格式灵活的查询报表，可以提供给用户更多的交互方式。

Hive是基于Hadoop的数据仓库解决方案。由于Hadoop本身在数据存储和计算方面有很好的可扩展性和高容错性，因此使用Hive构建的数据仓库也秉承了这些特性。

简单来说，Hive就是在Hadoop上架了一层SQL接口，可以将SQL翻译成MapReduce去Hadoop上执行，这样就使得数据开发和分析人员很方便的使用SQL来完成海量数据的统计和分析，而不必使用编程语言开发MapReduce那么麻烦。

Mahout™: A Scalable machine learning and data mining library.
Pig™: A high-level data-flow language and execution framework for parallel computation.
Spark™: A fast and general compute engine for Hadoop data. Spark provides a simple and expressive programming model that supports a wide range of applications, including ETL, machine learning, stream processing, and graph computation.
Tez™: A generalized data-flow programming framework, built on Hadoop YARN, which provides a powerful and flexible engine to execute an arbitrary DAG of tasks to process data for both batch and interactive use-cases. Tez is being adopted by Hive™, Pig™ and other frameworks in the Hadoop ecosystem, and also by other commercial software (e.g. ETL tools), to replace Hadoop™ MapReduce as the underlying execution engine.
ZooKeeper™: A high-performance coordination service for distributed applications.

[Getting Started]

Learn about Hadoop by reading the documentation.


简而言之，Hadoop 提供了一个稳定的共享存储和分析系统。存储由 HDFS 实现，分析由 MapReduce 实现。纵然 Hadoop 还有其他功能，但这些功能是它的核心所在。

1.3.1  关系型数据库管理系统
为什么我们不能使用数据库加上更多磁盘来做大规模的批量分析？为什么我们需要MapReduce？

这个问题的答案来自于磁盘驱动器的另一个发展趋势：寻址时间的提高速度远远慢于传输速率的提高速度。寻址就是将磁头移动到特定位置进行读写操作的工序。它的特点是磁盘操作有延迟，而传输速率对应于磁盘的带宽。

如果数据的访问模式受限于磁盘的寻址，势必会导致它花更长时间(相较于流)来读或写大部分数据。
另一方面，在更新一小部分数据库记录的时候，传统的 B 树(关系型数据库中使用的一种数据结构，受限于执行查找的速度)效果很好。

但在更新大部分数据库数据的时候，B 树的效率就没有 MapReduce 的效率高，因为它需要使用排序/合并来重建数据库。

在许多情况下，MapReduce 能够被视为一种 RDBMS(关系型数据库管理系统)的补充。(两个系统之间的差异见表 1-1)。

MapReduce 很适合处理那些需要分析整个数据集的问题，以批处理的方式，尤其是 Ad Hoc(自主或即时)分析。
RDBMS 适用于点查询和更新(其中，数据集已经被索引以提供低延迟的检索和短时间的少量数据更新。
MapReduce适合数据被一次写入和多次读取的应用，而关系型数据库更适合持续更新的数据集。

表 1-1：关系型数据库和 MapReduce 的比较

	传统关系型数据库	MapReduce
数据大小	GB	PB
访问	交互型和批处理	批处理
更新	多次读写	一次写入多次读取
结构	静态模式	动态模式
集成度	高	低
伸缩性	非线性	线性

MapReduce 和关系型数据库之间的另一个区别是它们操作的数据集中的结构化数据的数量。结构化数据是拥有准确定义的实体化数据，具有诸如 XML 文档或数据库表定义的格式，符合特定的预定义模式。这就是 RDBMS 包括的内容。

另一方面，半结构化数据比较宽松，虽然可能有模式，但经常被忽略，所以它只能用作数据结构指南。例如，一张电子表格，其中的结构便是单元格组成的网格，尽管其本身可能保存任何形式的数据。
非结构化数据没有什么特别的内部结构，例如纯文本或图像数据。MapReduce 对于非结构化或半结构化数据非常有效，因为它被设计为在处理时间内解释数据。

换句话说：MapReduce 输入的键和值并不是数据固有的属性，它们是由分析数据的人来选择的。

关系型数据往往是规范的，以保持其完整性和删除冗余。规范化为 MapReduce 带来问题，因为它使读取记录成为一个非本地操作，并且 MapReduce 的核心假设之一就是，它可以进行(高速)流的读写。

MapReduce 是一种线性的可伸缩的编程模型。程序员编写两个函数 map()和Reduce()每一个都定义一个键/值对集映射到另一个。
这些函数无视数据的大小或者它们正在使用的集群的特性，这样它们就可以原封不动地应用到小规模数据集或者大的数据集上。
更重要的是，如果放入两倍的数据量，运行的时间会少于两倍。但是如果是两倍大小的集群，一个任务仍然只是和原来的一样快。这不是一般的 SQL 查询的效果。

随着时间的推移，关系型数据库和 MapReduce 之间的差异很可能变得模糊。关系型数据库都开始吸收 MapReduce 的一些思路(如 ASTER DATA 的和 GreenPlum 的数据库)，
另一方面，基于 MapReduce 的高级查询语言(如 Pig 和 Hive)使 MapReduce 的系统更接近传统的数据库编程人员。

[NoSQL数据库 + KeyValue数据库]

8 种 NoSQL 数据库系统对比

ScyllaDB

NoSQL data store using the seastar framework, compatible with Apache Cassandra
http://scylladb.com

http://blog.jobbole.com/93027/
ScyllaDB：用 C++ 重写后的 Cassandra ，性能提高了十倍
最核心的两项技术: Intel的DPDK驱动框架和Seastar网络框架

cassandra

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates, strong support for denormalization and materialized views, and powerful built-in caching.

GettingStarted

About Apache Cassandra

This guide provides information for developers and administrators on installing, configuring, and using the features and capabilities of Cassandra.

What is Apache Cassandra?

Apache Cassandra™ is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple data centers and the cloud. Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers with no single point of failure, along with a powerful dynamic data model designed for maximum flexibility and fast response times.

How does Cassandra work?

Cassandra’s built-for-scale architecture means that it is capable of handling petabytes of information and thousands of concurrent users/operations per second.

http://www.ibm.com/developerworks/cn/opensource/os-cn-cassandra/index.html

Apache Cassandra 是一套开源分布式 Key-Value 存储系统。它最初由 Facebook 开发，用于储存特别大的数据。 Cassandra 不是一个数据库，它是一个混合型的非关系的数据库，类似于 Google 的 BigTable。
本文主要从以下五个方面来介绍 Cassandra：Cassandra 的数据模型、安装和配制 Cassandra、常用编程语言使用 Cassandra 来存储数据、Cassandra 集群搭建。

http://docs.datastax.com/en/cassandra/2.0/cassandra/gettingStartedCassandraIntro.html

etcd

etcd是一个用于配置共享和服务发现的高性能的键值存储系统。
A highly-available key value store for shared configuration and service discovery

Overview

etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles master elections during network partitions and will tolerate machine failure, including the master.

Your applications can read and write data into etcd. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. These values can be watched, allowing your app to reconfigure itself when they change.

Advanced uses take advantage of the consistency guarantees to implement database master elections or do distributed locking across a cluster of workers.

Getting Started with etcd

ceph

Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability.

- Object Storage    
Ceph provides seamless access to objects using native language bindings or radosgw, a REST interface that’s compatible with applications written for S3 and Swift.

- Block Storage   
Ceph’s RADOS Block Device (RBD) provides access to block device images that are striped and replicated across the entire storage cluster.

- File System   
Ceph provides a POSIX-compliant network file system that aims for high performance, large data storage, and maximum compatibility with legacy applications.


#### [Document](http://docs.ceph.com/docs/v0.80.5/)
Ceph uniquely delivers object, block, and file storage in one unified system.



#### [Intro to Ceph](http://docs.ceph.com/docs/v0.80.5/start/intro/)
Whether you want to provide Ceph Object Storage and/or Ceph Block Device services to Cloud Platforms, 
deploy a Ceph Filesystem or use Ceph for another purpose,all Ceph Storage Cluster deployments begin with setting up each Ceph Node, your network and the Ceph Storage Cluster. 

A Ceph Storage Cluster requires at least one Ceph Monitor and at least two Ceph OSD Daemons.
The Ceph Metadata Server is essential when running Ceph Filesystem clients.

ceph: 一个PB规模的 Linux 分布式文件系统

Ceph的主要目标是设计成基于POSIX的没有单点故障的分布式文件系统，使数据能容错和无缝的复制。2010年3 月，Linus Torvalds将Ceph client合并到内核2.6.34中。IBM开发者园地的一篇文章探讨了Ceph的架构，它的容错实现和简化海量数据管理的功能。

[网络框架]

seastar

High performance server-side application framework（c++开发）,是[scylla](https://github.com/scylladb/scylla)的网络框架

SeaStar is an event-driven framework allowing you to write non-blocking, asynchronous code in a relatively straightforward manner (once understood). It is based on futures.

POCO : github

POCO C++ Libraries-Cross-platform C++ libraries with a network/internet focus.

POrtable COmponents C++ Libraries are:

A collection of C++ class libraries, conceptually similar to the Java Class Library, the .NET Framework or Apple’s Cocoa.
Focused on solutions to frequently-encountered practical problems.
Focused on ‘internet-age’ network-centric applications.
Written in efficient, modern, 100% ANSI/ISO Standard C++.
Based on and complementing the C++ Standard Library/STL.
Highly portable and available on many different platforms.
Open Source, licensed under the Boost Software License.

对于c++11 STL支持线程 + string支持UTF8，跨平台已经不是梦了。我看好这个。

[分布式文件系统 + 存储 ]

hbase

Apache HBase™ is the Hadoop database, a distributed, scalable, big data store

When Would I Use Apache HBase?

Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.

ceph

Ceph is a scalable distributed storage system
Ceph is a distributed object, block, and file storage platform

Ceph是一个 Linux PB 级分布式文件系统

  Ceph的主要目标是设计成基于POSIX的没有单点故障的分布式文件系统，使数据能容错和无缝的复制。
2010年3 月，Linus Torvalds将Ceph client合并到内 核2.6.34中。
IBM开发者园地的一篇文章 探讨了Ceph的架构，它的容错实现和简化海量数据管理的功能。

gcsfuse

A user-space file system for interacting with Google Cloud Storage。

使用 Go 编写，基于 [Google Cloud Storage](https://cloud.google.com/storage/) 接口的 File系统。
目前是beta版本，可能有潜伏bug，接口修改 不向下兼容。

Seafile

使用 c 编写, 云存储平台
Seafile is an open source cloud storage system with features on privacy protection and teamwork.

Goofys

Goofys 是使用 Go 编写，基于 [S3](https://aws.amazon.com/s3/) 接口的 Filey 系统。
Goofys 允许你挂载一个 s3 bucket 作为一个 Filey 系统。为什么是 Filey 系统而不是 File 系统？因为 goofys 优先考虑性能而不是 POSIX

[其他]

HDFS

HDFS和KFS 比较
两者都是GFS的开源实现，而HDFS 是Hadoop 的子项目，用Java实现，为Hadoop上层应用提供高吞吐量的可扩展的大文件存储服务。

Kosmos filesystem（KFS） is a high performance distributed filesystem for web-scale applications such as,
storing log data, Map/Reduce data etc.
It builds upon ideas from Google‘s well known Google Filesystem project. 用C++实现

mooseFS

Lustre

TFS : 淘宝自己都不用了，2011年就停止更新了

mogileFS : github

FastDFS: github

FastDFS is an open source high performance distributed file system (DFS). 
It's major functions include: file storing, file syncing and file accessing, and design for high capacity and load balance. 

FastDFS是一款类Google FS的开源分布式文件系统，它用纯C语言实现，支持Linux、FreeBSD、AIX等UNIX系统。
它只能通过专有API对文件进行存取访问，不支持POSIX接口方式，不能mount使用。
准确地讲，Google FS以及FastDFS、mogileFS、HDFS、TFS等类Google FS都不是系统级的分布式文件系统，
而是应用级的分布式文件存储服务。

FastDFS是一个开源的轻量级分布式文件系统，它对文件进行管理，
功能包括：文件存储、文件同步、文件访问（文件上传、文件下载）等，解决了大容量存储和负载均衡的问题。
特别适合以文件为载体的在线服务，如相册网站、视频网站等等。

gcsfuse

gcsfuse is a user-space file system for interacting with Google Cloud Storage.

Document

GCS Fuse

GCS Fuse is an open source Fuse adapter that allows you to **mount Google Cloud Storage buckets as file systems on Linux or OS X systems**. 

GCS Fuse can be run anywhere with connectivity to Google Cloud Storage (GCS) including Google Compute Engine VMs or on-premises systems.

GCS Fuse provides another means to access Google Cloud Storage objects in addition to the XML API,
JSON API, and the gsutil command line,
allowing even more applications to use Google Cloud Storage and take advantage of its immense scale, high availability, rock-solid durability,
exemplary performance, and low overall cost. GCS Fuse is a Google-developed and community-supported open-source tool, written in Go and hosted on GitHub.

GCS Fuse is open-source software, released under the Apache License.

It is distributed as-is, without warranties or conditions of any kind.

Best effort community support is available on Server Fault with the google-cloud-platform and gcsfuse tags.

Check the previous questions and answers to see if your issue is already answered. For bugs and feature requests, file an issue.

Technical Overview

GCS Fuse works by translating object storage names into a file and directory system, interpreting the “/” character in object names as a directory separator so that objects with the same common prefix are treated as files in the same directory. Applications can interact with the mounted bucket like any other file system, providing virtually limitless file storage running in the cloud, but accessed through a traditional POSIX interface.

While GCS Fuse has a file system interface, it is not like an NFS or CIFS file system on the backend. 
GCS Fuse retains the same fundamental characteristics of Google Cloud Storage, preserving the scalability of Google Cloud Storage in terms of size and aggregate performance while maintaining the same latency and single object performance. As with the other access methods, Google Cloud Storage does not support concurrency and locking. For example, if multiple GCS Fuse clients are writing to the same file, the last flush wins.

For more information about using GCS Fuse or to file an issue, go to the Google Cloud Platform GitHub repository.

In the repository, we recommend you review README, semantics, installing, and mounting.

When to use GCS Fuse

GCS Fuse is a utility that helps you make better and quicker use of Google Cloud Storage by allowing file-based applications to use Google Cloud Storage without need for rewriting their I/O code. It is ideal for use cases where Google Cloud Storage has the right performance and scalability characteristics for an application and only the POSIX semantics are missing.

For example, GCS Fuse will work well for genomics and biotech applications, some media/visual effects/rendering applications, financial services modeling applications, web serving content, FTP backends, and applications storing log files (presuming they do not flush too frequently).

support

GCS Fuse is supported in Linux kernel version 3.10 and newer. To check your kernel version, you can use uname -a.

Current status

Please treat gcsfuse as beta-quality software. Use it for whatever you like, but be aware that bugs may lurk(潜伏), and that we reserve（保留）the right to make small backwards-incompatible changes.（保留权力 做不向后兼容的修改）

The careful user should be sure to read semantics.md for information on how gcsfuse maps file system operations to GCS operations, and especially on surprising behaviors. The list of open issues may also be of interest.

Goofys

Goofys is a Filey-System interface to [S3](https://aws.amazon.com/s3/)

Overview

Goofys allows you to mount an S3 bucket as a filey system.

It's a Filey System instead of a File System because goofys strives for performance first and POSIX second. Particularly things that are difficult to support on S3 or would translate into more than one round-trip would either fail (random writes) or faked (no per-file permission). Goofys does not have a on disk data cache, and consistency model is close-to-open.

Seafile : github

    Seafile is an open source cloud storage system with features on privacy protection and teamwork. Collections of files are called libraries, and each library can be synced separately. A library can also be encrypted with a user chosen password. Seafile also allows users to create groups and easily sharing files into groups.

Introduction Build Status

Seafile is an open source cloud storage system with features on privacy protection and teamwork. Collections of files are called libraries, and each library can be synced separately. A library can also be encrypted with a user chosen password. Seafile also allows users to create groups and easily sharing files into groups.

Feature Summary

Seafile has the following features:

File syncing

Selective synchronization of file libraries. Each library can be synced separately.
Correct handling of file conflicts based on history instead of timestamp.
Only transfering contents not in the server, and incomplete transfers can be resumed.
Sync with two or more servers.
Sync with existing folders.
Sync a sub-folder.

Sharing libraries between users or into groups.
Sharing sub-folders between users or into groups.
Download links with password protection
Upload links
Version control with configurable revision number.
Restoring deleted files from trash, history or snapshots.

Privacy protection

Library encryption with a user chosen password.
Client side encryption when using the desktop syncing.

Internal

Seafile's version control model is based on Git, but it is simplified for automatic synchronization does not need Git installed to run Seafile. Each Seafile library behaves like a Git repository. It has its own unique history, which consists of a list of commits. A commit points to the root of a file system snapshot. The snapshot consists of directories and files. Files are further divided into blocks for more efficient network transfer and storage usage.

Differences from Git:

Automatic synchronization.
Clients do not store file history, thus they avoid the overhead of storing data twice. Git is not efficient for larger files such as images.
Files are further divided into blocks for more efficient network transfer and storage usage.
File transfer can be paused and resumed.
Support for different storage backends on the server side.
Support for downloading from multiple block servers to accelerate file transfer.
More user-friendly file conflict handling. (Seafile adds the user's name as a suffix to conflicting files.)
Graceful handling of files the user modifies while auto-sync is running. Git is not designed to work in these cases.

《流式大数据处理的三种框架：Storm，Spark和Samza》
许多分布式计算系统都可以实时或接近实时地处理大数据流。
本文将对三种Apache框架分别进行简单介绍，然后尝试快速、高度概述其异同。

Cloudera 将发布新的开源储存引擎 Kudu ，大数据公司 Cloudera 正在开发一个大型的开源储存引擎 Kudu，用于储存和服务大量不同类型的非结构化数据。

分类: Distributed Systems

你可能感兴趣的:(云数据库)

Android MVVM 架构应用实现(2) 渊Y 程序员 android 架构
Repository类：实现BmobRepository类，作为HomeViewModel的数据提供方。BmobRepository类中有一个挂起函数getAllRecommendLibrary(libraryRecommendData:MutableLiveData)用来获取云数据库中的数据，函数的参数是LiveData，在获取数据后，利用setValue通知View展示数据。classBmob
MongoDB Atlas与LangChain集成指南 afTFODguAKBF mongodb langchain 数据库 python
引言MongoDBAtlas是一款全托管的云数据库解决方案,可在AWS、Azure和GCP上使用。最新版本支持在MongoDB文档数据上进行原生向量搜索。本文将介绍如何使用LangChain将MongoDBAtlas与语言模型集成,以实现高效的向量搜索和语义缓存。安装和设置1.安装langchain-mongodb包pipinstalllangchain-mongodb向量存储LangChain提
阿里云“99计划”是什么？“99计划”有哪些特惠云产品？价格是多少？阿里云最新优惠和活动汇总
2024年，阿里云推出了“99计划”，该计划是阿里云为了助力中小企业无忧上云而推出的特惠活动，“99计划”为初创企业准备的上云首选必备产品，让客户享受技术红利，长期普惠上云，新老同享，续费同价。包含的云产品有云服务器e实例和u1实例、对象存储OSS、NAS文件存储、阿里云盘企业版CDE、SLS日志服务、云数据库RDSMySQL版、云数据库RDSPostgreSQL版、云数据库RDSSQLServe
华为云发布《云原生2.0架构白皮书》，GaussDB技术再升级是怼怼呀11 云原生数据库
近期，在华为伙伴暨开发者大会2022，华为云CTO张宇昕发布了《云原生2.0架构白皮书》，包括云原生数据库在内，介绍了云原生2.0的关键特征、架构模式，以及优秀实践，为企业数字化升级注入了云原生2.0新动力。华为云数据库首席架构师冯柯也在会上分享了云原生数据库HTAP重大特性商用，通过极致混合负载能力和及时精准的数据分析，助力企业商业决策。华为云CTO张宇昕在会上发表云原生2.0重要演讲云原生数据
SQL数据库分层模板代码(建议根据所需进行调整) 巴依老爷coder 数据库 sql oracle
SQL数据库分层模板代码SQL分层模板代码1.CreateTable1.1ODS层建表1.2DWD层建表1.3DWS层建表1.4.ADS层建表2.CreateProcedure2.1DWD层存储过程2.2DWS层存储过程2.3ADS层存储过程3.CreateEvent3.1.DWD层定时任务3.2DWS层定时任务3.3ADS层定时任务3.4测试定时任务4.云数据库MySQL数据写入脚本4.1DWD
终于有人把云计算与数据库的关系讲明白了大数据v 数据库大数据人工智能 java python
导读：本文讨论云计算与数据库的关系，包括云数据库自身的技术和特征，也包括云数据库的使用方式和形态变迁。作者：李海翔来源：大数据DT（ID：hzdashuju）2006年Google的CEO埃里克·施密特首次提出了云计算（CloudComputing）的概念。2011年，哥伦比亚大学的Prof.Stolfo教授提出雾计算（FogComputing），后被思科公司理论化。云计算是集中式计算，埃森哲（A
云计算之云数据库 weixin_34320724 数据库
云数据库：架构在云端数据库集群上，通过云服务的方式让关系型数据库的可靠性更高，免去繁琐的维护工作，节约硬件成本，其具备以下特点：云数据库特点：1：管理方便：可以自动备份、弹性扩展。2：性能出色：针对数据库高性能需求，采用高端高性能硬件配置，同时对数据库性能参数做了特定的优化。3：配置简单：轻轻松松1分钟搭建主从等高可用架构。
腾讯，干掉 Redis 项目，正式开源、太牛逼啦六月·飞雪架构 redis 开源数据库
项目简介Tendis是腾讯互娱CROSDBA团队&腾讯云数据库团队自主设计和研发的分布式高性能KV存储数据库，兼容Redis核心数据结构与接口，可提供大容量、低成本、强持久化的数据库能力，适用于兼容Redis协议、需要大容量且较高访问性能的温冷数据存储场景。Tendis目前已经被应用到腾讯内、外部大型项目中。集群架构Tendis使用去中心化集群架构，每个数据节点都拥有全部的路由信息，用户可以访问集
腾讯云数据库（Redis）监控最佳指南 Tencent_Monitor 前端数据库 javascript
简介云数据库Redis（TencentDBforRedis）是由腾讯云提供的兼容Redis协议的缓存数据库，具备高可用、高可靠、高弹性等特征。云数据库Redis服务兼容Redis2.8、Redis4.0、Redis5.0版本协议，提供标准和集群两大架构版本。最大支持4TB的存储容量，千万级的并发请求，可满足业务在缓存、存储、计算等不同场景中的需求。云数据库Redis的优势：主从热备：提供主从热备，
阿里云上云方案：Web与移动App云上部署解决方案阿里云最新优惠和活动汇总
对于绝大部分的上云用户来说，部署Web与移动App是最常见的，很多新手用户不知道上云时该如何选择阿里云产品与配置，为此，阿里云专门针对这部分用户的需求推出了Web与移动App云上部署解决方案，下面是方案详情介绍。方案简介通过负载均衡SLB对多台云服务器ECS进行流量分发，并结合云数据库RDS，助力企业快速构建互联网高可用弹性的应用架构，同时方案整合云上计算、数据、存储等云服务实现Web与移动App
Navicat Botiway FlaskWeb python 后端 web3 flask linux
Navicat是一款功能强大的数据库管理工具，它支持多种数据库系统，包括MySQL、MariaDB、MongoDB、SQLServer、SQLite、Oracle、PostgreSQL等，以及云数据库如阿里云、腾讯云、华为云等。Navicat提供了直观的用户界面（GUI）和丰富的功能，旨在简化数据库的开发、管理和维护工作。以下是Navicat的一些主要特点和功能：多数据库支持：Navicat允许用
阿里云数据库产品专属活动：数据库爆款 MySQL 19.9元/年阿里云最新优惠和活动汇总
近日，阿里云推出全新优惠活动，数据库爆款MySQL19.9元/年，免费申请数据库测试代金券最高可领2000元。爆款数据库限时优惠，首购用户续费1折起。新人免费试用，填写表单提交申请，申请通过后免费试用1个月云数据库优惠专场图.png官方活动进入地址：点击进入阿里云数据库优惠活动活动内容如下：一、千元代金券大放送活动时间：2022年4月1日-2022年6月30日，最高2000元代金券，限量发售，等您
mysql-connector-java与mysql、jdk的对应版本网厓Malico mysql mysql java 数据库
【腾讯云】腾讯云数据库性能卓越稳定可靠，为您解决数据库运维难题云数据库购买页mysql-connector-java与Mysql对应版本：Connector/JversionDriverTypeJDBCversionMySQLServerversionStatus5.143.0,4.0,4.1,4.25.6*,5.7*,8.0*Generalavailability8.044.25.6,5.7,8
招聘｜头部云厂商招 PG 核心骨干 DBA【上海】 Bytebase 数据库运维 DBA 开发者数据库管理 DevOps
我们的招聘专区又回来了！Bytebase作为先进的数据库DevOps团队协同工具，用户群里汇聚了业界优秀的DBA，SRE，运维的同学们。上周用户群里有小伙伴发招聘信息，获得了不少！于是，我们决定召唤回此前搁置的招聘专区！本周共一条内容：1️⃣头部云厂商招PG核心骨干DBA【上海】目前还是技术团队紧缺,还没有完整JD。主要就是腾讯云数据库TDSQLforpostgres目前研发和推广很快，急需客户交
polardb for mysql 安装精致男孩富贵 mysql 数据库
我整理的一些关于【MySQL,SQL】的项目学习资料（附讲解～～）和大家一起分享、学习一下：https://edu.51cto.com/surl=QDW3g3PolarDBforMySQL安装指南PolarDBforMySQL是阿里云推出的一款高性能云数据库，兼具了MySQL的全面兼容性，并在性能与可扩展性上进行了优化。本文将详细介绍PolarDBforMySQL的安装流程，并提供相应代码示例，帮
15分钟学会MemFire Cloud，前端成为全栈的必备工具叫卢卡的中国女孩 MemFire 前端 serverless 数据库 javascript 小程序
如果你是一名前端开发者，对“全栈开发”这个词一定不会陌生。在如今的技术趋势中，拥有全栈技能已经成为越来越多前端开发者追求的目标。可要怎么从前端快速迈入全栈呢？答案就是——MemFireCloud。什么是MemFireCloud？简单来说，MemFireCloud是一款专为懒人开发者设计的一站式开发应用神器。它不仅提供了免费的云数据库，还为你免去了搭建服务和开发接口API的繁琐操作。对于独立开发者来
新注册的阿里云账号有哪些优惠？阿里云新用户必看优惠大合集阿里云最新优惠和活动汇总
很多用户看到阿里云各种活动中的云服务器、云数据库、企业邮箱等云产品都仅限新用户购买之后，都纷纷直接注册了阿里云新账号之后购买，其实，阿里云新用户不仅可以优惠购买活动中的各种云产品，还有很多优惠，下面是“阿里云最新优惠和活动汇总”整理汇总的阿里云新用户必看优惠大合集。新注册的阿里云账号在购买活动中的云产品之前，还有免费领云产品通用代金券、抽取无门槛代金券、免费试用云服务器和正式购买云服务器等阿里云产
2024年用户购买阿里云服务器、数据库、企业邮箱等云产品最新优惠政策阿里云最新优惠和活动汇总
2024年用户购买阿里云产品有什么优惠？对于很多新手用户来说，首次购买的阿里云产品主要是云服务器、云数据库和企业邮箱等热门云产品，2024年阿里云针对用户的需求也推出了购买云服务器99元/1年起，轻量应用服务器61元/1年起，购买云数据库8.8元起，购买企业邮箱480元起等优惠政策。下面是小编整理的2024年用户购买阿里云服务器、数据库、企业邮箱等云产品的一些优惠政策。一、购买阿里云服务器产品优惠
阿里云数据库产品活动：RDS MySQL 9.9元抢购，千元代金券免费领阿里云最新优惠和活动汇总
近日，阿里云推出云数据库产品活动，RDSMySQL9.9元抢购，千元代金券免费领，爆款规格6.5折起不限量购买，活动涵盖了云数据库MySQL、云数据库Redis、云数据库SQLserver、云数据库PostgreSQL11等众多数据库类产品。活动直达：点此进入阿里云数据库产品活动云数据库RDSMySQL基础版：1核1G存储50GB，秒杀价9.90/年起；云数据库RDSMySQL基础版：1核2G50
基于HBase和Spark构建企业级数据处理平台 weixin_34071713 大数据数据库爬虫
摘要：在中国HBase技术社区第十届Meetup杭州站上，阿里云数据库技术专家李伟为大家分享了如何基于当下流行的HBase和Spark体系构建企业级数据处理平台，并且针对于一些具体落地场景进行了介绍。演讲嘉宾简介：李伟（花名：沐远），阿里云数据库技术专家。专注于大数据分布式计算和数据库领域，具有6年分布式开发经验，先后研发Spark及自主研发内存计算，目前为广大公有云用户提供专业的云HBase数据
Linux指令学习青城小虫 linux 学习
购买腾讯云服务器学生购买地址：学生云服务器_学生云主机_学生云数据库_云+校园特惠套餐-腾讯云腾讯云开年·上云有礼_腾讯云优惠活动安装xshell连接云服务器常用linux命令目录切换cdcd/进入系统根目录cd~进入到用户目录cdopt进入到当前目录的opt文件，相对路径cd..返回上一级cd../..返回上一级的上一级cd-回到上次操作之前的目录cd/opt/rh根据绝对路径进入到指定目录.代
阿里云平台提供哪些云产品服务阿里云最新优惠和活动汇总
在日常生活中,我们经常听到阿里云,但阿里云到底能提供哪些云产品和服务,可能你并不是特别清楚,我给大家梳理阿里云平台的一些主要云产品和服务。阿里云平台提供的主要云产品和服务如下表：云基础服务域名与建站企业应用安全网络与存储云服务器域名小程序SSL证书CDN轻量应用服务器云虚拟主机企业邮箱DDoS高防IP对象存储OSS云数据库RDS网站建设短信服务Web应用防火墙负载均衡云数据库Redis云解析DNS
阿里云新用户专属活动：新人特惠，有哪些具体优惠政策？阿里云最新优惠和活动汇总
阿里云有一个新用户专属活动，新人特惠活动，此次活动面向广大已实名认证的注册会员用户，提供云服务器、云数据库、短信、存储等各种云产品优惠，是阿里云所有优惠活动中比较热门的一个活动，那么此次活动有哪些具体优惠政策，为什么关注度这么高呢?新人特惠专享活动图.png活动直达：点此进入阿里云新用户专属活动阿里云新人特惠活动有哪些具体优惠政策？一、上云必备新用户专享特惠推荐，助力普惠上云。新人特惠上云必备图.
从互联网到云计算再到 AI 原生，百度智能云数据库的演进人工智能云原生云计算百度
1数据库行业发展概述如果说今年科技圈什么最火，我估计大家会毫不犹豫选择ChatGPT。ChatGPT是2022年11月30日由OpenAI发布的聊天应用。它创造了有史以来用户增长最快的纪录：自11月30日发布起，5天就拥有了100万活跃用户，两个月就达到了一亿用户。对比其他热门应用，同样达到一亿用户量级，TikTok花了九个月，而像Instagram，Whatsapp等应用则超过了两年时间。Cha
盘点阿里云2023年优惠券活动大全，2023年阿里云推出过哪些优惠券？阿里云最新优惠和活动汇总
在2023年，阿里云为了回馈新老用户，推出了多种优惠活动。这些活动包括满减优惠券、云产品通用代金券、域名注册专用代金券、云数据库产品专用代金券、首购续费优惠券以及面向中国高校学生的无门槛优惠券和3折折扣优惠。本文将详细介绍这些优惠活动的领取方式和具体内容。一、满减优惠券阿里云为新用户提供了满减优惠券，分为个人优惠券和企业优惠券。个人优惠券金额为3360元，共7张，金额分别为30元、80元、150元
鱼和熊掌如何兼得？一文解析 RDS 数据库存储架构升级
在2023年云栖大会上，阿里云数据库产品事业部负责人李飞飞在主题演讲中提到，瑶池数据库推出“DB+存储”一体化能力，结合人工智能、机器学习、存储等方法和创新能力，实现BufferPoolExtension能力和智能冷温热数据分层能力。在大会的《云数据库RDS年度发布与最佳实践》演讲中，阿里云RDS及OLAP开源产品部负责人彭祥表示，面对当前越来越丰富且复杂的数据环境，性能、弹性、成本是数据库用户的
如何去写一手好SQL？码农小光
作者：编码砖家链接：https://www.cnblogs.com/xiaoyangjia/p/11267191.htmlMySQL性能数据表设计索引优化SQL优化其他数据库博主负责的项目主要采用阿里云数据库MySQL，最近频繁出现慢SQL告警，执行时间最长的竟然高达5分钟。导出日志后分析，主要原因竟然是没有命中索引和没有分页处理。其实这是非常低级的错误，我不禁后背一凉，团队成员的技术水平亟待提高
阿里云新用户注册即可领取的优惠券汇总（2023年新版）阿里云最新优惠和活动汇总
阿里云新用户注册之后可领取的优惠券有哪些？新用户在阿里云注册账号之后就可以领取一些优惠券，使用优惠券可以帮助用户便宜购买阿里云的一些产品，特别是ecs云服务器，使用阿里云的优惠券最高时可节约2000元。阿里云注册即可领取的优惠券一：云产品通用代金券阿里云通用代金券领取地址：阿里云小站阿里云代金券种类包括ecs云服务器专用代金券、云产品通用代金券、云数据库产品专用代金券，最新版的代金券共6张，代金券
阿里云新用户专享特惠平台，官方云小站购买云服务器享专属折扣阿里云最新优惠和活动汇总
阿里云官方云小站专属特惠活动来了，云小站专属折扣，全站低价，可叠加代金券。轻量应用服务器2核2G配置108元1年、2核4G配置297.98元1年。通用算力型u1实例2核2G配置731.52元1年；计算型c7实例2核4G配置1718.61元1年；通用型g7实例配置2核8G2117.95元1年；内存型r7实例配置2核16G2715.74元1年起。还有云数据库RDSSQLServer/PolarDBMy
云卷云舒：谈云数据库的备份容灾 Cloud云卷云舒数据库智能运维数据库备份数据库 mysql
备份和容灾是云数据库的标配，是实现数据恢复、保护的关键。一、数据库备份原理和策略（1）灾备方案设计：根据业务需求和数据的重要性，对不同级别业务数据库采用定制的备份策略，并且把备份存储在不同的AZ或者数据中心，提升备份数据的可靠性，需要做默认的实时备份。（2）自动化备份和恢复：建立自动化的备份机制，通过定时任务或事件触发，对关键数据进行备份。（3）增量备份和差异备份：通过采用增量备份和差异备份的方式
Java 并发包之线程池和原子计数 lijingyao8206 Java计数 ThreadPool 并发包 java线程池
对于大数据量关联的业务处理逻辑，比较直接的想法就是用JDK提供的并发包去解决多线程情况下的业务数据处理。线程池可以提供很好的管理线程的方式，并且可以提高线程利用率，并发包中的原子计数在多线程的情况下可以让我们避免去写一些同步代码。这里就先把jdk并发包中的线程池处理器ThreadPoolExecutor 以原子计数类AomicInteger 和倒数计时锁C
java编程思想抽象类和接口百合不是茶 java 抽象类接口
接口c++对接口和内部类只有简介的支持,但在java中有队这些类的直接支持 1 ,抽象类 : 如果一个类包含一个或多个抽象方法,该类必须限定为抽象类(否者编译器报错) 抽象方法 : 在方法中仅有声明而没有方法体 package com.wj.Interface;
[房地产与大数据]房地产数据挖掘系统 comsci 数据挖掘
随着一个关键核心技术的突破,我们已经是独立自主的开发某些先进模块,但是要完全实现,还需要一定的时间... 所以,除了代码工作以外,我们还需要关心一下非技术领域的事件..比如说房地产 &nb
数组队列总结沐刃青蛟数组队列
数组队列是一种大小可以改变，类型没有定死的类似数组的工具。不过与数组相比，它更具有灵活性。因为它不但不用担心越界问题，而且因为泛型（类似c++中模板的东西）的存在而支持各种类型。以下是数组队列的功能实现代码： import List.Student; public class
Oracle存储过程无法编译的解决方法 IT独行者 oracle 存储过程　
今天同事修改Oracle存储过程又导致2个过程无法被编译，流程规范上的东西，Dave 这里不多说，看看怎么解决问题。 1. 查看无效对象 XEZF@xezf(qs-xezf-db1)> select object_name,object_type,status from all_objects where status='IN
重装系统之后oracle恢复文强chu oracle
前几天正在使用电脑，没有暂停oracle的各种服务。突然win8.1系统奔溃，无法修复，开机时系统提示正在搜集错误信息，然后再开机，再提示的无限循环中。无耐我拿出系统u盘准备重装系统，没想到竟然无法从u盘引导成功。晚上到外面早了一家修电脑店，让人家给装了个系统，并且那哥们在我没反应过来的时候，直接把我的c盘给格式化了并且清理了注册表，再装系统。然后的结果就是我的oracl
python学习二（一些基础语法）小桔子 pthon 基础语法
紧接着把！昨天没看继续看django 官方教程，学了下python的基本语法与c类语言还是有些小差别： 1.ptyhon的源文件以UTF-8编码格式 2. / 除结果浮点型 // 除结果整形 % 除取余数 * 乘 ** 乘方 eg 5**2 结果是5的2次方25 _&
svn 常用命令 aichenglong SVN 版本回退
1 svn回退版本 1)在window中选择log,根据想要回退的内容,选择revert this version或revert chanages from this version 两者的区别: revert this version:表示回退到当前版本(该版本后的版本全部作废) revert chanages from this versio
某小公司面试归来 alafqq 面试
先填单子，还要写笔试题，我以时间为急，拒绝了它。。时间宝贵。老拿这些对付毕业生的东东来吓唬我。。面试官很刁难，问了几个问题，记录下； 1，包的范围。。。public,private,protect. --悲剧了 2，hashcode方法和equals方法的区别。谁覆盖谁.结果，他说我说反了。 3，最恶心的一道题，抽象类继承抽象类吗？（察，一般它都是被继承的啊） 4，stru
动态数组的存储速度比较集合框架百合不是茶集合框架
集合框架：自定义数据结构(增删改查等) package 数组; /** * 创建动态数组 * @author 百合 * */ public class ArrayDemo{ //定义一个数组来存放数据 String[] src = new String[0]; /** * 增加元素加入容器 * @param s要加入容器
用JS实现一个JS对象，对象里有两个属性一个方法 bijian1013 js对象
<html> <head> </head> <body> 用js代码实现一个js对象，对象里有两个属性，一个方法 </body> <script> var obj={a:'1234567',b:'bbbbbbbbbb',c:function(x){
探索JUnit4扩展：使用Rule bijian1013 java 单元测试 JUnit Rule
在上一篇文章中，讨论了使用Runner扩展JUnit4的方式，即直接修改Test Runner的实现(BlockJUnit4ClassRunner)。但这种方法显然不便于灵活地添加或删除扩展功能。下面将使用JUnit4.7才开始引入的扩展方式——Rule来实现相同的扩展功能。 1. Rule &n
[Gson一]非泛型POJO对象的反序列化 bit1129 POJO
当要将JSON数据串反序列化自身为非泛型的POJO时，使用Gson.fromJson(String, Class)方法。自身为非泛型的POJO的包括两种： 1. POJO对象不包含任何泛型的字段 2. POJO对象包含泛型字段，例如泛型集合或者泛型类 Data类 a.不是泛型类， b.Data中的集合List和Map都是泛型的 c.Data中不包含其它的POJO
【Kakfa五】Kafka Producer和Consumer基本使用 bit1129 kafka
0.Kafka服务器的配置一个Broker，一个Topic Topic中只有一个Partition（） 1. Producer： package kafka.examples.producers; import kafka.producer.KeyedMessage; import kafka.javaapi.producer.Producer; impor
lsyncd实时同步搭建指南——取代rsync+inotify ronin47
1. 几大实时同步工具比较 1.1 inotify + rsync 最近一直在寻求生产服务服务器上的同步替代方案，原先使用的是 inotify + rsync，但随着文件数量的增大到100W+，目录下的文件列表就达20M，在网络状况不佳或者限速的情况下，变更的文件可能10来个才几M，却因此要发送的文件列表就达20M，严重减低的带宽的使用效率以及同步效率；更为要紧的是，加入inotify
java-9. 判断整数序列是不是二元查找树的后序遍历结果 bylijinnan java
public class IsBinTreePostTraverse{ static boolean isBSTPostOrder(int[] a){ if(a==null){ return false; } /*1.只有一个结点时，肯定是查找树 *2.只有两个结点时，肯定是查找树。例如{5,6}对应的BST是 6 {6,5}对应的BST是
MySQL的sum函数返回的类型 bylijinnan java spring sql mysql jdbc
今天项目切换数据库时，出错访问数据库的代码大概是这样： String sql = "select sum(number) as sumNumberOfOneDay from tableName"; List<Map> rows = getJdbcTemplate().queryForList(sql); for (Map row : rows
java设计模式之单例模式 chicony java设计模式
在阎宏博士的《JAVA与模式》一书中开头是这样描述单例模式的：　　作为对象的创建模式，单例模式确保某一个类只有一个实例，而且自行实例化并向整个系统提供这个实例。这个类称为单例类。单例模式的结构　　单例模式的特点：单例类只能有一个实例。单例类必须自己创建自己的唯一实例。单例类必须给所有其他对象提供这一实例。　　饿汉式单例类 publ
javascript取当月最后一天 ctrain JavaScript
 <script language=javascript> var current = new Date(); var year = current.getYear(); var month = current.getMonth(); showMonthLastDay(year, mont
linux tune2fs命令详解 daizj linux tune2fs 查看系统文件块信息
一.简介： tune2fs是调整和查看ext2/ext3文件系统的文件系统参数，Windows下面如果出现意外断电死机情况，下次开机一般都会出现系统自检。Linux系统下面也有文件系统自检，而且是可以通过tune2fs命令，自行定义自检周期及方式。二.用法： Usage: tune2fs [-c max_mounts_count] [-e errors_behavior] [-g grou
做有中国特色的程序员 dcj3sjt126com 程序员
从出版业说起网络作品排到靠前的，都不会太难看，一般人不爱看某部作品也是因为不喜欢这个类型，而此人也不会全不喜欢这些网络作品。究其原因，是因为网络作品都是让人先白看的，看的好了才出了头。而纸质作品就不一定了，排行榜靠前的，有好作品，也有垃圾。许多大牛都是写了博客，后来出了书。这些书也都不次，可能有人让为不好，是因为技术书不像小说，小说在读故事，技术书是在学知识或温习知识，有
Android：TextView属性大全 dcj3sjt126com textview
android:autoLink 设置是否当文本为URL链接/email/电话号码/map时，文本显示为可点击的链接。可选值(none/web/email/phone/map/all) android:autoText 如果设置，将自动执行输入值的拼写纠正。此处无效果，在显示输入法并输
tomcat虚拟目录安装及其配置 eksliang tomcat配置说明 tomca部署web应用 tomcat虚拟目录安装
转载请出自出处：http://eksliang.iteye.com/blog/2097184 1.-------------------------------------------tomcat 目录结构 config：存放tomcat的配置文件 temp ：存放tomcat跑起来后存放临时文件用的 work ：当第一次访问应用中的jsp
浅谈：APP有哪些常被黑客利用的安全漏洞 gg163 APP
首先，说到APP的安全漏洞，身为程序猿的大家应该不陌生；如果抛开安卓自身开源的问题的话，其主要产生的原因就是开发过程中疏忽或者代码不严谨引起的。但这些责任也不能怪在程序猿头上，有时会因为BOSS时间催得紧等很多可观原因。由国内移动应用安全检测团队爱内测（ineice.com）的CTO给我们浅谈关于Android 系统的开源设计以及生态环境。 1. 应用反编译漏洞：APK 包非常容易被反编译成可读
C#根据网址生成静态页面 hvt Web .net C#asp.net hovertree
HoverTree开源项目中HoverTreeWeb.HVTPanel的Index.aspx文件是后台管理的首页。包含生成留言板首页，以及显示用户名，退出等功能。根据网址生成页面的方法： bool CreateHtmlFile(string url, string path) { //http://keleyi.com/a/bjae/3d10wfax.htm stri
SVG 教程（一）天梯梦 svg
SVG 简介 SVG 是使用 XML 来描述二维图形和绘图程序的语言。学习之前应具备的基础知识：继续学习之前，你应该对以下内容有基本的了解： HTML XML 基础如果希望首先学习这些内容，请在本站的首页选择相应的教程。什么是SVG？ SVG 指可伸缩矢量图形 (Scalable Vector Graphics) SVG 用来定义用于网络的基于矢量
一个简单的java栈 luyulong java 数据结构栈
public class MyStack { private long[] arr; private int top; public MyStack() { arr = new long[10]; top = -1; } public MyStack(int maxsize) { arr = new long[maxsize]; top
基础数据结构和算法八：Binary search sunwinner Algorithm Binary search
Binary search needs an ordered array so that it can use array indexing to dramatically reduce the number of compares required for each search, using the classic and venerable binary search algori
12个C语言面试题，涉及指针、进程、运算、结构体、函数、内存，看看你能做出几个！刘星宇 c 面试
12个C语言面试题，涉及指针、进程、运算、结构体、函数、内存，看看你能做出几个！ 1.gets()函数问：请找出下面代码里的问题： #include<stdio.h> int main(void) { char buff[10]; memset(buff,0,sizeof(buff));
ITeye 7月技术图书有奖试读获奖名单公布 ITeye管理员活动 ITeye 试读
ITeye携手人民邮电出版社图灵教育共同举办的7月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 7月试读活动回顾： http://webmaster.iteye.com/blog/2092746 本次技术图书试读活动的优秀奖获奖名单及相应作品如下（优秀文章有很多，但名额有限，没获奖并不代表不优秀）：《Java性能优化权威指南》

分布式开源库 介绍