Cassandra

阅读更多

Cassandra

下载

http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.5.1/apache-cassandra-0.5.1-bin.tar.gz

 

apache-cassandra-0.5.1自带的hector-0.5.0-7.jar有严重的性能问题,需要修改成hector-0.5.1-9.jar

 

资源

http://cassandra.apache.org/

http://wiki.apache.org/cassandra/FrontPage

 

 

部署

http://kauu.net/2010/02/27/cassandra%E5%88%9D%E4%BD%93%E9%AA%8C/

 

192.168.2.79

/home/bmb/apache-cassandra-0.5.1

 

 

DataModal设计

l  对同一行的所有列,可以定义根据列名的排序规则(即保存规则)。当保存某个用户相对应的朋友的时候,可以用朋友的加入时间作为一个一个的列名,列按照时间倒序拍。这样很容易获得用户最新的朋友。

测试,Keyspace1Standard1的列按照BytesType进行排序,不管按照什么set顺序,get_slice都会获得,a,b,c的顺序

set Keyspace1.Standard1['jsmith']['c'] = 'c'

set Keyspace1.Standard1['jsmith']['a'] = 'a'

set Keyspace1.Standard1['jsmith']['b'] = 'b'

 

l  Super columns

Super columns are a great way to store one-to-many indexes to other records: make the sub column names TimeUUIDs (or whatever you'd like to use to sort the index), and have the values be the foreign key.

不如某个用户的好友,Sub column name是好友加入时间,Sub column value是好友的ID,可以作为外键关联好友的信息表。

l  复合Key可以等效Super Columns,列名为时间

Alternatively, we could preface the status keys with the user key, which has less temporal locality. If we used user_id:status_id as the status key, we could do range queries on the user fragment to get tweets-by-user, avoiding the need for a user_timeline super column.

l  In column-orientation, the column names are the data

l  列名TimeUUID,列值JSON格式,可以解决一些问题

l   

 

http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/

Twitter怎样使用Cassandra,TwitterData Model,BlogData Model

http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model

Digg提供的一个完整例子

http://wiki.apache.org/cassandra/DataModel

http://wiki.apache.org/cassandra/CassandraLimitations

http://www.hellodba.net/2010/02/cassandra.html

(中文翻译,有出入)介绍TwitterData Modal,有借鉴意义

 

 

修改Schema定义,2次重启

https://issues.apache.org/jira/browse/CASSANDRA-44

动态创建Column Falimy,在不重启服务器下

http://github.com/NZKoz/cassandra_object

 

启动单个节点的Cluster

安装JDK 6

   tar -zxvf cassandra-$VERSION.tgz

   cd cassandra-$VERSION

   sudo mkdir -p /var/log/cassandra

   sudo chown -R `whoami` /var/log/cassandra

   sudo mkdir -p /var/lib/cassandra

   sudo chown -R `whoami` /var/lib/cassandra

修改/bin/cassandra.in.sh里面的启动端口(-Dcom.sun.management.jmxremote.port=8080

bin/cassandra -f

查看日志

tail -f /var/log/cassandra/system.log

客户端连接

cd /home/bmb/apache-cassandra-0.5.1

bin/cassandra-cli --host 192.168.2.79  --port 9160

 

  cassandra> set Keyspace1.Standard1['jsmith']['first'] = 'John'

  Value inserted.

  cassandra> set Keyspace1.Standard1['jsmith']['last'] = 'Smith'

  Value inserted.

  cassandra> set Keyspace1.Standard1['jsmith']['age'] = '42'

  Value inserted.

  cassandra> get Keyspace1.Standard1['jsmith']

    (column=age, value=42; timestamp=1249930062801)

    (column=first, value=John; timestamp=1249930053103)

    (column=last, value=Smith; timestamp=1249930058345)

  Returned 3 rows.

  cassandra>

Java 客户端

l  原始Thrift

http://apache.freelamp.com/incubator/thrift/0.2.0-incubating/thrift-0.2.0-incubating.tar.gz

封装

http://github.com/charliem/OCM

 

手动编译Java  Thrift

cd D:\7g\Personal\Resources\Architecture\Cassandra\thrift-0.2.0\lib\java

ant

 

l  hector

http://prettyprint.me/2010/02/23/hector-a-java-cassandra-client/

http://github.com/rantav/hector/downloads

 

l  OCM

http://github.com/charliem/OCM/downloads

 

 

 

Git

下载Git

http://kernel.org/pub/software/scm/git/git-1.7.0.3.tar.gz

安装

cd /home/bmb/apache-cassandra-0.5.1/git-1.7.0.3

./configure

make

make install

 

Git客户端

http://msysgit.googlecode.com/files/msysGit-fullinstall-1.7.0.2-preview20100309.exe

D:\7g\Personal\Resources\Architecture\Cassandra\msysgit\msysgit\ git-cmd.bat

Check out OCM

git clone http://github.com/charliem/OCM.git

 

 

http://tortoisegit.googlecode.com/files/TortoiseGit-1.3.6.0-32bit.msi

 

 

 

集群配置

http://pan-java.iteye.com/blog/604672

 

192.168.5.11

/u/iic/bmb/apache-cassandra-0.5.1

修改conf下面的文件

/var/log/cassandra/u/iic/bmb/apache-cassandra-0.5.1/log

/var/lib/cassandra/u/iic/bmb/apache-cassandra-0.5.1/log

修改bin/cassandraJava_home

export JAVA_HOME=/u/iic/bmb/jdk6

 

 

192.168.5.12 (目录配置同5.11,都以2.79Seed)

注意:部署集群的时候,不能把5.11整个目录搬到5.12上,不然他们Token一样,会导致Ring5.115.12重复。解决方法:删除datacommit目录。

还有所有的IP必须配置成绝对IP,如果配Localhost,会使Ring不完整

 

192.168.2.79

/home/bmb/apache-cassandra-0.5.1

 

bin/cassandra -f

 

OCM的使用

l  定义数据结构:D:\7g\Personal\Resources\Architecture\Cassandra\Client\OCM Compiler\OCMSpecSample.txt

l  通过com.kissintellignetsystems.ocm.compiler.Compiler类,提供以下的命令行参数,生成Java对象:"OCMSpecSample.txt", "keyspace1", "Java",

                "mynamespace", "output/"

l  测试例子:D:\7g\Personal\Resources\Architecture\Cassandra\Client\Output Languages\Java\TestHarness

CreatTest.java

 

 

get keyspace1.Users['charlie']

bin/cassandra-cli --host 192.168.2.79  --port 9160

bin/cassandra-cli --host 192.168.5.11  --port 9160

bin/cassandra-cli --host 192.168.5.12  --port 9160

 

 

 

 

查看集群的节点信息

bin/nodeprobe -host localhost -port 8090 ring

 

Hadoop & Cassandra

using Hadoop to Cassandra through Binary Memtable

http://github.com/lenn0x/Cassandra-Hadoop-BMT/blob/master/src/java/org/digg/CassandraBulkLoader.java

 

http://blog.csdn.net/wdwbw/archive/2010/03/10/5366739.aspx

http://www.roadtofailure.com/2009/10/29/hbase-vs-cassandra-nosql-battle/

 

Lucene + Cassandra

Lucandra

 

 

你可能感兴趣的:(Cassandra,Git,Apache,Hadoop,Twitter)