HBase 删除指定column的所有数据

HBase 删除指定column的所有数据

背景

最近由于项目改版更新,原来存储在Hbase表中的某一列的数据需要全部更新,但是更新时需要每天去定时计算,而且第二天的数据需要用到前面好几天的历史数据,故需要将原来的这一列全部清空

As we know, during table creation we would define only the column family not the column qualifier. So the column qualifier will be created on the fly and it depends on the need. Which means you won’t be having a column qualifier for all rows in that table. So there is no provision to delete column qualifiers for a particular column family.>

大致意思就是说:我们在创建表的时候,hbase只关注rowkey,column Family,并没有说在创建表的时候指定cq有多少,这也是hbase列式存储的特点,所以在hbase API中是没有提供delete 一个列下的所有数据的

删除数据的方式有两种:
1、根据rowkey、cf、cq、ts删除指定的cell
2、删除cf下的所有column

为了方便实现这种需求,特意把代码上传一下,方便后续有用到的童鞋使用,嘻嘻

def main(args: Array[String]): Unit = {
    val conn = getHbaseConfig()

    val t = new HTable(TableName.valueOf("tableName"), HbaseConnect.getHbaseConfig())
    val scan = new Scan()
    scan.addColumn("cf_name".getBytes, "cq_name".getBytes())
    val scanner = t.getScanner(scan).iterator()
    val deletes = new util.ArrayList[Delete]()
    while (scanner.hasNext) {
      val v = scanner.next()
      val delete = new Delete(v.getRow)
      if (v.containsColumn("cf_name".getBytes, "cq_name".getBytes())) {
        delete.deleteColumns("cf_name".getBytes, "cq_name".getBytes())
        deletes.add(delete)
      }
    }
    println("-------------------------------"+t.isAutoFlush)
    t.delete(deletes)
    t.close()
    conn.close()
 }

  def getHbaseConfig() = {
    val configuration = HBaseConfiguration.create()
    configuration.set("hbase.zookeeper.quorum", this.zookeeperHosts) //集群机器ip
    configuration.set("hbase.defaults.for.version.skip", "true")

    val conn = ConnectionFactory.createConnection(configuration)
    conn
  }

你可能感兴趣的:(hbase)