# The number of milliseconds of each
ticktickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
ZooKeeper JMX enabled by default
Using config: /home/mi/wanglei/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
说明启动成功
3.5 验证zk信息
在另外终端中输入
telnet 127.0.0.1 2181
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
然后输入stat,会报错
stat is not executed because it is not in the whitelist. connection closed by foreign host
可以修改zkServer.sh
...
ZOOMAIN="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=$JMXPORT -Dcom.sun.management.jmxremote.authenticate=$JMXAUTH -Dcom.sun.management.jmxremote.ssl=$JMXSSL -Dzookeeper.jmx.log4j.disable=$JMXLOG4J org.apache.zookeeper.server.quorum.QuorumPeerMain"
fi
else
echo "JMX disabled by user request" >&2
ZOOMAIN="org.apache.zookeeper.server.quorum.QuorumPeerMain"
fi
import org.apache.kafka.common.serialization.{IntegerDeserializer, StringDeserializer}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.kafka010.{ConsumerStrategies, KafkaUtils, LocationStrategies}
import org.apache.spark.streaming.{Seconds, StreamingContext}
object KafkaWordCount {
def main(args: Array[String]): Unit = {
val zkQuorum = "xxx:9092"
val topics = Array("test")
val kafkaMap = Map[String, Object](
"bootstrap.servers" -> zkQuorum,
"key.deserializer" -> classOf[IntegerDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"group.id" -> "test",
"auto.offset.reset" -> "latest",
"enable.auto.commit" -> (false: java.lang.Boolean),
"session.timeout.ms" -> "30000"
)
val conf = new SparkConf()
.setAppName(KafkaWordCount.getClass.getSimpleName)
.setMaster("local[4]")
val ssc = new StreamingContext(conf, Seconds(20))
val consumer = ConsumerStrategies.Subscribe[String, String](topics, kafkaMap)
val lines = KafkaUtils.createDirectStream(
ssc,
LocationStrategies.PreferConsistent,
consumer
)
.map(_.value())
val wordCount = lines.flatMap(_.split(" "))
.map(key => (key, 1L))
.reduceByKey(_ + _)
wordCount.print()
ssc.start()
ssc.awaitTermination()
}
}
在IDE里启动,可以看到IDE里的输出
20/05/12 15:16:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
20/05/12 15:16:40 INFO TaskSetManager: Finished task 0.0 in stage 19.0 (TID 22) in 5 ms on localhost (executor driver) (1/3)
20/05/12 15:16:40 INFO TaskSetManager: Finished task 1.0 in stage 19.0 (TID 23) in 4 ms on localhost (executor driver) (2/3)
20/05/12 15:16:40 INFO Executor: Finished task 2.0 in stage 19.0 (TID 24). 1718 bytes result sent to driver
20/05/12 15:16:40 INFO TaskSetManager: Finished task 2.0 in stage 19.0 (TID 24) in 5 ms on localhost (executor driver) (3/3)
20/05/12 15:16:40 INFO TaskSchedulerImpl: Removed TaskSet 19.0, whose tasks have all completed, from pool
20/05/12 15:16:40 INFO DAGScheduler: ResultStage 19 (print at KafkaWordCount.scala:40) finished in 0.006 s
20/05/12 15:16:40 INFO DAGScheduler: Job 9 finished: print at KafkaWordCount.scala:40, took 0.011706 s
20/05/12 15:16:40 INFO JobScheduler: Finished job streaming job 1589267800000 ms.0 from job set of time 1589267800000 ms
20/05/12 15:16:40 INFO JobScheduler: Total delay: 0.088 s for time 1589267800000 ms (execution: 0.064 s)
20/05/12 15:16:40 INFO ShuffledRDD: Removing RDD 19 from persistence list
20/05/12 15:16:40 INFO BlockManager: Removing RDD 19
20/05/12 15:16:40 INFO MapPartitionsRDD: Removing RDD 18 from persistence list
20/05/12 15:16:40 INFO BlockManager: Removing RDD 18
20/05/12 15:16:40 INFO MapPartitionsRDD: Removing RDD 17 from persistence list
20/05/12 15:16:40 INFO BlockManager: Removing RDD 17
20/05/12 15:16:40 INFO MapPartitionsRDD: Removing RDD 16 from persistence list
20/05/12 15:16:40 INFO BlockManager: Removing RDD 16
20/05/12 15:16:40 INFO KafkaRDD: Removing RDD 15 from persistence list
20/05/12 15:16:40 INFO BlockManager: Removing RDD 15
20/05/12 15:16:40 INFO ReceivedBlockTracker: Deleting batches:
20/05/12 15:16:40 INFO InputInfoTracker: remove old batch metadata: 1589267760000 ms
-------------------------------------------
Time: 1589267800000 ms
-------------------------------------------
在平时工作中,难免会遇到把 XML 作为数据存储格式。面对目前种类繁多的解决方案,哪个最适合我们呢?在这篇文章中,我对这四种主流方案做一个不完全评测,仅仅针对遍历 XML 这块来测试,因为遍历 XML 是工作中使用最多的(至少我认为)。 预 备 测试环境: AMD 毒龙1.4G OC 1.5G、256M DDR333、Windows2000 Server
Netty 3.x的user guide里FrameDecoder的例子,有几个疑问:
1.文档说:FrameDecoder calls decode method with an internally maintained cumulative buffer whenever new data is received.
为什么每次有新数据到达时,都会调用decode方法?
2.Dec
hive> select * from t_test where ds=20150323 limit 2;
OK
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
问题原因: hive堆内存默认为256M
这个问题的解决方法为:
修改/us
Simply do the following:
I. Declare a global variable:
var markersArray = [];
II. Define a function:
function clearOverlays() {
for (var i = 0; i < markersArray.length; i++ )
Quick sort is probably used more widely than any other. It is popular because it is not difficult to implement, works well for a variety of different kinds of input data, and is substantially faster t