有些命令

一些杂琐的东西,记录一下,以后可能会用得上,另外以后遇到可以记录的可以追加在这里

文件gbk 编码转utf-8:

coding=$(file -b $file1 |cut -d ' ' -f1)
if [ "$coding" == "ISO-8859" ];then
		local tmpfile=$(mktemp)
		iconv -f $scode -t $dcode $file1 > $tmpfile
		mv $tmpfile $file1
fi


转UTF-8 with BOM:

echo -ne '\xEF\xBB\xBF' > names.utf8.csv
iconv -f CP1252 -t UTF-8  names.csv >> names.utf8.csv

查看进程启动时间、运行时长:

ps -eo pid,lstart,etime | grep xxx

查找进程内最耗费CPU的线程:
ps -Lfp pid  #列出进程内所有线程 -L threads -f 所有full -p by process id
ps -mp pid -o THREAD,tid,time


BOM utf-8 的去掉BOM:
几种方式都可以:

cat INFILE | sed 's/\xef\xbb\xbf//g' > OUTFILE
awk '{if(NR==1)sub(/^\xef\xbb\xbf/,"");print}' INFILE > OUTFILE
tail --bytes=+4 INFILE > OUTFILE  ##没有判断标示

top -Hp pid #找出进程内最耗CPU线程ID
printf "%x\n" tid #线程ID转成16进制
jstak pid | grep tid  #找到最耗费CPU的线程

jmap导出java进程内存情况并用jhat分析
jmap -dump:format=b,file=/tmp/dump.dat 21711  
jhat -J-Xmx512m -port 9998 /tmp/dump.dat

storm相关进程启动命令:
nohup ./storm nimbus >/dev/null 2>&1 &
nohup ./storm supervisor >/dev/null 2>&1 &
nohup ./storm ui >/dev/null 2>&1 &
nohup ./storm logviewer >/dev/null 2>&1 &

jstorm相关进程启动命令:
nohup $JSTORM_HOME/bin/jstorm nimbus >/dev/null 2>&1 &
nohup $JSTORM_HOME/bin/jstorm supervisor >/dev/null 2>&1 &

storm杀进程命令:
kill `ps aux | egrep '(daemon\.nimbus)|(storm\.ui\.core)' | fgrep -v egrep | awk '{print $2}'`
kill `ps aux | fgrep storm | fgrep -v 'fgrep' | awk '{print $2}'`

hive相关进程启动命令:
nohup ./hive --service hiveserver2 > hiveserver2.log 2>&1  &
nohup ./hive --service metastore > metastore.log 2>&1 &
nohup ./hive --service hwi > hwi.log 2>&1 &

找出目录包含指定字符串的文件列表:
find . -type f -name "*.sh" -exec grep -nH "xxxxxx" {} \;

linux清理内存:
sync && echo 3 > /proc/sys/vm/drop_caches

列出文件中包含指定字符串的行的前后指定行:
grep -n -A 10 -B 10 "xxxx" file

tcpdump抓包实例:
tcpdump -i eth1 -XvvS -s  0 tcp port 10020
tcpdump -S -nn -vvv -i eth1 port 10020

spark任务提交实例:
./spark-submit --deploy-mode cluster --master spark://10.49.133.77:6066  --jars hdfs://10.49.133.77:9000/spark/guava-14.0.1.jar --class spark.itil.video.ItilData hdfs://10.49.133.77:9000/spark/sparktest2-0.0.1-jar-with-dependencies.jar --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails  -XX:+PrintGCTimeStamps -XX:-UseGCOverheadLimit"

spark启动worker实例:
./spark-daemon.sh start org.apache.spark.deploy.worker.Worker 1 --webui-port 8081 --port 8092 spark://100.65.32.215:8070,100.65.32.212:8070

spark sql操作实例:
export SPARK_CLASSPATH=$SPARK_CLASSPATH:/data/webitil/hive/lib/mysql-connector-java-5.0.8-bin.jar
SPARK_CLASSPATH=$SPARK_CLASSPATH:/data/webitil/hive/lib/mysql-connector-java-5.0.8-bin.jar ./spark-sql --master spark://10.49.133.77:8070
./spark-sql --master spark://10.49.133.77:8070 --jars /data/webitil/hive/lib/mysql-connector-java-5.0.8-bin.jar

./spark-shell --jars /data/webitil/hive/lib/mysql-connector-java-5.0.8-bin.jar
./spark-shell --packages com.databricks:spark-csv_2.11:1.4.0
ADD_JARS=../elasticsearch-hadoop-2.1.0.Beta1/dist/elasticsearch-spark_2.10-2.1.0.Beta1.jar ./bin/spark-shell

./spark-shell
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
val url = "jdbc:mysql://10.198.30.118:3311/logplatform"
val table = " (select * from t_log_stat limit 5) as tb1"
val reader = sqlContext.read.format("jdbc")
reader.option("url", url)
reader.option("dbtable", table)
reader.option("driver", "com.mysql.jdbc.Driver")
reader.option("user", "logplat_w")
reader.option("password", "rm5Bey6x")
val df = reader.load()
df.show()


mvn安装自己的jar包到本地mvn库实例:
mvn install:install-file -DgroupId=com.tencent.omg.itil.net -DartifactId=IpServiceJNI -Dversion=1.0 -Dpackaging=jar -Dfile=d:\storm\IpServiceJNI-1.0.jar


你可能感兴趣的:(shell)