小文件和代码,我放在QQ群里(952211102),文末会有视频讲解。
下载链接:https://github.com/Embedding/Chinese-Word-Vectors
下载并解压
path = "D:/xxx/sgns.target.word-character.char1-2.dynwin5.thr10.neg5.dim300.iter5"
f = open(path, "r", encoding="utf-8")
chunk_data = f.read(1024*10) #为了不一次全读完,只看一部分数据即可
print(chunk_data )
可以看到数据格式为:
字/词的个数 300维向量
字/词1 具体的向量2...
字/词2 具体的向量2...
等等
数据大致为:
636086 300
, -0.225854 0.107560 0.197237 -0.163468 0.090813 0.040628 0.176729 -0.011261 -0.053033 0.037572 -0.155545 0.053847 0.131007 0.250081 -0.071398 -0.089812 -0.034247 0.078562 0.023870 0.159746 0.100427 0.021786 0.266321 0.004339 0.105988 -0.002758 0.119828 0.004190 -0.154152 0.087963 0.179135 0.041696 -0.150765 0.112602 -0.003246 -0.115960 0.042190 0.108845 0.138592 -0.270801 0.276069 -0.377507 -0.133841 0.225290 -0.084972 -0.046473 -0.163377 -0.129677 0.178721 -0.008124 -0.037467 0.291655 0.144279 -0.118583 0.046584 0.021907 0.126214 0.054273 0.048182 0.079335 -0.126211 0.045360 -0.099212 -0.016365 -0.009512 -0.038277 -0.152457 0.013738 -0.210855 -0.151658 0.068768 0.310373 0.086278 0.065519 0.089834 0.264020 0.206357 -0.046300 0.111625 -0.112923 0.025023 0.266332 0.238958 -0.112658 0.037161 -0.228547 0.048586 0.243026 -0.143488 0.045040 0.028236 0.096553 0.011036 0.119268 0.068397 -0.000245 -0.011066 -0.096202 -0.020504 -0.104224 -0.152824 -0.126277 0.003383 0.146738 0.034192 -0.063062 -0.100550 0.081958 0.297142 -0.095431 0.047876 0.045076 0.061213 -0.103860 -0.046096 -0.108332 0.083888 -0.170114 0.091852 -0.111302 0.036355 0.048322 0.048027 -0.133125 -0.173485 -0.062455 0.133545 0.264515 -0.199027 -0.134663 -0.176003 -0.073278 -0.071808 -0.067675 0.065894 -0.061778 -0.207889 -0.035713 0.129135 0.160631 0.064196 0.036111 -0.037556 -0.123741 0.070222 -0.011605 0.095488 -0.026130 0.176827 0.135286 -0.091638 -0.196278 0.135840 -0.067259 -0.066008 -0.207676 -0.178852 -0.009413 -0.113950 0.196629 -0.114693 -0.026324 -0.141586 0.197364 -0.078522 -0.162726 0.052150 0.003707 0.034934 -0.067691 -0.014802 0.025208 -0.012278 0.014441 0.015678 0.044566 0.007233 -0.030680 -0.075503 0.143719 0.075201 0.141424 -0.038741 0.120257 0.066381 0.028938 -0.026662 0.052459 0.103320 -0.057982 0.058221 0.058726 -0.196115 -0.118826 -0.017446 0.047007 0.301567 0.037915 -0.147273 0.340786 -0.015451 -0.004354 0.009008 -0.036533 0.171037 0.224140 -0.119820 0.302488 -0.036199 -0.200074 0.108383 0.048416 0.059023 0.092124 0.024632 0.049616 -0.205193 0.018068 -0.330599 0.047790 -0.031321 -0.066260 -0.077764 0.274229 -0.157499 -0.090307 -0.057102 0.099106 0.094118 -0.152254 -0.012646 0.065620 0.032115 0.122921 0.051477 0.019677 0.321413 0.100348 -0.195362 0.033550 0.171877 -0.054965 -0.090468 -0.046022 -0.023165 0.142064 0.160361 -0.100200 0.114204 -0.251116 -0.020862 0.259914 0.010826 -0.333081 -0.029773 -0.106668 -0.066178 -0.055028 0.032080 0.081552 0.237320 0.034470 0.116792 -0.054930 0.035778 -0.171559 -0.077482 0.091026 -0.050017 0.080905 -0.356599 -0.044822 -0.058992 0.191774 0.001098 0.036497 -0.047119 -0.051166 0.028191 0.230730 -0.093177 -0.086363 -0.153171 -0.000628 0.028436 -0.117305 -0.154677 -0.030172 -0.073724 0.022715 -0.036977 0.059616 0.153312 -0.103805 0.231885 0.247361 -0.134653 0.142064 0.144121 0.005673
的 -0.242538 0.100439 0.129818 -0.104647 -0.028103 0.058042 0.190883 0.153426 0.034308 0.071330 -0.000116 0.113657 0.097657 0.030841 0.060856 0.056382 -0.195434 0.031622 0.003772 0.059192 -0.021331 -0.109444 0.192544 0.012395 0.107907 0.179732 0.216159 -0.004080 -0.127886 0.022992 0.169664 0.191425 -0.022217 -0.095708 0.075299 -0.169385 0.042564 0.002497 0.033388 -0.279786 0.135520 0.028730 -0.006901 0.183539 0.175054 0.166405 0.106541 -0.030475 0.122642 -0.196793 0.247228 0.058643 0.177309 -0.197690 -0.088260 0.094268 0.117994 0.031037 0.069194 0.000642 -0.066777 0.101824 -0.002390 0.094974 0.121026 0.153325 -0.304356 0.173549 -0.093552 0.029033 0.101660 0.149433 0.072934 0.143490 0.083457 0.241503 -0.070801 -0.088046 0.003713 -0.280668 -0.001448 0.003456 0.101584 0.131760 -0.223845 -0.309329 0.016964 0.347164 0.132431 -0.111628 -0.138338 -0.064733 0.007556 0.122302 0.184578 -0.078595 -0.140727 -0.192051 -0.086686 -0.038096 -0.097754 -0.052457 -0.018865 0.045217 0.132015 0.010384 -0.070730 -0.116558 0.109532 -0.159887 -0.024422 0.011281 -0.006494 0.021118 -0.021956 0.045676 0.285816 -0.096120 0.045639 0.046192 -0.194560 0.143332 0.013284 0.181637 -0.135146 -0.213470 -0.122927 0.139591 -0.174840 -0.230727 -0.336673 0.028399 0.133554 -0.022328 0.263509 -0.135144 -0.085525 -0.068479 0.147214 0.148020 -0.165846 0.096487 0.216477 -0.130104 0.220343 0.022198 0.081715 0.190736 -0.112020 0.124746 -0.042398 -0.100392 0.217173 -0.025453 -0.261025 -0.122996 -0.065484 0.169312 -0.274064 0.073796 -0.042404 0.003309 -0.026870 0.224915 -0.086456 -0.116525 0.077721 -0.003964 0.094634 -0.345002 -0.055975 0.189918 -0.206350 -0.058314 0.003844 -0.008447 -0.021032 0.057915 0.084640 0.098421 0.103423 0.139302 0.069879 0.235352 -0.012435 -0.214576 0.140327 -0.096340 -0.000419 0.145002 -0.118673 -0.067662 -0.314651 0.103676 0.213736 0.119828 -0.093621 0.300272 -0.054337 0.236886 -0.066297 0.070531 0.055797 -0.052518 -0.042077 0.220657 -0.085996 0.439905 0.213758 -0.013311 0.172127 -0.072370 0.025413 0.129522 0.082697 0.258775 -0.146191 -0.015176 -0.039916 0.097016 0.134828 -0.051018 0.105613 0.200699 -0.085717 -0.149180 -0.140295 -0.099351 -0.072185 0.008729 0.114468 -0.014246 0.211366 0.059199 0.042156 0.000897 0.234377 0.119545 -0.052635 -0.034904 -0.053223 -0.105491 -0.097634 -0.044138 0.039147 0.025329 0.121565 0.042493 0.119284 0.007208 0.110501 0.105863 0.014750 -0.279106 -0.178406 0.028334 -0.144416 0.213126 0.025383 0.247148 0.346476 -0.046433 0.199948 0.019231 0.053996 -0.044669 -0.117902 -0.048377 -0.114109 0.047294 -0.266003 -0.155737 0.022962 -0.032529 -0.112454 0.065954 0.005879 0.160480 -0.098461 0.098248 -0.110154 -0.067323 -0.102438 -0.100263 -0.001491 -0.205655 -0.219179 0.047583 -0.187761 0.135312 0.035478 0.002708 0.039958 -0.083279 0.195324 0.142303 -0.079450 0.133499 0.202978 -0.277668
。 -0.283826 -0.052346 0.080995 -0.139234 0.153747 0.052080 0.152875 0.159906 -0.100812 0.051320 -0.103536 -0.089473 0.056333 0.140998 -0.062160 -0.124558 -0.066892 -0.009883 0.091323 0.173555 -0.096824 0.053216 0.320953 -0.072564 0.084597 -0.016583 0.137165 0.005142 -0.181158 0.144163 0.155581 0.165243 -0.017603 -0.001569 -0.008859 -0.074905 0.062937 -0.126123 0.157542 -0.174461 0.277550 -0.226569 0.105378 0.384084 0.012730 0.064785 0.061948 0.034733 0.245869 -0.052040 -0.061160 0.229989 0.137800 0.058283 0.062240 0.165518 0.029029 0.008543 0.159878 0.128581 -0.132286 -0.042042 -0.064327 -0.029669 -0.012382 0.171713 -0.170834 -0.030781 -0.156063 -0.166197 0.083500 0.245971 0.158185 0.124231 0.016966 0.098247 0.108287 -0.033103 0.110902 0.085093 -0.012798 0.059657 0.207193 0.008308 -0.073832 -0.165532 0.103812 0.138122 -0.223544 -0.129617 0.024598 0.118812 0.023367 0.241243 0.167620 0.045504 0.004117 -0.133555 -0.034388 -0.069076 -0.219639 -0.210766 0.192454 0.116632 -0.013204 -0.170307 -0.193683 0.075764 0.209414 -0.036529 -0.005920 0.164980 0.069390 -0.044813 0.209077 -0.192445 0.179965 -0.183163 0.145443 -0.115985 0.078686 0.064413 0.106028 0.040743 0.007855 -0.077971 0.019152 0.060632 -0.025784 -0.157173 -0.069382 0.041079 0.079359 -0.061446 0.156869 -0.041106 -0.239221 -0.040970 -0.000015 0.099060 -0.247002 -0.020837 0.050309 0.002642 0.118486 -0.029898 0.186345 0.085188 0.178551 0.096495 -0.075727 -0.120875 0.101078 0.074043 -0.114990 -0.139079 -0.132218 0.178934 -0.198598 0.116678 0.085819 -0.047442 -0.343870 -0.023334 -0.127745 -0.187099 0.153834 -0.065911 0.212171 -0.226741 0.007796 0.170214 -0.123449 0.030632 -0.134519 0.026184 0.060357 0.023709 -0.105402 0.059923 -0.054748 0.163454 -0.021259 0.143792 0.039344 -0.113686 0.095763 0.047529 0.053945 -0.024458 -0.035755 -0.034898 -0.117274 -0.140923 -0.051384 0.073058 0.142643 0.218760 -0.172208 0.232220 0.078158 0.015812 0.180485 -0.130071 0.163176 0.193347 0.036909 0.212062 -0.014643 -0.164350 0.269914 -0.020742 0.139275 0.116478 -0.010222 0.046338 -0.163462 0.078293 -0.194750 0.146771 -0.066055 0.023407 -0.031146 0.323978 -0.104894 -0.062218 -0.067920 -0.058051 -0.007136 -0.065643 0.057267 0.005363 0.113890 0.194012 0.130181 0.081436 0.086198 0.065030 -0.172616 0.074657 0.038350 -0.150484 -0.019897 -0.079627 0.163732 0.090669 0.121193 -0.269247 0.119581 -0.304608 0.071850 0.088829 0.151985 -0.040556 -0.166373 -0.112855 -0.022780 0.054751 -0.004542 -0.012059 0.113281 -0.085975 0.213007 0.050355 0.042661 -0.188214 -0.074528 0.242681 -0.223175 0.019245 -0.291517 -0.086909 0.100913 0.090165 0.080523 0.154252 0.056052 0.049938 0.099428 0.266409 -0.078517 -0.211588 -0.247789 -0.061397 0.011922 -0.010878 -0.138854 -0.032372 -0.191472 0.056607 0.051876 0.045863 0.213666 -0.076109 0.197351 0.265458 -0.068780 0.057721 0.142923 -0.091333
...
插播:如果只想看最终使用方法,可以直接跳转到 2.3.2 代码
pip install gensim
参考:使用别人训练好的词向量
import gensim
from gensim.models.word2vec import Word2Vec
model = Word2Vec()
new_model = gensim.models.Word2Vec.load(path)
print(new_model['的'])
因为代码是一样的,所以看他的数据格式是否相同。
下载60维向量Word60.model,utf-8编码打开:
发现并不是和我们的word2vec的向量格式一致,它乱码了。推测是保存的模型,有点像pickle直接保存一组向量。或者是二进制文件。
我觉得最后可能还是会用到tensorflow,所以还是弄下tensorflow吧。
[之前做项目和实习时用的都是tf,应该还算熟悉,只不过隔了好几个月没碰了,现在要重新收回领地]
参考地址:window上安装tensorflow cpu版本
但我发现我电脑里已经安装好了tf,[以前安装的]
我的tf版本为:1.13.1
简单测试代码:
import tensorflow as tf
sess = tf.Session()
a = tf.constant(1)
b = tf.constant(2)
print("this is test for tf.")
print(sess.run(a+b))
结果:
虽然有很多warning,但最后几行还是正确的显示结果了,说明tf安装正确。
参考资料:查看所使用的tensorflow是GPU还是CPU版本
运行以下代码:
import os
from tensorflow.python.client import device_lib
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "99"
if __name__ == "__main__":
print(device_lib.list_local_devices())
原作者资料:TensorFlow 07: Word Embeddings (2) – Loading Pre-trained Vectors
参考并修改的资料:https://blog.csdn.net/lxg0807/article/details/72518962
跑了下代码发现,本地根本不行啊,才加载一个1.68g的词向量,8G内存的笔记本,运行一些普通软件就占用内存30%~50%,现在一下就冲到了94%以上。
1024lab,极客云等。
另外,国外有kaggle可以免费使用notebook之类的,国内百度AI studio会操作的话可以不要钱,只不过百度的只支持paddlepaddle框架,不支持tensorflow。
转到2.3 tensorflow读取小文件
四行数据咱也能写代码,对吧,只不过不能用,道理都是一样的。
就用它吧,保存为test300d.txt
3 300
, -0.225854 0.107560 0.197237 -0.163468 0.090813 0.040628 0.176729 -0.011261 -0.053033 0.037572 -0.155545 0.053847 0.131007 0.250081 -0.071398 -0.089812 -0.034247 0.078562 0.023870 0.159746 0.100427 0.021786 0.266321 0.004339 0.105988 -0.002758 0.119828 0.004190 -0.154152 0.087963 0.179135 0.041696 -0.150765 0.112602 -0.003246 -0.115960 0.042190 0.108845 0.138592 -0.270801 0.276069 -0.377507 -0.133841 0.225290 -0.084972 -0.046473 -0.163377 -0.129677 0.178721 -0.008124 -0.037467 0.291655 0.144279 -0.118583 0.046584 0.021907 0.126214 0.054273 0.048182 0.079335 -0.126211 0.045360 -0.099212 -0.016365 -0.009512 -0.038277 -0.152457 0.013738 -0.210855 -0.151658 0.068768 0.310373 0.086278 0.065519 0.089834 0.264020 0.206357 -0.046300 0.111625 -0.112923 0.025023 0.266332 0.238958 -0.112658 0.037161 -0.228547 0.048586 0.243026 -0.143488 0.045040 0.028236 0.096553 0.011036 0.119268 0.068397 -0.000245 -0.011066 -0.096202 -0.020504 -0.104224 -0.152824 -0.126277 0.003383 0.146738 0.034192 -0.063062 -0.100550 0.081958 0.297142 -0.095431 0.047876 0.045076 0.061213 -0.103860 -0.046096 -0.108332 0.083888 -0.170114 0.091852 -0.111302 0.036355 0.048322 0.048027 -0.133125 -0.173485 -0.062455 0.133545 0.264515 -0.199027 -0.134663 -0.176003 -0.073278 -0.071808 -0.067675 0.065894 -0.061778 -0.207889 -0.035713 0.129135 0.160631 0.064196 0.036111 -0.037556 -0.123741 0.070222 -0.011605 0.095488 -0.026130 0.176827 0.135286 -0.091638 -0.196278 0.135840 -0.067259 -0.066008 -0.207676 -0.178852 -0.009413 -0.113950 0.196629 -0.114693 -0.026324 -0.141586 0.197364 -0.078522 -0.162726 0.052150 0.003707 0.034934 -0.067691 -0.014802 0.025208 -0.012278 0.014441 0.015678 0.044566 0.007233 -0.030680 -0.075503 0.143719 0.075201 0.141424 -0.038741 0.120257 0.066381 0.028938 -0.026662 0.052459 0.103320 -0.057982 0.058221 0.058726 -0.196115 -0.118826 -0.017446 0.047007 0.301567 0.037915 -0.147273 0.340786 -0.015451 -0.004354 0.009008 -0.036533 0.171037 0.224140 -0.119820 0.302488 -0.036199 -0.200074 0.108383 0.048416 0.059023 0.092124 0.024632 0.049616 -0.205193 0.018068 -0.330599 0.047790 -0.031321 -0.066260 -0.077764 0.274229 -0.157499 -0.090307 -0.057102 0.099106 0.094118 -0.152254 -0.012646 0.065620 0.032115 0.122921 0.051477 0.019677 0.321413 0.100348 -0.195362 0.033550 0.171877 -0.054965 -0.090468 -0.046022 -0.023165 0.142064 0.160361 -0.100200 0.114204 -0.251116 -0.020862 0.259914 0.010826 -0.333081 -0.029773 -0.106668 -0.066178 -0.055028 0.032080 0.081552 0.237320 0.034470 0.116792 -0.054930 0.035778 -0.171559 -0.077482 0.091026 -0.050017 0.080905 -0.356599 -0.044822 -0.058992 0.191774 0.001098 0.036497 -0.047119 -0.051166 0.028191 0.230730 -0.093177 -0.086363 -0.153171 -0.000628 0.028436 -0.117305 -0.154677 -0.030172 -0.073724 0.022715 -0.036977 0.059616 0.153312 -0.103805 0.231885 0.247361 -0.134653 0.142064 0.144121 0.005673
的 -0.242538 0.100439 0.129818 -0.104647 -0.028103 0.058042 0.190883 0.153426 0.034308 0.071330 -0.000116 0.113657 0.097657 0.030841 0.060856 0.056382 -0.195434 0.031622 0.003772 0.059192 -0.021331 -0.109444 0.192544 0.012395 0.107907 0.179732 0.216159 -0.004080 -0.127886 0.022992 0.169664 0.191425 -0.022217 -0.095708 0.075299 -0.169385 0.042564 0.002497 0.033388 -0.279786 0.135520 0.028730 -0.006901 0.183539 0.175054 0.166405 0.106541 -0.030475 0.122642 -0.196793 0.247228 0.058643 0.177309 -0.197690 -0.088260 0.094268 0.117994 0.031037 0.069194 0.000642 -0.066777 0.101824 -0.002390 0.094974 0.121026 0.153325 -0.304356 0.173549 -0.093552 0.029033 0.101660 0.149433 0.072934 0.143490 0.083457 0.241503 -0.070801 -0.088046 0.003713 -0.280668 -0.001448 0.003456 0.101584 0.131760 -0.223845 -0.309329 0.016964 0.347164 0.132431 -0.111628 -0.138338 -0.064733 0.007556 0.122302 0.184578 -0.078595 -0.140727 -0.192051 -0.086686 -0.038096 -0.097754 -0.052457 -0.018865 0.045217 0.132015 0.010384 -0.070730 -0.116558 0.109532 -0.159887 -0.024422 0.011281 -0.006494 0.021118 -0.021956 0.045676 0.285816 -0.096120 0.045639 0.046192 -0.194560 0.143332 0.013284 0.181637 -0.135146 -0.213470 -0.122927 0.139591 -0.174840 -0.230727 -0.336673 0.028399 0.133554 -0.022328 0.263509 -0.135144 -0.085525 -0.068479 0.147214 0.148020 -0.165846 0.096487 0.216477 -0.130104 0.220343 0.022198 0.081715 0.190736 -0.112020 0.124746 -0.042398 -0.100392 0.217173 -0.025453 -0.261025 -0.122996 -0.065484 0.169312 -0.274064 0.073796 -0.042404 0.003309 -0.026870 0.224915 -0.086456 -0.116525 0.077721 -0.003964 0.094634 -0.345002 -0.055975 0.189918 -0.206350 -0.058314 0.003844 -0.008447 -0.021032 0.057915 0.084640 0.098421 0.103423 0.139302 0.069879 0.235352 -0.012435 -0.214576 0.140327 -0.096340 -0.000419 0.145002 -0.118673 -0.067662 -0.314651 0.103676 0.213736 0.119828 -0.093621 0.300272 -0.054337 0.236886 -0.066297 0.070531 0.055797 -0.052518 -0.042077 0.220657 -0.085996 0.439905 0.213758 -0.013311 0.172127 -0.072370 0.025413 0.129522 0.082697 0.258775 -0.146191 -0.015176 -0.039916 0.097016 0.134828 -0.051018 0.105613 0.200699 -0.085717 -0.149180 -0.140295 -0.099351 -0.072185 0.008729 0.114468 -0.014246 0.211366 0.059199 0.042156 0.000897 0.234377 0.119545 -0.052635 -0.034904 -0.053223 -0.105491 -0.097634 -0.044138 0.039147 0.025329 0.121565 0.042493 0.119284 0.007208 0.110501 0.105863 0.014750 -0.279106 -0.178406 0.028334 -0.144416 0.213126 0.025383 0.247148 0.346476 -0.046433 0.199948 0.019231 0.053996 -0.044669 -0.117902 -0.048377 -0.114109 0.047294 -0.266003 -0.155737 0.022962 -0.032529 -0.112454 0.065954 0.005879 0.160480 -0.098461 0.098248 -0.110154 -0.067323 -0.102438 -0.100263 -0.001491 -0.205655 -0.219179 0.047583 -0.187761 0.135312 0.035478 0.002708 0.039958 -0.083279 0.195324 0.142303 -0.079450 0.133499 0.202978 -0.277668
。 -0.283826 -0.052346 0.080995 -0.139234 0.153747 0.052080 0.152875 0.159906 -0.100812 0.051320 -0.103536 -0.089473 0.056333 0.140998 -0.062160 -0.124558 -0.066892 -0.009883 0.091323 0.173555 -0.096824 0.053216 0.320953 -0.072564 0.084597 -0.016583 0.137165 0.005142 -0.181158 0.144163 0.155581 0.165243 -0.017603 -0.001569 -0.008859 -0.074905 0.062937 -0.126123 0.157542 -0.174461 0.277550 -0.226569 0.105378 0.384084 0.012730 0.064785 0.061948 0.034733 0.245869 -0.052040 -0.061160 0.229989 0.137800 0.058283 0.062240 0.165518 0.029029 0.008543 0.159878 0.128581 -0.132286 -0.042042 -0.064327 -0.029669 -0.012382 0.171713 -0.170834 -0.030781 -0.156063 -0.166197 0.083500 0.245971 0.158185 0.124231 0.016966 0.098247 0.108287 -0.033103 0.110902 0.085093 -0.012798 0.059657 0.207193 0.008308 -0.073832 -0.165532 0.103812 0.138122 -0.223544 -0.129617 0.024598 0.118812 0.023367 0.241243 0.167620 0.045504 0.004117 -0.133555 -0.034388 -0.069076 -0.219639 -0.210766 0.192454 0.116632 -0.013204 -0.170307 -0.193683 0.075764 0.209414 -0.036529 -0.005920 0.164980 0.069390 -0.044813 0.209077 -0.192445 0.179965 -0.183163 0.145443 -0.115985 0.078686 0.064413 0.106028 0.040743 0.007855 -0.077971 0.019152 0.060632 -0.025784 -0.157173 -0.069382 0.041079 0.079359 -0.061446 0.156869 -0.041106 -0.239221 -0.040970 -0.000015 0.099060 -0.247002 -0.020837 0.050309 0.002642 0.118486 -0.029898 0.186345 0.085188 0.178551 0.096495 -0.075727 -0.120875 0.101078 0.074043 -0.114990 -0.139079 -0.132218 0.178934 -0.198598 0.116678 0.085819 -0.047442 -0.343870 -0.023334 -0.127745 -0.187099 0.153834 -0.065911 0.212171 -0.226741 0.007796 0.170214 -0.123449 0.030632 -0.134519 0.026184 0.060357 0.023709 -0.105402 0.059923 -0.054748 0.163454 -0.021259 0.143792 0.039344 -0.113686 0.095763 0.047529 0.053945 -0.024458 -0.035755 -0.034898 -0.117274 -0.140923 -0.051384 0.073058 0.142643 0.218760 -0.172208 0.232220 0.078158 0.015812 0.180485 -0.130071 0.163176 0.193347 0.036909 0.212062 -0.014643 -0.164350 0.269914 -0.020742 0.139275 0.116478 -0.010222 0.046338 -0.163462 0.078293 -0.194750 0.146771 -0.066055 0.023407 -0.031146 0.323978 -0.104894 -0.062218 -0.067920 -0.058051 -0.007136 -0.065643 0.057267 0.005363 0.113890 0.194012 0.130181 0.081436 0.086198 0.065030 -0.172616 0.074657 0.038350 -0.150484 -0.019897 -0.079627 0.163732 0.090669 0.121193 -0.269247 0.119581 -0.304608 0.071850 0.088829 0.151985 -0.040556 -0.166373 -0.112855 -0.022780 0.054751 -0.004542 -0.012059 0.113281 -0.085975 0.213007 0.050355 0.042661 -0.188214 -0.074528 0.242681 -0.223175 0.019245 -0.291517 -0.086909 0.100913 0.090165 0.080523 0.154252 0.056052 0.049938 0.099428 0.266409 -0.078517 -0.211588 -0.247789 -0.061397 0.011922 -0.010878 -0.138854 -0.032372 -0.191472 0.056607 0.051876 0.045863 0.213666 -0.076109 0.197351 0.265458 -0.068780 0.057721 0.142923 -0.091333
import numpy as np
import tensorflow as tf
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" #只显示error信息。(但怎么不成功)
filename = "D:/xxx/test300d.txt"
#py3
def loadWord2Vec(filename):
vocab = []
embd = []
cnt = 0
fr = open(filename, 'r', encoding="utf-8")
line = fr.readline().strip()
#print(line) #3 300
word_dim = int(line.split(' ')[1])
vocab.append("unk")
embd.append([0]*word_dim)
for line in fr :
row = line.strip().split(' ')
vocab.append(row[0]) #把第一个字/词加入vocab中
embd.append(row[1:]) #把后面一长串加入embd中
print("loaded word2vec")
fr.close()
return vocab,embd
vocab,embd = loadWord2Vec(filename)
vocab_size = len(vocab) #1+3
embedding_dim = len(embd[0]) #300
embedding = np.asarray(embd)
W = tf.Variable(tf.constant(0.0, shape=[vocab_size, embedding_dim]),
trainable=False, name="W")
embedding_placeholder = tf.placeholder(tf.float32, [vocab_size, embedding_dim])
embedding_init = W.assign(embedding_placeholder)
# 初始化变量
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
sess.run(embedding_init, feed_dict={embedding_placeholder: embedding})
x = tf.nn.embedding_lookup(W, [1,0]) #[1,0]相当于取出单词对应标号为1,0的embdding
y = sess.run(x)
print(y)
print(y.shape)
主要结果:
[[-2.25854e-01 1.07560e-01 1.97237e-01 -1.63468e-01 9.08130e-02
4.06280e-02 1.76729e-01 -1.12610e-02 -5.30330e-02 3.75720e-02
-1.55545e-01 5.38470e-02 1.31007e-01 2.50081e-01 -7.13980e-02
-8.98120e-02 -3.42470e-02 7.85620e-02 2.38700e-02 1.59746e-01
1.00427e-01 2.17860e-02 2.66321e-01 4.33900e-03 1.05988e-01
-2.75800e-03 1.19828e-01 4.19000e-03 -1.54152e-01 8.79630e-02
1.79135e-01 4.16960e-02 -1.50765e-01 1.12602e-01 -3.24600e-03
-1.15960e-01 4.21900e-02 1.08845e-01 1.38592e-01 -2.70801e-01
2.76069e-01 -3.77507e-01 -1.33841e-01 2.25290e-01 -8.49720e-02
-4.64730e-02 -1.63377e-01 -1.29677e-01 1.78721e-01 -8.12400e-03
-3.74670e-02 2.91655e-01 1.44279e-01 -1.18583e-01 4.65840e-02
2.19070e-02 1.26214e-01 5.42730e-02 4.81820e-02 7.93350e-02
-1.26211e-01 4.53600e-02 -9.92120e-02 -1.63650e-02 -9.51200e-03
-3.82770e-02 -1.52457e-01 1.37380e-02 -2.10855e-01 -1.51658e-01
6.87680e-02 3.10373e-01 8.62780e-02 6.55190e-02 8.98340e-02
2.64020e-01 2.06357e-01 -4.63000e-02 1.11625e-01 -1.12923e-01
2.50230e-02 2.66332e-01 2.38958e-01 -1.12658e-01 3.71610e-02
-2.28547e-01 4.85860e-02 2.43026e-01 -1.43488e-01 4.50400e-02
2.82360e-02 9.65530e-02 1.10360e-02 1.19268e-01 6.83970e-02
-2.45000e-04 -1.10660e-02 -9.62020e-02 -2.05040e-02 -1.04224e-01
-1.52824e-01 -1.26277e-01 3.38300e-03 1.46738e-01 3.41920e-02
-6.30620e-02 -1.00550e-01 8.19580e-02 2.97142e-01 -9.54310e-02
4.78760e-02 4.50760e-02 6.12130e-02 -1.03860e-01 -4.60960e-02
-1.08332e-01 8.38880e-02 -1.70114e-01 9.18520e-02 -1.11302e-01
3.63550e-02 4.83220e-02 4.80270e-02 -1.33125e-01 -1.73485e-01
-6.24550e-02 1.33545e-01 2.64515e-01 -1.99027e-01 -1.34663e-01
-1.76003e-01 -7.32780e-02 -7.18080e-02 -6.76750e-02 6.58940e-02
-6.17780e-02 -2.07889e-01 -3.57130e-02 1.29135e-01 1.60631e-01
6.41960e-02 3.61110e-02 -3.75560e-02 -1.23741e-01 7.02220e-02
-1.16050e-02 9.54880e-02 -2.61300e-02 1.76827e-01 1.35286e-01
-9.16380e-02 -1.96278e-01 1.35840e-01 -6.72590e-02 -6.60080e-02
-2.07676e-01 -1.78852e-01 -9.41300e-03 -1.13950e-01 1.96629e-01
-1.14693e-01 -2.63240e-02 -1.41586e-01 1.97364e-01 -7.85220e-02
-1.62726e-01 5.21500e-02 3.70700e-03 3.49340e-02 -6.76910e-02
-1.48020e-02 2.52080e-02 -1.22780e-02 1.44410e-02 1.56780e-02
4.45660e-02 7.23300e-03 -3.06800e-02 -7.55030e-02 1.43719e-01
7.52010e-02 1.41424e-01 -3.87410e-02 1.20257e-01 6.63810e-02
2.89380e-02 -2.66620e-02 5.24590e-02 1.03320e-01 -5.79820e-02
5.82210e-02 5.87260e-02 -1.96115e-01 -1.18826e-01 -1.74460e-02
4.70070e-02 3.01567e-01 3.79150e-02 -1.47273e-01 3.40786e-01
-1.54510e-02 -4.35400e-03 9.00800e-03 -3.65330e-02 1.71037e-01
2.24140e-01 -1.19820e-01 3.02488e-01 -3.61990e-02 -2.00074e-01
1.08383e-01 4.84160e-02 5.90230e-02 9.21240e-02 2.46320e-02
4.96160e-02 -2.05193e-01 1.80680e-02 -3.30599e-01 4.77900e-02
-3.13210e-02 -6.62600e-02 -7.77640e-02 2.74229e-01 -1.57499e-01
-9.03070e-02 -5.71020e-02 9.91060e-02 9.41180e-02 -1.52254e-01
-1.26460e-02 6.56200e-02 3.21150e-02 1.22921e-01 5.14770e-02
1.96770e-02 3.21413e-01 1.00348e-01 -1.95362e-01 3.35500e-02
1.71877e-01 -5.49650e-02 -9.04680e-02 -4.60220e-02 -2.31650e-02
1.42064e-01 1.60361e-01 -1.00200e-01 1.14204e-01 -2.51116e-01
-2.08620e-02 2.59914e-01 1.08260e-02 -3.33081e-01 -2.97730e-02
-1.06668e-01 -6.61780e-02 -5.50280e-02 3.20800e-02 8.15520e-02
2.37320e-01 3.44700e-02 1.16792e-01 -5.49300e-02 3.57780e-02
-1.71559e-01 -7.74820e-02 9.10260e-02 -5.00170e-02 8.09050e-02
-3.56599e-01 -4.48220e-02 -5.89920e-02 1.91774e-01 1.09800e-03
3.64970e-02 -4.71190e-02 -5.11660e-02 2.81910e-02 2.30730e-01
-9.31770e-02 -8.63630e-02 -1.53171e-01 -6.28000e-04 2.84360e-02
-1.17305e-01 -1.54677e-01 -3.01720e-02 -7.37240e-02 2.27150e-02
-3.69770e-02 5.96160e-02 1.53312e-01 -1.03805e-01 2.31885e-01
2.47361e-01 -1.34653e-01 1.42064e-01 1.44121e-01 5.67300e-03]
[ 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00]]
(2, 300)
以上,先到这里,应该会有后续的探索。