东京大学招生海报上的那些事(上)

最近在人人上很火的关于揭秘东京大学招生海报的文章。其实是STAGE1论坛上把各位集思广益结果的一个总结。

请先到这个帖里膜拜s1的巨巨们。破译工作不是我做的,是他们搞的。


====


东京大学招生海报上的那些事(上)_第1张图片


这个是去年东京大学情報理工学系研究科(Graduate School of Information Science and Technology,大致是类似于计算机系研究生院的地方吧我猜)的招生海报。这张图流传很广,想必不少人在人人和微博上看到过。此图貌似引起各高校CS系男砖工们各种遐想和yy,图中的妹子一副古色古香的打扮,娴静而端庄,然后大家就纷纷感叹同是CS自己的学校怎么就没有这样的福利……


当大家对着可爱的妹子胡思乱想时,不知你是否想过,前面的那一串二进制01串并非是用脸滚键盘出来的装饰品,而是别有深意呢?


实际上,这张图真正最有趣的地方并非作为背景的妹子,而恰恰是前面的这个01串。下面我们就来看看这个01串到底是什么意思。


首先用ocr软件将图片上的数字转化为文本,有些地方颜色和背景差不多需要手工识别。数字串总共26行,除最后一行98个数字外其它每行110个数字,共计2848个数字。最后手工校验一遍。这个过程比较考验耐心……


00011111100010110000100000000000000000000000000000000000000000000000001000000011011011010101000011001011010011
10110000100100000000010100001111010101001100011110100001010101101010110101100000001110000000000011010100001101
10000101000100010011011010001110001000010110001100010011000110101100010011000011010010011010011000001101100010
11100000011010011010101001010111100001110100010011011001110101100000001011000011110111010010111101110010111000
11010110000001010010011000101110111111000000000011111111110000111000111101110000111000011110001100010110010011
01000011001000100110010011110011001001110010111001111101111001111010010011011110111110111001111101111111010111
11000111111000000001101100001000011110110010000001100001010100101011010010011010110011000010010010000010000010
10010100100001100011011000110010000110001101010110010001010101111001000101010010101100001100111100010011110111
10000101001010111110010000000001010000111010010001100010101101100001100110100010010011011110111111010010001000
01000110000111001011100001001100101110010000101010000101111100011100111101001111101110011011010110100000001011
10111101011101010110010010110000011011000010101111011010111010000111101101001110010110010110111001000111011011
11111110000100111110111110011110111100001100110111000001110100001001001001000001010100101100011010000101100011
00010011010110110000110011100110101001001100101001000111110000010110000000100010011011010101011100111000000100
11101011110110011001010101011001011010110111011111111010110000111010100100100110001111101011000010011100110110
10101001111011010101000111110000011110100011110110100001010101110111010101001110010001110101000101000101000001
01010010110001101011010110001100000111001001010000011000100110000110001000110010111111100101101010011110011101
10011110011000111100101110000101010101000110101101111010001000111001110100100001111110011111010101000110010101
00110010001111111011111000000100111001111001110101010001101110011000100100010110010111010011111000101100001011
11100010100010011000110110111010011001001001111011010010111000001111000011100100100100010011101011010001110010
01100101001101100001101100110000111111010010101000011010011000000111110100011000100001101010010110100110110110
01010001010000101001110011010110000101110001111010000101001011100100110100001001000001101000110101111110001010
01010000100100011000011000110110110111110100000010011110111001100000010011100100100101100010000010111111100101
10110110111000100000100010100000100001100001001110011101001000101000000110001110011000011101111100101000111101
01000011010010100100011000001000010001111011110001101010100110001111110100001101110100100001101101100010101101
11000010111011101011111111011110010000100010100101101101100100110101111000001101110010100000011100110101011000
01111011110010100000000010110111111110101010100011000010011100000110111110000000010000000000000000


首先第一个想法肯定是ASCII编码,由于不知道是big-endian还是little-endian就两个都试一次,结果看不出来是什么有意义的字符串。考虑到可能用凯撒密码加密过,去掉非拉丁字母后反解出26个可能的文本,也都没有什么意义。看来不是ASCII编码。


考虑到2848长度是8的倍数,也可能是二进制文件,所以也big-endian和little-endian地读出来并写成二进制文件。


gnome还是比较聪明的,认出了big-endian的那个输出是一个gz压缩包。实际上,看不出来的话,也可以把文件头的3个字节丢google,


东京大学招生海报上的那些事(上)_第2张图片


改名为bin.gz,解压得到一个叫做bin的文件。看一下


\CA\FE\BA\BE\00\00\002\00!
\00\00 \00\00\00
\00\00
\00\00\00\00\00<init>\00()V\00Code\00main\00([Ljava/lang/String;)V\00 StackMapTable \00\00 \00 \00\00\00)w-aurlwtcniewo./-t.kjhltiypioe.o/kvru.fae\00 \00\00\00 \00\00 \00i\00java/lang/Object\00java/lang/System\00out\00Ljava/io/PrintStream;\00java/lang/String\00charAt\00(I)C\00java/io/PrintStream\00print\00(C)V\00 \00\00\00\00\00\00\00\00\00\00\00 \00\00
\00\00\00\00\00\00\00\00*\B7\00\B1\00\00\00\00\00 \00 \00 \00\00
\00\00\00=\00\00\00\00\00!=)\A2\00\B2\00h)p\B6\00\B6\00\84\A7\FF\E5\B1\00\00\00\00 \00\00\00
\00\FD\00\00\F9\00\00\00\00


熟悉java的同学应该一眼就能看出这是个编译过的class文件(看不懂的也可以用jd-gui反编译),改名为bin.class,运行提示


Exception in thread "main" java.lang.NoClassDefFoundError: bin (wrong name: i)


再改名为i.class,运行得到


www.i.u-tokyo.ac.jp/fun/hikari-loveletter


哦哦哦!感觉要接近真相了。


东京大学招生海报上的那些事(上)_第3张图片


information is conceived in letters and sounds,
and wherever sentiment and intention are found -
poetry is formed when feelings rest within consonants and vowels,
melody is born when emotion in harmony with each soul resounds.

our desire to communicate is found at the core of the information we impart,
likes or dislikes, black or white, 0 or 1, science or art -
from our eyes and our ears - our whole bodies - each part,
and, most assuredly, from each one of our hearts.


唔哦,好文艺……点next进去,有一首歌 ヒカリラブレター hikari loveletter,似乎就是海报上那个妹子唱的,还挺好听的……


反复听了几遍,当作练听力好了……嗯,其实有一段挺违和的,在2:58 - 3:01那一段,明显不是日语,而且感觉是机器合成的一样。用audacity把这段剪出来分析。


东京大学招生海报上的那些事(上)_第4张图片


瞎搞了一番,最后的发现其实很简单:倒放reverse即可。听到的应该是xxxくれて、ありがとう(听力不好,真心听不清楚……)。


有说是慢放能听作“聞いたくれてありがとう” (谢谢倾听)。不过我是听作“いてくれて、ありがとう”(来到这里,谢谢你了),超带感的啊~


====


那么接下来来玩玩前年的吧。


东京大学招生海报上的那些事(上)_第5张图片


妹子还是一如既往可爱呢。


100011101110010011000010110010001110101011000010111010001100101001000000101001101100011011010000110111101101111
011011000010000001101111011001100010000001001001011011100110011001101111011100100110110101100001011101000110100
101101111011011100010000001010011011000110110100101100101011011100110001101100101001000000110000101101110011001
000010000001010100011001010110001101101000011011100110111101101100011011110110011101111001001011000010000001010
100011010000110010100100000010101010110111001101001011101100110010101110010011100110110100101110100011110010010
000001101111011001100010000001010100011011110110101101111001011011111000111011100100110000101100100011101010110
000101110100011001010010000001010011011000110110100001101111011011110110110000100000011011110110011000100000010
010010110111001100110011011110111001001101101011000010111010001101001011011110110111000100000010100110110001101
101001011001010110111001100011011001010010000001100001011011100110010000100000010101000110010101100011011010000
110111001101111011011000110111101100111011110010010110000100000010101000110100001100101001000000101010101101110
011010010111011001100101011100100111001101101001011101000111100100100000011011110110011000100000010101000110111
101101011011110010110111110001110111001001100001011001000111010101100001011101000110010100100000010100110110001
101101000011011110110111101101100001000000110111101100110001000000100100101101110011001100110111101110010011011
010110000101110100011010010110111101101110001000000101001101100011011010010110010101101110011000110110010100100
000011000010110111001100100001000000101010001100101011000110110100001101110011011110110110001101111011001110111
100100101100001000000101010001101000011001010010000001010101011011100110100101110110011001010111001001110011011
010010111010001111001001000000110111101100110001000000101010001101111011010110111100101101111100011101110010011
000010110010001110101011000010111010001100101001000000101001101100011011010000110111101101111011011000010000001
101111011001100010000001001001011011100110011001101111011100100110110101100001011101000110100101101111011011100
010000001010011011000110110100101100101011011100110001101100101001000000110000101101110011001000010000001010100
011001010110001101101000011011100110111101101100011011110110011101111001001011000010000001010100011010000110010
100100000010101010110111001101001011101100110010101110010011100110110100101110100011110010010000001101111011001
100010000001010100011011110110100011101110010011000010110010001110101011000010111010001100101001000000101001101
100011011010000110111101101111011011000010000001101111011001100010000001001001011011100110011001101111011100100
110110101100001011101000110100101101111011011100010000001010011011000110110100101100101011011100110001101100101
001000000110000101101110011001000010000001010100011001010110001101101000011011100110111101101100011011110110011
10111100100101100001000000101010001101000011001010010000001010101011011100110100101110110011001010111001001110011011


共27行,除最后一行116个数字外每行111个数字,共计3002个数字。尝试掐掉末尾转成ASCII码和二进制文件都失败了。


试着转化为图像,大致如图。


东京大学招生海报上的那些事(上)_第6张图片


仔细观察可以发现,似乎什么东西重复了4遍……于是用kmp作check,发现第1-第623个字符重复了3次,而第四次的后面似乎有些噪声。不管它,先把1-623扣出来。


623不是8的倍数,但623=7*89,考虑到ASCII码第一位一定是0,所以每7位组成一个字符,尝试打出来。毫无结果……


等等,第一位一定是0的话,似乎这串里8的倍数的位置都是0,这样的话……莫非是最后补上一个0,然后这回是看作little-endian?尝试后依旧无果……


再等等,8的倍数的位置都是0的话,也有可能是在第一位补0,然后看作big-endian。这回结果出来了:


[jffifa@jffifa acm]$ ./t
Graduate School of Information Science and Technology, The University of Tokyo


怎么不是妹子的联系方式呢……


顺便一提,妹子的衣服叫做“十二单“,是一种极其考究的古服,有兴趣的同学可以研究一下。


====


今年北大的招生海报……


已经在github上开设项目了:https://github.com/NaokiKuzumi/istposter2013。


东京大学招生海报上的那些事(上)_第7张图片






你可能感兴趣的:(东京大学招生海报上的那些事(上))