题目:
In this kata you have to write a Morse code decoder for wired electrical telegraph.
Electric telegraph is operated on a 2-wire line with a key that, when pressed, connects the wires together, which can be detected on a remote station. The Morse code encodes every character being transmitted as a sequence of “dots” (short presses on the key) and “dashes” (long presses on the key).
When transmitting the Morse code, the international standard specifies that:
“Dot” – is 1 time unit long.
“Dash” – is 3 time units long.
Pause between dots and dashes in a character – is 1 time unit long.
Pause between characters inside a word – is 3 time units long.
Pause between words – is 7 time units long.
However, the standard does not specify how long that “time unit” is. And in fact different operators would transmit at different speed. An amateur person may need a few seconds to transmit a single character, a skilled professional can transmit 60 words per minute, and robotic transmitters may go way faster.
For this kata we assume the message receiving is performed automatically by the hardware that checks the line periodically, and if the line is connected (the key at the remote station is down), 1 is recorded, and if the line is not connected (remote key is up), 0 is recorded. After the message is fully received, it gets to you for decoding as a string containing only symbols 0 and 1.
For example, the message HEY JUDE, that is ···· · −·−− ·−−− ··− −·· · may be received as follows:
1100110011001100000011000000111111001100111111001111110000000000000011001111110011111100111111000000110011001111110000001111110011001100000011
As you may see, this transmission is perfectly accurate according to the standard, and the hardware sampled the line exactly two times per “dot”.
That said, your task is to implement two functions:
Function decodeBits(bits), that should find out the transmission rate of the message, correctly decode the message to dots ., dashes - and spaces (one between characters, three between words) and return those as a string. Note that some extra 0’s may naturally occur at the beginning and the end of a message, make sure to ignore them. Also if you have trouble discerning if the particular sequence of 1’s is a dot or a dash, assume it’s a dot.
Function decodeMorse(morseCode), that would take the output of the previous function and return a human-readable string.
NOTE: For coding purposes you have to use ASCII characters . and -, not Unicode characters.
The Morse code table is preloaded for you as MORSE_CODE dictionary; in Java MorseCode class is provided; in Haskell the codes are in a Map String String and can be accessed like this: morseCodes ! “.–”; in Racket MORSE-CODE and can be accessed like this: (hash-ref MORSE-CODE “.–”). Feel free to use this preload.
All the test strings would be valid to the point that they could be reliably decoded as described above, so you may skip checking for errors and exceptions, just do your best in figuring out what the message is!
Good luck!
After you master this kata, you may try to Decode the Morse code, for real.
我的思路:
这题叫 Decode the Morse code, advanced,是前一题 Decode the Morse code 的进阶版。前一题呢是让我们把 “···· · −·−− ·−−− ··− −·· ·” 翻译成 HEY JUDE。这题是让我们把01码翻译成可读的文字。题目示意我们写两个function,一个是 decodeBits() ,用于把
1100110011001100000011000000111111001100111111001111110000000000000011001111110011111100111111000000110011001111110000001111110011001100000011
翻译成 “···· · −·−− ·−−− ··− −·· ·” ,另一个是 decodeMorse() 可以直接复制之前的代码,用于把 “···· · −·−− ·−−− ··− −·· ·” 翻译成 HEY JUDE。
如题,一单位的“1”表示一个“·”,三单位的“1”表示一个“−”,一单位的“0”表示“·”和−”之间的间隔,三单位的“0”表示“一个单词中字母之间的间隔,七单位的“0”表示“单词之间的间隔。
第一个问题是要知道一个单位是几位,先数一下最开头有几个“1”,这可能是一个单位的长度也可能是三个单位的长度。我们先假设这是一个单位的长度,记为unit,往后遍历整个字符串,如果连续的“0”或“1”的长度没有少于这个unit的,那我们就认定unit为一个单位的长度,不然就刷新unit的值,使其更小。
有时候也会有歧义,比如说“111000111000111”,不知道应该是“···”还是“− − −”,所以题目补了一句:Also if you have trouble discerning if the particular sequence of 1’s is a dot or a dash, assume it’s a dot. 如果无法确定是“·”还是“−”,那就算是“·”。
确定了一个单位的长度,接下来就是按规则翻译。可以先分成一个个单词,再分成一个个字母,再分成一个个符号,然后逐个翻译,补上空格。
我的解答:
def decodeBits(bits):
bits = bits.strip("0")
unit = 0
for bit in bits:
if bit != "0":
unit += 1
else:
break
#unit now might be 1 unit or 3 units
count = 1
for i in range(1,len(bits)):
if bits[i] == bits[i-1]:
count += 1
else:
if count < unit:
unit = count
count = 1
else:
count = 1
morse_code = ""
words = bits.split("0"*7*unit)
for word in words:
characters = word.split("0"*3*unit)
for character in characters:
signs = character.split("0"*unit)
for sign in signs:
if sign == "1"*3*unit:
morse_code += "-"
else:
morse_code += "."
morse_code += " "
morse_code += " "
return morse_code
def decodeMorse(morse_code):
morse_code.strip()
result = ""
characters = morse_code.split(" ")
for character in characters:
if character != "":
result += MORSE_CODE[character]
else:
result += " "
return ' '.join(result.split())
Most Clever:
做完题目看一下题后投票clever最多的答案:
def decodeBits(bits):
import re
# remove trailing and leading 0's
bits = bits.strip('0')
# find the least amount of occurrences of either a 0 or 1, and that is the time hop
time_unit = min(len(m) for m in re.findall(r'1+|0+', bits))
# hop through the bits and translate to morse
return bits[::time_unit].replace('111', '-').replace('1','.').replace('0000000',' ').replace('000',' ').replace('0','')
def decodeMorse(morseCode):
return ' '.join(''.join(MORSE_CODE[l] for l in w.split()) for w in morseCode.split(' '))
这个方法和我思路一致,但异常清新简洁。每一行代码解决一个问题。
第二句是用来得到unit的大小,其中
re.findall(r'1+|0+', bits)
使用正则表达式,在bits里找到所有连续多个“1”或连续多个“0”。然后找到最小的长度,那就是unit。
第三句直接用 replace(),简单却恰到好处,省去了分割再还原的麻烦,非常优雅。不过要注意从长到短替换。
总结:
这道题主要就是unit长度的获取,需要想想。其他的没什么难度。两种方法思路基本一致,但Most Clever非常简单优雅。