In July 2019, a colleague published a post, on the Sysdream's blog, about an NF&A2P** certified alarm system. If you want more details about this system or the context of this post, check the blog post. Here is a perfect example about why are we saying "Never roll your own crypto". In this blog post, I explain how I break this trivial XOR-based encryption. Then, I will give some recommendations to avoid a case like this one.
This blog post has been written more than 2 years ago. The publication has been delayed due to "political" reasons. It has been published without informing Sysdream, Nicolas Chatelin nor the constructor. The exploitation code has been published in the original paper by Nicolas.
The part where I contributed is the following:
The box connected to a remote server (located outside of France), using the TCP port 6000. No SSL/TLS was involved, but, somehow, the communications seemed to be encrypted.
TL;DR: they rolled their own crypto.
After powering on the box, we could observe that:
- The client sent an 11-bytes (random ?) packet after connecting (HELLO packet);
- No block cipher seemed to be used, because packets size varies and are not aligned to any particular even block size.
After a few hours of research, we found that XOR-based encryption was being used.
We came up with the following Python code:
def keystream(clientid): index = 0 while True: if (index % 10) < 5: j = index % 10 else: j = (index % 10)+1 index += 1 yield clientid[5] ^ clientid[j % len(clientid)] def alarm_xor(clientid, encdata): return ''.join(chr(c ^ k) for c, k in zip(encdata, keystream(clientid)))
The encryption mechanism was generating a keystream based on the 6th byte of the client HELLO packet XORed with the rest (bytes 1 to 5 and 7 to 11) of the HELLO packet.
Then, another XOR was performed between the data and the keystream (we purposely omitted some parts of the cryptographic algorithm).
This is not so explicit about how we found this algorithm.
Initial statements
Nicolas comes to me with these statements:
He had the following strange TCP communication (in a JSON file here):
::uml:: format="svg" alt="Original communication between the alarm and the company" Alarm -> Server: fb 01 36 0c 6b 2b 10 12 28 b9 a7 Server -> Alarm: bf 46 71 5e 79 5e 4d 5b fd df b3 1c 48 15 25 7c Alarm -> Server: ec 1a 31 75 25 5c 50 70 e6 e9 a2 14 35 16 78 0a 0e 3a a2 \n\ bf e7 1a 2e 13 72 0a 7a 2a 9f 86 Server -> Alarm: 05 33 be df 85 69 5e 62 13 68 07 0e 98 Server -> Alarm: b0 e0 06 52 57 25 55 6a 66 e1 ff b9 45 73 19 68 09 0c 33 \n\ a3 a5 dd 20 Alarm -> Server: ec 1a 31 74 15 78 7a 46 c1 df ee 02 48 74 05 69 09 33 bb \n\ 81 da Server -> Alarm: 21 16 6c 7c 5c 77 d6 e9 a6 43 7e 42 09 55 5f 6c e1 b2 dd \n\ 20 Alarm -> Server: 16 2c 0b 13 6e 7a 40 d7 df 83 14 35 77 0c 6e 7e 5c db dc \n\ ae 6e 54 66 07 69 78 4f be bd e8 1b 2a 1e 70 08 0e 33 a1 \n\ b8 e2 1b 5e 0b 16 0a 17 37 bc bc 8f 7c 2c 14 1f 6d 08 2d \n\ a3 a2 e1 13 31 14 6e 0b 15 3a a2 bc fc 03 10 2d Server -> Alarm: 21 15 6c 7c 5c 77 d0 e0 bf 49 76 19 68 0b 0c 31 bc bd e7 \n\ 12 33 17 79 03 17 33 a2 ba fc 18 24 14 79 0d 15 35 a7 a0 \n\ e1 18 2b 15 71 0b 01 2a 9f 86 Alarm -> Server: 7c 09 15 50 c7 cf 93 6f 4e 74 7e 13 6b 42 c5 a5 dd 20 Alarm -> Server: 21 16 6c 68 5c 6d f6 c9 a6 4f 73 53 7e 13 7e 46 c6 ce 9c \n\ 65 5e 6c 1f 74 72 2a 9f 86 Server -> Alarm: ec 1b 31 74 15 78 7a 46 c1 df ee 27 17 Server -> Alarm: 1b 73 17 6a 66 fc e8 93 45 70 4a 21 55 5d 3d ba af e2 1a \n\ 2c 04 63 12 34 09 Alarm -> Server: ec 19 31 74 15 78 7a 46 c1 df ee 02 24 17 70 17 09 33 a2 \n\ bc e0 1a 31 64 0d 7f 66 4c d9 a5 dd 20 Server -> Alarm: ae b8 fc 79 78 49 24 78 56 6e ff ed be 4e 23 0f 63 09 0a \n\ 20 b1 a5 dd 20 Alarm -> Server: 21 13 6c 68 6c 40 d1 c9 83 79 23 0f 03 0b 0e 2f a2 bc e0 \n\ 1a 2d 17 6c 78 74 47 cd c3 9b 03 10 2d Server -> Alarm: 21 12 6c 68 5c 6d f6 cf bf 47 70 46 2e 5f 07 2b b1 be e0 \n\ 1b 3e 04 69 36 33 Alarm -> Server: 7c 0e 15 50 c7 cf 93 6f 4e 74 7e 13 7a 33 a5 a0 e0 1a 2d \n\ 17 70 0b 15 40 df c8 8f 65 56 0e 4d 31 Alarm -> Server: 05 31 be df b5 44 79 66 2c 5a 4b 6e ac a4 e1 12 2c 10 79 \n\ 0b 0a 34 a2 bf e4 18 2c 64 6c 0c 0e 34 a5 ac e7 1d 3d 10 \n\ 77 0c 10 0e 98 Server -> Alarm: 3f a0 a0 83 7f 5e 64 05 68 6a 3d ba c3 9b 03 10 2d Server -> Alarm: 7c 0d 15 50 f7 e2 b4 69 72 4a 2d 5a 57 67 ac a4 f3 18 2c \n\ 04 63 12 34 09 Alarm -> Server: b0 e6 06 4e 72 03 78 7c 50 c1 b2 f8 13 2d 17 6c 0b 09 33 \n\ a2 bc e0 06 5e 6a 04 64 76 48 bb 81 da Server -> Alarm: ae bb fc 79 78 49 24 78 56 6e ff ed be 4e 23 0f 63 09 09 \n\ 32 b1 af f9 27 17 Alarm -> Server: 16 2a 0b 13 6e 7a 40 d7 df 83 14 35 1e 70 0b 15 33 a2 bc \n\ e0 1a 2d 0b 03 76 7d 5c dd c7 f9 27 17 ::end-uml::
He also observed that a XOR encryption was used. He identified this statement by sending messages to the server as following (JSON here):
What he sent as first message | What he received |
---|---|
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
olly9etXoSc6U2eG |
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
olly9etXoSc6U2eG |
\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a |
jhk\x7f8fvUc\\f2R4dD |
\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
omly9etXoSc7U2eG |
\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
only9etXoSc4U2eG |
\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
nlly9etXoSb6U2eG |
\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
mlly9etXoSa6U2eG |
\x03\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01 |
mlly9etXoSa6U2eG |
\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02 |
olly9etXoSc6U2eG |
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01 |
olly9etXoRc6U2eG |
As we can see in the previous table:
- The first message received seems to be encrypted using the first message. This can be used to "authenticate" the server. I'm pretty sure it's not a good idea to proceed like that… 😋
- When we send twice the same probe, we receive the same answer. This is one more clue about that the first message is a key to encrypt the first answer.
- Some chars are just XOR-ed (see the 4th, 5th, 6th, 7th probe).
- Some vectors have an unexpected behavior (see probe #3).
Decryption of the first message
The first step, to understand the complete encryption algorithm, is to understand how the first message is encrypted. To do so, we searched to map the relation between input bytes and output ones. Thus, we asked encryption using the following keys (JSON here):
Encryption key | Output | Observations |
---|---|---|
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
olly9etXoSc6U2eG |
No bytes are changed |
\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
nlly9etXoSb6U2eG |
The 1st and the 11th bytes are XOR-ed by \x01 |
\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00 |
omly9etXoSc7U2eG |
The 2nd and the 12th bytes are XOR-ed by \x01 |
\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00 |
olmy9etXoSc6T2eG |
The 3rd and the 13th bytes are XOR-ed by \x01 |
\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00 |
ollx9etXoSc6U3eG |
The 4th and the 14th bytes are XOR-ed by \x01 |
\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00 |
olly8etXoSc6U2dG |
The 5th and the 15th bytes are XOR-ed by \x01 |
\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00 |
nmmx8duYnRb7T3dF |
All bytes are XOR-ed by \x01 |
\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00 |
olly9dtXoSc6U2eF |
The 6th and the 16th bytes are XOR-ed by \x01 |
\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00 |
olly9euXoSc6U2eG |
The 7th byte is XOR-ed by \x01 |
\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00 |
olly9etYoSc6U2eG |
The 8th byte is XOR-ed by \x01 |
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00 |
olly9etXnSc6U2eG |
The 9th byte is XOR-ed by \x01 |
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01 |
olly9etXoRc6U2eG |
The 10th byte is XOR-ed by \x01 |
To have another point of vue of these results, we expressed them as follows:
Output byte (affected one) | Input bytes (influencing one) |
---|---|
#1 | #1 and #6 |
#2 | #2 and #6 |
#3 | #3 and #6 |
#4 | #4 and #6 |
#5 | #5 and #6 |
#6 | #7 and #6 |
#7 | #8 and #6 |
#8 | #9 and #6 |
#9 | #10 and #6 |
#10 | #11 and #6 |
#11 | #1 and #6 |
#12 | #2 and #6 |
#13 | #3 and #6 |
#14 | #4 and #6 |
#15 | #5 and #6 |
#16 | #7 and #6 |
Thus, we observed the following statements:
- All output bytes are XOR-ed by the 6th one in input.
- Putting the 6th input byte aside, the sequence of input bytes, used to XOR each output ones, is composed of the 1st to the 5th then from the 7th to the 11th.
- The encryption sequence is circular and is used to encrypt all the messages. Thus, the XOR algorithm is reusing the key (and you should never forget the "One Time" in "One Time Pad").
To verify our observations, we created the following script.
import random
TESTS_QT = 1000
KEY_LENGTH = 11
CLEARTEXT = "olly9etXoSc6U2eG"
def request(key):
"""Request the encrypted text using our key"""
assert len(key) == KEY_LENGTH
# ...
return encrypted
def keystream(key):
"""Generate our keystream"""
assert len(key) == KEY_LENGTH
index = 0
while True:
# We avoid to reuse the 6th byte
if index < 5:
real_position = index
else:
real_position = index + 1
yield key[5] ^ key[real_position]
index += 1
index %= KEY_LENGTH - 1
def encrypt(key):
"""Encrypt the cleartext using our key"""
assert len(key) == KEY_LENGTH
return "".join(
chr(c ^ k)
for c, k in zip(CLEARTEXT, keystream(key))
)
for i in range(TESTS_QT):
random_key = random.getrandbits(KEY_LENGTH)
assert request(random_key) == encrypt(random_key)
And… it works.
Thus, we have an idea about how the encryption algorithm is working. But one detail persists: how are chained messages?
Message chaining
How are chained messages?
Some naive possibilities are:
- It regenerates the same keystream for each message. Thus, the key scheduling starts from 0 for each key.
- The keystream is generated for the entire communication. Thus, if a message stops using the keystream at the 3rd byte, the 1st byte of the following message will use the 4th. The key may be changed in the communication.
Let's verify these ideas!
The first idea is that the keystream is reset for each message.
from XXX import keystream, KEY_LENGTH
import json
from pwn import hexdump
messages = json.load(open("sample_origin.json"))
KEY = bytes.fromhex(messages[0]["message"])
assert len(KEY) == KEY_LENGTH
def decrypt(message):
"""Decrypt a cipher text using the key"""
return "".join(
chr(c ^ k)
for c, k in zip(message, keystream(KEY))
)
for m in messages[2:]:
print("From", m["sender"])
print(hexdump(decrypt(bytes.fromhex(m["message"]))))
print()
The first message is well decrypted:
From A
00000000 3c 30 2c 52 65 67 69 73 74 65 72 3e 28 31 38 31 │<0,R│egis│ter>│(181│
00000010 37 39 30 33 37 30 33 34 32 31 43 29 0d 0a │7903│7034│21C)│··│
0000001e
So, the protocol behind the crypto is made of ASCII chars. However, the second message of the global conversation seems to not been decrypted:
From S
00000000 c3 95 19 c2 a3 c3 b8 c3 85 52 67 61 c2 81 c3 a4 │····│····│·Rga│····│
00000010 c3 97 24 c2 85 │··$·│·│
00000015
Moreover, the second message sent by the alarm has successfully decrypted.
From A
00000000 3c 30 2c 53 55 43 43 45 53 53 3e 28 55 53 45 52 │<0,S│UCCE│SS>(│USER│
00000010 30 30 29 0d 0a │00)·│·│
00000015
Thus, we can think that there is a different keystream for each direction of communication and that we have not (yet) the server's keystream. However, this idea is invalidated by the decryption of the third message from the alarm.
From A
00000000 c3 86 06 16 34 2e 41 79 c3 94 4d 0f c3 84 1f 6a │····│4.Ay│··M·│···j│
00000010 2b 2e 45 65 c3 98 4e 22 c2 be 7e 7b 20 29 43 76 │+.Ee│··N"│··~{│ )Cv│
00000020 c2 bd 2f 64 c3 8b 00 03 57 48 35 0a c2 a2 2a 6e │··/d│····│WH5·│··*n│
00000030 c3 8b 74 16 31 4a 2c 0e c2 bf 2e 03 c2 ac 06 09 │··t·│1J,·│··.·│····│
00000040 38 2d 33 14 c2 a0 30 6d c3 83 1b 09 49 4b 2e 03 │8-3·│··0m│····│IK.·│
00000050 c2 a1 2e 70 c3 93 3a 30 │··.p│··:0│
Regarding the two messages we were able to decrypt, we may be on the right path.
The second idea we had was that the keystream is same during all the communication.
from XXX import keystream, KEY_LENGTH
import json
from pwn import hexdump
messages = json.load(open("sample_origin.json"))
KEY = bytes.fromhex(messages[0]["message"])
assert len(KEY) == KEY_LENGTH
stream = keystream(KEY)
def decrypt(message):
"""Decrypt a cipher text using the key"""
return "".join(
chr(c ^ k)
for c, k in zip(message, stream)
)
for m in messages[2:]:
print("From", m["sender"])
print(hexdump(decrypt(bytes.fromhex(m["message"]))))
print()
Using the previous python code, we are able to decrypt the first message. No more.
From A
00000000 3c 30 2c 52 65 67 69 73 74 65 72 3e 28 31 38 31 │<0,R│egis│ter>│(181│
00000010 37 39 30 33 37 30 33 34 32 31 43 29 0d 0a │7903│7034│21C)│··│
0000001e
From S
00000000 c3 95 19 c2 a3 c3 b8 c3 85 52 67 61 c2 81 c3 a4 │····│····│·Rga│····│
00000010 c3 97 24 c2 85 │··$·│·│
00000015
From S
00000000 c2 97 c2 a0 3d 6b 54 c2 b7 c3 99 c2 ba 4c c3 bc │····│=kT·│····│·L··│
00000010 c3 98 c3 b9 7e 4a 1a c3 ba c2 85 c3 9c 19 c2 be │····│~J··│····│····│
00000020 c2 82 c2 9d 1b │····│·│
00000025
From A
00000000 c3 95 19 c2 a3 c3 b8 c3 85 52 67 61 c2 81 c3 a4 │····│····│·Rga│····│
00000010 c3 97 01 c3 9a c3 b8 c3 95 43 14 14 c3 bb c2 ba │····│····│·C··│····│
00000020 c3 a3 │··│
00000022
[...]
Ok, this does not work.
When we reuse the same keystream for each message, the first two messages sent by the alarm has been well decrypted. Why?
We observe that the length of the first message (=0x1e
= 30 chars) is
congruent with the length of a keystream cycle (=10 chars). Thus, what if there
was one keystream used for messages sent from the alarm and another for the
messages sent by the server? After all, TCP allows full-duplex communications,
thus the server and the alarm have the possibility to send a message at
the same time. If this happens, and we were using the same keystream for all the
communication, the alarm and the server would be unable to decrypt the rest of
the communication.
from XXX import keystream, KEY_LENGTH
import json
from pwn import hexdump
messages = json.load(open("sample_origin.json"))
KEY = bytes.fromhex(messages[0]["message"])
assert len(KEY) == KEY_LENGTH
streams = {}
def decrypt(message, sender):
"""Decrypt a cipher text using the key"""
global streams
if streams.get(sender) is None:
streams[sender] = keystream(KEY)
stream = streams[sender]
return "".join(
chr(c ^ k)
for c, k in zip(message, stream)
)
for m in messages[1:]:
print("From", m["sender"])
print(hexdump(decrypt(bytes.fromhex(m["message"]), m["sender"])))
print()
Our python code decrypts the entire communication. Achievement completed! The entire decrypted communication has been pushed into this JSON file.
From S
00000000 6f 6c 6c 79 39 65 74 58 6f 53 63 36 55 32 65 47 │olly│9etX│oSc6│U2eG│
00000010
From A
00000000 3c 30 2c 52 65 67 69 73 74 65 72 3e 28 31 38 31 │<0,R│egis│ter>│(181│
00000010 37 39 30 33 37 30 33 34 32 31 43 29 0d 0a │7903│7034│21C)│··│
0000001e
From S
00000000 3c 30 2c 53 55 43 43 45 53 53 3e 0d 0a │<0,S│UCCE│SS>·│·│
0000000d
From S
00000000 3c 30 2c 4f 70 65 6e 53 65 73 73 69 6f 6e 3e 28 │<0,O│penS│essi│on>(│
00000010 32 35 30 31 29 0d 0a │2501│)··│
00000017
From A
00000000 3c 30 2c 53 55 43 43 45 53 53 3e 28 55 53 45 52 │<0,S│UCCE│SS>(│USER│
00000010 30 30 29 0d 0a │00)·│·│
00000015
From S
00000000 3c 31 2c 47 65 74 44 65 76 69 63 65 49 6e 66 6f │<1,G│etDe│vice│Info│
00000010 73 3e 0d 0a │s>··│
00000014
Conclusion
Never roll your own cryptography!
Definitely, never use your own cryptography. Neither your own algorithms nor your own implementation. Several algorithms and their implementations are designed by mathematicians. Moreover, even if some algorithms require lots of computation resource, lightweight crypto exists. The NIST is (when I write these lines) standardizing some lightweight crypto algorithms.
Moreover, if you want to use an algorithm XOR-ing data with a keystream, please, NEVER reuse the keystream. Here, our observations lead us to the encryption scheme quickly. If we had been sticking, we would use this vulnerability to try to decrypt the communication.
Thanks
- Nicolas Chatelain, to submit me this protocol 😄
Highlight
Some time ago, I decided to highlight "lesser known" artists on my README files and blog posts. This is a gift for curious people who find these texts, a digital time capsule of sorts!
For this project, I would like to highlight SARĀB.
SARĀB is a syrian-french arabic jazz rock band. I love its music and their featuring.