[Translation] Insulin Pumps, Chip Breaker, and Software-Defined Radio

[Translation] Insulin Pumps, Chip Breaker, and Software-Defined Radio


Reverse engineering of an insulin pump for DIY therapy

About three years ago I heard about a website offering a reward for being very close to my heart: reverse engineering communications with an insulin pump. I already helped create a system for my daughter Loop , with Medtronic pomp, for which I performed reverse engineering of communications (most of the main Medtronic communication protocol was decoded by Ben West using a Carelink USB device, and I found out the radio frequencies and did some additional work on the protocol). But the Medtronic pump needed to be turned off for a few hours during gymnastics. The tubeless design of this Omnipod pump seemed interesting to me, and I had all the tools to work.

The Omnipod system consists of a small one-time pump, called a module (pod), and a control unit (PDM).



Since the PDM communicates with the module over the radio, and the module does not have a built-in interface, this means that it is completely radio controlled. It becomes possible to fully integrate with Loop, using only RileyLink or its modified version.

James Wedding appointed a reward, and it attracted a lot of attention, and then the right people who helped in the work.

Software-defined radio


SDR - awesome tool . He makes the hidden world of radio visible. There are many different kinds of messages that constantly stream through the air, and these tools allow you to poke around, view messages, and after some work decode the small flashes that you see there. If you are looking for messages from a specific device, you need to know where to start the search. This is where FCC public documents will come in handy.

The FCC PDM documentation, RBV-019 , says that the device transmits in the 433 MHz band. After setting up the SDR software for listening in the 433 MHz band, when issuing status from the PDM, the following signals appear:



As I eventually learned, these two bright lines indicate a certain type of modulation called frequency shift keying , or FSK. This means that the frequency of the signal varies depending on the transmitted information. Bit 1 is sent as a higher frequency (upper line), and 0 is sent at a slightly lower frequency (lower line). Using the inspectrum tool, we can analyze the data to more clearly recognize 1 and 0. Here is a greatly enlarged view of the first message:



I wrote a Python script to extract these bits so that we can look at them as whole packets.



It turns out that this repeating pattern is part of the preamble . To save energy, receivers often go to sleep and wake up periodically to check the signal.The transmitter sends the preamble long enough for the receiver to catch it during one of the short listening periods. When the receiver hears the preamble, it wakes up until real data appear.

You need to go through another layer before you get the actual packet data. You cannot send your data over the radio in the same way as the original bits, because the receiver uses transitions to synchronize in time when waiting for the next bit. If you have a long set of zeros or ones, then the receiver may go out of sync. Therefore, radio communications typically use encoding to ensure a sufficient number of transitions. Omnipod communications use the so-called Manchester coding . Each bit is encoded with two bits. 1 is encoded as 10, and 0 is encoded as 01.

All this took a long time to figure out, and there were many theories on the openomni channel in Slack as we tried to repeat the original bits. Mark Brighton , Dan Caron and @larsonlr have achieved some success with RFCat and Ti Stick for capturing packets. Evarist Kurzho finally wrote a tool called rtlomni , it uses the rtl-sdr USB receiver to listen to packets and decode them, which turned out to be very convenient and more reliable than TI Stick based methods.

Packet Decoder


Having received the actual bits, we began to study the packet structure. Based on the bits that changed between different modules and different teams, we created a structure that looks like this:



CRC8


Radio is far from an ideal transmission medium. There are many different sources of interference that cause the receiver to hear 1 when 0 is sent, and vice versa. It is important to know when this happened, so most protocols use a checksum, often called a CRC. The receiver calculates the CRC as data is received, and the last byte of the packet includes the CRC calculated by the transmitter. If they do not match, the receiver discards the packet and waits for retransmission.

The Omnipod protocol used a standard 8-bit CRC. When we found him, we decided that we were very close to understanding the messages. How little we knew ...

Messages, CRC16


Some messages are too large to fit in one package, so they are broken up into several packages. We started to piece up the message format and noticed another set of bits at the end of each message, which looked like a 16-bit CRC. But it was strange: 5 of 16 bits were never installed. We tried many different methods to figure out how it is coded but nothing worked.

This was the first big problem: we could continue to work on other bits in the messages, but this will not help much to understand what is being sent, and we will not be able to generate new packets, so progress slowed down.

Practically unsuccessfully several months passed. Finally, in the winter of 2016, a member of the group under the name @lorelai reported that she successfully copied the firmware from a larger ARM chip to PDM and began the tedious process disassembling: taking CPU instructions and turning them into human-readable code with semantic variables and function names.She did an amazing job finding out the various methods that were used to broadcast data.

I looked at one of the subroutines without a name, and noticed that it looked like a standard implementation of a CRC calculation on a table. And in the table were the values ​​for the standard 16-bit CRC. I wrote my own implementation on the tables, and it was tested as a regular CRC. Then I carefully looked at how the function was written. The normal implementation of CRC looks like this:

  while (len--) {
  crc = (crc & lt; & lt; 8) ^ crctable [((crc & gt; & gt; 8) ^ * c ++)];
 }  

Their implementation looked like this ::

  while (len--) {
  crc = (crc & gt; & gt; 8) ^ crctable [((crc & gt; & gt; 8) ^ * c ++)];
 }  

Notice the difference? What should have been a bitwise left shift operator was somehow encoded as a right shift. This is mistake; There is no reason to disable your own CRC algorithm, as this makes it difficult to identify damaged messages.

We are back in operation again! And they resumed work on decoding messages, recording sessions from PDM for delivering bolus [medication], temporary bases [Temporary Basal Rate specifies an increase or decrease in insulin delivery - approx. lane.], suspensions of filing, etc. ...

Once used number


All insulin delivery teams had a 4-byte piece of data at the beginning of the message that looked like some form of cryptography. Again, we tried many different ways of interpreting and analyzing it in the context of the messages in which it was sent, but it was not a CRC (sometimes we saw the same 4 bytes even in different messages). And sometimes we saw the picture repeat. It looked, perhaps, as part of a protocol to prevent data from being played. In other protocols, this function is called nonce (a single-use number).

One of the options that we considered was to record a database of messages for reproducing specified commands. Even if the address of each module was different, now we knew how to generate the CRC, so we could take the old copy of the command, put the new address on the message and recalculate the CRC. Only this nonce prevented us from using this strategy. Regardless of the command, the module only accepted the next nonce in the sequence, and we did not know how to generate the next nonce.

But! We have decompiled PDM firmware, we can just look there! So, we studied the PDM firmware, traced the generation of messages in the code and found where these four bytes should be. But instead of a method that computes some cryptographic nonce, we just found four characters INS. . What kind of nonsense?!?! Well, somehow this message area should be updated later in the pipeline.

There was another chip on the PDM, closer to the radio. It was the same chip that was used in the modules, with the identifier SC9S08ER48, which was not documented on the Internet. It was probably custom made for Insulet. Maybe we can remove the firmware from this chip. Unfortunately, the chip was blocked, which prevented the firmware from being copied.



Work slowed down again ... it looked like a real dead end. We put all the mental effort into this nonce, and we didn’t have any good mathematics clues behind it. And the ER48 chip, which (possibly) kept secrets, was blocked, and it is difficult to find some publicly available information that would help it to be hacked.

X-rays


Trying to understand ER48, some members of the Slack community suggested taking x-rays. It was really cool, but, unfortunately, did not open up any new opportunities.


General Snapshot


Detailed image

Opening and Shooting


Dan Caron decided to contact the researcher, Dr. Sergey Skorobogatov from the University of Cambridge in the UK. Dan read that he had experience extracting code from blocked chips, and convinced him to look at our problem. Dr. Skorobogatov conducted research on the use of SEM (Scanning Electron Microscope) for reverse-engineering microcircuits. He suggested that it was possible, but it would be expensive, would require expensive equipment and did not guarantee a result. Joe Moran , who recently started using Loop after we met at the Nightscout hackathon in the fall of 2016, agreed to help with the project. He agreed with the company from Silicon Valley, Nanolab Technologies, to open and take pictures of the chips, and also kindly financed the work of Nanolab and Dr. Skorobogatov (as well as his personal modules).

Dr. Skorobogatov asked Nanolab to apply various imaging techniques to find out if the protection could be cracked using known non-invasive or semi-invasive methods. As a result, many images have appeared, some of which are very beautiful. These are optical microscopic images of a silicon matrix.


General view of the chip under an optical microscope


General view of the chip under an optical microscope

Images of specific areas of the matrix were also taken using a scanning electron microscope. With different voltages, different surface preparation and various equipment.


SEM image of flash cells. Does not show data

Unfortunately, none of these images showed the actual contents of the flash memory.

Dr. Skorobogatov had one last method that can be used only in case of failure. It was a patented method, the use of which had to get permission from the university. Dr. Skorobogatov did an initial test and confirmed that he was able to read the data on this chip. But before continuing, it was necessary to sign the NDA, and therefore negotiations were held about who will receive the contents of the extracted firmware.

In the end, the NDA signed the Nightscout Foundation, he took responsibility for preventing unauthorized disclosure of memory extraction methods and results.

The result of this agreement and the work was an incredible article , written by Dr. Sergey Skorobogatov, as well as the firmware code. From the first time in the code there were quite a few errors, but this was enough to get started. At the Nightscout spring hackath, Joe turned to the guys if anyone wanted to disassemble. Nobody raised their hands. Transforming processor instructions into something understandable is hard work, and very few people know how to do it. I tried to delve into the assembler using the CPU documentation, but I achieved very little and became disillusioned. Others optimistically asked for the firmware code with expectations of rapid progress, then they realized the scale and complexity of the task - and quietly fell off.


Example of disassembling SC908 instructions

It turns out that Joe also had extensive experience working with an assembler, and he began to perform this difficult task himself. In July, Dr. Skorobogatov completed a second memory retrieval operation with far fewer errors. During the summer, Joe Moran worked tirelessly on displaying a huge number of processor instructions and their gradual integration into the overall picture of the pseudo-code of the module.

In the end, Ken Shirriff, a hardware reverse engineering expert, joined us, and he significantly accelerated the process. Together, Joe and Ken eventually translated enough code to find a function that encodes nonce. It happened in September 2017.



RileyLink and Loop


We updated the Python scripts
openomni , but now it's time to focus on RileyLink + iOS, so I started working on OmniKit and firmware updates for RileyLink. I believed that we have the basics of the protocol, and the rest is just details. Again, completely underestimating how much more is ahead.



I had to write a new firmware that handles the modulation and coding of the module. I also had to rewrite how the two chips on the RL talk to each other in order to process the zeros, since in Medtronic the zeros were the special end of the packet marker. Much of the Loop had to be reworked to support several pumps, and also to make new interfaces to support pairing, deactivation, and error handling. Fortunately, Nate Racklift laid a solid foundation in Loop to make all this possible.

Meanwhile, work continued on understanding the format of commands. Everything has been carefully documented in the openomni wiki , the most comprehensive documentation on the protocol. Joe, Evarist and Elke Jager have done a really great job of decoding messages and updating pages over time. Various members of the Slack channel have contributed on capturing PDM packets and a module to help the overall decoding effort.

Decoding was fun work, with a lot of small victories, because each component of each command is decrypted, and I really enjoyed working on this part, gradually adding code to the Loop. In April 2018, I shared in Slack that I did “paired through the iPhone + RL primary cannulation according to the programmed basal schedule, and then 5 units were littered with.”

RL 2.0 firmware was completed in July 2018, and new shipments have already gone with it. It was hoped that these boards could be used with Loop and Omnipod, but the existing 915 MHz antenna turned out to be too bad to operate at 433 MHz effectively.

Decoding and implementation have progressed significantly over the summer, and Loop gradually approached performance. Joe did an amazing thing by providing me with funding so that I would quit my day job and focus on this project, and eventually I joined the great Tidepool team. Of course, in the field of DIY and legislative regulation of medical technology, there were more events that I will not cover, but it was a very interesting summer!

Screamers


When more features appeared in the driver, I connected it to Loop, turning on the ability to automatically adjust delivery on time.At this stage, “flashy” modules were often obtained when some of the module’s internal checks found a problem and he stopped insulin delivery.

But it seemed a solvable problem, as we continued to find small discrepancies in the Loop and the original PDM packages when manually sending commands, and I assumed that if we correct them all, the screams will stop.

Working Loop!


On October 3, 2018, Joe put on a managed module running Loop and became the first Loop Omnipod user, although he didn’t tell me right away because he knew that I would be worried. When he told me, I was still worried. We saw how the module works, and understood the functionality, and the main algorithm was tested for a long time, but still ...



A month later, on the Nightscout hackath in November 2018, several more adventurers decided to try it for themselves, and also became part of a small closed testing group that will grow to more than 30 people before the code is published.

Unfortunately, we still had module “screams” that often occur before the completion of the full three days of use, and we carefully compared the Loop commands with the samples from PDM. In this process, Elke was particularly useful: he wrote a script to automatically check the commands with the original versions. I began to worry that the unstable operation of the modules was caused by increased battery requirements for communication every five minutes.


Taps of the voltage regulator in the module, drilled through the plastic of the back panel, on the superglue

Therefore, I began to measure the power supply voltage of the module using Arduino, write data and save it in a local database for visualization. I compared PDM and Loop.


Long-term change in the module supply voltage

Unfortunately, this also turned out to be a dead end; using PDM and injecting a large amount of insulin, I could bring the module to a lower voltage than the entire lifetime of the Loop module, and could not make the module “scream”. It seemed that the voltage is not a problem, there must be something else.


RileyLinks with 955 MHz (left) and 433 MHz coil antennas (right)

At some point, I noticed that if the exchange of messages with the module failed, then the module sometimes continued its attempts to complete the exchange, re-sending the packets again and again. Testers logs also showed a lot of failures, so I started experimenting with antennas. Both problems must be related to the quality of communication. I planned to try different antennas and ordered them in different places on the Internet, but I did not have time to test them until it became a priority.

I had several 433 MHz flexible antennas that can be attached to the inside of the RL chassis. They often show better performance in some scenarios, but not in others; too insecure. When I got to the coil, it showed good performance very consistently and on very surprising ranges. Time to make a new case for RileyLink.

With the new antenna and some optimizations that reduced messaging, while still making adjustments every 5 minutes, the screams became very rare. Probably comparable to the usual use of modules with PDM. In the last 7,500 hours of real-time testing, 94% of the modules have completed their work without interruption.

Testing and Documentation


The testing team was slowly growing: new users were constantly joining the system, who, with a fresh eye, could assess which parts look confusing. These testers put up with a lot of screaming modules and made a very big contribution to improving Loop with Omnipod. They mostly sent problem reports and logs of work.

In these reports there is a log of messages that can be analyzed using the tool made by Elke. It gives an idea if we get some distorted commands, and also allows us to collect statistics on certain parts of the Loop interaction with the modules.

Marion Barker joined the testing group and added special reports and additional statistics on the progress of testing - and we were able to use her statistics on successful modules against failures to have an idea of ​​high-level progress.

In the end, Katie DiSimone joined the testing group. She began a major restructuring of loopdocs.org with documentation on using Loop with multiple devices. Waiting for the Loop version that worked with Omnipod was incredibly high, and without good documentation it was clear that we would be overwhelmed with the same questions.

New Loop Features


Integration with Omnipod required rethinking of some interface elements and adding new controls. The module does not report a battery, and the user can do little with a low charge if this happens, so the display of the battery level widget does not make sense. In addition, without a user interface on the pump, the user should be able to quickly cancel the bolus. The tank icon depicted the Medtronic tank, so we wanted to redo it. Thanks to Paul Vorgione for developing the module logo, which now shows the reservoir level.



Acknowledgments


Thanks to all the people who helped to go this long way, so that we realized the goal that we set ourselves a long time ago. I know that I didn’t mention all or any of the events. This is not possible in one article, and I have only personal experience. It’s hard to imagine how many hours it took. If you add them all up, I’m sure you’ll get a shocking figure. Not to mention the work of creating the Omnipod itself, which, it seems to me, overshadows all these efforts. So thank you all. In addition, many of these watches would otherwise be spent with families. I really appreciate the understanding of my wife and children because of the time I spent on it, and I want to thank them too.

Notes


I should mention Joakim Ornstedt as one of the participants in openomni decoding, as well as the creator, probably the first integration with omnipod. He built a device that used optical character recognition (OCR) to extract data from the PDM, and connected the number buttons to the physical PDM through another microcontroller. This approach is difficult to scale, but it is very smart and avoids many of the problems that we had to deal with with an RE-based solution. I really admire how he handled the problem and got the job done for a tiny fraction of the time it took to get the device to work with Loop.

Source text: [Translation] Insulin Pumps, Chip Breaker, and Software-Defined Radio