Learning from the past

Blackberry and later the iPhone really changed the mobile industry, not just from user experience, but from the way service providers (carriers) packaged and sold their services.

Data on mobile phones used to be extremely expensive. So expensive in fact that as a user would want to use as little of it as possible. On Nokia Smartphones any application that attempted to connect to the Internet would have to prompt the user for permission first.

When the Blackberry and later the iPhone launched, they needed constant data connections. Blackberry to an enterprise email service, and the iPhone to an Apple data center. This constant data connection is what allowed push email on the Blackberry, and notifications on the iPhone, without this data link some of the main features of both platforms become inoperable.

When carriers sold iPhone and Blackberry devices they needed to shift the way they charged for data. What was once very costly became cheap and bundled with the handset. Today of course the major selling point of a mobile phone plan is how much data you get and how much it costs.

The resulting drop in data prices has given the illusion that mobile data is cheap, or free. It isn’t , all this information runs across radio links, fiber cables, switches, routers and base-stations, all of which need to be paid for and maintained.

The bundling of data has given us the impression that data is “free” or so cheap it might as well be free. It isn’t, but this notion has lead to some less optimal, and more costly software design, particularly in the IoT domain.

Connectivity Restrictions

Every carrier implements a NAT. This help manage the limited IPv4 addresses and provides a way for the carrier to control data traffic on the network. The NAT restriction means that mobile devices can not receive an incoming connection request from a server on the Internet. Instead the device must initiate the connection. This makes sense from a billing point of view, as a user only gets charged for the data they use based on the applications that they install.

For an IoT device this can be cumbersome. There are often times were we need to call out to a remote device to make sure it is working.

Unscheduled Communication with Cellular IoT Devices

From both a power and cost perspective cellular IoT communication is generally kept to a minimum and scheduled at regular intervals. Careful scheduling of when IoT devices “call home” allows the workload on the server to be balanced out. This strategy is cost effective, however the design of the cellular network means it is not possible for a server to initiate communication with the IoT device. On cellular networks all devices operate behind a NAT. The result is that devices can be operating for hours or perhaps days before “calling home”. This multi-hour / day latency introduces issues for device management as it becomes impossible to probe a device for its state, so activities like troubleshooting, auditing, maintenance, unscheduled security patches, etc, become hard if not impossible to do.

Different IoT devices will, of course have different data throughput requirements. Video / streaming sensors could conceivably maintain an constant flow of data to a server. However for a large number of cellular IoT devices the data volumes are low, and the need to manage the data costs can have a direct impact on the profitability and the overall success of the IoT solution.

Data Costs

Even when a device is not actively transmitting any sensor data, an open link (socket) with a server will incur a cost. Maintaining a communication link between an IoT device and a server requires sending keep alive messages. These small back and forth messages are sent across an existing link at a regular interval. As their name suggests they keep the connection alive. The client will send a message and the server of course will reply to acknowledge the client’s active connection.

Every byte of data that is transmitted incurs cost. On an cellular IoT connection these costs originate from three sources:

  1. The data processing costs on the server.
    Every message needs to be processed and replied to, even the keep alive messages. This is an additional load, if your dealing with 100s - 1000s of devices then this load can add up and result in additional server requirements, or extra virtual machines in the cloud.
  2. Public Cloud or Data Center Network Traffic Costs.
    Public cloud providers and nearly every data center operator charge for data being sent into and out of their data centers / networks. It is possible to work to reduce this, for instance if you use AWS’s IoT platform, they will zero cost the keep alive messages for MQTT connections. However if you want to connect all your devices via a VPN link then you’ll need to pay for the data.
  3. Cellular Traffic Costs.
    Every carrier / service provider charges for data traffic. As an example Hologram offer a “pay as you go” services for $0.34 per mb, and Things Mobile offer a $0.10 per mb deal.

The Yearly Cost of a Keep Alive

I wrote a Reddit Post outlining the cost, over a year of the default keep alive messages used by MQTT. MQTT sends 202 bytes every 60 seconds. If we keep the connection open constantly between the IoT device and an MQTT server that’s between ~$10/yr on Things Mobile or ~$34/yr on Hologram just in keep alives.

We can save money by negotiating better data costs of course, but we could do better again by eleminating all of the uncessary keep alives and dropping the data link completely when it is not needed. This inpart explains why IoT providers employee the regular intervals strategy I mentioned above.

Supporting Unscheduled Communication

Unscheduled communication is usually required in a number of situations:

  • When things go wrong, a customer complaint, or failure occurs
  • Audit of a device is needed, or for asset tracking
  • Emergency software updates (security, bug fixes etc)
  • When the customer needs immediate sensor data

In all of these situations being able to reach out to the device is important. Without the ability to contact the device on an unscheduled basis could mean hours or days before changes can be applied, or data collected from a device. This would leave angry customers complaining about their device waiting long periods of time for details, or sensors left with insecure or incorrect software executing for days without an update. So it is important that this functionality is supported. There are only a limited number of ways of supporting this critical management operation:

  1. Keep the IoT device and the cloud device constantly connected.
    This is very expensive both for the battery and for data usage, as an example an MQTT connection configured using default settings will transmit just less than 100 megabytes of “keep alive” queries and responses every year. It also places a constant burden on the server, which needs to monitor and maintain connections to every one of the IoT devices.
  2. Increase the frequency the device “calls home”
    This approach works, but we have an increase in data cost and we would still have some latency to deal with.
  3. Wake up SMS to the IoT Device
    This approach makes sense, but each SMS will cost money, so long as the unscheduled communication is intermittent this solution makes sense.

The SMS Solution

SMSs are not sent via a data link, but using the underlying carrier network, they do not require IP addresses and are therefore not subject to the NAT restrictions that data links are. In fact, MMS: Multimedia Messing System, or more commonly know as “Picture Messaging” combined both SMS and data links together to deliver large data payloads to mobile devices. Essentially this system used the “Wake up on SMS” approach outlined above, sending an SMS to a phone to get the phone to create a data link and collect the pending pictures.

Missing Developer Tooling

While this SMS pattern is know it is seldom used to reduce cost. The core reason appears to be the lack of developer tooling. Most of the IoT system solutions that are offered to developers focus on data integration, such as AWS, Azure and Google Cloud, where the cloud providers interest is in getting the IoT data into the cloud. Using the existing tooling makes sense as it reduces the development time, and hence development costs, helps when integrating IoT systems together, and overall increases the speed to market.

The Goal

Creating a “Wake on SMS” solution requires changes to the device software to send, receive and validate SMSs and changes on the server side to prompt the IoT device with an SMS when required, it also requires tracking on server to record the IoT device and it’s phone number. That is a lot of additional work, and a lot of custom code. The goal therefore is to reduce this, make the Wake on SMS feature as transparent as possible for IoT developers, something that can be integrated easily both on the device, and with the protocols used by the major IoT cloud providers.

Protocol Selection and Strategy

Protocols

There are a number of popular protocols that are used to transfer data and actuation instructions from the cloud out to an IoT device or gateway. These protocols can be roughly grouped into three sets:

  1. Message Queues
  2. Message Based
  3. Request Based

Strategy

Being a single person, I don’t have the resources to develop solutions for every one of the 6 identified protocols and each of their different implementations. Instead I am going to focus on producing a Wake on SMS MQTT based solution.