Connection Efficiency Use Cases

Connection Efficiency Use Cases

Proof of the impact of inefficient IoT Devices can be seen today. The following cases were recently experienced by GSMA Mobile Network Operator members and highlight why the requirements defined within this document are necessary:

Use of Unintelligent Error Handing Mechanisms 

In this case, one of the Mobile Network Operator’s B2B customers had an installed base of approx. 375,000 geographically fixed IoT Devices (for use in the homes of consumers). These devices were located in 6 different European markets and the devices normally communicated via fixed line Ethernet connections. In normal circumstances they periodically communicate with the customer’s server to report on their status, and these status reports must be acknowledged by the customer’s server.

Recently the following sequence of events happened which caused massive disruption and loss of service for a large number of the Mobile Network Operator’s customers:

  1. On a particular day, the customer’s server suddenly and unexpectedly stopped acknowledging the status reports from the IoT devices.
  2. The devices treated this as a loss of connectivity over their Ethernet network connections and in an attempt to regain connectivity with the server they all started to ‘fall-back’ to a GSM/GPRS network connection.
  3. All the devices then switched on their GSM Communication Modules and attempted to send status messages via their local GSM/GPRS network but again the acknowledge messages were not received from the server.
  4. In this event the devices would reset the GSM Communication Module, forcing it to re-register to the local GSM network and they would try again to contact the server. Eventually all 375,000 devices ended up in an infinite loop with their GSM modems being rebooted every minute or so.
  5. As the number of devices which entered this ‘reboot’ loop grew, the signalling load within the core network of the devices home Mobile Network Operator grew to an unmanageable level. This resulted in one of home network’s HLRs became overloaded with registration attempts, which in turn prevented all devices that use (U)SIMs provisioned in that HLR to register to any GSM network.
  6. At this point the home Mobile Network Operator as he now has a much wider issue to address. The Mobile Network Operator has to stabilize their core network signalling and in this case the Mobile Network Operator was forced close down major roaming destinations like Germany, France, Austria, Italy, Spain and the UK. This reduced the signalling load, and then each network connection could be re-established one by one to bring the number of devices trying to register to the network back in smaller, more manageable, numbers.

Overall, it took this Mobile Network Operator approximately 48 hours to completely resolve the problem which classified the event as a ‘critical’ event on their network. If the devices had implemented an intelligent ‘back-off’ mechanism (intended delivery of the Network efficiency project) when loss of connectivity to the server had been detected then this problem would not have occurred. GSM Association Official Document

Use of insecure IoT Communications Modules 

In this case, the Mobile Network Operator’s B2B customer had an installed base of 59 IoT devices used to monitor wind and solar power generation. All of the devices used the same make of Communications Modules.

In December 2013 a sudden increase in calls to Gambia, Latvia, Lithuania, UK and Falkland Islands occurred, all the calls being made by the 59 IoT devices. In total approx. 17,000 calls were made before the Mobile Network Operator discovered the fraud and implemented the necessary countermeasures.

Upon further investigation it was discovered:

  • All of the Communications Modules within the IoT Devices had been left configured with default usernames and passwords.
  • The hacker had discovered the temporary public IP addresses of the IoT Devices and then logged on to each device using the default username and password.
  • The hacker then configured the Communications Modules within the IoT Devices to use dynamic DNS addressing to give each device a permanent IP address.
  • The hacker then used these permanent IP addresses to connect to the IoT Devices from the 9th to 15th of December and instruct the devices to make calls.

As a result of this hack, the Mobile Network Operator and its customer incurred a financial cost estimated at 150,000 euros for the ~17,000 illegal calls made by the IoT Devices.

If the IoT device vendor had properly configured the security features provided by the Communications Modules within their IoT Devices this event would not have occurred.

Radius Server Overload 

After an SGSN outage tens of thousands IoT devices that belong to an IoT Service Provider re-register to the GPRS network.

There is no throttling activated on the receiving GGSN, so all requests to activate a PDP Context on the IoT Service Provider’s APN is processed.

The APN is configured to authenticate through a RADIUS server hosted by the IoT Service Provider which resides on the remote end of a VPN that terminates in the GGSN.

The RADIUS server is not scaling well and the IoT Service Provider has not added enough resources to the RADIUS server to cater for this peak of authentication requests.

The first thousand requests go through but after that the RADIUS server start to experience problems to respond in a timely manner.

In turn the GGSN resend authentication requests that have timed out, putting even more load on the RADIUS server.

Finally, the RADIUS server’s CPU utilization hit 100% and the GGSN starts to suffer from the vast amount of PDP Context activation requests that cannot be authenticated and times out.

The IoT Devices do not have a back-off feature and send new requests to activate PDP Context as soon as the previous times out.

The Mobile Network Operator needs to disable all the IoT Devices’ (U)SIMs and re-activate them in batches in order for the RADIUS server to be able to authenticate the requests. GSM Association Official Document

Lessons learned:

  • Mobile Network Operators should have a throttling mechanism on GGSNs per APN.
  • IoT Application Developers’ need to implement a back-off feature for such scenarios.
  • IoT Service Providers’ back-end engineers must communicate with their organization and request information about active (U)SIMs in order to have the appropriate resources available for RADIUS and back-end systems.

Fake IMEI case 

The existence of IoT devices with fake/incorrect IMEIs presents a problem to the Mobile Network Operator. The problem occurs because there are no regulations to check the IMEIs of devices passing customs clearance and as a result, devices with fake/incorrect IMEIs are easily spreading between different markets without any resistance.

Based on Mobile Network Operator experience there is several typical scenarios of fake/incorrect IMEI:

– Copied IMEI for particular consignment of IoT Devices, where the chip which stores the IMEI was not properly coded by manufacturer.

– Substituted IMEI for the IoT Device, taken from the IMEI range dedicated to different type of device and as a consequence the Network has a misunderstanding of device type.

– Fake IMEI which has been re-flashed by the IoT Device Maker from its original value.

3GPP standards non-compliance cases 

3GPP standards non-compliance has been faced for several devices or even types of devices in signalling flow cases.

Device capabilities which have sent to the Network are different in comparison with real device behaviour, the following cases are most typical:

– False information regarding supported frequencies has been sent to the Network, e.g. GSM 1900 instead of GSM1800

– False information regarding the class of output radio power

These false capabilities stresses the Network and behaves abnormally in terms of Network <-> device interaction.

Incorrect response on technical parameter and requirements which sent by the Network in system information messages:

– Much more often Periodical Location Update independently from Network sent parameters. Ignoring of predefined network parameter of Periodical Location Update interval. Doubled or even tripled signalling load on the Network.

– Frequent reload of the device with related signalling flow such as IMSI attach, GPRS attach which increases Network load. The procedure of reloading mechanism is pre-programmed in device application and could be not optimized to the real Network conditions. E.g. losing of the satellite connections to GPS module of the device could be a criteria for initiation of the device rebooting by its application. It could be a reason for additional network load if car with such device installed could be parked under hangar roof for ex.

– Device inability to make Network attach being sent IMSI attach requests while misunderstanding of Network standard signalling respond which cause devices restart and consequent frequent attach requests.

Other Reported Examples 

  • Digital Picture Frame –If the device’s cloud based server is not available, the device would start to ping the server every 5 seconds to re-establish network connection. When an Mobile Network Operator has thousands of such devices in their network doing the same thing, it results in a “denial of service” attack.
  • M2M Device – When configured with an invalid APN or a deactivated (U)SIM the device still attempts to obtain PDP context at a very aggressive rate, unnecessarily consuming network resources and if deployed on a large scale, would congest or crash the network.
  • M2M Device Behaviour after Network Outages – After a network outage, when the network comes back up, a large number of devices will see the network and all attempt to access at the same time. The network is unable to respond to all these simultaneous requests. This puts these devices into a state where they are continually attempting to access and potentially crash the SGSN.