Esper Deep Cuts: Network Noncompliance

Keith Szot
|
Try Esper for Free
MDM Solutions for Android and iOS

At Esper, we specialize in building a platform and offering management tools for dedicated device solutions. But in this complex environment, our customers run into complex issues that require “above and beyond” expertise to get to the root causes and ultimately solve the problems in a commercially viable way. As an infrastructure provider for dedicated and fully managed solutions on both Android and iOS, we go much deeper than typical MDM vendors. 

This is the fourth blog in our Deep Cuts series of posts that provide case studies of actual situations we’ve encountered and solved for our customers. Names have been removed to protect the innocent (and, in some cases, the not-so-innocent). This one is an example of a not particularly complex problem, but it had a huge impact and is a lesson everyone should remember in case it happens to YOU!

Deep Cut #4: Irrational IP Infrastructure

Sometimes, a little thing can throw big things way off. Staging is the operation where batches of dedicated devices are prepared at a single location, then packaged for shipment and, ultimately, final deployment. 

For us, a big part of this operation is provisioning the device onto Esper’s device management platform. We’ve seen customers do this themselves (noting that I have actually gone onsite to personally help these customers with it — nothing better than being on the line getting your hands dirty). But once you get to higher volumes, it's common for customers to utilize a specialized third-party logistics (or 3PL) vendor to do the staging for them.

These 3PLs are all about throughput and efficiency. If they have thousands of devices, they want to move them through quickly. The 3PL uses humans and the more devices a human can do in a given time, the more money the 3PL can make — and a better price per device they can offer to the customer.

Esper’s provisioning is very efficient. Using our various technologies (our 6-tap versus competitor 6-tap, Seamless versus non-seamless, etc.), I’ve done SOP and time metric quantitative comparisons against the competition, and we are astonishingly faster. It makes sense, as most MDMs come from the world of managing employee devices where it doesn’t really matter as much — after all, you have the employee do it! But that’s not the case in the dedicated world, where staging can be a significant expenditure and bottleneck the process. Important note — provisioning requires Internet connectivity to talk to Esper’s Cloud system, get the apps to install, etc.

Since a 3PL's whole scheme is to utilize human labor against a batch job, the worst thing that can happen is that they schedule a scad of people to do a staging batch, and the 3PL runs into an issue where they can’t stage. This results in people hanging around doing nothing but burning the 3PL’s cash.

Yes, it happened to one of our customer’s 3PLs, and the 3PL initially blamed Esper. Here is where the key lesson will lie for those looking at high volume with your 3PL. Let’s jump in!

Stoppage on the Line!

Our customer had a batch of tens of thousands of devices to get staged, which included provisioning to Esper. Based on the time metrics and human resource pool, the 3PL targeted staging several thousand devices a day. And given the volume, it required days of work by a dedicated team.

Since the 3PL had to secure the time of the corresponding number of humans, they were exposed by the sunk cost. Plus, the customer needed the devices as soon as possible so they could achieve delivery to their users and start monetizing the fleet.

The three parties did all the prep work, which included an on-site visit by Esper at the 3PL for a dry run to ensure the provisioning part of the operation was smooth and efficient with no gotchas. It was all good. 

On day one, the 3PL kicks off with the swarm of techs to get provisioning going. It was all going fine, but when they completed provisioning the first thousand devices, provisioning suddenly stopped working. To them, Esper provisioning was hanging. Now, the techs were indeed just sitting around; they could do nothing. And Esper was to blame, or so it seemed.

Quick Investigation and Resolution

We were not on site, but since many factors impact provisioning, we both have a run book and an on-device reporting mode for our provisioner that we quickly put to the task.

The first clue was that our cloud system was not seeing these devices at all. They were not on the Esper Cloud’s grid. Hmm…. The next step was to work remotely with the 3PL.

To the 3PL, the devices were connected to the Internet because you can see the full Wi-Fi strength bars on the UI. We quickly ruled ou the firewall, as the dry run went fine. So, we had the techs on the floor invoke reporting mode on the devices not provisioning.

From that, we determined that while these devices were connected to the Wi-Fi network, they did not have IP addresses.

That immediately led to the 3PL’s network infrastructure — specifically, their DHCP Server. A DHCP Server is a network server that provides and assigns IP addresses and other network parameters to endpoint devices. It assigns a unique internal IP address to each device and handles the relay and forwarding to the public Internet, where it shows up on the other side as a single IP address. The DHCP Server routes Internet traffic to and from the proper internal IP address for each address assigned.

DHCP Servers are limited in how many IP addresses they can assign and, therefore, handle. They are built into consumer routers as software implementations, which results in a limitation of 100-250 IP addresses that the router can assign. Once the DHCP server uses all the IP addresses, it’s unable to issue any additional IP addresses. 

In addition, there is the DHCP lease, which is the amount of time a DHCP server allows an endpoint to use an IP address. The lease time is typically configurable, with a common default of 24 hours. This means if an endpoint obtains an IP address from a DHCP Server, the DHCP Server reserves that IP address for 24 hours, even if the device is turned off, never to be put back on the network again — queue in the provisioning scenario. We are getting warm here, folks!

As expected, this 3PL was using a more robust DHCP Server implementation than a consumer router and initially resisted a deeper investigation. However, the data made a strong case from our point of view, so we pushed. 

The 3PL reluctantly brought in their network admin. Upon investigation, it turned out that this DHCP Server had a limitation of about 1,000 IP addresses. For comparison, a Windows Server running a DHCP Server can literally issue six figures of IP addresses!

Furthermore, the default lease time was 24 hours. Once the 3PL reached the ~1,000 device level, the DHCP Server exhausted all IP addresses and held on to those assignments for 24 hours. Even though the devices with the assigned addresses were back in their boxes, ready to be deployed and never again to be on that network, the DHCP server was still holding their IP addresses. The IP store was fresh out of addresses — come back tomorrow! That wasn’t going to work.

DHCP Hotfixin’

At this point, we solved the issue quickly. Some quick math against the provisioning time and throughput led to a sensible setting for the IP address lease duration. With that change, the DHCP Server would release an IP address to assign it again. This took care of the blocker, provisioning worked again, and the 3PL only lost a few hours. 

One reason this was cleared up so quickly is that our customer purchased our premium support. If you do that, we are on it like Donkey Kongit! Otherwise, the process would have been much slower for the customer and 3PL, resulting in increased money loss. In the end, they just lost that couple of hours.

It was surprising that the 3PL was not ready for this. This leads to the lessons for customers…

The Lessons to Remember

When evaluating a 3PL, there are some (perhaps) not-so-obvious due diligence questions to ask. Clearly inspecting the 3PL’s network infrastructure is key. Firewall is definitely one area, and by definition, we addressed that up front in this case. We shared our firewall allow list with the customer and 3PL, the 3PL network admin made the necessary configuration changes, and we easily validated by performing provisioning on test devices.

But what we all missed was the DHCP server's capacity and lease duration relative to the staging throughput. So, if you plan to use a 3PL at higher daily throughputs, be sure to ask questions about IP address capacity and/or lease duration.

Another consideration is the bandwidth of the Internet connection. While the communication to the Esper Cloud for configuration instructions executed by our Esper agent on the device is relatively lightweight, the actual application installations may not be. If you have 100 devices on the staging tables trying to install a 200MB application load through a thin Internet connection, that can bog things down.

For this particular customer, we anticipated this bottleneck and sold Esper's Local Cache server. Local Cache simplifies app installation by storing the files locally on the local intranet instead of hosting the application set to install on the Esper Cloud, thus using the LAN to avoid the bottleneck of the Internet connection.

It’s worth noting that this is not allowed for Play Store applications if the customer is using Managed Google Play. A deep dive into this topic is for another day. If you have Play Store installs, you’ll need to make sure you have a high Internet bandwidth connection.

If you use consumer devices for dedicated device use cases, system updates can also get in the way. Think of phones sitting in boxes for months before you fire them up. Depending on how the OEM implemented their setup flow, the system update can take precedence and take the device offline during that operation during a forced reboot. Know that first by inspecting the state of the firmware on a representative device from the batch, collaborating with the OEM, and by smoke testing it.

Another interesting but often overlooked aspect is the device packaging. Oftentimes, devices are packaged in a form where extracting the device could be more efficient, resulting in a similar inefficiency when putting them back. Vertical market OEMs understand that and come in packaging that is more efficient in staging. This matters for staging as that’s money — 5 extra minutes a device doesn’t sound like much, but for 10,000 devices, that’s 833 hours. At USD$50 an hour, that’s about an extra $40,000. This also circles back to Esper’s provisioning value prop because we require fewer steps and, thus, less human attention. Time is indeed money.

If you’d like to learn more about our awesome provisioning flow and working with 3PLs, please reach out to us!

FAQ

No items found.
No items found.
Keith Szot
Keith Szot
Keith is the Chief Evangelist at Esper, the geeky force-of-nature driving efforts to build a robust community of device manufacturers and software developers to connect with our customers.
Keith Szot
Learn about Esper mobile device management software for Android and iOS
Featured resource
Read more
Featured resource

Esper is Modern Device Management

For tablets, smartphones, kiosks, point of sale, IoT, and other Android and iOS edge devices.
MDM Solutions