Generative AI is all the rage in the tech world. While there are compelling use cases for GenAI in the dedicated device segment, bringing AI to bear on the edge is not limited to just GenAI. In fact, we’ve been helping a small set of customers with edge inference for years. But now, with the increasing sophistication and lower barriers to adopting AI technology, we’ve seen a marked increase in our customers' interest in rolling out AI PoCs for new use cases to build the business case for scale.
Why perform the AI inference operation on the edge? Our use cases oftentimes need to be independent of internet availability, execute without latency, and deal with privacy and security concerns where the resulting data set needs to reside on the edge in a locked-down manner. An AI model on the edge is an artifact, and a key part of Esper’s value prop is artifact management at scale, which provides the enabling technology for developers and the right management experience for IT Operators. That’s Esper.
With our edge AI expertise and the corresponding tooling we built to support edge AI applications, we created the Pando Solutions Accelerator. By engaging with Esper via our Pando offering, we work with customers on the full journey — from solution design to achieving PoC — leveraging technology building blocks supplied by both Esper and the larger edge AI ecosystem. Customers bring the use case and the business value, and we collaborate to create the right user experience with the required technology infrastructure.
Here is one such journey with an Esper customer: We went from concept to PoC quickly and efficiently to meet a tradeshow deadline.
Patent Is Pending, But So Is The PoC!
One of Esper’s customers is a leader in providing end-to-end age verification services to retailers selling age-restricted items. This area is a huge risk and pain point for retailers, as selling age-restricted items to minors results in huge fines or potential closure. The big retailers have technology and business processes to address these risks, but SMB retailers, especially convenience stores, are much more exposed. This particular customer provides services to smaller retailers, helping them economically implement age-verification services. Their current solution leverages Esper to manage these devices.
In parallel, self-service kiosks are all the rage in the restaurant and retail segments. From that, our customer came up with a unique solution: A self-service kiosk for age-restricted items. To protect this novel idea, they are going through the patenting process, which means they achieved a sufficient prototype for the initial filing. However, they wanted to take the next step and commercially demonstrate a solution.
Based on our close relationship with the customer, the self-service kiosk offering using AI came up when discussing how we could help propel their business. The customer had already done some experimentation, so they knew how they wanted to implement AI for this particular use case. But they were a bit stuck on achieving a demonstrable PoC for NACS, their major trade show, which was just a few months away.
Esper jumped in by applying our Pando Solutions Accelerator to take the idea from idea to PoC.
The Show Will Go On!
When we engaged with the customer, NACS was not far away — less than two months. In normal circumstances, getting a fully functional demonstration built and ready would be near impossible. But that’s where Pando comes into play. Esper built the technology underpinnings for AI model management and inference at the edge.
The first step was to define the use case as envisioned end-to-end. The demo needed an application that showed the transactional side of purchasing an adult beverage — in this case, various beer and wine options. This then leveraged the existing age verification implementation: scanning the barcode on the back of the ID to retrieve the licensee's birthdate and optionally checking a third-party system to check the validity of the ID.
The novel part is that the user could present their ID to the kiosk system, have it verify that the ID matches the customer's, and check the ID to ensure it was not a fake — all without a human attendant or internet connection. This is where AI and edge inference came into play.
Time for some context. Arguably, the current nexus of AI is the cloud — think of ChatGPT and Microsoft CoPilot as examples, which run in huge data centers filled with Nvidia GPUs. AI engineers train models in Cloud environments. They tend to be very broad, and updating a model is a simple container operation across data center instances with ultra-reliable, super high-speed connections.
This is not what you typically see at a liquor store with poor internet connectivity and limited funds for IT capital expenditures. To make this a viable offering, the customer needed the solution to run on what they expected to encounter as the typical kiosk endpoint hardware configuration — an aging x86 CPU without an NPU (Neural Processing Unit), thus minimal edge inference capability.
Rather than trying to run inference on the kiosk POS system, we dropped in a Linux-based Intel x86 edge device with an Nvidia GPU. Such a configuration provides the necessary TOPs for inference, supplying it to one or more customer-facing POS devices. The beauty of this approach is that an existing customer deployment can be made AI edge inference-ready without changing the current edge deployment — the inference box is just a drop-in.
The customer knew exactly the use case they needed, and the technology plumbing their current solution made available. But, they needed a new Android app experience for the demonstration. We helped out by building the Android app that would run on the block-and-tackle POS device used for the demonstration. All we needed from the customer were the user screen flows, which the customer readily supplied.
While we had the Edge AI building blocks for creating the PoC, we needed to cover some gaps. This leads us to some interesting technical challenges we needed to overcome or had already addressed via Pando.
Clearing the Hurdles Towards PoC
The edge kiosk was a typical Android x86 PoS device managed by Esper in retail environments.
Since the crux of the use case was matching a picture on a valid ID to the person presenting the ID, it required a camera. Most PoS systems are camera-less, much less a camera that would work for this use case. Our demo system didn’t have a camera, but as an Android x86-based system, we had to find the right camera to work with the OS build for the use case. In this case, our customer took the lead in sourcing and testing different cameras and found one that fit the bill.
The PoS device also dictated the APK that we needed to build — it had to be an Android x86 ABI without GMS dependencies (it was AOSP), which was not a problem for us. Our customer created a user experience flow with simple mockups, which made it easy to build the app to the use case specification.
The AI gateway performing the inference needed to be appropriately powered to support the use case. Linux was the most sensible host OS, combined with an Nvidia GPU. We ended up going with the ASUS NUC 14 Performance running Intel x86. Through Pando, we extended Esper’s device management infrastructure to Linux to ensure precise onboarding and management of these Linux-based edge devices, including model management.
This brings us to the AI model — the solution needed a face comparison model. The Cloud AI world is all about Docker-based models, and why not? It is a resource-rich environment where you can dial up whatever you need; COGS be damned! But on the edge, running Docker combined with the required interprocess communication creates overhead that consumes system resources, which means a more expensive gateway device. Who wants to pay for a Blackwell at every store when you have 15,000 stores? Instead, a compiled binary model is the most efficient way to go.
We ended up going with a technology similar to Dlib, as it offered the right tradeoff between speed and accuracy and the ability to compile the model for Linux x86. So, we tuned the model to optimize it for the use case and then compiled it.
As with any Esper-managed device, the devices continue to operate as configured on the local network if the intranet is disconnected from the internet. Since this is a tough trade show environment with the typical high cost of an internet connection, we built the solution to operate without the internet. It simply required staging before the show.
But that brought up a different problem: How would the PoS device dynamically discover and connect to the AI inference gateway device — AKA Network Discovery? A couple of things drove this: The network environment we’d use at NACS was up in the air when we needed to finish our part of the implementation, and we were experimenting with different edge gateway hardware.
So, rather than have a static implementation, we implemented a discovery protocol in our Esper infrastructure so the endpoint could find the gateway device and set up the connection between the two to perform inference operations without intervention. We built this discovery so a single inference gateway can support multiple endpoints with support for different host OSes. Conceptually, this means at a single location, all you’d need is one gateway inference device to support delivering AI at all the endpoints. Furthermore, the whole solution was based on interoperability between two different OSes: the endpoint running Android and the gateway device running Linux.
Preparing for the Show
Coordinating with the customer for the show was interesting. They’re on the East Coast, while we’re on the West Coast. We were able to share the app builds with them via design reviews to get their input, iterate on the app, and make sure the experience was what they wanted to demonstrate.
It also meant we had to stress test the solution in our office lab, and through that, we ironed out all the gotchas. Then, we prepared the kit for the show with the proper training to assemble the pieces — with fallbacks. It all went into a Pelican case that we carried to NACS. Because yes, we were also going to help the customer demo the solution in their NACS booth — we were booth staff.
How’d It Go?
The booth setup went great. The only challenge is that the tradeshow floor is a visually noisy environment, so we needed to modify the setup slightly to ensure it worked robustly.
But it worked very well. Through this, our customer demonstrated their patent-pending AI-driven solution for self-service kiosks, which generated many interesting leads that hopefully will benefit their AI push and break into new accounts with their core offering.
So, in less than two months, leveraging Esper’s Pando AI Solutions Accelerator, our customer was able to take their AI concept to PoC and show it to the world at NACS.
What idea do you have for applying edge AI to your business? We can help here at Esper!