Binary Telemetry Encoding for IIoT: Why JSON Is Killing Your Bandwidth [2026]
If you're sending PLC tag values as JSON from edge gateways to the cloud, you're wasting 80–90% of your bandwidth. On a cellular-connected factory floor with dozens of machines, that's the difference between a $50/month data plan and a $500/month one — and the difference between sub-second telemetry and multi-second lag.
This guide breaks down binary telemetry encoding: how to pack industrial data efficiently at the edge, preserve type fidelity across the wire, and design batch grouping strategies that survive unreliable networks.

The JSON Problem in Industrial Telemetry
Consider a simple temperature reading from a Modbus-connected chiller:
{
"timestamp": 1709251200,
"device_type": 1018,
"serial": "0x00300039",
"tags": [
{"id": 1, "name": "Tank Temperature", "value": 42.5, "status": "ok"},
{"id": 2, "name": "CQT 1 Approach Temp", "value": 18.3, "status": "ok"},
{"id": 3, "name": "CQT 1 Chill In Temp", "value": 22.1, "status": "ok"}
]
}
That's ~320 bytes for three sensor readings. The actual data — three 16-bit integers — is 6 bytes. You're paying for 314 bytes of structural overhead on every single transmission.
On a typical plastics manufacturing floor running 12 machines with 30–40 tags each at 60-second intervals, that's roughly 14 MB/day in pure JSON overhead. Over cellular, that adds up fast.
The Real Cost Breakdown
| Component | JSON Size | Binary Size | Overhead |
|---|---|---|---|
| Field names ("timestamp", "device_type", etc.) | ~120 bytes | 0 bytes | ∞ |
| Brackets, quotes, colons | ~80 bytes | 0 bytes | ∞ |
| Numeric values as ASCII strings | ~40 bytes | 14 bytes | 2.8x |
| Status strings | ~30 bytes | 3 bytes | 10x |
| Total for 3 tags | ~320 bytes | ~25 bytes | 12.8x |
Binary encoding eliminates all structural overhead. Field names, quotes, brackets, and colons vanish. What remains is pure data with minimal framing.
Designing a Binary Wire Format
A good binary telemetry format for IIoT needs four properties:
- Self-describing enough to decode without external schema
- Compact enough to fit cellular/LoRa constraints
- Groupable so multiple readings share context (timestamps, device IDs)
- Error-tolerant so partial corruption doesn't destroy entire payloads
The Anatomy of a Binary Telemetry Frame
Here's a proven structure used in production edge systems:
┌─────────────────────────────────────────────────┐
│ Command Byte (1 byte) │
│ Number of Groups (4 bytes, uint32) │
├─────────────────────────────────────────────────┤
│ ┌─── Group 1 ───────────────────────────────┐ │
│ │ Timestamp (4 bytes, uint32) │ │
│ │ Device Type (2 bytes, uint16) │ │
│ │ Production Year (1 byte) │ │
│ │ Production Month (1 byte) │ │
│ │ Production Number (2 bytes, uint16) │ │
│ │ Number of Values (4 bytes, uint32) │ │
│ │ ┌─── Value 1 ────────────────────────┐ │ │
│ │ │ Tag ID (2 bytes, uint16)│ │ │
│ │ │ Status (1 byte) │ │ │
│ │ │ Array Size (S) (1 byte) │ │ │
│ │ │ Element Size (N) (1 byte) │ │ │
│ │ │ Data (S×N bytes) │ │ │
│ │ └────────────────────────────────────┘ │ │
│ │ ┌─── Value 2 ────────────────────────┐ │ │
│ │ │ ... │ │ │
│ │ └────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────┘ │
│ ┌─── Group 2 ───────────────────────────────┐ │
│ │ ... │ │
│ └───────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
Why Groups Matter
The key insight is temporal grouping. Multiple tag values read at the same moment share a single timestamp and device identifier. Instead of repeating context with every value, you amortize it across an entire group.
In a typical chiller with 65+ temperature and pressure tags all polled in the same cycle, this saves ~10 bytes per tag — over 650 bytes per polling cycle.
Status Bytes Save Bandwidth Too
Notice the status byte (0x00 = OK, anything else = error code). When a tag read fails, you transmit just the 3-byte header (ID + status) instead of the full value. Failed reads are common on noisy factory floors — serial bus contention, PLC CPU overload, intermittent wiring. A binary format handles this gracefully without the awkward "value": null, "error": "timeout" constructs of JSON.
Data Type Packing: Getting the Bytes Right
Industrial data types map directly to fixed-width binary representations. Here's the canonical mapping used in most edge telemetry systems:
Type Encoding Table
| PLC Type | Wire Size | Byte Order | Example Value | Packed Bytes |
|---|---|---|---|---|
| BOOL | 1 byte | N/A | true | 01 |
| INT8 | 1 byte | N/A | 55 | 37 |
| UINT8 | 1 byte | N/A | 157 | 9D |
| INT16 | 2 bytes | Big-endian | -55 | FF C9 |
| UINT16 | 2 bytes | Big-endian | 32768 | 80 00 |
| INT32 | 4 bytes | Big-endian | -55 | FF FF FF C9 |
| UINT32 | 4 bytes | Big-endian | 4294967281 | FF FF FF F1 |
| FLOAT32 | 4 bytes | IEEE 754 BE | 1.55 | 3F C6 66 66 |
| FLOAT32 | 4 bytes | IEEE 754 BE | -1.55 | BF C6 66 66 |
Byte Ordering: The Silent Data Corruptor
Byte ordering mismatches are the #1 cause of garbage data in IIoT deployments. Here's why:
- Modbus registers are big-endian (MSB first) by specification
- EtherNet/IP (CIP) uses little-endian for REAL/DINT values
- Most ARM-based edge gateways are little-endian natively
- x86 cloud servers are little-endian
When you read a float from a Modbus holding register, the bytes arrive big-endian. If your edge gateway is little-endian and you memcpy those bytes into a float without swapping, you get nonsense. A temperature of 42.5°C becomes 2.77e-41.
Best practice: Standardize on big-endian (network byte order) for the wire format, regardless of the source protocol. Convert at the edge before packing.
Array Values: When One Tag Returns Many
Some PLC tags return arrays — a recipe with 8 ingredient percentages, or a batch of 4 temperature zones. The binary format handles this with the array_size and element_size fields:
Tag ID: 0x0023 (Recipe Values)
Status: 0x00 (OK)
Array Size: 8
Element Size: 4 (float32)
Data: 32 bytes (8 × 4-byte floats)
The decoder knows to read exactly S × N bytes of payload. No delimiters, no parsing, no ambiguity.
Batch Collection Strategies
Raw tag values don't ship individually — they accumulate in batches. Two finalization triggers work together:
Size-Based Finalization
Set a maximum batch buffer size (e.g., 4,000 bytes). When accumulated data approaches this limit, seal the batch and queue it for transmission. This prevents any single batch from exceeding the MQTT payload limit or the available RAM on constrained devices.
Time-Based Finalization
Set a maximum collection window (e.g., 60 seconds). Even if the buffer isn't full, finalize and send. This guarantees a maximum data latency — critical for alarm values where you can't wait minutes for the buffer to fill.
The Hybrid Approach
In practice, both triggers run simultaneously:
while collecting:
read next tag value
add to current group
if group timestamp changed:
close current group
start new group
if batch_size > MAX_SIZE:
finalize and transmit
if elapsed_time > MAX_TIMEOUT:
finalize and transmit
A well-tuned system uses a 4 KB buffer with a 60-second timeout. During normal operation, the timeout triggers first (30–40 tags generate ~800 bytes per cycle). During high-frequency events (1-second alarm monitoring), the size limit triggers first, preventing memory exhaustion.
Change-Based Filtering: Don't Send What Hasn't Changed
Here's where binary encoding gets truly efficient. Many industrial values are stable for long periods — a setpoint stays at 450°F for an entire 8-hour shift, a pump status remains "running" for days.
Compare-and-Send Logic
For each tag, maintain a last_values cache. On each read cycle:
- Read the current value from the PLC
- Compare with the cached value
- Only add to the batch if the value changed (or if this is the first read)
Tags that change frequently (temperatures, pressures) still transmit every cycle. Tags that rarely change (setpoints, status flags) might transmit once per hour. The bandwidth savings compound dramatically:
| Scenario | Without Filtering | With Filtering | Savings |
|---|---|---|---|
| 40 tags, 60s interval, 8hr shift | 19,200 transmissions | ~4,800 transmissions | 75% |
| 65 tags, 60s interval, 24hr shift | 93,600 transmissions | ~15,000 transmissions | 84% |
Critical Tags: Skip the Compare
Some tags must transmit every cycle regardless — delivery temperatures during active process control, flow values during filling operations. Mark these as compare: false in your tag configuration. They always transmit, ensuring the cloud has a continuous time series for trending and anomaly detection.
Event-Driven Tags: Bypass Batching Entirely
Pump status changes, heater activation, vent opening — these are operational events that demand immediate delivery. Configure these tags with two flags:
compare: true(only transmit on change)do_not_batch: true(send immediately, bypass the batch buffer)
The result: a pump going from OFF to ON at 14:32:07.342 arrives at the cloud within milliseconds, not 60 seconds later in the next batch. For alarm correlation, this timing precision is essential.
Calculated Tags: Edge-Side Bit Extraction
Many PLCs pack multiple boolean flags into a single 16-bit or 32-bit register. A status word might encode:
- Bit 0: Pump running
- Bit 1: Heater active
- Bit 2: Low flow alarm
- Bit 3: High temp alarm
- Bits 4–7: Operating mode (0–15)
Rather than transmitting the raw 16-bit value and parsing it in the cloud, efficient edge systems extract individual bits at the edge and transmit them as separate boolean tags.
How It Works
Define a "calculated tag" that references a source tag with a bit mask and shift count:
Source tag: Status Word (uint16, ID: 100)
Calculated tag: Pump Running (bool, ID: 200)
→ shift: 0, mask: 0x01
Calculated tag: Heater Active (bool, ID: 201)
→ shift: 1, mask: 0x01
Calculated tag: Low Flow Alarm (bool, ID: 202)
→ shift: 2, mask: 0x01
The edge device reads the status word once, then derives multiple boolean tags without additional PLC reads. Each calculated tag inherits the source tag's polling interval and compare behavior — but the comparison happens on the derived boolean value, not the raw word.
This is especially valuable for Modbus devices where each register read consumes bus time. Reading one register and extracting 8 booleans is 8x faster than reading 8 coils individually.
Practical Binary Encoding Example
Let's walk through encoding a real-world data frame. Suppose an HE Central Chiller (device type 1018) reports three values at timestamp 1609459200:
Tag 1: Tank Temperature = 42 (int16)
Tag 2: CQT 1 Approach Temp = 18 (int16)
Tag 40: Pump Status changed from 0→1 (bool, event-driven)
The batch frame:
F7 Command: tag values
00 00 00 01 1 group
60 06 0A 93 Timestamp: 1609459200
13 88 Device type: 0x1388 (decimal 5000)
00 Production year
00 Production month
30 39 Production number: 12345
00 00 00 02 2 values (pump status sent separately)
00 01 Tag ID: 1 (Tank Temperature)
00 Status: OK
01 Array size: 1
02 Element size: 2 (int16)
00 2A Value: 42
00 02 Tag ID: 2 (CQT 1 Approach Temp)
00 Status: OK
01 Array size: 1
02 Element size: 2 (int16)
00 12 Value: 18
Total: 35 bytes. The equivalent JSON would be ~250 bytes.
The pump status change (Tag 40) is sent immediately as its own micro-frame, bypassing the batch buffer entirely:
F7 00 00 00 01 60 06 0A 93 13 88 00 00 30 39 00 00 00 01
00 28 00 01 01 01
24 bytes for an immediate pump status event. In JSON: ~120 bytes.
Serial Number Encoding: Packing Identity Into 4 Bytes
Device identity in IIoT systems needs to be compact but unique. A common scheme packs the manufacturing identity into exactly 4 bytes:
Byte 0: Year (0x28 = 2010, 0x29 = 2011, 0x2A = 2012, ...)
Byte 1: Month (0x00 = Jan, 0x01 = Feb, ..., 0x0B = Dec)
Bytes 2-3: Unit (sequential manufacturing number, uint16)
Example: 0x002A0050 = January 2010, Unit #80
This encoding supports 65,536 units per month — more than enough for any manufacturing line — while fitting in a single 32-bit field. No GUIDs, no strings, no parsing.
When to Use Binary vs. JSON
Binary encoding isn't always the right choice. Here's the decision matrix:
| Factor | Use Binary | Use JSON |
|---|---|---|
| Network | Cellular, LoRa, satellite | Ethernet, WiFi |
| Gateway RAM | < 64 MB | > 256 MB |
| Tag count per device | > 20 | < 10 |
| Polling interval | ≤ 60 seconds | > 5 minutes |
| Cloud ingestion | Custom parser available | Generic REST API |
| Protocol | MQTT (binary payload) | HTTP REST |
For most production IIoT deployments — especially those using cellular gateways on the factory floor — binary encoding is the clear winner.
How machineCDN Handles Binary Telemetry
machineCDN's edge architecture uses binary encoding by default for all PLC telemetry. Tag values from Modbus RTU, Modbus TCP, and EtherNet/IP sources are packed into compact binary frames with temporal grouping, change-based filtering, and event-driven bypass for critical tags.
The platform handles type-aware encoding (bool through float32), automatic batch finalization (size + time triggers), and calculated tag extraction — all running on resource-constrained edge gateways with as little as 32 MB of RAM.
For teams building their own edge telemetry pipeline, the principles in this guide apply regardless of platform. But if you want this solved out of the box, machineCDN's edge agent handles the encoding, buffering, and delivery so your team can focus on what the data means rather than how it moves.
Key Takeaways
- JSON overhead is 10–13x for typical industrial telemetry — switch to binary encoding for cellular/constrained networks
- Temporal grouping amortizes device context across multiple tag values in the same read cycle
- Change-based filtering eliminates 75–85% of redundant transmissions for stable process values
- Event-driven tags bypass batching for immediate delivery of critical status changes
- Calculated tags extract multiple booleans from status words at the edge, reducing PLC bus traffic
- Standardize on big-endian wire format to avoid byte-ordering nightmares across protocols
- Dual finalization triggers (size + time) balance latency against bandwidth efficiency
The factory floor generates data at the edge. How you encode and move that data determines whether your IIoT deployment scales to 10 machines or 10,000.