News & Events

Fixing the SNZB-02DR2 OTA Issue: Improving Home Assistant Support for Telink OTA

Fixing the SNZB-02DR2 OTA Issue: Improving Home Assistant Support for Telink OTA

Summary

In September 2025, the newly released SNZB-02DR2 worked as expected on SONOFF's own gateway, but OTA issues started to appear on Home Assistant. Reported symptoms included update check failures, upgrades reaching 100% without changing the installed firmware version, and missing firmware push notifications. The issue appeared in both Zigbee2MQTT and ZHA.
SNZB-02DR2 is built on a Telink SoC. After reviewing logs, firmware image structure, and platform behavior, the SONOFF team traced the issue to changes in the OTA encryption format used in the Telink SDK, which introduced a compatibility gap with existing OTA parsing logic. The fixes were pushed in two stages: first on the Zigbee2MQTT side for the Telink 0xf000 vendor-specific encrypted packaging, then on the ZHA / zigpy side for encrypted Telink OTA parsing. Community contributors later extended the work with OTA block size, schema, and provider updates.

Exposure and Symptoms

In September 2025, the newly released SNZB-02DR2 operated normally on SONOFF's own gateway, but OTA-related failures appeared on Home Assistant. The main symptoms were:

  • update check failures
  • OTA transfers timing out or stalling
  • upgrade progress reaching 100% while the firmware version remained unchanged
  • devices showing Firmware: Unknown
  • missing firmware push notifications when an update was expected

These symptoms appeared in both Zigbee2MQTT and ZHA. Based on logs, firmware images, and runtime behavior, the SONOFF team narrowed the issue down to OTA compatibility around the Telink SDK encryption format.

OTA Failure Types

This issue falls into three categories:

The first is wireless transport failure, typically seen as timeouts, stalls, or excessive retransmissions. Common causes include 2.4 GHz interference, weak signal strength, and long device-to-coordinator distance.

The second is OTA image parsing failure. In this case, update checks fail early, or the image transfers but later fails during parsing, slicing, or validation.

The third is version and configuration mismatch. Typical examples include containers still running older images, OTA provider settings not matching the installed version, or incorrect index metadata. The usual symptom is that the environment has been updated, but the fix does not appear to take effect.

Telink OTA Packaging and Parsing Misalignment

The root cause was traced to changes in the OTA encryption format used by the Telink SDK.

A standard Zigbee OTA sub-element can be simplified as:

Tag ID + Length + Payload

In the standard layout, Length Field describes only the size of the following Data. It does not include Tag ID or the Length Field itself.

In Telink OTA encrypted packaging, especially in vendor-specific elements such as 0xf000, the layout contains an extra field:

Tag ID + Length + Tag Info + Payload

The inserted Tag Info changes the physical layout of the data stream, while the meaning of Length is not interpreted correctly by parsers that still follow the standard path. That is what causes the offset mismatch.

If a parser still reads the image as standard ZCL OTA, offset calculation, length handling, and data slicing will be wrong. Typical results are:

  • offset or bounds errors during OTA image inspection
  • successful transfer of a corrupted in-memory image
  • rejection by the device during the final validation stage

The issue is therefore a parsing mismatch caused by the updated OTA encryption format in the Telink SDK.

Zigbee2MQTT: Diagnosis and Fix

issue #9963

Item: issue #9963
Repository: zigbee-herdsman-converters
Date: 2025-09-09
Role: first public report identifying OTA unpacking failures caused by Telink 0xf000 vendor-specific encrypted packaging

On September 9, 2025, issue #9963 (https://github.com/Koenkk/zigbee-herdsman-converters/issues/9963) was opened in zigbee-herdsman-converters under the title Some questions about the OTA encryption method.

The issue already showed the core failure: during OTA update checks, parsing reached tagID:0xf000, then failed with an out-of-range offset error. It also documented the SONOFF team's conclusions:
• the issue was related to OTA encryption packaging in the Telink SDK
• the image structure contained an extra Tag Info field
• the parsing fix could be verified by changing parseSubElement
• OTA completed successfully after the parser adjustment

PR #9984

Item: PR #9984
Repository: zigbee-herdsman-converters
Date: merged on 2025-09-13
Role: added Telink OTA parsing compatibility and fixed the packaging issue exposed in #9963

Based on the analysis in the issue, the SONOFF team submitted PR #9984 (https://github.com/Koenkk/zigbee-herdsman-converters/pull/9984), titled add ota handle code with telink ota.

The PR was merged on 2025-09-13, with merge commit 6cb56489bc9e9ea4c2df99f9499d61d9c10671f0.
The fix did not add a device-specific branch for a single product. It added parsing compatibility for Telink OTA packaging. The scope was therefore broader than one model and covered the compatibility impact introduced by the OTA encryption format changes in the Telink SDK.

Release 25.25.0

Item: release 25.25.0
Repository: zigbee-herdsman-converters
Date: 2025-09-13
Role: brought Telink encrypted OTAs support into the Zigbee2MQTT mainline

Release PR #10005 (https://github.com/Koenkk/zigbee-herdsman-converters/pull/10005) for zigbee-herdsman-converters was merged on 2025-09-13, corresponding to version 25.25.0.

The release notes explicitly included:
• Support Telink encrypted OTAs (#9984)

This made Telink encrypted OTAs part of the Zigbee2MQTT mainline.

ZHA / zigpy Progression

Once the issue was diagnosed and fixed on the Zigbee2MQTT side, it was clear that the problem was not tied to one integration path. It was a broader compatibility gap in how Home Assistant handled Telink OTA.

ZHA relies on zigpy for OTA processing, so the root cause confirmed in Zigbee2MQTT also had to be addressed in the zigpy OTA parser.

PR #1734

Item: PR #1734
Repository: zigpy
Role: pushed Telink OTA parsing support into the ZHA / zigpy side

Based on the Telink OTA structure issue exposed in Zigbee2MQTT, the SONOFF team submitted PR #1734 (https://github.com/zigpy/zigpy/pull/1734) to zigpy. The proposal added dedicated parsing support for Telink OTA and propagated the related encryption flag.

PR #1736

Item: PR #1736
Repository: zigpy
Date: merged on 2026-01-03
Role: introduced an independent parser path for encrypted Telink OTA files

After the discussion around #1734, zigpy adopted a different implementation in PR #1736 (https://github.com/zigpy/zigpy/pull/1736), titled Parse encrypted Telink OTA files.

The PR was merged on 2026-01-03, with merge commit 78a0b4da658deef0e7e88ec3bad0c80f20a19ea9.

The main changes in #1736 were:
• direct detection and parsing of encrypted Telink OTA images
• removal of strict dependence on external metadata to identify Telink encrypted images
• native recognition of Telink OTA inside the parser

This allowed the parser to identify these images directly instead of relying only on external conditions.

zigpy 0.90.0 Release

Item: release 0.90.0
Repository: zigpy
Date: 2026-01-03
Role: first formal release with encrypted Telink OTA parsing support

zigpy 0.90.0 was released on 2026-01-03, and its release notes explicitly included:
• Parse encrypted Telink OTA files by @puddly in #1736 Release link: zigpy 0.90.0 (https://github.com/zigpy/zigpy/releases/tag/0.90.0) 0.90.0 is the key release where encrypted Telink OTA parsing became available in ZHA / zigpy.

Community Follow-up Work

0.90.0 solved the core parsing problem for encrypted Telink OTA, but it did not cover all compatibility details. The community continued with follow-up fixes.

OTA Block Size Fix

Item: PR #1781
Repository: zigpy
Date: merged on 2026-03-02
Role: fixed the mismatch between Telink OTA block size requests and the platform cap

In PR #1781 (https://github.com/zigpy/zigpy/pull/1781), community contributors addressed OTA block size compatibility.
The PR, titled Increase OTA block size cap, was merged on 2026-03-02. Some Telink devices request a 48-byte OTA block, while the earlier cap was only 40 bytes. Even with correct image parsing, the upgrade can still fail if the requested and delivered block sizes do not match.
This shows that Telink OTA compatibility is not only about image parsing. It also includes transport parameter matching.

OTA Schema and Provider Alignment

Item: PR #1782
Repository: zigpy
Date: merged on 2026-03-04
Role: aligned the zigpy-ota schema with the OTA provider path

PR #1782 (https://github.com/zigpy/zigpy/pull/1782) then upgraded the zigpy-ota schema to v2.

Even when the framework already supports Telink OTA parsing, users can still see unknown firmware, missing push notifications, or no effective change if OTA metadata, provider configuration, and the installed zigpy version are out of sync.

• 0.90.0 solved the core encrypted Telink OTA parsing issue

• later community changes completed block size, schema, and OTA distribution details

• ZHA support for Telink OTA matured in stages rather than in a single patch

Common Reasons the Fix Still Does Not Take Effect

“Supported upstream” does not mean the local environment will immediately behave correctly. The most common last-mile causes are below.

1. The runtime version is not the version the user expects

This is common in Docker deployments. A user may update files on the host or believe the code has already been updated, while the container is still running older package versions. If the runtime zigpy or related package has not changed, the upstream fix will not appear locally.

2. Only part of the stack has been updated

For example, Home Assistant may be upgraded while zigpy is still below the required version, or local OTA files and provider configuration may still follow the old path. In that case, the environment still behaves as if the fix is missing.

3. OTA provider or index metadata is still incorrect

Even when platform versions are correct, the platform may fail to identify or push firmware if manufacturerCode, imageType, fileVersion, sha, or file paths in index.json are inconsistent.

4. The device did not send a new OTA query

Server-side readiness does not mean the device will immediately request a new image. If the device does not send Query Next Image Request again, firmware push may still appear to be missing.

5. Wireless transport issues still exist

Even after parsing compatibility is fixed, weak signal, long distance, interference, and excessive retransmissions can still break OTA. These are separate transport problems and do not disappear when the parsing issue is fixed.

Upstream support is a necessary condition for OTA success. Whether the fix actually works in the field depends on the version chain, configuration chain, container state, and wireless conditions.

User-side Troubleshooting

If the SNZB-02DR2 environment has already been upgraded to versions that include the relevant fixes, but OTA is still failing, firmware is still not pushed, upgrades still stall, or the version still does not change after update, the following checks should be done in order:

1. Confirm the actual platform and component versions

• Zigbee2MQTT should be at least 25.25.0

• ZHA / zigpy should be at least 0.90.0

2. Confirm that the container image or runtime environment has actually been updated

3. Confirm that OTA provider settings, local OTA index data, and firmware metadata are consistent

4. Trigger a new OTA query from the device, for example by power-cycling the device so it sends Query Next Image Request

5. Check signal strength, distance, and 2.4 GHz interference

Conclusion

The OTA fix around SNZB-02DR2 started with failures observed in real deployments. The root cause was then narrowed to a compatibility gap between OTA encryption format changes in the Telink SDK and existing parser behavior.

From Zigbee2MQTT #9963 -> #9984 -> 25.25.0, to ZHA / zigpy #1734 -> #1736 -> 0.90.0, and then to later community work on OTA block size, schema, and provider handling, the result is broader and more stable Telink OTA support across Home Assistant.

For device vendors, compatibility support is not only about providing a temporary workaround. It also requires pushing the issue into the upstream ecosystem so the fix becomes part of the platform itself.

Weiterlesen

Labor Day Holiday Notice

Hinterlasse einen Kommentar

Alle Kommentare werden vor der Veröffentlichung geprüft.

Diese Website ist durch hCaptcha geschützt und es gelten die allgemeinen Geschäftsbedingungen und Datenschutzbestimmungen von hCaptcha.