-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Willow Application Server #211
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
stintel
force-pushed
the
feature/was
branch
13 times, most recently
from
June 30, 2023 22:49
410a269
to
0fdc19c
Compare
stintel
force-pushed
the
feature/was
branch
5 times, most recently
from
July 14, 2023 13:08
f511bcf
to
2b6a9da
Compare
stintel
force-pushed
the
feature/was
branch
8 times, most recently
from
July 19, 2023 10:26
fc05edf
to
ff7ec43
Compare
kristiankielhofner
force-pushed
the
feature/was
branch
from
July 21, 2023 06:33
527150e
to
c7ea431
Compare
stintel
force-pushed
the
feature/was
branch
3 times, most recently
from
July 23, 2023 19:24
d7101f2
to
aa4ee65
Compare
And have it disable the power amplifier for events PAUSED, STOPPED, FINISHED and ERROR.
Fixes: dc3f443 ("main: add cancel button")
We're modifying the recording variable in one task, and we read it in another. Make the variable volatile, to be sure we don't read an optimized or cached value.
Don't send STOP on WAKE when we're already recording, just avoid sending START again, and let the STOP on WAKE END trigger everything. Don't stop streaming to WIS when reading from the recorder returns 0. This should avoid early cut-off when something can't keep up for some reason.
During a torture run with WIS over HTTPS we've seen the following crash: ***ERROR*** A stack overflow in task http_stream_writer has been detected. Increase the task stack size to avoid this.
During a torture run with heap poisoning enabled, the following crash was observed: Guru Meditation Error: Core 1 panic'ed (StoreProhibited). Exception was unhandled. Core 1 register dump: PC : 0x4038abca PS : 0x00060033 A0 : 0x8038b847 A1 : 0x3fcc9150 A2 : 0x3d800014 A3 : 0x0000000c A4 : 0x3d800038 A5 : 0x3d800274 A6 : 0x00060023 A7 : 0x00000000 A8 : 0x3d800014 A9 : 0x00000394 A10 : 0x3d9a28f4 A11 : 0x00000038 A12 : 0xabba1234 A13 : 0x00000003 A14 : 0x0000001e A15 : 0x3d80007c SAR : 0x0000001f EXCCAUSE: 0x0000001d EXCVADDR: 0xabba1240 LBEG : 0x400570e8 LEND : 0x400570f3 LCOUNT : 0x00000000 /opt/esp/idf/components/heap/heap_tlsf.c:206 /opt/esp/idf/components/heap/multi_heap.c:187 /opt/esp/idf/components/heap/multi_heap_poisoning.c:234 /opt/esp/idf/components/heap/multi_heap_poisoning.c:229 /opt/esp/idf/components/heap/heap_caps.c:175 /opt/esp/idf/components/heap/heap_caps.c:239 /opt/esp/idf/components/newlib/heap.c:24 /opt/esp/idf/components/json/cJSON/cJSON.c:243 /opt/esp/idf/components/json/cJSON/cJSON.c:1633 /opt/esp/idf/components/json/cJSON/cJSON.c:1674 /opt/esp/idf/components/json/cJSON/cJSON.c:1674 /opt/esp/idf/components/json/cJSON/cJSON.c:1116 /opt/esp/idf/components/json/cJSON/cJSON.c:1087 /opt/esp/idf/components/json/cJSON/cJSON.c:1173 /willow/main/endpoint/hass.c:58 /opt/esp/idf/components/esp_event/esp_event.c:145 /opt/esp/idf/components/esp_websocket_client/esp_websocket_client.c:155 /opt/esp/idf/components/esp_websocket_client/esp_websocket_client.c:555 Line 59 in hass.c is cJSON_Parse(resp). Let's add a few extra checks to make sure we do not call cJSON_Parse on garbage data.
We're regularly observing LVGL using 100% CPU with scrolling text. This seems to cause all kinds of random behaviour. Disable it for now.
We've observed the following crash: Guru Meditation Error: Core 0 panic'ed (Double exception). Core 0 register dump: PC : 0x4038424a PS : 0x00040936 A0 : 0x820e7897 A1 : 0x3d9a74b0 A2 : 0x00000000 A3 : 0x3d9a74f0 A4 : 0x00000000 A5 : 0x00000000 A6 : 0x00000000 A7 : 0x00000000 A8 : 0x820ed3f3 A9 : 0x3d9a7490 A10 : 0x00000003 A11 : 0x3d99ecd4 A12 : 0x00000000 A13 : 0x00000000 A14 : 0x3d9a74f0 A15 : 0x00000000 SAR : 0x00000019 EXCCAUSE: 0x00000002 EXCVADDR: 0xfffffff0 LBEG : 0x42148bf1 LEND : 0x42148bfc LCOUNT : 0x00000000 Backtrace: 0x40384247:0x3d9a74b0 0x420e7894:0x00000000 |<-CORRUPTED /opt/esp/tools/xtensa-esp32s3-elf/esp-2021r2-patch5-8.4.0/xtensa-esp32s3-elf/bin/xtensa-esp32s3-elf-addr2line -e build/willow.elf 0x40384247:0x3d9a74b0 0x420e7894:0x00000000 /opt/esp/idf/components/freertos/port/xtensa/xtensa_context.S:194 /opt/esp/idf/components/mbedtls/mbedtls/library/md.c:601 On this line, mbedtls_sha512_finish_ret is called. Let's disable SHA hardware acceleration, as we're suspecting this causes issues.
We've experienced the following crash during boot after disabling SHA hardware acceleration: ***ERROR*** A stack overflow in task websocket_task has been detected. Increase the task stack size to avoid this.
Found with clang-tidy: /willow/main/endpoint/openhab.c:70:9: warning: Null pointer passed as 1st argument to string length function [clang-analyzer-unix.cstring.NullArg]
Found with clang-tidy.
Found with clang-tidy: /willow/main/was.c:104:21: warning: 1st function call argument is an uninitialized value [clang-analyzer-core.CallAndMessage]
And define it to the same value used in sdkconfig.willow if it doesn't. Fixes the following error running clang-check: /willow/main/audio.c:655:42: error: use of undeclared identifier 'CONFIG_TASK_WDT_PANIC' [clang-diagnostic-error]
Fixes the following clang-tidy warnings: /willow/main/endpoint/hass.c:278:5: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores] /willow/main/endpoint/hass.c:278:5: note: Value stored to 'ret' is never read /willow/main/endpoint/hass.c:319:5: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores] /willow/main/endpoint/hass.c:319:5: note: Value stored to 'ret' is never read We print an error message in hass_set_http_auth so we don't need to print an error message in the callers. We might consider returning early in the caller when hass_set_http_auth returns false, or changing the function to void if we keep the current behaviour. Not sure which one would be best, so let's just fix the warnings for now.
The audio_board_lcd_init returns a void pointer so we need to cast it to esp_lcd_panel_handle_t.
Saying wake word when VAD_END/WAKEUP_END hasn't been trigger yet causes multiple session timers to start. Let's do like we do for VAD_START and do nothing when we're already recording.
Terminating the audio pipeline will destroy the tasks for the elements in the pipeline, so the tasks have to be recreated every time we stream to WIS. This isn't very efficient, and it doesn't appear to be needed. We also suspect this could leak memory, and sometimes causes this error: E (16:45:01.400) AUDIO_PIPELINE: audio_pipeline_resume failed After this error, the device is no longer able to stream to WIS.
The ESP_AUDIO_PREFER_MEM mode seems to destroy and recreate its element tasks on every run. This is not very efficient, and as we have plenty of free memory, we can perfectly use the ESP_AUDIO_PREFER_SPEED mode instead. TTS log of 2nd wake before this change: I (02:13:12.437) WILLOW/AUDIO: Using WIS TTS URL 'https://wis.local:19000/api/tts?format=WAV&speaker=CLB&text=0' I (02:13:12.450) AUDIO_PIPELINE: audio_pipeline_unlinked W (02:13:12.454) AUDIO_PIPELINE: There are no listener registered I (02:13:12.461) AUDIO_PIPELINE: link el->rb, el:0x3d9f7c78, tag:IN_http, rb:0x3db20950 I (02:13:12.470) AUDIO_PIPELINE: link el->rb, el:0x3d9f8484, tag:DEC_auto, rb:0x3db20988 I (02:13:12.478) AUDIO_PIPELINE: link el->rb, el:0x3db1968c, tag:Audio_forge, rb:0x3db132cc I (02:13:12.488) AUDIO_ELEMENT: [IN_http-0x3d9f7c78] Element task created I (02:13:12.495) AUDIO_ELEMENT: [IN_http] AEL_MSG_CMD_RESUME,state:1 I (02:13:12.503) AUDIO_ELEMENT: [DEC_auto-0x3d9f8484] Element task created I (02:13:12.509) AUDIO_ELEMENT: [DEC_auto] AEL_MSG_CMD_RESUME,state:1 I (02:13:13.285) AUDIO_ELEMENT: [Audio_forge-0x3db1968c] Element task created I (02:13:13.285) AUDIO_ELEMENT: [Audio_forge] AEL_MSG_CMD_RESUME,state:1 I (02:13:13.289) AUDIO_ELEMENT: [OUT_iis-0x3da17d48] Element task created I (02:13:13.297) AUDIO_ELEMENT: [OUT_iis] AEL_MSG_CMD_RESUME,state:1 I (02:13:13.308) AUDIO_ELEMENT: IN-[IN_http] AEL_IO_DONE,0 I (02:13:13.310) AUDIO_ELEMENT: IN-[DEC_auto] AEL_IO_DONE,-2 I (02:13:13.785) AUDIO_ELEMENT: IN-[Audio_forge] AEL_IO_DONE,-2 I (02:13:13.933) AUDIO_ELEMENT: IN-[OUT_iis] AEL_IO_DONE,-2 I (02:13:14.340) AUDIO_ELEMENT: [http_stream_writer] AEL_MSG_CMD_PAUSE TTS log of 2nd wake after this change: I (02:18:19.707) WILLOW/AUDIO: Using WIS TTS URL 'https://wis.local:19000/api/tts?format=WAV&speaker=CLB&text=0' I (02:18:19.718) AUDIO_ELEMENT: [IN_http] AEL_MSG_CMD_RESUME,state:1 I (02:18:19.725) AUDIO_ELEMENT: [DEC_auto] AEL_MSG_CMD_RESUME,state:1 I (02:18:20.513) AUDIO_ELEMENT: [Audio_forge] AEL_MSG_CMD_RESUME,state:1 I (02:18:20.514) AUDIO_ELEMENT: [OUT_iis] AEL_MSG_CMD_RESUME,state:1 I (02:18:20.525) AUDIO_ELEMENT: IN-[IN_http] AEL_IO_DONE,0 I (02:18:20.526) AUDIO_ELEMENT: IN-[DEC_auto] AEL_IO_DONE,-2 I (02:18:20.996) AUDIO_ELEMENT: IN-[Audio_forge] AEL_IO_DONE,-2 I (02:18:21.146) AUDIO_ELEMENT: IN-[OUT_iis] AEL_IO_DONE,-2 I (02:18:21.616) AUDIO_ELEMENT: [http_stream_writer] AEL_MSG_CMD_PAUSE This also appears to work around the following observed behavior with task heap debugging using esp_dump_per_task_heap_info: Task: main -> CAP_8BIT: 1392004 CAP_32BIT: 0 Task: esp_periph -> CAP_8BIT: 25296 CAP_32BIT: 0 Task: wifi -> CAP_8BIT: 105068 CAP_32BIT: 0 Task: Pre-Scheduler allocs -> CAP_8BIT: 4068 CAP_32BIT: 0 Task: ipc0 -> CAP_8BIT: 30028 CAP_32BIT: 0 Task: tiT -> CAP_8BIT: 2908 CAP_32BIT: 0 Task: sys_evt -> CAP_8BIT: 100 CAP_32BIT: 0 Task: websocket_task -> CAP_8BIT: 724 CAP_32BIT: 0 Task: websocket_task -> CAP_8BIT: 25708 CAP_32BIT: 0 Task: i2s_stream_read -> CAP_8BIT: 2048 CAP_32BIT: 0 Task: at_read -> CAP_8BIT: 10584 CAP_32BIT: 0 Task: feed_task -> CAP_8BIT: 8192 CAP_32BIT: 0 Task: ipc1 -> CAP_8BIT: 24 CAP_32BIT: 0 Task: btm_rrm_t -> CAP_8BIT: 100 CAP_32BIT: 0 Task: esp_timer -> CAP_8BIT: 208 CAP_32BIT: 0 Task: media_task -> CAP_8BIT: 41456 CAP_32BIT: 0 Task: recorder_task -> CAP_8BIT: 24 CAP_32BIT: 0 Task: http_stream_wri -> CAP_8BIT: 4904 CAP_32BIT: 0 Task: �������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� Tasks like the latter would appear randomly, sometimes with a short sequence of binary data for task name, but regularly with multiple lines (100+) full of it, eventually slowing things down so badly that the TWDT triggers a restart.
We're still able to lock up the display. When this happens, the audio being sent to WIS is often incomplete. Enable SPIRAM_RODATA and SPIRAM_FETCH_INSTRUCTIONS to hopefully finally fix this.
We only stop the audio pipeline to stream to WIS when the session timer expires. VAD_END and WAKE_END will potentially trigger afterwards. If this happens when a new stream to WIS was already started, all kinds of weirdness can occur. Force stop the recorder when the session timer expires to avoid this.
We were still able to lock up the display, with both our custom font and montserat. Enable LCD_RGB_ISR_IRAM_SAFE to test if this helps.
During several torture runs, the following error appeared sporadically: E (23:45:27.704) esp-tls-mbedtls: read error :-0x7180: E (23:45:27.705) TRANSPORT_BASE: esp_tls_conn_read error, errno=Success Disabling AES hardware acceleration seems to fix those.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.