There is a transcription / text alignment bug. The transcription converts spoken numbers (e.g. "two hundred and seventy five thousand dollars") to formatted symbols ("$275,000"), which breaks word-level timing alignment
I want the transcript to reflect spoken words for alignment purposes, but captions to display the formatted version — currently there's no way to do both
I've already tried re-transcribing and re-aligning without success