Detect silence before transcribing | Voters

Detect silence before transcribing

Thomas Reintjes

Only transcribe parts of audio files that contain a meaningful signal. This comes in handy especially with the new multitrack feature, which otherwise is very expensive to use when you're charged for transcribing all the silence...

March 14, 2018

Dan Reyes-Cairo

marked this post as

open

Hi all, we appreciate your continued feedback and support for this issue. While we'd love to get to this sooner than later, we haven't yet been able to incorporate it into our roadmap as originally intended 3 years ago so we're moving this out of review for the time being without an ETA for arrival.

No further details to share at this time, but do continue to let us know your thoughts and upvote if this is something you'd like to see incorporated in the future.

Sat P

Dan Reyes-Cairo: This is very disappointing. It's one of the major issues that everyone faces who ends up with two tracks for each podcast. Descript could easily implement a compromise solution. Mark both tracks as one if they both have the same length. For example, I always clean and rough edit both in Audition first, then I upload them. It would be pretty obvious that if both tracks are 43m:02s, for example, that they are for the same episode. Just have a rule that if one track is even 01s longer, the system rejects it. You don't need any highly paid AI engineers or an army of coders to implement something like this.

Thomas Latter

Sat P: Not quite an elegant solution I'm afraid. Descript would need to have some code to detect meaningful signal so that you're not just trying to cheat the transcription limits by uploading multiple files as part of a sequence. And once you have code to detect meaningful silence, you might as well just use that for calculation rather than rely on track length. 
Having said that, I can't imagine the code for calculating how much silence there is in a track is hard to make, since this is a trivial task in any DAW. 
My current solution is to mix and master my tracks separately before importing them into descript as a single track. Just means I essentially ignore their 'sequence' feature. But it's a workaround for now.

George Mallone

Dan Reyes-Cairo: i sometimes make screencasts where i go over some educational material and there are long pauses or gaps. i could stop and start the recording repeatedly and stitch together multiple recordings but that's annoying. OTOH it's also annoying to burn up transcription hours on silence. having some sort of remove silence rough cut tool would be nice. i know the Reaper audio-editing application has a feature that lets you automatically remove silences from a video (i've used that before) but i'm trying to keep the workflow simple. this one feature would take my experience of the Descript app from "decent" to "GREAT, STUPENDOUS AND AMAZING VALUE"

Stacey Axelrod

New Descript user here and this is my #1 issue. I record interviews on Zencastr which outputs two separate files. I then sync them back together as a sequence in Descript. But I was very surprised to find out that the gaps of silence in both tracks used up my transcription time. With a big backlog of interviews to transcribe, I have unfortunately reached my 30 hour transcription limit much quicker than I expected. Once I figured out what was happening, I tried to use the "Shorten Word Gaps" feature before transcribing, but it does not work on non-transcribed files.

Sat P

Absolutely agree! I reluctantly upload both podcast audio tracks as one because the system deducts 2x 60 mins regardless that the audio is for the same podcast episode

Jeremy Au

I’m doing an interview style podcast. Let’s say the show is for 60min. I record it from Squadcast remotely which gives me 2 WAV tracks - 1 for the interviewer and 1 for the interviewee. When I put it into Descript, it is automatically recognized as a sequence where we are taking turns speaking.
However, when I want to use the white glove transcript, it charges me 60min  $2/min
 2 tracks = $240.
This is 2X more than what it needs to be, as only one speaker is ever speaking at a time.
This is probably an error as the transcription isn’t working from the Sequence and not from each audio file.
Otherwise, me as the interviewer who only speaks for 5min in the whole show is going to be over-charged by over a $100, which doesn’t make sense for me to use the feature.
Could you fix this by charging it as $2/min for the sequence length, rather than the combined length of the two tracks?

Jeremy Au

The other alternative is as what the others recommend, which is to only transcribe meaningful signal, which should effectively cut the cost as per above

Ben Gregson

Jeremy Au: this may have changed since you commentd this, but I believe they now charge $2.50 an hour not minute if you want to go over the credits in your subscription

Mitch Hollis

Is this still under consideration? This would make a huge difference for us and would mean we could actually use your tool with all of our podcast rather than just a few episodes!

Chad McAllister

But please leave time codes based on the original audio, taking into account leading silence or silence in the audio.

Andrew Mason

marked this post as

under consideration

Jack Saturn

Andrew Mason: Very interested in this as well. This is a phenomenal service, but while testing I was disappointed to discover that 75 minutes of a 2-track conversation used up 2.5 hours of my available time.

Andrew Mason

Yes you are right. We will do this soon.

Thomas Reintjes

Andrew Mason: 👍👍

Jason Lustig

Andrew Mason: So when we use multi track transcribing, let’s say it’s 60 minutes x 2 tracks. Are we charged for 120 minutes?

Jeremy Au

Jason Lustig: exactly! I wrote my experience above as well

AY-AY-Ron

Andrew Mason: Has this yet been addressed. Any progress you can share? :-)

Ben Gregson

Andrew Mason: hi, it's been nearly three years, any progress on this?