azure-ai/AzureAi.Transcriber/README.md

47 lines
1.6 KiB
Markdown
Raw Normal View History

2024-05-08 10:15:01 +00:00
Azure Ai Transcribing
=====================
This project is meant to demonstrate and document my attempts to create a video transcribe service
using Azure AI.
## Creating Azure Ai Resource
- I created just plain *Azure AI Services* resource from Azure Portal
- I chose Sweden Central as location and standard S0 as pricing tier
- I am using Visual Studio license attached Azure account so I have roughly 150€ of free credits
- here's how the portal looked after
![Azure AI resource](note-resources/azure-ai-resources-list-after-creation.png)
- all audio files need to be in specific format, I am using ffmpeg to convert
```shell
ffmpeg -i [INPUT].mp3 -acodec pcm_s16le -ac 1 -ar 16000 [OUTPUT].wav
```
## Running this app
- first get your Key and Region data from Azure Portal
- then set environment variables
```shell
export SPEECH_KEY=your_key
export SPEECH_REGION=your_region
dotnet run
```
## Using this app
- You need to fill out the `Data` folder with some .wav files
- The app will then display those files and you can choose which to transcribe
- The app will then display the transcription
### Experimentation results & thoughts
- I used three different files of varying lengths to try to transcribe each
- It seems there is a limit as to how long the audio file can be
- It might be that the silence detection is too strict
- Yeah documentation says as much
> This example uses the RecognizeOnceAsync operation to transcribe utterances of up to 30 seconds, or until silence is detected.
- The results that I do get are very readable
2024-05-08 12:32:46 +00:00
- Transcribing entire files needed a different code solution
2024-05-08 10:15:01 +00:00