azure-ai/AzureAi.Transcriber/README.md

Azure Ai Transcribing
=====================

This project is meant to demonstrate and document my attempts to create a video transcribe service
using Azure AI.

## Creating Azure Ai Resource
- I created just plain *Azure AI Services* resource from Azure Portal
- I chose Sweden Central as location and standard S0 as pricing tier
- I am using Visual Studio license attached Azure account so I have roughly 150€ of free credits
- here's how the portal looked after

![Azure AI resource](note-resources/azure-ai-resources-list-after-creation.png)

- all audio files need to be in specific format, I am using ffmpeg to convert

```shell
ffmpeg -i [INPUT].mp3 -acodec pcm_s16le -ac 1 -ar 16000 [OUTPUT].wav
```

## Running this app

- first get your Key and Region data from Azure Portal
- then set environment variables

```shell
    export SPEECH_KEY=your_key
    export SPEECH_REGION=your_region

    dotnet run
```

## Using this app
- You need to fill out the `Data` folder with some .wav files
- The app will then display those files and you can choose which to transcribe
- The app will then display the transcription

### Experimentation results & thoughts
- I used three different files of varying lengths to try to transcribe each
- It seems there is a limit as to how long the audio file can be
- It might be that the silence detection is too strict
  - Yeah documentation says as much
> This example uses the RecognizeOnceAsync operation to transcribe utterances of up to 30 seconds, or until silence is detected.
- The results that I do get are very readable
- Transcribing entire files needed a different code solution