diff --git a/AzureAi.Transcriber/README.md b/AzureAi.Transcriber/README.md new file mode 100644 index 0000000..e62725e --- /dev/null +++ b/AzureAi.Transcriber/README.md @@ -0,0 +1,46 @@ +Azure Ai Transcribing +===================== + +This project is meant to demonstrate and document my attempts to create a video transcribe service +using Azure AI. + +## Creating Azure Ai Resource +- I created just plain *Azure AI Services* resource from Azure Portal +- I chose Sweden Central as location and standard S0 as pricing tier +- I am using Visual Studio license attached Azure account so I have roughly 150€ of free credits +- here's how the portal looked after + +![Azure AI resource](note-resources/azure-ai-resources-list-after-creation.png) + +- all audio files need to be in specific format, I am using ffmpeg to convert + +```shell +ffmpeg -i [INPUT].mp3 -acodec pcm_s16le -ac 1 -ar 16000 [OUTPUT].wav +``` + +## Running this app + +- first get your Key and Region data from Azure Portal +- then set environment variables + +```shell + export SPEECH_KEY=your_key + export SPEECH_REGION=your_region + + dotnet run +``` + +## Using this app +- You need to fill out the `Data` folder with some .wav files +- The app will then display those files and you can choose which to transcribe +- The app will then display the transcription + +### Experimentation results & thoughts +- I used three different files of varying lengths to try to transcribe each +- It seems there is a limit as to how long the audio file can be +- It might be that the silence detection is too strict + - Yeah documentation says as much +> This example uses the RecognizeOnceAsync operation to transcribe utterances of up to 30 seconds, or until silence is detected. +- The results that I do get are very readable +- + diff --git a/AzureAi.Transcriber/wwwroot/app.css b/AzureAi.Transcriber/wwwroot/app.css index 7ac30e2..659573e 100644 --- a/AzureAi.Transcriber/wwwroot/app.css +++ b/AzureAi.Transcriber/wwwroot/app.css @@ -1,41 +1,11 @@ +*, *:before, *:after { + box-sizing: border-box; +} + html, body { font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; } -a, .btn-link { - color: #006bb7; -} - -.btn-primary { - color: #fff; - background-color: #1b6ec2; - border-color: #1861ac; -} - -.btn:focus, .btn:active:focus, .btn-link.nav-link:focus, .form-control:focus, .form-check-input:focus { - box-shadow: 0 0 0 0.1rem white, 0 0 0 0.25rem #258cfb; -} - -.content { - padding-top: 1.1rem; -} - -h1:focus { - outline: none; -} - -.valid.modified:not([type=checkbox]) { - outline: 1px solid #26b050; -} - -.invalid { - outline: 1px solid #e50000; -} - -.validation-message { - color: #e50000; -} - .blazor-error-boundary { background: url() no-repeat 1rem/1.8rem, #b32121; padding: 1rem 1rem 1rem 3.7rem;