%PDF- %PDF-
Direktori : /www/varak.net/nextcloud.varak.net/core/doc/admin/ai/ |
Current File : //www/varak.net/nextcloud.varak.net/core/doc/admin/ai/app_stt_whisper2.html |
<!DOCTYPE html> <html class="writer-html5" lang="en" data-content_root="../"> <head> <meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>App: Local Whisper Speech-To-Text (stt_whisper2) — Nextcloud latest Administration Manual latest documentation</title> <link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=fa44fd50" /> <link rel="stylesheet" type="text/css" href="../_static/css/theme.css?v=19f00094" /> <link rel="stylesheet" type="text/css" href="../_static/copybutton.css?v=76b2166b" /> <link rel="stylesheet" type="text/css" href="../_static/dark_mode_css/general.css?v=c0a7eb24" /> <link rel="stylesheet" type="text/css" href="../_static/dark_mode_css/dark.css?v=70edf1c7" /> <link rel="stylesheet" href="../_static/custom.css" type="text/css" /> <!--[if lt IE 9]> <script src="../_static/js/html5shiv.min.js"></script> <![endif]--> <script src="../_static/jquery.js?v=5d32c60e"></script> <script src="../_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script> <script src="../_static/documentation_options.js?v=c6e86fd7"></script> <script src="../_static/doctools.js?v=888ff710"></script> <script src="../_static/sphinx_highlight.js?v=dc90522c"></script> <script src="../_static/clipboard.min.js?v=a7894cd8"></script> <script src="../_static/copybutton.js?v=f281be69"></script> <script src="../_static/dark_mode_js/default_light.js?v=c2e647ce"></script> <script src="../_static/dark_mode_js/theme_switcher.js?v=358d3910"></script> <script src="../_static/js/theme.js"></script> <link rel="index" title="Index" href="../genindex.html" /> <link rel="search" title="Search" href="../search.html" /> <link rel="next" title="App: Recognize" href="app_recognize.html" /> <link rel="prev" title="App: Local large language model (llm2)" href="app_llm2.html" /> </head> <body class="wy-body-for-nav"> <div class="wy-grid-for-nav"> <nav data-toggle="wy-nav-shift" class="wy-nav-side"> <div class="wy-side-scroll"> <div class="wy-side-nav-search" > <a href="../contents.html"> <img src="../_static/logo-white.png" class="logo" alt="Logo"/> </a> <div role="search"> <form id="rtd-search-form" class="wy-form" action="../search.html" method="get"> <input type="text" name="q" placeholder="Search docs" aria-label="Search docs" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> </div> </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu"> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../index.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../release_notes/index.html">Release notes</a></li> <li class="toctree-l1"><a class="reference internal" href="../release_schedule.html">Maintenance and release schedule</a></li> <li class="toctree-l1"><a class="reference internal" href="../installation/index.html">Installation and server configuration</a></li> <li class="toctree-l1"><a class="reference internal" href="../configuration_server/index.html">Nextcloud configuration</a></li> <li class="toctree-l1"><a class="reference internal" href="../apps_management.html">Apps management</a></li> <li class="toctree-l1"><a class="reference internal" href="../configuration_user/index.html">User management</a></li> <li class="toctree-l1"><a class="reference internal" href="../configuration_files/index.html">File sharing and management</a></li> <li class="toctree-l1"><a class="reference internal" href="../file_workflows/index.html">Flow</a></li> <li class="toctree-l1"><a class="reference internal" href="../groupware/index.html">Groupware</a></li> <li class="toctree-l1"><a class="reference internal" href="../office/index.html">Office</a></li> <li class="toctree-l1"><a class="reference internal" href="../reference/index.html">Reference management</a></li> <li class="toctree-l1 current"><a class="reference internal" href="index.html">Artificial Intelligence</a><ul class="current"> <li class="toctree-l2"><a class="reference internal" href="overview.html">Overview</a></li> <li class="toctree-l2"><a class="reference internal" href="app_assistant.html">Nextcloud Assistant</a></li> <li class="toctree-l2"><a class="reference internal" href="app_translate2.html">App: Local Machine translation 2 (translate2)</a></li> <li class="toctree-l2"><a class="reference internal" href="app_llm2.html">App: Local large language model (llm2)</a></li> <li class="toctree-l2 current"><a class="current reference internal" href="#">App: Local Whisper Speech-To-Text (stt_whisper2)</a><ul> <li class="toctree-l3"><a class="reference internal" href="#requirements">Requirements</a></li> <li class="toctree-l3"><a class="reference internal" href="#installation">Installation</a><ul> <li class="toctree-l4"><a class="reference internal" href="#supplying-alternate-models">Supplying alternate models</a></li> </ul> </li> <li class="toctree-l3"><a class="reference internal" href="#scaling">Scaling</a></li> <li class="toctree-l3"><a class="reference internal" href="#app-store">App store</a></li> <li class="toctree-l3"><a class="reference internal" href="#repository">Repository</a></li> <li class="toctree-l3"><a class="reference internal" href="#known-limitations">Known Limitations</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="app_recognize.html">App: Recognize</a></li> <li class="toctree-l2"><a class="reference internal" href="app_context_chat.html">App: Context Chat</a></li> <li class="toctree-l2"><a class="reference internal" href="app_summary_bot.html">App: Summary Bot (Talk chat summarize bot)</a></li> <li class="toctree-l2"><a class="reference internal" href="app_api_and_external_apps.html">AppAPI and External Apps</a></li> <li class="toctree-l2"><a class="reference internal" href="ai_as_a_service.html">AI as a Service</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../webhook_listeners/index.html">Webhook Listeners</a></li> <li class="toctree-l1"><a class="reference internal" href="../windmill_workflows/index.html">Windmill Workflows</a></li> <li class="toctree-l1"><a class="reference internal" href="../configuration_database/index.html">Database configuration</a></li> <li class="toctree-l1"><a class="reference internal" href="../configuration_mimetypes/index.html">Mimetypes management</a></li> <li class="toctree-l1"><a class="reference internal" href="../maintenance/index.html">Maintenance</a></li> <li class="toctree-l1"><a class="reference internal" href="../issues/index.html">Issues and troubleshooting</a></li> <li class="toctree-l1"><a class="reference internal" href="../gdpr/index.html">GDPR-compliance</a></li> </ul> </div> </div> </nav> <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" > <i data-toggle="wy-nav-top" class="fa fa-bars"></i> <a href="../contents.html">Nextcloud latest Administration Manual</a> </nav> <div class="wy-nav-content"> <div class="rst-content style-external-links"> <div role="navigation" aria-label="Page navigation"> <ul class="wy-breadcrumbs"> <li><a href="../contents.html" class="icon icon-home" aria-label="Home"></a></li> <li class="breadcrumb-item"><a href="index.html">Artificial Intelligence</a></li> <li class="breadcrumb-item active">App: Local Whisper Speech-To-Text (stt_whisper2)</li> <li class="wy-breadcrumbs-aside"> <a href="https://github.com/nextcloud/documentation/edit/master/admin_manual/ai/app_stt_whisper2.rst" class="fa fa-github"> Edit on GitHub</a> </li> </ul> <hr/> </div> <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> <div itemprop="articleBody"> <section id="app-local-whisper-speech-to-text-stt-whisper2"> <h1>App: Local Whisper Speech-To-Text (stt_whisper2)<a class="headerlink" href="#app-local-whisper-speech-to-text-stt-whisper2" title="Link to this heading"></a></h1> <p id="ai-app-stt-whisper2">The <em>stt_whisper2</em> app is one of the apps that provide Speech-To-Text functionality in Nextcloud and act as a media transcription backend for the <a class="reference internal" href="app_assistant.html#ai-app-assistant"><span class="std std-ref">Nextcloud Assistant app</span></a>, the <em>talk</em> app and <a class="reference internal" href="overview.html#stt-consumer-apps"><span class="std std-ref">other apps making use of the core Translation API</span></a>. The <em>stt_whisper2</em> app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.</p> <p>This app supports input and output in languages other than English if the underlying model supports the language.</p> <p>This app uses <a class="reference external" href="https://github.com/SYSTRAN/faster-whisper">faster-whisper</a> under the hood. Output quality will differ depending on which model you use, we recommend the following models:</p> <blockquote> <div><ul class="simple"> <li><p>OpenAI Whisper large-v2 or v3 (multilingual)</p></li> <li><p>OpenAI Whisper medium.en (English only)</p></li> </ul> </div></blockquote> <p>Whisper large v3 supports about ~100 languages and shows outstanding performance in ~10 of them. For more details see the <a class="reference external" href="https://cdn.openai.com/papers/whisper.pdf">OpenAI Whisper paper</a></p> <section id="requirements"> <h2>Requirements<a class="headerlink" href="#requirements" title="Link to this heading"></a></h2> <ul> <li><p>Minimal Nextcloud version: 28</p></li> <li><p>This app is built as an External App and thus depends on AppAPI v2.3.0</p></li> <li><p>Nextcloud AIO is supported</p></li> <li><p>Using GPU is currently not supported</p></li> <li><p>CPU Sizing</p> <blockquote> <div><ul class="simple"> <li><p>The more cores you have and the more powerful the CPU the better, we recommend 10-20 cores</p></li> <li><p>The app will hog all cores by default, so it is usually better to run it on a separate machine</p></li> <li><p>4GB for the app</p></li> </ul> </div></blockquote> </li> </ul> </section> <section id="installation"> <h2>Installation<a class="headerlink" href="#installation" title="Link to this heading"></a></h2> <ol class="arabic simple" start="0"> <li><p>Make sure the <a class="reference internal" href="app_assistant.html#ai-app-assistant"><span class="std std-ref">Nextcloud Assistant app</span></a> is installed</p></li> <li><p><a class="reference internal" href="app_api_and_external_apps.html#ai-app-api"><span class="std std-ref">Install AppAPI and setup a Deploy Demon</span></a></p></li> <li><p>Install the <em>stt_whisper2</em> “Local Speech-To-Text” ExApp via the “External Apps” page in the Nextcloud web admin user interface</p></li> </ol> <section id="supplying-alternate-models"> <h3>Supplying alternate models<a class="headerlink" href="#supplying-alternate-models" title="Link to this heading"></a></h3> <p>This app allows supplying alternate models in the <code class="docutils literal notranslate"><span class="pre">/nc_app_llm2_data</span></code> directory of the docker container. You can use any <a class="reference external" href="https://huggingface.co/Systran">*faster-whisper* model by Systran on hugging face</a> in the following way:</p> <ol class="arabic simple"> <li><p>git cloning the respective repository</p></li> <li><p>Copying the folder with the git repository to <code class="docutils literal notranslate"><span class="pre">/nc_app_llm2_data</span></code> inside the docker container.</p></li> <li><p>Restarting the Whisper ExApp</p></li> <li><p>Selecting the respective model in the Nextcloud AI admin settings</p></li> </ol> </section> </section> <section id="scaling"> <h2>Scaling<a class="headerlink" href="#scaling" title="Link to this heading"></a></h2> <p>It is currently not possible to scale this app, we are working on this. Based on our calculations an instance has a rough capacity of 4h of transcription throughput per minute (measured with 8 CPU threads on an Intel(R) Xeon(R) Gold 6226R). It is unclear how close to real-world usage this number is, so we do appreciate real-world feedback on this.</p> </section> <section id="app-store"> <h2>App store<a class="headerlink" href="#app-store" title="Link to this heading"></a></h2> <p>You can also find this app in our app store, where you can write a review: <a class="reference external" href="https://apps.nextcloud.com/apps/stt_whisper2">https://apps.nextcloud.com/apps/stt_whisper2</a></p> </section> <section id="repository"> <h2>Repository<a class="headerlink" href="#repository" title="Link to this heading"></a></h2> <p>You can find the app’s code repository on GitHub where you can report bugs and contribute fixes and features: <a class="reference external" href="https://github.com/nextcloud/stt_whisper2">https://github.com/nextcloud/stt_whisper2</a></p> <p>Nextcloud customers should file bugs directly with our customer support.</p> </section> <section id="known-limitations"> <h2>Known Limitations<a class="headerlink" href="#known-limitations" title="Link to this heading"></a></h2> <ul class="simple"> <li><p>We currently do not support live transcription</p></li> <li><p>We currently only support languages supported by the underlying Whisper models</p></li> <li><p>The whisper models perform unevenly across languages, and may show lower accuracy on low-resource and/or low-discoverability languages or languages where there was less training data available. The models also exhibit disparate performance on different accents and dialects of particular languages, which may include higher word error rate across speakers of different genders, races, ages, or other demographic criteria.</p></li> <li><p>Language models are likely to generate false information and should thus only be used in situations that are not critical. It’s recommended to only use AI at the beginning of a creation process and not at the end, so that outputs of AI serve as a draft for example and not as final product. Always check the output of language models before using it.</p></li> <li><p>Make sure to test the language model you are using it for whether it meets the use-case’s quality requirements</p></li> <li><p>Language models notoriously have a high energy consumption, if you want to reduce load on your server you can choose smaller models or quantized models in exchange for lower accuracy</p></li> <li><p>Customer support is available upon request, however we can’t solve false or problematic output, most performance issues, or other problems caused by the underlying model. Support is thus limited only to bugs directly caused by the implementation of the app (connectors, API, front-end, AppAPI)</p></li> <li><p>Due to technical limitations that we are in the process of mitigating, each task currently incurs a time cost of between 0 and 5 minutes in addition to the actual processing time</p></li> </ul> </section> </section> </div> </div> <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer"> <a href="app_llm2.html" class="btn btn-neutral float-left" title="App: Local large language model (llm2)" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a> <a href="app_recognize.html" class="btn btn-neutral float-right" title="App: Recognize" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a> </div> <hr/> <div role="contentinfo"> <p>© Copyright 2024 Nextcloud GmbH.</p> </div> </footer> </div> </div> </section> </div> <div class="rst-versions" data-toggle="rst-versions" role="note" aria-label="Versions"> <span class="rst-current-version" data-toggle="rst-current-version"> <span class="fa fa-book"> Read the Docs</span> v: latest <span class="fa fa-caret-down"></span> </span> <div class="rst-other-versions"> <dl> <dt>Versions</dt> <dd><a href="https://docs.nextcloud.com/server/28/admin_manual">28</a></dd> <dd><a href="https://docs.nextcloud.com/server/29/admin_manual">29</a></dd> <dd><a href="https://docs.nextcloud.com/server/stable/admin_manual">stable</a></dd> <dd><a href="https://docs.nextcloud.com/server/latest/admin_manual">latest</a></dd> </dl> <dl> <dt>Downloads</dt> </dl> <dl> <dt>On Read the Docs</dt> <dd> <a href="///projects//?fromdocs=">Project Home</a> </dd> <dd> <a href="///builds//?fromdocs=">Builds</a> </dd> </dl> </div> </div> <script> jQuery(function () { SphinxRtdTheme.Navigation.enable(true); }); </script> </body> </html>