GitHub - curtgrimes/webcaptioner: A former speech-to-text app using the Web Speech API. (2024)

GitHub - curtgrimes/webcaptioner: A former speech-to-text app using the Web Speech API. (1)

Web Captioner

A former speech-to-text app by Curt Grimes.

Web Captioner was a speech-to-text service using the Web Speech API that existed at webcaptioner.com from 2017 until October 31, 2023.

No longer maintained

⚠️ This project is no longer maintained. On October 31, 2023, I simultaneously sunset the project and released its source here. If you were a user of Web Captioner, your support over the years has been greatly appreciated!

I recommended exploring alternatives to Web Captioner which are better maintained and more fully featured, including built-in speech-to-text tools available on many of today's devices.

The release of this source code may lead to derivative versions of Web Captioner, over which I have no control, so use them at your own discretion.

Features

  • Provides a web interface for using the Web Speech API for speech-to-text conversion. Focused on Google Chrome support only.
  • Provides a unique link to captions to display on other devices or in other applications (like in streaming applications that can embed a webpage). Requires Redis server.
  • Integrates with vMix, YouTube, FAB Subtitler, Dropbox, Zoom, and OBS Studio. Some integrations are incomplete or broken.
  • Send captions to Chromecast devices.
  • Configure speech recognition language (but no translation between languages).
  • Configure the appearance of captions displayed on a screen.
  • Configure word replacements for frequently misunderstood words.
  • Export and restore configuration settings.
  • Keyboard shortcuts for starting/stopping captions.
  • Heuristics for "undoing" censorship applied by Google Chrome's implementation of the Web Speech API.
  • Convert the output from the Web Speech API into a SRT file (experimental).

Running the project

The bulk of the Web Captioner project (which was previously available at webcaptioner.com/captioner) is a Vue 2 / Nuxt (version 2) app located under the ./app folder. There are earlier iterations that do not use Nuxt.

⚠️ Warning: This project has outdated dependencies. The most active development this project had was in 2017 through 2020, so there are now many outdated dependencies. Make sure you understand the security implications of running this as-is. I do not provide support.

Prerequisites

  1. Node 12.2.0 - This runs successfully with Node 12.2.0 (If you have nvm you can run nvm install 12.2.0.) I have not verified compatibility with other Node versions, so it may work with other versions.
  2. Python 2 - To build some dependencies you will need Python 2 available in your environment. (If you are on a recent version of macOS that does not have Python 2 installed, here is one way to install Python 2 on macOS.)

Local development server

To run the Nuxt.js development server:

  1. Clone this repository
  2. Make sure you meet the prerequisites above
  3. cd ./app
  4. Run npm i
  5. Copy the file .env.sample to .env and fill in any env as necessary. (You can run a good portion of the project without filling in any.)
  6. Run npm run dev
  7. Load the website in Google Chrome at the path printed to the console.
  8. You will need to update Chrome to treat your dev server as a secure origin in order for Chrome to allow you to grant the page the microphone permission:
    1. Type chrome://flags/#unsafely-treat-insecure-origin-as-secure in the address bar
    2. Copy and paste the dev server origin (something that looks like http://192.168.1.200:8080) into the text box.
    3. Click "Enable" next to the text box.
    4. Save and relaunch the browser.

Docker

For those that are interested, there is an older Dockerfile (not used in more recent iterations of Web Captioner) that could be used as a starting point for a more robust Dockerfile.

These commits at a much earlier point in the codebase where I was doing some Docker work might be helpful:

  1. cad70d18 - Dockerize app and add deployment to AWS (Dec 25, 2017)
  2. 1ba6ceb4 - Add docker-compose file (Dec 26, 2017)

Redis server

The "share captions by link" feature requires a Redis server, but it could be reworked to not require it. If you have a Redis server to provide, update REDIS_URL in the .env file.

If you are not using this feature, you don't need the Redis server.

Firebase

This project uses Firebase for user authentication (Firebase Authentication) and data storage (Cloud Firestore). Firebase can be configured with the FIREBASE_* environment variables listed in .env.sample.

Firestore had a "users" collection, and each user record had these properties:

  • settings - values related to a user's preferences. The user could edit these things through the Web Captioner UI.
  • privileges - some toggles for experimental features that I would change manually through the Firebase console on a per-user basis.

See firestore-collection-example.json for an example of what a user record looked like.

Contributing

This project is no longer maintained and is not accepting PRs.

You are welcome to fork this project and use it according to the MIT license.

Development History

This project is no longer maintained and was open sourced on October 31, 2023.

I have chosen to provide the full commit history for those that are interested in seeing the course of Web Captioner's development over the years. Changes related to the static site part of the repo have been removed for simplicity.

Some high level changes that stand out to me when I read through the commit history:

  1. eee704fa (Mar 18, 2017) - The first commit, which used Google's Web Speech API Demonstration page as a starting point.
  2. 2f96e233 (Jun 21, 2017) - Starting to add Bootstrap and some other UI improvements.
  3. fc566bd9 (Jul 2, 2017) - Starting to add a blog and other pages on the static site part of the codebase (the static site has been removed from this source code release for simplicity).
  4. cad70d18 (Dec 25, 2017) - Dockerize the app and set up some auto deployment to AWS.
  5. f7ab9f88 (Mar 24, 2018) - I began adding (and learning 😉) Vue.
  6. 4e0ca506 (Apr 9, 2018) - Added the ability for settings to be saved to local storage.
  7. d7cd5926 (Apr 10, 2018) - Began working on Chromecast integration (got it working at bfcab4a1).
  8. 7056ef46 (May 6, 2018) - Starting vMix integration implementation.
  9. 51419d1c (Jun 23, 2018) - Add start of an experiments section.
  10. 0f04ba61 (Jun 27, 2018) - Add ability to change fonts.
  11. 06fc0e98 (Aug 31, 2018) - Begin using Nuxt.
  12. fea294d6 (Sep 1, 2018) - Start adding the ability to call a webhook with caption data.
  13. 8e43c17a (Sep 2, 2018) - Add ability to export and restore settings.
  14. 83cdb0c0 and d8f226cb (Sep 11, 2018) - Add the start of a typing mode which was never completely finished.
  15. fe756717 (Sep 12, 2018) - Begin some work to make the Web Captioner interface support languages other than English.
  16. 00b5b2e8 (Oct 1, 2018) - Initial work supporting the "share captions" feature/experiment.
  17. 59855af0 (Oct 29, 2018) - Add Dropbox integration.
  18. 28ba1a76 (Nov 23, 2018) - Add a heuristic for attempting to "undo" censorship applied by Chrome's implementation of the Web Speech API.
  19. 28db2c81 (Apr 26, 2019) - Start to add Firebase for signing in and saving user settings.
  20. b6728eb9 (Apr 30, 2019) - Add ability to have vanity links in the "share captions" feature/experiment.
  21. ac6259eb (Oct 26, 2019) - Add the start of some work for converting the output from the web speech API into a SRT file. Further improved in 1ceaed2c and 08b40b10.
  22. cb69766b (Jun 14, 2020) - Add start of Zoom integration.
  23. 00844325 (Jun 23, 2020) - Add experiment where it speaks back what is captioned.
  24. 703dd15a (Sep 3, 2020) - Start of the work for the "channels" feature to support different integrations.
  25. b2bbc2a4 (Sep 3, 2020) - Start of YouTube integration.
  26. 9a477bca (Sep 7, 2020) - Start of OBS integration.
  27. 90e43765 (Sep 8, 2020) - Start of FAB Subtitler integration.
  28. fae5f2e4 (Jun 3, 2021) - Add ability to scroll up during live transcription and without being snapped back to the bottom.

Copyright and License

Code and documentation copyright 2017-present by Curt Grimes. Code released under the MIT License.

GitHub - curtgrimes/webcaptioner: A former speech-to-text app using the Web Speech API. (2024)

FAQs

What is the difference between a captioner and a transcriptionist? ›

The primary difference between a transcriptionist and captioner is the need for a captioner to not only document the audio elements but time them accurately to sync with a video. Once captionists complete their initial transcript of a video, they focus their attention on syncing their captions with the content.

How do I get free closed captioning on Zoom? ›

Select “Settings” along the left and then “Meeting” along the top. Click on “In Meeting (Advanced)” to scroll rapidly down to those options. Look for “Closed captioning.” Toggle that option on (the switch will turn blue). A message will come up saying this action will also turn on the Save Captions option below it.

How do I add closed captions to OBS? ›

OBS Plugin Set Up
  1. Install the “Closed Captioning OBS plugin” from GitHub. ...
  2. Once you've installed the plugin, restart your OBS application if it is open. ...
  3. In the Tools drop down menu, you should now see “Cloud Closed Captions” as an option.
Apr 21, 2023

What pays more transcription or captioning? ›

For general transcription, the rate is typically around $0.30 to $0.60 per minute of audio. For captions, the rate is a bit higher, at around $0.75 to $1.25 per minute of audio.

What pays more Rev captioning or transcription? ›

Captioning files pay a bit more, starting at about 54 cents per minute while transcription files start at about 45 cents per minute. Keep in mind with captions there is an additional syncing step and longer deadlines.

How much does Zoom captions cost? ›

You can add this translation functionality to any paid Zoom license for just $5/month per user. Translated captions follow a host's configured settings. So if the host enables translated captions, all participants and attendees can use them in their meeting or webinar.

What is AI companion on Zoom? ›

In Zoom Events

AI Companion can help you generate content, such event descriptions, session descriptions, speaker bios, lobby announcements, and more, when creating a new event. Zoom Events participants can use AI Companion to compose chat messages for the lobby chat.

How much does closed captioning cost on Zoom? ›

Zoom Built-in Closed Captioning and Live Transcription

This feature is available to every licensed UF Zoom user at no additional costs, and provides the following two options: Automated Live Transcriptions: This uses a software engine to transcribe audio from Zoom sessions automatically (English only).

How do I create a closed caption SRT file? ›

How to create SRT subtitles
  1. Choose a text editor. The first step in creating an SRT file is selecting a text editing platform. ...
  2. Review the video file. ...
  3. Create the beginning timestamp. ...
  4. Add subtitle contents. ...
  5. Repeat for all subtitles. ...
  6. Review your subtitles. ...
  7. Save and upload your SRT file. ...
  8. Edit your subtitles as necessary.
Feb 3, 2023

How do I set up text to speech on OBS? ›

To enable Twitch TTS on your channel, follow these simple steps:
  1. Step 1: Access Your Twitch Account.
  2. Step 2: Open Streamlabs or OBS.
  3. Step 3: Add the TTS Widget.
  4. Step 4: Customize TTS Settings.
  5. Step 5: Test the TTS Feature.
  6. Step 6: Go Live and Interact.
Oct 2, 2022

How do I convert SRT to closed caption? ›

SRT to SCC converter
  1. Open your web browser, then search for the GoTranscript subtitle converter. ...
  2. Upload the original caption file. ...
  3. Select the target format. ...
  4. Now you can convert your SRT subtitles by clicking on the "Convert" button. ...
  5. After the conversion, your SCC files will be available for download and export.

How do you turn on captions on twitch? ›

When captions are available on a live stream or video, you will see a CC button that lets you turn captions on or off.

How do I turn on subtitles on twitch app? ›

How Do You Get Captions on Twitch? Viewers watching someone's captioned stream on Twitch can turn on closed captioning by clicking the “CC” button on the video player. The “CC” button is only present when captions are available on a live stream or video.

How do you add a caption in zoom? ›

Start manual captioning in a Zoom meeting or webinar
  1. Start the Zoom meeting or webinar that you are hosting.
  2. In the meeting controls toolbar, next to the Show Captions icon , click the up arrow icon .
  3. Under Host controls, click Set up manual captioner.
  4. Under Enable manual captioner, choose On.

Does Adobe Connect have closed captioning? ›

Automated closed captions are available within Adobe Connect rooms with Enhanced Audio/Video enabled. Login to Adobe Connect Central home page, click on Admin. Go to Compliance and Control>Recordings, Closed Captions and Notice. Select the checkbox Enable Automated Closed Captions to enable the captions.

Top Articles
Latest Posts
Article information

Author: Rob Wisoky

Last Updated:

Views: 6247

Rating: 4.8 / 5 (68 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Rob Wisoky

Birthday: 1994-09-30

Address: 5789 Michel Vista, West Domenic, OR 80464-9452

Phone: +97313824072371

Job: Education Orchestrator

Hobby: Lockpicking, Crocheting, Baton twirling, Video gaming, Jogging, Whittling, Model building

Introduction: My name is Rob Wisoky, I am a smiling, helpful, encouraging, zealous, energetic, faithful, fantastic person who loves writing and wants to share my knowledge and understanding with you.