2019-07-18 17:13:20 +02:00
2018-11-28 02:04:52 -05:00
2019-02-19 17:10:07 -05:00
2019-02-20 09:21:26 -05:00

tpro

Transcript Processing! tpro takes transcripts produced by various speech-to-text services and converts them to various standardized formats.

demo

Installation and Usage

Non-pip Requirement: Stanford NER JAR

  • download and unzip this
  • put these files in in /usr/local/bin/:
    • stanford-ner.jar
    • classifiers/english.all.3class.distsim.crf.ser.gz
  • you might have to update Java on Linux

Pip

$ pip install tpro

Usage

$ tpro --help

Usage: tpro [OPTIONS] TRANSCRIPT_DATA_PATH OUTPUT_PATH
            [amazon|gentle|speechmatics|google] [universal|vo]

Options:
  -p, --print-output    pretty print the transcript, breaks pipeability
  --language-code TEXT  specify language, defaults to en-US.
  --help                Show this message and exit.

STT Services

Planned

Output Formats

Planned

  • Draft.js JSON
  • Word (.doc, .docx)
  • text files
  • SRT (subtitles)
Description
Transcript processing from STT services to standardized formats.
Readme 479 KiB
Languages
Python 100%