Speech-recognition app to convert MP3 to text?Is there any voice recognition software(G.U.I) for ubuntu...
My cat mixes up the floors in my building. How can I help him?
Intern applicant asking for compensation equivalent to that of permanent employee
If I deleted a game I lost the disc for, can I reinstall it digitally?
Why exactly do action photographers need high fps burst cameras?
Normalization for two bulk RNA-Seq samples to enable reliable fold-change estimation between genes
Can making a creature unable to attack after it has been assigned as an attacker remove it from combat?
Can an insurance company drop you after receiving a bill and refusing to pay?
Digits in an algebraic irrational number
Why do members of Congress in committee hearings ask witnesses the same question multiple times?
Why publish a research paper when a blog post or a lecture slide can have more citation count than a journal paper?
How would an AI self awareness kill switch work?
CREATE ASSEMBLY System.DirectoryServices.AccountManagement.dll without enabling TRUSTWORTHY
Why do neural networks need so many training examples to perform?
Incorporating research and background: How much is too much?
Why do no American passenger airlines still operate dedicated cargo flights?
How to prevent cleaner from hanging my lock screen in Ubuntu 16.04
Is it a fallacy if someone claims they need an explanation for every word of your argument to the point where they don't understand common terms?
What is the purpose of easy combat scenarios that don't need resource expenditure?
How to deal with an incendiary email that was recalled
How to say "Brexit" in Latin?
Why would space fleets be aligned?
Injecting creativity into a cookbook
How can I get my players to come to the game session after agreeing to a date?
Why did the villain in the first Men in Black movie care about Earth's Cockroaches?
Speech-recognition app to convert MP3 to text?
Is there any voice recognition software(G.U.I) for ubuntu destop?Recommendation for a dictation softwareSpeech to text software for notesConvert speech (mp3 audio files) to textWhat program can I use to convert text into binary numbers?How can I install and use text-to-speech software?Software for speech transcriptionWhy does the Julius speech recognition engine return a segmentation fault when passing a WAV file?simple Speech recognition under linuxSpeech recognition for regional languageSimple Speech RecognitionWhat is a good speech recognition software?Convert speech (mp3 audio files) to textSpeech recognition for programming
Does any one know of an application that can convert audio to text? I'm running ubuntu 12.04 LTS.
software-recommendation speech-recognition
add a comment |
Does any one know of an application that can convert audio to text? I'm running ubuntu 12.04 LTS.
software-recommendation speech-recognition
I assume it is spoken text. Which language is that text in?
– Martin Ueding
Jul 9 '12 at 11:33
The speech text is in simple english.
– Kopano
Jul 9 '12 at 14:33
add a comment |
Does any one know of an application that can convert audio to text? I'm running ubuntu 12.04 LTS.
software-recommendation speech-recognition
Does any one know of an application that can convert audio to text? I'm running ubuntu 12.04 LTS.
software-recommendation speech-recognition
software-recommendation speech-recognition
edited Jul 9 '12 at 15:07
Eliah Kagan
82.4k22227368
82.4k22227368
asked Jul 9 '12 at 11:33
KopanoKopano
121113
121113
I assume it is spoken text. Which language is that text in?
– Martin Ueding
Jul 9 '12 at 11:33
The speech text is in simple english.
– Kopano
Jul 9 '12 at 14:33
add a comment |
I assume it is spoken text. Which language is that text in?
– Martin Ueding
Jul 9 '12 at 11:33
The speech text is in simple english.
– Kopano
Jul 9 '12 at 14:33
I assume it is spoken text. Which language is that text in?
– Martin Ueding
Jul 9 '12 at 11:33
I assume it is spoken text. Which language is that text in?
– Martin Ueding
Jul 9 '12 at 11:33
The speech text is in simple english.
– Kopano
Jul 9 '12 at 14:33
The speech text is in simple english.
– Kopano
Jul 9 '12 at 14:33
add a comment |
4 Answers
4
active
oldest
votes
The software you can use is CMUSphinx. Unlike suggested in another answer Julius is not suitable because it requires models. Models for large vocabulary speech recognition are not available for Julius.
You can use pocketsphinx to convert audio file. Those two commands must do the work. First you convert the file to the required format and then you recognize it:
ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav
The run pocketsphinx
pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > result.txt
Result will be stored in result.txt.
also, as an addition to this answer, there's a cool demo of bothspeech recognition
andvoice command
tools here: youtube.com/…
– Daithí
Jan 8 '15 at 10:22
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
4
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Thenpocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.
– jarno
Feb 8 '15 at 15:05
|
show 4 more comments
I you are looking to convert speech to text you could try opening up your Ubuntu Software Center and search for Julius
Description
"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers.
Or another option that isn't in the Software Center is Simon
... is an open-source speech recognition program and replaces the mouse and keyboard.
Reference Links
http://julius.sourceforge.jp/en_index.php
http://sourceforge.net/projects/speech2text/
http://simon-listens.org/index.php?id=122&L=1
add a comment |
I know this is old, but to expand on Nikolay's answer and hopefully save someone some time in the future, in order to get an up-to-date version of pocketsphinx working you need to compile it from the github or sourceforge repository (not sure which is kept more up to date). Note the -j8 means run 8 separate jobs in parallel if possible; if you have more CPU cores you can increase the number.
git clone https://github.com/cmusphinx/sphinxbase.git
cd sphinxbase
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
git clone https://github.com/cmusphinx/pocketsphinx.git
cd pocketsphinx
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
Then, from: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/
download the newest versions of cmusphinx-en-us-....tar.gz
and en-70k-....lm.gz
tar -xzf cmusphinx-en-us-....tar.gz
gunzip en-70k-....lm.gz
Then you can finally proceed with the steps from Nikolay's answer:
ffmpeg -i book.mp3 -ar 16000 -ac 1 book.wav
pocketsphinx_continuous -infile book.wav
-hmm cmusphinx-en-us-8khz-5.2 -lm en-70k-0.2.lm
2>pocketsphinx.log >book.txt
Sphinx works alright. I wouldn't rely on it to make a readable version of the text, but it's good enough that you can search it if you're looking for a particular quote. That works especially well if you use a search algorithm like Xapian (http://www.lesbonscomptes.com/recoll/) which accepts wildcards and doesn't require exact search expressions.
Hope this helps.
3
every thing works like a charm but in my case i had to run following command to fixpocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
------->export LD_LIBRARY_PATH=/usr/local/lib
------->export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
add a comment |
You can use speechpad.pw transcription panel
See video of using transcription
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f161515%2fspeech-recognition-app-to-convert-mp3-to-text%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
The software you can use is CMUSphinx. Unlike suggested in another answer Julius is not suitable because it requires models. Models for large vocabulary speech recognition are not available for Julius.
You can use pocketsphinx to convert audio file. Those two commands must do the work. First you convert the file to the required format and then you recognize it:
ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav
The run pocketsphinx
pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > result.txt
Result will be stored in result.txt.
also, as an addition to this answer, there's a cool demo of bothspeech recognition
andvoice command
tools here: youtube.com/…
– Daithí
Jan 8 '15 at 10:22
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
4
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Thenpocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.
– jarno
Feb 8 '15 at 15:05
|
show 4 more comments
The software you can use is CMUSphinx. Unlike suggested in another answer Julius is not suitable because it requires models. Models for large vocabulary speech recognition are not available for Julius.
You can use pocketsphinx to convert audio file. Those two commands must do the work. First you convert the file to the required format and then you recognize it:
ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav
The run pocketsphinx
pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > result.txt
Result will be stored in result.txt.
also, as an addition to this answer, there's a cool demo of bothspeech recognition
andvoice command
tools here: youtube.com/…
– Daithí
Jan 8 '15 at 10:22
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
4
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Thenpocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.
– jarno
Feb 8 '15 at 15:05
|
show 4 more comments
The software you can use is CMUSphinx. Unlike suggested in another answer Julius is not suitable because it requires models. Models for large vocabulary speech recognition are not available for Julius.
You can use pocketsphinx to convert audio file. Those two commands must do the work. First you convert the file to the required format and then you recognize it:
ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav
The run pocketsphinx
pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > result.txt
Result will be stored in result.txt.
The software you can use is CMUSphinx. Unlike suggested in another answer Julius is not suitable because it requires models. Models for large vocabulary speech recognition are not available for Julius.
You can use pocketsphinx to convert audio file. Those two commands must do the work. First you convert the file to the required format and then you recognize it:
ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav
The run pocketsphinx
pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > result.txt
Result will be stored in result.txt.
edited 24 mins ago
Pablo Bianchi
2,77821533
2,77821533
answered Feb 20 '14 at 20:24
Nikolay ShmyrevNikolay Shmyrev
37729
37729
also, as an addition to this answer, there's a cool demo of bothspeech recognition
andvoice command
tools here: youtube.com/…
– Daithí
Jan 8 '15 at 10:22
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
4
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Thenpocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.
– jarno
Feb 8 '15 at 15:05
|
show 4 more comments
also, as an addition to this answer, there's a cool demo of bothspeech recognition
andvoice command
tools here: youtube.com/…
– Daithí
Jan 8 '15 at 10:22
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
4
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Thenpocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.
– jarno
Feb 8 '15 at 15:05
also, as an addition to this answer, there's a cool demo of both
speech recognition
and voice command
tools here: youtube.com/…– Daithí
Jan 8 '15 at 10:22
also, as an addition to this answer, there's a cool demo of both
speech recognition
and voice command
tools here: youtube.com/…– Daithí
Jan 8 '15 at 10:22
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
How do you add an acoustic model to the system?
– jarno
Feb 8 '15 at 13:38
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
You just download it and unpack, there is no such thing as "add to the system"
– Nikolay Shmyrev
Feb 8 '15 at 13:56
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
@NikolayShmyrev Where should I unpack it so that pocketsphinx_continuous finds it?
– jarno
Feb 8 '15 at 14:14
4
4
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Then
pocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.– jarno
Feb 8 '15 at 15:05
Well, I installed packages pocketsphinx-utils, pocketsphinx-hmm-en-hub4wsj and pocketsphinx-lm-en-hub4 in universe repository of Ubuntu 14.04. Then
pocketsphinx_continuous -infile file.wav -hmm en_US/hub4wsj_sc_8k -lm en_US/hub4.5000.DMP 2> pocketsphinx.log
worked. Maybe they are not optimal packages, but they were best matches I could find in the repositories.– jarno
Feb 8 '15 at 15:05
|
show 4 more comments
I you are looking to convert speech to text you could try opening up your Ubuntu Software Center and search for Julius
Description
"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers.
Or another option that isn't in the Software Center is Simon
... is an open-source speech recognition program and replaces the mouse and keyboard.
Reference Links
http://julius.sourceforge.jp/en_index.php
http://sourceforge.net/projects/speech2text/
http://simon-listens.org/index.php?id=122&L=1
add a comment |
I you are looking to convert speech to text you could try opening up your Ubuntu Software Center and search for Julius
Description
"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers.
Or another option that isn't in the Software Center is Simon
... is an open-source speech recognition program and replaces the mouse and keyboard.
Reference Links
http://julius.sourceforge.jp/en_index.php
http://sourceforge.net/projects/speech2text/
http://simon-listens.org/index.php?id=122&L=1
add a comment |
I you are looking to convert speech to text you could try opening up your Ubuntu Software Center and search for Julius
Description
"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers.
Or another option that isn't in the Software Center is Simon
... is an open-source speech recognition program and replaces the mouse and keyboard.
Reference Links
http://julius.sourceforge.jp/en_index.php
http://sourceforge.net/projects/speech2text/
http://simon-listens.org/index.php?id=122&L=1
I you are looking to convert speech to text you could try opening up your Ubuntu Software Center and search for Julius
Description
"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers.
Or another option that isn't in the Software Center is Simon
... is an open-source speech recognition program and replaces the mouse and keyboard.
Reference Links
http://julius.sourceforge.jp/en_index.php
http://sourceforge.net/projects/speech2text/
http://simon-listens.org/index.php?id=122&L=1
answered Jul 9 '12 at 11:54
CoalaWebCoalaWeb
2,7441628
2,7441628
add a comment |
add a comment |
I know this is old, but to expand on Nikolay's answer and hopefully save someone some time in the future, in order to get an up-to-date version of pocketsphinx working you need to compile it from the github or sourceforge repository (not sure which is kept more up to date). Note the -j8 means run 8 separate jobs in parallel if possible; if you have more CPU cores you can increase the number.
git clone https://github.com/cmusphinx/sphinxbase.git
cd sphinxbase
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
git clone https://github.com/cmusphinx/pocketsphinx.git
cd pocketsphinx
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
Then, from: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/
download the newest versions of cmusphinx-en-us-....tar.gz
and en-70k-....lm.gz
tar -xzf cmusphinx-en-us-....tar.gz
gunzip en-70k-....lm.gz
Then you can finally proceed with the steps from Nikolay's answer:
ffmpeg -i book.mp3 -ar 16000 -ac 1 book.wav
pocketsphinx_continuous -infile book.wav
-hmm cmusphinx-en-us-8khz-5.2 -lm en-70k-0.2.lm
2>pocketsphinx.log >book.txt
Sphinx works alright. I wouldn't rely on it to make a readable version of the text, but it's good enough that you can search it if you're looking for a particular quote. That works especially well if you use a search algorithm like Xapian (http://www.lesbonscomptes.com/recoll/) which accepts wildcards and doesn't require exact search expressions.
Hope this helps.
3
every thing works like a charm but in my case i had to run following command to fixpocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
------->export LD_LIBRARY_PATH=/usr/local/lib
------->export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
add a comment |
I know this is old, but to expand on Nikolay's answer and hopefully save someone some time in the future, in order to get an up-to-date version of pocketsphinx working you need to compile it from the github or sourceforge repository (not sure which is kept more up to date). Note the -j8 means run 8 separate jobs in parallel if possible; if you have more CPU cores you can increase the number.
git clone https://github.com/cmusphinx/sphinxbase.git
cd sphinxbase
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
git clone https://github.com/cmusphinx/pocketsphinx.git
cd pocketsphinx
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
Then, from: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/
download the newest versions of cmusphinx-en-us-....tar.gz
and en-70k-....lm.gz
tar -xzf cmusphinx-en-us-....tar.gz
gunzip en-70k-....lm.gz
Then you can finally proceed with the steps from Nikolay's answer:
ffmpeg -i book.mp3 -ar 16000 -ac 1 book.wav
pocketsphinx_continuous -infile book.wav
-hmm cmusphinx-en-us-8khz-5.2 -lm en-70k-0.2.lm
2>pocketsphinx.log >book.txt
Sphinx works alright. I wouldn't rely on it to make a readable version of the text, but it's good enough that you can search it if you're looking for a particular quote. That works especially well if you use a search algorithm like Xapian (http://www.lesbonscomptes.com/recoll/) which accepts wildcards and doesn't require exact search expressions.
Hope this helps.
3
every thing works like a charm but in my case i had to run following command to fixpocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
------->export LD_LIBRARY_PATH=/usr/local/lib
------->export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
add a comment |
I know this is old, but to expand on Nikolay's answer and hopefully save someone some time in the future, in order to get an up-to-date version of pocketsphinx working you need to compile it from the github or sourceforge repository (not sure which is kept more up to date). Note the -j8 means run 8 separate jobs in parallel if possible; if you have more CPU cores you can increase the number.
git clone https://github.com/cmusphinx/sphinxbase.git
cd sphinxbase
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
git clone https://github.com/cmusphinx/pocketsphinx.git
cd pocketsphinx
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
Then, from: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/
download the newest versions of cmusphinx-en-us-....tar.gz
and en-70k-....lm.gz
tar -xzf cmusphinx-en-us-....tar.gz
gunzip en-70k-....lm.gz
Then you can finally proceed with the steps from Nikolay's answer:
ffmpeg -i book.mp3 -ar 16000 -ac 1 book.wav
pocketsphinx_continuous -infile book.wav
-hmm cmusphinx-en-us-8khz-5.2 -lm en-70k-0.2.lm
2>pocketsphinx.log >book.txt
Sphinx works alright. I wouldn't rely on it to make a readable version of the text, but it's good enough that you can search it if you're looking for a particular quote. That works especially well if you use a search algorithm like Xapian (http://www.lesbonscomptes.com/recoll/) which accepts wildcards and doesn't require exact search expressions.
Hope this helps.
I know this is old, but to expand on Nikolay's answer and hopefully save someone some time in the future, in order to get an up-to-date version of pocketsphinx working you need to compile it from the github or sourceforge repository (not sure which is kept more up to date). Note the -j8 means run 8 separate jobs in parallel if possible; if you have more CPU cores you can increase the number.
git clone https://github.com/cmusphinx/sphinxbase.git
cd sphinxbase
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
git clone https://github.com/cmusphinx/pocketsphinx.git
cd pocketsphinx
./autogen.sh
./configure
make -j8
make -j8 check
sudo make install
cd ..
Then, from: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/
download the newest versions of cmusphinx-en-us-....tar.gz
and en-70k-....lm.gz
tar -xzf cmusphinx-en-us-....tar.gz
gunzip en-70k-....lm.gz
Then you can finally proceed with the steps from Nikolay's answer:
ffmpeg -i book.mp3 -ar 16000 -ac 1 book.wav
pocketsphinx_continuous -infile book.wav
-hmm cmusphinx-en-us-8khz-5.2 -lm en-70k-0.2.lm
2>pocketsphinx.log >book.txt
Sphinx works alright. I wouldn't rely on it to make a readable version of the text, but it's good enough that you can search it if you're looking for a particular quote. That works especially well if you use a search algorithm like Xapian (http://www.lesbonscomptes.com/recoll/) which accepts wildcards and doesn't require exact search expressions.
Hope this helps.
edited Jan 17 '18 at 10:43
nickcrabtree
571414
571414
answered Apr 25 '17 at 5:01
Jonathan Perry-HoutsJonathan Perry-Houts
9111
9111
3
every thing works like a charm but in my case i had to run following command to fixpocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
------->export LD_LIBRARY_PATH=/usr/local/lib
------->export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
add a comment |
3
every thing works like a charm but in my case i had to run following command to fixpocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
------->export LD_LIBRARY_PATH=/usr/local/lib
------->export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
3
3
every thing works like a charm but in my case i had to run following command to fix
pocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
-------> export LD_LIBRARY_PATH=/usr/local/lib
-------> export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
every thing works like a charm but in my case i had to run following command to fix
pocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory
-------> export LD_LIBRARY_PATH=/usr/local/lib
-------> export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
– Vijay Dohare
Sep 19 '17 at 11:30
add a comment |
You can use speechpad.pw transcription panel
See video of using transcription
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
add a comment |
You can use speechpad.pw transcription panel
See video of using transcription
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
add a comment |
You can use speechpad.pw transcription panel
See video of using transcription
You can use speechpad.pw transcription panel
See video of using transcription
answered Jul 10 '16 at 20:37
alexeialexei
211
211
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
add a comment |
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
That looks cool although I don't think it answers the question which was to get a transcription of an existing file. That being said, I just tried Sphinx and it failed miserably... the transcription was 99.9% wrong.
– Alexis Wilke
Nov 10 '17 at 18:47
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f161515%2fspeech-recognition-app-to-convert-mp3-to-text%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I assume it is spoken text. Which language is that text in?
– Martin Ueding
Jul 9 '12 at 11:33
The speech text is in simple english.
– Kopano
Jul 9 '12 at 14:33