| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Speech-Synthesis
Hi,
There are only few people, who used ring-modulation to simulate Speech....
I even want try it, but unfortunately I don't have any idea how this works
Before I waste time by testing I ask here: Is there someone with knowledge?
Btw: Is there someone who can tell me, what the voice in AGEMIXER'S song "Da Shit (Eastwood Jack's Red..)" says?
|
|
| |
algorithm
Registered: May 2002 Posts: 702 |
not entirely sure on the techy side regarding Ring Modulation, bit it involved playing something on, for example the first channel and applying ring modulation on the second channel. the pitch of the note playing on the first channel then effects the modulation effect on the second channel (along them lines) Not many musicians have utilised this effect (Maybe Rob hubbard (eg, thundercats soundtrack). There was a demo by Padua (One of the Torture series) which had the wacko jacko soundtract 'black & white) with the 'Ow' synth effect in there. The sound effects in Myth were quite amazing as well (sword clangs, etc) |
| |
Zyron
Registered: Jan 2002 Posts: 2381 |
There are a couple of SIDs in HVSC where JCH tried to make the SID say Triangle using ringmod. |
| |
algorithm
Registered: May 2002 Posts: 702 |
Any speech synth examples using waveforms (not $d418)? |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
try agemixer's freestyler remix for a really ammazing example. I was shocked when I've heard it for the firs time. |
| |
Steppe
Registered: Jan 2002 Posts: 1510 |
Strobosphere II by Agemixer also comes to mind. |
| |
algorithm
Registered: May 2002 Posts: 702 |
Wow, that was really amazing. The first ever time that i have heard someone attempt to make the sid 'sing' any more examples? |
| |
Hate Bush
Registered: Jul 2002 Posts: 453 |
Jammer used it twice, but I don't think the tunes ("Not Worse than Huus" and "Opium II intro") were ever released. I tried as well - the C64 was supposed to say "What? Brasil!" in the last seconds of my (otherwise lame) tune "Transmitting Live from Brasil", and "Freeze!" in (also lame) tune "Gunjah".
Ari beats us both in speech synthesis, hands down, though ;) |
| |
T.M.R Account closed
Registered: Dec 2001 Posts: 749 |
Have a listen to "Orcus_giv_it_up" by JCH, that's probably the best example of ring mod "singing" i've heard... =-) |
| |
Soren
Registered: Dec 2001 Posts: 547 |
The problem with speechsynthesis using sidwaves and ringmod is that you get the best result by using filters and as we
know there are way too many sid-revisions with different filtercutoff-ranges. Did a test last year though and made an example that "says": VIRUZ, with a deep vocoder-like sound, but then again it it relies too much on the filter my 6581r4 sid uses. But I had a lot of fun doing vocoder-like sounds on my good old trusty Roland XP-50 synth, by using the same technique, more or less. Well, apart from using 2 filters instead of 1.
:-)
|
| |
yago
Registered: May 2002 Posts: 332 |
Viznut did also some speech-synthesizing (i did not yet research how he did it..)
Who Cares
|
| |
Krill
Registered: Apr 2002 Posts: 2839 |
yago: viznut simply replays sampled phonems and $d418, afaik. this thread here is focussing on pseudo-speechsynth using sid voices :D |
| |
Hate Bush
Registered: Jul 2002 Posts: 453 |
Jeff: which filter did you use? While working on my Primary Star '05 msx compo entry (which contains its title spoken out by two channels of SID) I noticed that high pass gives the best results, especially if the cutoff frequencies change suddenly.
Also pulse width (of the modulated signal) seems to be important, 'coz it helps in recreating vowels, and more. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: The problem with speechsynthesis using sidwaves and ringmod is that you get the best result by using filters and as we
know there are way too many sid-revisions with different filtercutoff-ranges. Did a test last year though and made an example that "says": VIRUZ, with a deep vocoder-like sound, but then again it it relies too much on the filter my 6581r4 sid uses. But I had a lot of fun doing vocoder-like sounds on my good old trusty Roland XP-50 synth, by using the same technique, more or less. Well, apart from using 2 filters instead of 1.
:-)
But 8580 filters should work fine?
I'm really wondering how it's possible to simulate spoken words very quickly. I really dont wnat spent weeks for maybe one word....
|
| |
Graham Account closed
Registered: Dec 2002 Posts: 990 |
http://www.arl.wustl.edu/~jaf/lpc/
okok the SID cannot do LPC, but atleast it has a filter with variable cutoff frequency, so maybe similar methods like LPC are possible to be applied. |
| |
Soren
Registered: Dec 2001 Posts: 547 |
Randall: I think I use either Bandpass or a combined filter.
Reason: the "voice" is quite deep, so I didn't want it to loose too much of the bass. But I used 2 voices aswell :)
Btw, keep up the good job you're doing in composing ;-)
Interesting stuff!
|
| |
Soren
Registered: Dec 2001 Posts: 547 |
nata: well should work on 8580 too, but with different filtercutoff values. But prepare yourself for some work, as it takes some time to do. Atleast if you want it good+detailed. |
| |
HCL
Registered: Feb 2003 Posts: 716 |
@nata: I really dont wnat spent weeks for maybe one word....
What do you mean?!? Why NOT? Just buy your self a scene spirit! |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
But how do I have to start?
I really need a bit more tips.
Maybe out there are some docs on generating Speech..... |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: But how do I have to start?
I really need a bit more tips.
Maybe out there are some docs on generating Speech.....
Goattracker 1, 1speed crappy example. I've used this in a 2speed example too, but thats going to be used within the next 10 years in a Focus demo.
http://www.sinapism.net/goattrackersongs/robotvoice2.sng
it sounds good on 6581 sidplay2, rather high filtered on 8580, but heck, ITS A TEST, the best excuse for crap.
You can try using 2 or 4 or 8 speed for better result. |
| |
SIDWAVE Account closed
Registered: Apr 2002 Posts: 2238 |
I know it doesnt really say any word you can understand, but the speech in "little computer people", is quite funny.
You can keep him talking on the phone forever, by pressing ctrl+p |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: @nata: I really dont wnat spent weeks for maybe one word....
What do you mean?!? Why NOT? Just buy your self a scene spirit!
:)
When is that demo ready that was supposed to be released at LCP? I finished 2 tunes at least 2 weeks before the deadline, so your complaining in advance did have some effect on me, but not on yourself.. scenespirit, right. ;) |
| |
Ben Account closed
Registered: Feb 2003 Posts: 163 |
@Hein: I hear someone gargle 'Jezus.. Jezus.. Waarom.. Jezus?' Is that correct? |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote:
@Hein: I hear someone gargle 'Jezus.. Jezus.. Waarom.. Jezus?' Is that correct?
haha.. you played it backwards, eh? |
| |
HCL
Registered: Feb 2003 Posts: 716 |
@Hein: I Wish i could just hand over a bunch of effects, and let the musician finish the demo. Is it a deal? ;) |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
btw, anyone has a general picture how does S.A.M. work ? |
| |
algorithm
Registered: May 2002 Posts: 702 |
I programmed a front end based on S.A.M a long time ago (The program was known as 'Dalek Speech')
I did not analyse the routine in detail (in fact, i barely analysed it) a pointer pointed to an area of memory which would contain allophone text) and a jmp or jsr call to an area would process the speech.
Speech synthesis using allophone based speech is very straightforward. You can just sample the allophones using your own voice and a routine just reads the allophone information and sends the samples through (or alternatively, standard text can be converted to allophone, then allophone to point to the required samples in memory. (prefer using allophone system for more precise results.
|
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: @Hein: I Wish i could just hand over a bunch of effects, and let the musician finish the demo. Is it a deal? ;)
Why would you want a musician to finish your demo? |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
i disassembled and cleaned up (a little) SAM a while back, my plan was to have it render the speech sample into a temporary buffer instead of doing output directly. this didnt quite work out because:
1) SAM does _not_ use sampled pieces which are then put together. there are 4 or 5 (i dont remember) small routines which take a few parameters causing them to output different sounds
2) those routines do not work on a fixed timebase (ie samplerate), which is the magic of the high quality achived in SAM - it practically outputs samples in a very high samplerate (sometimes one d418 write every few cycles)
oh and SAM isnt as simple as algorithm suggests (even if it could work that way)... since all those routines are parametrized in sam you can generate a LOT of variation in speech, even make it sing if you have a lot of patience :)
|
| |
Stryyker
Registered: Dec 2001 Posts: 465 |
I played around with some SAM, not sure what version or anything in Half Demo. I just put the speech commands as used in BASIC in to memory and pointed to the RAM and play.
I also found out it was wouldn't play so nice if I had screen on and I was very much a beginner at the time. No idea what I was doing. |
| |
Zyron
Registered: Jan 2002 Posts: 2381 |
SAM rules. Every year he shows up at Floppy to present the compo-entries :) |
| |
Ben Account closed
Registered: Feb 2003 Posts: 163 |
Quote: SAM rules. Every year he shows up at Floppy to present the compo-entries :)
You nearly seduced me to write a demo with the name "Bloemkolen" (Gunnar's favorite exotic dish) to participate in the compo... |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: You nearly seduced me to write a demo with the name "Bloemkolen" (Gunnar's favorite exotic dish) to participate in the compo...
JT wrote a song about boerenkool, just listen to 'Tomcat'. Its singing, 'we eten boerenkool' |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: I played around with some SAM, not sure what version or anything in Half Demo. I just put the speech commands as used in BASIC in to memory and pointed to the RAM and play.
I also found out it was wouldn't play so nice if I had screen on and I was very much a beginner at the time. No idea what I was doing.
But that makes it possible to make SAM rap over a sid song, or what? fill up the 256 bytes of tekstmemory, triggered by the sidsong routine, refreshing the 256 bytes, jmp to SAM play routine.. or am I talking like the typical dummy? |
| |
algorithm
Registered: May 2002 Posts: 702 |
From the description of the routine that Groepaz wrote regarding the SAM player, it would probably be too awkward if at all feasable to do other things while sample is played back.
The description I gave regarding the allophone to speech sample lookup was not anything to do with the SAM speech synthesiser. Using this method (eg NMI interrupt playing sample sections, reading next allophone, pointing to sample lookup and playing etc etc) would allow other stuff to happen (eg SID music etc) and the voice would probably be more natural (although less control unless the playback routine was tweaked)
|
| |
algorithm
Registered: May 2002 Posts: 702 |
Just rummaged through of one the disks and found an ancient crap program written by me a long time ago. The program is a front end for the SAM synthesiser (Dalek Speech) give it a try. I was just reading the scroller for and text for the program. What lame things i came out with (One example.. Written 100% in machine code, not assembler.. Classic lameness. It's shameful to write something along them lines. (Mind you, most of my stuff at that time especially before was even worse)
http://www.filelodge.com/files/2500/DalekSpeech2.zip
|
| |
Jucke
Registered: Feb 2002 Posts: 34 |
Quote: SAM rules. Every year he shows up at Floppy to present the compo-entries :)
Zyron: Just before Floppy 2001 we were discussing who was to talk in the mic and present the entries during the compo, but we all were pretty nervous of doing that. A few nights later I found some documentation of S.A.M and was playing around with it, when the idea struck me to use him as the voice of the compo. Surely one of our heaviest party features.. (: if anyone is the voice of the c64, it surely is S.A.M. |
| |
carlsson
Registered: Nov 2002 Posts: 41 |
S.A.M. is also the voice of Apple II, Atari 8-bit and original IBM PC, if I remember correctly. :-) |
| |
Hein
Registered: Apr 2004 Posts: 933 |
An example from the Japanese synth master Tomita. Found it a while ago on the internet.
http://www.sinapism.net/goattrackersongs/tomitaspeech.zip |
| |
WVL
Registered: Mar 2002 Posts: 886 |
Quote: An example from the Japanese synth master Tomita. Found it a while ago on the internet.
http://www.sinapism.net/goattrackersongs/tomitaspeech.zip
#45 sounds more like farting to me :) |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: #45 sounds more like farting to me :)
you're right. I think it IS. |
| |
TDJ
Registered: Dec 2001 Posts: 1879 |
Quote: you're right. I think it IS.
Just as long as he's not driving a moped while farting, I'm fine with it. |
| |
_V_ Account closed
Registered: Jan 2002 Posts: 124 |
It's nice to see a lot of sceners are thinking about speech synthesis in SID tunes... 1x-speedwise, Freestyler by Agemixer is the best example yet, I think. I have been experimenting a little myself a while back, and you can do some nice ringmod singing in 2x-speed. It's quite hard though, you can't just make single letter patterns and then link them together. Every word requires a new pattern, depending on the sentence and intonation. Still needs a lot of work and as Jeff mentioned, it will probably sound not too good outside of the SID chip you created the tune for. Don't have the time at the moment. |
| |
dalezy
Registered: Apr 2002 Posts: 475 |
i totally lack patience to come up with some recognizable ringmod-speech. i've studied jch's triangle- and bonzai-worktunes until my nose bled, yet reproducing something alike turned out into an impossible mission for me. |
| |
Steppe
Registered: Jan 2002 Posts: 1510 |
@_V_: Agemixers Freestyler is triplespeed. A very good 1x example is the Strobosphere - Halfway There piece. And did anyone notice the speech in Randall's latest release, Ooh Crikey? |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
IMHO strobosphere is unrecognizable, I've been listening to the tune for 5 mins or more when suddenly I realized that what I thought of a lame sound fx/instrument is trying to say strobosphere. And that happened with the knowledge in mind that it says somewhere "strobosphere"........ |
| |
Steppe
Registered: Jan 2002 Posts: 1510 |
It clearly says "Strobosphere - Awaaaaay Theeeee"! :-) |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
clearly ? must be worst than edison's first working phonograph prototype =) |
| |
celticdesign
Registered: Oct 2005 Posts: 148 |
wow,
freestyler sound pretty nice.
also the strobophere sounds kewl..
are there any others?
|
| |
A Life in Hell Account closed
Registered: May 2002 Posts: 204 |
Quote: clearly ? must be worst than edison's first working phonograph prototype =)
Interestingly, I only found it understandable on a real 8580 - neither the emulated one, or a6581, worked for me. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: wow,
freestyler sound pretty nice.
also the strobophere sounds kewl..
are there any others?
Strobosphere 2 does not use ringmod. for speach synthesis |
| |
Steppe
Registered: Jan 2002 Posts: 1510 |
Quote: Strobosphere 2 does not use ringmod. for speach synthesis
I think it sounds quite similiar on a real and an emulated 8580 (talking about the latest Sidplay2 here). I never bothered to check it out on the 6581, but boy, what a difference! Now I can understand Oswald took this as a lame soundeffect. :-) |
| |
stunt Account closed
Registered: Jul 2006 Posts: 48 |
I thinkg speech synthesis can best be implemented by studying the way that humanspeech works.
1. The vocalcords -> oscillator
2. Human head and body function as -> resonators and thereby generate formants (resonant frequencies that stay the same nomatter what pitch is played/sung). These can be made using other 2 oscillators and determine the character of the voice: male, female, chipmunk etc.
3. The mouth is like a lowpassfilter that determines the vowel that sounds: try it out.. make all vowels and make the hole of your mouth wider or smaller: you will hear automatically that all vowels pass when you start making sound with yourmouth openwide and then gently close your mouth: A - O - U - I ...
4. consonants are prolly best made by using noise
also check this:
http://en.wikipedia.org/wiki/Speech_synthesis
Stunt
|
| |
cadaver
Registered: Feb 2002 Posts: 1153 |
For nata, I guess anything less than the text-to-ringmod utility will not do, though. Maybe it's already been done, but kept secret? Hmmm? :-I |
| |
Hein
Registered: Apr 2004 Posts: 933 |
another speech example I did, allthough not very appropriate for the demo its used in, it fits the obvious music:
http://www.sinapism.net/goattrackersongs/TransForm_Speechtest.s..
it sounds a bit scandinavian, though. (as pointed out by Agemixer) |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
on 0:25 it says "go!" in this tune ;-)) At least it should have been like that, but I didn't tweak it for too long.. just enough to sound like a low quality sample ;-).
http://www.c64.org/HVSC/VARIOUS/A-F/CreaMD/Go.sid
I once tried hard to make sound of old rusty door.. it was helluwa hard and sounded more like cow's moo... ;-)
Pacman eating pills and ghost sounds in backgroud in "The Revenge of Big Emulator" were quite easy though.. ;-)) Okay but that was coin-up to commie not human speech.
http://www.c64.org/HVSC/VARIOUS/A-F/CreaMD/Catch_That_Pacman_II..
anyway.. it sounds like a nice challege.. do you think it's worth to make compo around it? ;-))) Like..
make the best rendition of sentence... e.g. "C64 scene rulez!" ;)) Or better.. make a tune around that sentence too ;-) |
| |
dalezy
Registered: Apr 2002 Posts: 475 |
without having any knowledge and without wanting to shamelessly promote a crappy tune, but this one does make use of ringmodded vocals: http://www.c64.sk/files/sidcompo5/error_23.sid
ah .. what gives, that guy sucks at doing vox. a certain jch-depacking-tool might give an insight there tho. |
| |
Style
Registered: Jun 2004 Posts: 498 |
Roman, that pacman tune is the greatest thing ever! There's even a very Armalyte sounding instrument in there somewhere :) |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Quote: Roman, that pacman tune is the greatest thing ever! There's even a very Armalyte sounding instrument in there somewhere :)
Thanx. I had fun doing it. ;-) |
| |
Hein
Registered: Apr 2004 Posts: 933 |
I like your Kung Fu Master tune. Inspiring sounds, esp. the basedrum is cool.
I think the idea of speechsynth compo is nice.. Hope there are people willing to spend an evening on 1 word. :) |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: For nata, I guess anything less than the text-to-ringmod utility will not do, though. Maybe it's already been done, but kept secret? Hmmm? :-I
:D
Feel free to release such a tool. ;) |
| |
cadaver
Registered: Feb 2002 Posts: 1153 |
The C64 scene mafia would take me for a swim with concrete-boots, so can't do that :) You know, there are still things (such as certain editors, AND the secret mailswapping circle) the public isn't allowed to know about... |
| |
Style
Registered: Jun 2004 Posts: 498 |
people still mailswap? seriously?
|
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: The C64 scene mafia would take me for a swim with concrete-boots, so can't do that :) You know, there are still things (such as certain editors, AND the secret mailswapping circle) the public isn't allowed to know about...
Ahaa! :]] |
| |
Hate Bush
Registered: Jul 2002 Posts: 453 |
what truly sucks in c64 speech synthesis is the fact you have to sacrifice two channels for it, also the filter if you want it really intelligible.
from what i know, two north party v10 music compo entries will contain singer simulation (with lyrics) ;) |
| |
Hein
Registered: Apr 2004 Posts: 933 |
A singing cow:
http://www.sinapism.net/goattrackersongs/mooo.sid
aka TDJ's moped that won't start. |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Quote: I like your Kung Fu Master tune. Inspiring sounds, esp. the basedrum is cool.
I think the idea of speechsynth compo is nice.. Hope there are people willing to spend an evening on 1 word. :)
Hey thanx! ;-) Btw. that moooin' sound is quite okay.. and now how about screeching dooors ;-)
Speech-synthesis compo.. who else is interested? ;-) |
| |
Hein
Registered: Apr 2004 Posts: 933 |
You can combine 2 compos, one where sidcreators make a speech, the other where listeners have to guess what the speech is saying. |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
;-))) I would rather have hardcore speech-synthesis oriented music compo. Either thematic.. (everyone having to use the same sentence) Or free.. (but that could lead to "cheating" with simple words like... "go" ;-))). I also find compo trying to cover reals songs + speech synthesis quite interesting. That "freestyler" by Agemixer has a big potential. I've tested it on my coleague today (stole some work time for shameless C64 promo ;-)) and he was quite amused byt that cover. ;-) |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
ah didnt check this thread for a while... someone who is also looking into SAM contacted me, we shared what we got...and he made some nice progress with understanding how the thing works, maybe soon we can release something nice.
unfortunatly though, i see no way of applying the methods used in SAM to ordinary sid voices...yet :=P |
| |
Hate Bush
Registered: Jul 2002 Posts: 453 |
hein: trrrransssssffoooorrm, great stuff. completely different approach. |
| |
MRT Account closed
Registered: Sep 2005 Posts: 149 |
Quote: You can combine 2 compos, one where sidcreators make a speech, the other where listeners have to guess what the speech is saying.
:-D LOL! True! |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
groepaz, so how about that SAM release? |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
oswald: patience.... the plan is to actually somewhat document the code to show how it works, make a decently relocateable version (that part is almost done, but monitor coding bites :=P), add alternative sound output to a sample buffer (also almost done) and last not least transform the thing into read- and portable C (thats halfway done). it will take some time, but i think it will be worth the wait :) and ofcourse i will take myself the right to use the work/time that went into it in some kind of production first before throwing it at the general public for abuse :=))) |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
great, I want to see a c64 demo sing :) |
| |
JackAsser
Registered: Jun 2002 Posts: 1989 |
@oswald: This one already sings: http://noname.c64.org/csdb/release/?id=37348. Maybe you missed that one? :D |
| |
ptoing
Registered: Sep 2005 Posts: 271 |
That one is just samples tho, no ringmods or sumsuch. |
| |
algorithm
Registered: May 2002 Posts: 702 |
I know this should be in a different thread.. But any examples of instruments (eg grand piano, electric guitar) on the c64? |
| |
Hate Bush
Registered: Jul 2002 Posts: 453 |
prime example of that, hammond organ:
http://gallium.prg.dtu.dk/HVSC/C64Music/VARIOUS/M-R/Moog/Hammon.. |
| |
Hate Bush
Registered: Jul 2002 Posts: 453 |
remembered one more. acoustic guitar (definitely not bad, mostly thanks to good strumming simulation):
http://gallium.prg.dtu.dk/HVSC/C64Music/DEMOS/Sundrench_Tidal_E.. |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Last one is nice, he forgot to tune his guitar. Sounds quite accoustic indeed. |
| |
Linus
Registered: Jun 2004 Posts: 638 |
http://hafnium.prg.dtu.dk/HVSC/C64Music/Follin_Tim/Music_Demo_V..
(subtune #27)
one's gotta love the organ, the flute and the percussions. brilliant jethro tull cover.
http://hafnium.prg.dtu.dk/HVSC/C64Music/Follin_Tim/Gauntlet_III..
(subtune #1)
amazing strings in the beginning. |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Let's not forget Ghouls n Ghosts; screamqueen, howling wolve, anxious breathing inside hideout and all that. Tim's the king of storytelling sids. |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
jack, way too low quality there. The compromise is badly balanced between effects and digi. Doing it the right way is SAM imho. Visuals should be as low on cpu as possible, needs creativity to get it right. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
indeed, my head is aching already :=) |
| |
Puterman Account closed
Registered: Jan 2002 Posts: 188 |
Awful or awesome, well, some people talk, others write code. :) |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Quote: Let's not forget Ghouls n Ghosts; screamqueen, howling wolve, anxious breathing inside hideout and all that. Tim's the king of storytelling sids.
I love Tim's "vacuum cleaner guitar" soundtrack in speccy game called Chronos. I doubt something like this can be covered on sid (at least not as powerfully as original) ;-)) |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
Puterman, and some ppl troll around forums. |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Quote: Puterman, and some ppl troll around forums.
You're number 5 on the list of active forum trolls. \o/ |
| |
TDJ
Registered: Dec 2001 Posts: 1879 |
Quote: You're number 5 on the list of active forum trolls. \o/
*proud to be #1* |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
http://www.c64.sk/temp/chronos-tapestry-of-time.zip
Here is that Chronos by Tim Folin (1,4 MB zipped Mp3) .. don't be scarred, it's really weird sound (it's just simple one channel beeper of ZX Spectrun (similar to PC speaker)) but it's really interesting heavy metal sound anyway. Is it coverable on C64 what do you think? ;-)
|
| |
Puterman Account closed
Registered: Jan 2002 Posts: 188 |
Oswald: We should all do what we do best. |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
Creamd, that sound is the result of heavy limitations, and not really intended imho. on c64 u can do pretty much anything with digis in better quality than a 1 bit speaker... |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Quote: Creamd, that sound is the result of heavy limitations, and not really intended imho. on c64 u can do pretty much anything with digis in better quality than a 1 bit speaker...
Oswald I know why it sounds so. I'm asking about whether this tune can be covered on C64 using 3 channel oscialotors (not d418 4bit digi) and all the effects available - leaving all it's original "drive and balls" inside the tune! ;-) |
| |
Steppe
Registered: Jan 2002 Posts: 1510 |
CreaMD, check the endtune of the Animatron demo, there are some similarly fuzzy sounds in there. I think Linus can shed more light on how he did it. |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Quote: CreaMD, check the endtune of the Animatron demo, there are some similarly fuzzy sounds in there. I think Linus can shed more light on how he did it.
Checked. Those sounds are quite easy to recreate, but will Linus make the tune, thats the question ;-). Some of the sounds had the potential to sound like original, but question is if they would sound as good in melodical parts. I could try it too, but I don't think I have so much patience nowadays as I had in past. |
| |
Linus
Registered: Jun 2004 Posts: 638 |
Quote:Those sounds are quite easy to recreate, but will Linus make the tune, thats the question ;-).
Well, if somebody can provide a midi file i'll give it a try. I am too lazy to convert it just by listening :)
Quote:Some of the sounds had the potential to sound like original, but question is if they would sound as good in melodical parts
They prolly won't sound as good in melodical parts. But then again it depends on the speed. 200hz minimum I'd say. |
| |
Linus
Registered: Jun 2004 Posts: 638 |
is there a midifile of that brilliant tune out there? google *is* my friend but i didn't find any.
anyone? |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Quote: is there a midifile of that brilliant tune out there? google *is* my friend but i didn't find any.
anyone?
nope ;-( but there is an original specy file somewhere (I have it somewehere on my disk) but I don't know if it's useful.
roman |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
mmmmh. i must be missing something, to me this tune just sounds like a fuzzy digitrack. no more no less. |
| |
Hein
Registered: Apr 2004 Posts: 933 |
yes, I compared it to other speccy tracks as well.. this kind of crappydigithroughalousyspeaker music is not my favourite in the list of speccy tunes. (Possibly it isn't digi, I do not know the technical aspects of Speccy 48K/128K tunes, but it sounds bad nevertheless) |
| |
SIDWAVE Account closed
Registered: Apr 2002 Posts: 2238 |
Hein, is that Transform speech sid for release/finished ? |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
I would tend to believe that it is not digitrack, cosindering Tim Folin's musical insanity. It's just 6kb. But maybe it is, it still can be put together from some low quality guitar samplings. Even that "quasi-drum" sounds sampled (that's said for 128kb speccy version, in 48k version it sounds a bit like the mp3 I've posted before).
http://www.c64.sk/temp/microspeccy-plus-chronos.zip (+ music player) (btw. sounds a bit better there) |
| |
Hein
Registered: Apr 2004 Posts: 933 |
Many tunes have this 'small pw' sort of sound. Maybe it's a very short wavesample.
Rambones, that tune is released, yes, in this forum. (But the speech is used in Trans*Form as well) |
| |
Laxity
Registered: Aug 2005 Posts: 459 |
Quote: Many tunes have this 'small pw' sort of sound. Maybe it's a very short wavesample.
Rambones, that tune is released, yes, in this forum. (But the speech is used in Trans*Form as well)
Do you mean similar to the GameBoy with its 16 or 32 (?) bytes wave sample thing? |
| |
Hein
Registered: Apr 2004 Posts: 933 |
mmm, 16 bytes. Thats really not much. Possibly the reason why it's pitch is so high. Mysteries, mysteries, spooky Spectrum. |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
European_5-A-Side.sid - tune #2 crows cheering #3 whistle ;-)
http://www.c64.org/HVSC/Barrett_Steve/European_5-A-Side.sid |
| |
CreaMD
Registered: Dec 2001 Posts: 3034 |
Just recalled JCH's Give it up ;-)
http://www.c64.org/HVSC/JCH/Orcus_giv_it_up.sid
Was that Orcus game actually finished? Must check. |
| |
Tch Account closed
Registered: Sep 2004 Posts: 512 |
Quote: Just recalled JCH's Give it up ;-)
http://www.c64.org/HVSC/JCH/Orcus_giv_it_up.sid
Was that Orcus game actually finished? Must check.
Or his "Bonzai" attempts.. ;)
"Give it up" is pretty cool. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
I'm wondering how KB's speech-synth (inluded in farbrausch_v2) works. Sure it's a PC tool, but who knows what can be done with the SID...
|
| |
Krill
Registered: Apr 2002 Posts: 2839 |
Speech synth can be done using SID voices, no doubt about that. And, Nata, just use teh intarweb, and you'll find lots of papers dealing with the topic of speech synthesis. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: Speech synth can be done using SID voices, no doubt about that. And, Nata, just use teh intarweb, and you'll find lots of papers dealing with the topic of speech synthesis.
Sorry, but I do not want to create Speech-Sounds with SID. (at least not by hand) :P |
| |
Krill
Registered: Apr 2002 Posts: 2839 |
Whatever, but those papers explain how KB's speech synth works, which you wondered about. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: Whatever, but those papers explain how KB's speech synth works, which you wondered about.
Link? |
| |
trident
Registered: May 2002 Posts: 74 |
Quote: Link?
http://www.google.com |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: http://www.google.com
:D ;) |
| |
Krill
Registered: Apr 2002 Posts: 2839 |
Also, those papers are so not about doing it by hand. That's why it's called synthesis, and not fiddling-with-ringmod-and-waveforms-until-it-somehow-sounds-close-to-understanda ble. :)
The magic term for google there is "formant speech synthesis," btw., with emphasis on formant. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
In fact It would be interesting to see a high quality Speech-Synth tool, that allows to create spoken words in a quite easy way.
Benefit: It could be used quickly for Demos without music (or something between).
Of course, many, many, many, c64 people had the same idea already a long time before... |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
Quote:In fact It would be interesting to see a high quality Speech-Synth tool, that allows to create spoken words in a quite easy way.
whats wrong with S.A.M. ? |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: Quote:In fact It would be interesting to see a high quality Speech-Synth tool, that allows to create spoken words in a quite easy way.
whats wrong with S.A.M. ?
Can you tell me a quality demo that uses SAM? |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
sam uses up all cpu power so u cant show any fx while its talking -> no demo uses it. |
| |
Bamu® Account closed
Registered: May 2005 Posts: 1332 |
Quote: sam uses up all cpu power so u cant show any fx while its talking -> no demo uses it.
Indeed!
There must be another way to create quality speech without losing cpu power. |
| |
MagerValp
Registered: Dec 2001 Posts: 1055 |
Yes, ring mod, which was discussed a few posts ago...
|
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
Quote:sam uses up all cpu power so u cant show any fx while its talking -> no demo uses it.
my reversed version renders into a buffer so you can play the sample while other stuff is going on...however, this means lower quality (SAM originally uses very tight timing, meaning a pretty high samplerate). oh well...guess cleaning it up and releasing it will be next thing after my disk-transfer-mayhem =P |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
groepaz, u're promising that since a year or two. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
it's no secret that i am lazy =D
however, its really almost done now... someone else has made a proper (even fully working) translation to ansi C, and my reversed version also has no more (known) bugs.... so yes, it will definetly be the next thing after the remaining 1000 disks are transfered. |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
guys, this c-64 'speech synthesis' is made with sync not ring-mod ;) |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Some detailed info about Strobosphere 2:
- It derivates from "Strobosphere". Is onespeed.
- I reconstructed some instruments for cleaner and tougher sound+melody, having darker and more trancy feeling than V1
- I was asked for a tune for "Halfway there" by Dekadence
- The middle tranquilizing part was too boring, so i decided to surprise a bit by adding filter speech saying "StRRobosFeerrr - HHaaaweiii tHeeeerrr" to promote the demo |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
The speech in Strobosphere 2 was achieved by:
* A simple, narrow "triggering" pulse with bass frequency and high sustain
* A lowpass filter with highest resonance, using few sweep lists with
accurate timing.
* The "trigger pulse" should be narrow enough to give power only to the filter
, reducing that unwanted "bass boom".
* The "R" sound was achieved with either $41/$40-waveloop (=with ADSR) or
$41/$01-waveloop (=with waveform), can't remember exactly.
* Made up the word-starting instruments+notes, split to 7 parts (IIRC,
why do i remember it was just 4):
"S", "trobo", "S", "fere", "Halfway", "There"
* Note that the speech is ONLY cut in places where those stronger,
"hitting" consonants should take place! :)
* Finally added a slow slide-down to each word part. Thats it!
That vocal phrase was all made up in less than an hour. |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
There is a GOOD reason why i haven't dared to release the lyrics for Da Shit Eastwood Jacks... Why do you think the whole thing is sencored to death? :D I could have been disqualified from Assembly 2004!
For the same reason i'm not copypasting the lyrics to the forum... :) |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Da Shit Eastwood Jack's Red Cocks, Green Hens and Nat Hubs:
Lyrics:
http://skalaria.japo.fi/Da-Shit-Eastwood-Jacks-lyrics.txt
(first turn your browser to View-> character encoding = UTF-8. On firefox, press Alt.) |
| |
Oswald
Registered: Apr 2002 Posts: 5017 |
feestyler is still the best imho :) |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
Well, I find it kinda obsolete and complicated in comparison with $57 sync-mod, considering its results ;) |
| |
algorithm
Registered: May 2002 Posts: 702 |
As a quick test and some fun, I have used just a single sid channel on the "diner" speech example.
Result as below
https://www.dropbox.com/s/l9am1u8vhmpkl8g/diner1chan.prg?dl=0
Certainly would require more channels to recreate speech more accurately. Below is using 3 sid channels
https://www.dropbox.com/s/l3eih48rnfbw682/diner3chans.prg?dl=0
if using real c64 or emulator, use new sid. Will sound crackly on old sid (although can be rectified by updating and interpolating value in d418) |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Jammer: Hey, 6581-8580.com mirrors seems to be down. Curious. Is there a chance there's an mp3 of the famous "Hot Mommas" recorded on your SID? :) (haven't xferred that sid on real C64 yet, so i have no idea)
Which other tunes would you prefer to be heard, using your $57 wave speech technique?
Haven't tried it myself, but $57 looks like too many variables - how can you predict how the voice behaves? Sounds like thousands of choices though (waveforms*notes*notes*detuning*pulses?) How can you imagine that combined sync+mod waveform? Or did you just try manually until you got the wanted waveform..? |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
On real SID $57 sounds a bit thinner, still it works properly ;) I cannot transfer it right now, but I'll record it for sure. Here are couple of my tunes with $57, however 'Hot Mommas' is my best speech effort so far:
- http://hvsc.perff.dk/MUSICIANS/J/Jammer/For_Jazzcat_2k6.sid - relatively the weakest example but 'Let's go' can be heard with a bit of imagination ;)
- http://hvsc.perff.dk/MUSICIANS/J/Jammer/Galaxy_Bounce_Edit.sid - simulation of original sampled shouts
- http://hvsc.perff.dk/MUSICIANS/J/Jammer/Sweet_Infection.sid attempt at 'Sweet Infection' phrase, you have to wait over 2 minutes, unfortunately ;)
- http://hvsc.perff.dk/MUSICIANS/J/Jammer/Mr_Marvellous.sid - my first reasonable effort in this field
- http://hvsc.perff.dk/MUSICIANS/J/Jammer/HVSC.sid - tune for HVSC' 10 years anniversary with my short article on the technique
Apart from more advanced techniques like Algo's player or Radwar's example, $57 gives IMHO the best and most audible results, yet it sounds more like robotic vocoder rather tan proper speech (well, in 'Hot Mommas' I've managed to make it a bit less robotic thanks to hipass/band at the beginning) ;) $57 is quite simple in use - you just treat base pitch as ... pitch :D and modulation pitch as formant. $57 gives the best results on 6th and 7th octave but it's more a matter of tweaking. Still you can achieve clear vowel sounds such as oo, eh, ae etc. ;) In noise frames you either cease modulation or set it to appropriate value that matches noise sound. I'm pretty sure, you'll make a yet better use out of it, especially that I haven't added dynamic filtering to speech so far ;) |
| |
Mixer
Registered: Apr 2008 Posts: 422 |
Just a bit of reasoning out loud here:
57 is a waveform of pulse and triangle, it uses previous voice oscillator frequency as sync modulation source.
Idea: Instead of the voice sync one could do a timer irq sync and reset testbit (do the sync) with any desired frequency to generate the formant frq for speech.
Ringmod still uses the previous voice output.
Using timer syncmod and voice ringmod together allow 2 different frequency modulations on same voice.
It would be interesting to know if it is possible to generate any/better/worse vowels with this method.
(yeah, gotta test this some night) |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
Right but I assume that we are talking about solutions within popular editors, not custom pieces of code ;) |
| |
Mixer
Registered: Apr 2008 Posts: 422 |
Jammer, you're right on that.
Time to upgrade them editors, so the new stuff is available for all musicians.
If only the general attitude would change from calls per frame thinking to free timing/bpm. Most tunes won't be heard in demos or games that require per frame sync anyway so "calls per frame" thinking is unfounded.
However, this discussion is about speech, pardon my derailing. |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Maybe I will include a small assembler in AgentRacker then hehehe... |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Mixer: Exactly - that's why... - if there is no variable speed trackers around with ultra flexibility, i have to work on my own AgentRacker. |
| |
SIDWAVE Account closed
Registered: Apr 2002 Posts: 2238 |
check this short video
https://www.youtube.com/watch?v=LMfNzQFMaeA&index=1&list=UUgOH6..
how do i begin in the right way ?
my goal is to make it say something..
like "sidwave" would be a nice thing.
how to start properly, make an S ?
and how to proceed ?
i can poke around endlessly with notes and waveforms,
but i hit good stuff only at random.
in the video, i especially like the reall low scratch-bass sound. but i dont even know what triggers it..
?
Jammer help! |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
I hear you're quite close in this example. Try $51 for carrier waveform. First of all - pitch difference between carrier (first channel) and modulator channel (latter channel) cannot be too vast - in practice you canno make carrier very low - aim rather at octaves 3 and 4. If you have C-2 carrier and C-6 modulator, it won't do the trick yet :) Achieving exact phoneme is kind of trial and error, unfortunately. The higher carrier is, the higher modulator also to make the sound prominent. I haven't tested 6581 here, unfortunately, and I assume that $57 sounds different and you might make a nice use of saturation to achieve some details.
To make S sound you... put a noise basically :D:D:D |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Hmm, is there some limit for a reply in CSDb? I think my msg visited cyberspace, no idea if it wanted to go that way.. |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Trying again.. shortly:
Easy stuff! :)
Easier to start with humming - make carrier voice of that.
Then add channel to the "right of the carrier" your vocal pitchs out of "A", try out all the halfnotes and pick one that's nearest. Do the same for AEIOUY vocals - and also some consonants like J and M.
Consonants might be harder: Make new instrument to hit a new "Hard consonant" - should be very easy with SDI and it's amouint of HR settings! Should make a big difference!
And easy with SDI: you can see both channels side by side.. I used Sync and there you can't see WYSIWYG way your tones.
You need to tie subsequent vocal tones.
For S, F, and such tones you definitely need noise (just like Jammer said). |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
The basic idea is real singing. Do sing some easy, long vocal and try to produce as similiar voice with SDI instruments, as possible. It is easier than you think. (with SDI).
One good vocal example is Retro Gold Love (lousy consonants):
MUSICIANS/A/Agemixer/007-Retro_Gold_Love_T01.sid_CSG8580R5.mp3
Then about multispeed - I find it particularly "must" for singing, because the vocal speed must be faster in resolution than most usual instruments - mostly applies to consonants.
Then there's one very important secret that applies to humanly singing - i don't want to spoil it - you will find that simple secret anyway if you are into vocals anyway, so if you know... don't tell! :) |
| |
Soren
Registered: Dec 2001 Posts: 547 |
The only time I fiddled with it on c64, I am pretty sure I did it like I did on a synth of mine - saw wave+bandpass filter, resonance, ringmod... for the vowels.
I am not a fan of this kind of speech synth on c64 - it usually just doesn't sound good enough to me.
It's more fun for me to create other types of instruments ;) |
| |
2bt
Registered: Jun 2021 Posts: 9 |
I have been fiddling with the C port of SAM a while ago (https://github.com/vidarh/SAM). If you run it with -debug, it will print out lots of interesting information. The last part is a nice table with all the data that is used by the backend to generate the audio. This is the table for "hello".
flags ampl1 freq1 ampl2 freq2 ampl3 freq3 pitch
------------------------------------------------
7C 0 14 0 73 0 93 57
0 4 16 3 70 2 92 56
0 13 18 11 66 4 91 55
0 13 18 11 66 4 91 55
0 13 18 9 60 4 94 55
0 13 17 6 48 2 102 56
0 11 16 4 36 1 110 56
0 11 16 4 36 1 110 56
0 11 16 4 36 1 110 56
0 11 16 4 35 1 106 56
0 13 17 6 33 1 97 56
0 15 18 9 30 0 88 55
0 15 18 9 30 0 88 55
0 15 18 9 30 0 88 55
0 15 18 9 30 0 88 55
0 15 18 9 30 0 87 55
0 15 16 8 30 0 85 56
0 13 15 6 29 0 83 57
0 13 13 5 29 0 81 58
0 11 12 4 28 0 80 58
0 11 12 4 28 0 80 58
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
...
Each row represents a time slice. Flags indicate noise, amplitude and frequency for tree formants, and the voice pitch. The final audio signal is the sum of these three formant oscillators. And they are synced with the pitch.
It turns out that the third formant isn't really that important. Neither is amplitude, really. What's left maps well to the SID. Voice 1 for the pitch and voices 2 and 3 for the first and second formant. Sync voice 2 to voice 1 and sync voice 3 to voice 2 (bummer, since syncing voice 3 to voice 1 is not possible, but that is ok).
I managed to render a 1x SID that sounds a lot like SAM (http://langnerd.de/sid/sam-test.sid). It has some rough edges but I think it shows that the SID is totally capable to render understandable speech without samples. In my estimate, we have only really scratched the surface of what's possible. |
| |
Frantic
Registered: Mar 2003 Posts: 1627 |
Quote:we have only really scratched the surface of what's possible
Most definitely. Actually, the player code in defMON started out as part of a project to play "voice data" similar to what you have done (nicely!) here. Would be interesting to see what people who are skilled in this particular area can come up with. I think some quite unbelievable things would be possible. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1378 |
Nice work, 2bt.
I must admit, I was already thinking the original SAM port to the c64 has a lot of scope for optimisation/quality improvement, even just as a soft synth.
Certainly still many things to try :) |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
Speaking of 'traditionsl' synth speech - bandpass filtering helps a lot with emphasizing phonemes :) As JCH tested once - it might even save one channel with quite satisfying results. Still, it's no filter for anything else, unless it can share setting:
https://www.youtube.com/watch?v=tZOnXX5p7Yo |
| |
morphfrog
Registered: Mar 2002 Posts: 32 |
When speaking about speech synthesis we cant miss to mention Viznuts serie of Vic20 demo like robotic liberation, with singing ! just wow. |
| |
Agemixer
Registered: Dec 2002 Posts: 38 |
Hi guys, while composing "BassOnAVajo" i recognized SidWizard can drive the resonance value each call! This could be a significant use to create more realistic one-channel speech.
While in good use, i believe it COULD be the most significant multispeed parameter for vocals... but i could be wrong.. havent tested it but just roughly, and it seems to have less difference between SID versions compared to other filter parameters.
Along with other waveform / pitch driving parameters + ADSR triggering ofcourse.
Who is teh first one? Happy testing! :D |
| |
Jammer
Registered: Nov 2002 Posts: 1289 |
Quoting AgemixerSidWizard can drive the resonance value each call!
So can Goat Tracker ;) |
| |
Linus
Registered: Jun 2004 Posts: 638 |
Quote: Hi guys, while composing "BassOnAVajo" i recognized SidWizard can drive the resonance value each call! This could be a significant use to create more realistic one-channel speech.
While in good use, i believe it COULD be the most significant multispeed parameter for vocals... but i could be wrong.. havent tested it but just roughly, and it seems to have less difference between SID versions compared to other filter parameters.
Along with other waveform / pitch driving parameters + ADSR triggering ofcourse.
Who is teh first one? Happy testing! :D
Since the resonance on SID chips is not very strong I am not sure it can be utilized as much as you might think, tho. It would be a whole different story if it could self oscillate, of course. |
| |
2bt
Registered: Jun 2021 Posts: 9 |
Jammer's Happy Tree Friends is very impressive. I presume it's all handcrafted sounds, is that correct?
For some time I have been trying to generate something intelligible directly from speech audio. It's a hard problem, my results are actually rather disappointing. It looks like the model SAM uses, i.e. one voice for the pitch plus two synced voices for formants, is about as good as it's gonna get. The challenge is finding the best parameters (frequencies of these three voices) given an speech signal to recreate. It's kind of a bummer that the SID doesn't offer direct volume control here.
Has anyone else tried generating speech from speech audio? |
| |
JackAsser
Registered: Jun 2002 Posts: 1989 |
Quote: Jammer's Happy Tree Friends is very impressive. I presume it's all handcrafted sounds, is that correct?
For some time I have been trying to generate something intelligible directly from speech audio. It's a hard problem, my results are actually rather disappointing. It looks like the model SAM uses, i.e. one voice for the pitch plus two synced voices for formants, is about as good as it's gonna get. The challenge is finding the best parameters (frequencies of these three voices) given an speech signal to recreate. It's kind of a bummer that the SID doesn't offer direct volume control here.
Has anyone else tried generating speech from speech audio?
The trick imo is to display the text on screen synched. Then you’ll get away with almost anything. |
| |
2bt
Registered: Jun 2021 Posts: 9 |
Haha, I guess that's true. But I really want to aim higher than that! |
| |
chatGPZ
Registered: Dec 2001 Posts: 11114 |
Quote:The trick imo is to display the text on screen synched
this :) |
| |
acrouzet
Registered: May 2020 Posts: 80 |
That SAM speech is impressive! Now I'm thinking about how a 2xSID tracker that would let you add a SID vocal track might work. |