This is pretty neat. Might be useful for generating voiced dialogue. I imagine it's not that easy to get quality output, depending on how complex / emotive you want the dialogue to be. I wonder if it can be adapted to emulate ToEE's NPCs so we can fill in the additional unvoiced dialogue. Sounds like sthg @Gaear would want to play with. I saw a similar AI thing for portraits, but couldn't get it to generate ToEE-like portraits unfortunately.
I haven't seen this ^ but Artbreeder could generate infinite unique portraits now, for everything, based on existing portraits. Amazing how far technology has come.
Been following this. It's an impressive project, still being actively developed - and you can even grab it on steam now: https://store.steampowered.com/app/1765720/xVASynth_v2/ About generating voices based on ToEE sound files (AKA training voice models): this capability isn't publicly available right now, since the toolset to do that is too complicated for the normal user. Dev's working on a streamlined tool called xVATrainer, however. It's pretty far along now, seems like it should be available in the next few months. Will keep an eye on it. On a personal note, back when I tried my hand at modding ToEE quests (the Traders mod etc), one of the things that held me back was that I didn't want to add too many lines to voiced NPCs, since it would contrast heavily with the original content. Not that I plan to revisit that, but it'd be so cool if that problem was finally solved once and for all
xVATrainer is progressing more rapidly than I thought! Steam release is tentatively slated for April 8.
In a rare case of under promise, over deliver, release is set to April 1 https://store.steampowered.com/app/1922750/xVATrainer/ Check out the showcase video: Looks very cool. Deep learning at your fingertips!
It's out! Sadly, it looks like my GPU is not up to task... I guess it's pretty old by now. I asked around if there's a cloud solution and apparently someone already set up a Colab notebook: https://colab.research.google.com/drive/1YqNFvFZRAUbZ1xAFJNXVTGxn1C6tOfVQ?usp=sharing I'm trying that out now, training a voice model based on Burne. Also got the Colab Pro subscription, it's just 10$ a month (and the shekel is strong now, hehe). We'll see how it turns out. (Note that the database preparation still has to be done in the xVATrainer framework, but it's fairly straightforward)
A few notes in case anyone else is interested: 1. Keep the sound clips shorter than 10s. You can use the diarization tool for that (e.g. put all the >500kB files in the input folder and run it on them). It will split 'em up for you. I've started over the training from scratch because some of the lines were 17s long, and after shortening the length the training time was around 50% shorter, and that's despite adding even more lines to the dataset. 2. You don't have to parse the dlg files - the auto-transcribe tool is pretty good, you can make do with it + manual correction here and there. It doesn't have to be perfect anyway. 3. Training takes a really long time. Depends on hardware in general, but you're looking at 2-3 days minimum, and that's with a high end GPU (this is well known and not particular to my setup/database or anything like that). Training in Colab Pro, you get a P100 with 16GB of VRAM. FastPitch model stages 1-3 took me 13 hours, and the notorious stage 4 is still running (currently already about 2 days in, and it looks like it could take another day or two to finish). This is why I wouldn't recommend doing it on your PC, unless you have a really beastly GPU, like RTX 3090 or whatever with over 20GB VRAM. 4. If there's enough demand for ToEE voice models, I could arrange a one-month subscription for Colab Pro+ which has even higher end hardware (40GB GPU), so I could probably train a whole bunch of voices in a short span of time. But I guess we should wait and see how Burne turns out first I saw others arranging for a similar setup on the xVA discord, perhaps we could hitch a ride on that as well.
It's good that I abandoned this because it looks like it’s going to be obsolete: https://beta.elevenlabs.io/ AI technology has really come a long way in such a short period of time. I think we're not far off from an AI that could take a PnP module pdf and completely generate a computer game.
The AI to create a module, you to turn in your list of demands, and the AI to delete itself in disgust.