My latest Starbound Renegade single has been released! It’s called Just a Sloth on a Mystical Journey and I used AI tools to help me create it.
If you haven’t noticed, AI tools like ChatGPT have been taking the world by storm the last few months (for better or worse). I got a chance to tinker around with Midjourney back in November to much horror and delight. And I finally got around to trying out ChatGPT to my utter amazement.
The topic of Artificial Intelligence is very trendy and controversial right now. I’ve been following the culture war regarding artists and big tech, and if AI is a good thing or bad thing. I’m an artist and see both sides of the issue and there’s a lot of grey area. I was really against it at first, for many reasons. But I wanted to try it out and get firsthand experience. It was mind blowing (and hilarious at times). But also creepy and dystopian. I am still not sure how I feel about it. Only time will tell. But the cat’s out of the bag and AI is going to be part of our lives from now on.
Here’s a huge list of AI tools for you to check out later.
Before I get into how I made it, here’s a few early images I generated using text-to-image generators like Midjourney and Stable Diffusion.
Slideshow of AI Generated Images
How did I make this song?
I heard about ChatGPT and finally decided to check it out. My first prompt was “write me some lyrics for a pop punk tune about a sloth on a mystical journey.” What it spit out was quite impressive honestly. It was my first experience with ChatGPT and I was blown away. I immediately started changing some of the lyrics to make it more my own. I never would have even thought to make this song, but now here I am with an idea to start with.
Ok let’s see where this goes.
So then I asked ChatGPT if it could write a chord progression for me, and voilà, it did. It suggested I use G, C, and D with some variations. Ok, easy enough. Why not, let’s do it. What could go wrong?
So I went into FL Studio and started arranging my track in the key of G and went for something really simple and catchy. I played a melody on top that would eventually become the vocal part. But I don’t really like to sing so I needed to find something else to sing for me. AI perhaps??
I searched around and found some VSTs that could simulate vocals (I’ve used AlterEgo before and it was ok). But I came across a new one I’ve never heard of before called Synthesizer V. Wow this was incredible! You just type the lyrics and it will sing it for you. The voices were actually better than I expected. The cool thing was I could export my lead melody’s MIDI data and use that to place my lyrics onto. I could then tweak the cadence to get it just how I liked it. I even added a second vocalist for harmonies. Dang this was fun. I exported the vocals as audio files so I could import into my DAW.
Days flew by as I composed and arranged the rest of my song. Once I was happy with it, I needed to create a music video to post it to YouTube. So I figured since this was an AI inspired song, why not use AI tools to help me produce a video.
Music Video with AI Tools
I spent a week doing a lot of research on AI animations while trying out a lot of different tools Disco Diffusion, Deforum, etc. This was kind of awkward because I wasn’t that familiar with Python scripts or working with Google Colab. But I was having a lot of fun making this song and I was learning new things I never would have explored otherwise.
My first attempt at making an AI powered animation was using Disco Diffusion on Google Colab and typing in a different prompt every 100 frames. It would generate a scene based on my input and then interpolate all the frames in between using their AI model. It took forever. It was rendering about one frame per minute, and it had a total of 2400 frames to do. I figured I’d just let it sit all night and wake up to a finished render the next morning.
Not so fast!
Google Colab kept timing out and disconnecting from inactivity. And eventually it would lock me out for several hours. I even tried paying for the plus version, which granted me a faster GPU and render times, but it would still time me out. I kept repeating this process over the course of several days until I got fed up and looked for something else.
I found a service called RunPod that allows you to rent GPUs to run AI tools on much like Colab but cheaper. I could have run this locally on my own computer but my graphics card is not that great and you need at least 8-16 GB of VRAM to do this stuff. That’s what Google Colab, RunPod or Vast.ai is for, you can “borrow” GPUs in the cloud and run these AI programs.
But these tools are not for everyone. They’re quite techy and buggy. I kept running into errors and having to ask Google (and ChatGPT!) for answers. I’m just not that familiar with code, so it took a lot of trial and error. It also took some lurking on various Discord forums to finally figure things out. I managed to finish it, but not after shelling out another $10.
AI Animation Results
I finished my animation and it looked… very weird. Kind of scary. Most of my prompts were about sloths doing mystical things in the style of Frank Frazetta and Brian Froud, two of my favorite artists. But it would render them with malformed bodies, multiple eyeballs and disfigured limbs. It was dark and grotesque, but still kind of cool in its own way. Just not the vibe of my song. I think I’ll still use it for something, maybe a more ambient slothwave track.
Another issue was that it was only 1280×768 and 12 frames per second. I would need to find a way to upscale it to at least 1080p and 30 frames per second. Could AI do that too?? Of course it could. I managed to find tools like Cupscale and Flowframes to do this. Cupscale could batch upscale (enlarge without losing quality) all 2400+ frames to HD and Flowframes would add more frames in between what I already had so it would play back smoother. Both use AI technology to do this and the results are insanely good.
But I ultimately didn’t use that video that took me so long to make. It just didn’t fit the vibe of the song so I had to think of another solution.
Using AI with my own art
Aha! What if I used my own sloth painting that I did a few months ago? Back in early 2022, when I was playing around with Procreate on my new iPad, I made a cool painting of a sloth hanging in a tree. He was cute and cuddly and actually fits the vibe of the song very well. Could I use that?
But it was just a still image. And also it was in portrait mode, not landscape. But, hey, maybe there is an AI tool that could help me with THIS task?
Sure enough, there was.
There’s a technique called “outpainting” that lets you extend the background of any image using artificial intelligence. It can match the style of your existing artwork too. I was skeptical and did some research and found a few tools, one being Invoke AI. I set it up in RunPod and toyed with it for awhile, but wasn’t getting the results I was looking for.
I found another called getimg.ai and this worked amazingly. It wasn’t perfect by any means (see his claw foot lol), but it added content to the left and right of my painting while matching my art style. It extended the branches and even added leaves that weren’t there, but actually look great in the image. I did some minor tweaking and used Cupscale to increase the size to 1920×1080. Now we’re talking.
But it’s still just an image. How could I make it come to life for a video? Just some subtle animations is all I need. I found a few tools for this job.
Sidebar, I attempted other things like trying to create a consistent AI character model using one of the generated sloth images. Ended up making this little animation using a short “driving video” of my own facial expressions. I used the Thin Plate Spline Motion Model notebook on Colab.
Ultimately I didn’t use it because I kept running into errors and getting timed out of Google Colab. Seems like a cool idea for another time.
Music Visualizer for Extra Spice
One clever tool I found is the music visualizer app called Vizzy.io. How have I never seen this before? I was making simple “audio react” videos with Adobe After Effects or even FL Studio’s ZGameEditor Visualizer plugin (see example). They did the job fine, but Vizzy is on a whole other level. And it’s free, without watermarks. As they say, if the product is free…
Anyway, I threw my new sloth painting into a background layer on Vizzy and added the audio spectrum and other effects. Uploaded my audio track and watched everything animate in reaction to the music. So easy, I barely had to do anything. You can browse other artists creations and use them as starting points or get ideas. I messed around with it for hours, but it was still missing something. What could it be?
My wonderful sloth painting was still just a still image. Couldn’t I add some movement to it? AI to the rescue again lol.
There’s a tool I found called LeiPix Converter (based on Depthy) that instantly takes any image you give it and makes it 3D (sorta). It creates what’s called a “depth map” based on your image and uses that to make a simple animation that really makes it pop off the screen. Very impressive for very little work on my part! I tweaked some settings and exported a three second mp4 that I could import into Vizzy as my looping background.
Unfortunately the video I got from LeiPix was only 1024×576 and not the best quality. But hey, you know what I was thinking? Upscale time! I threw the clip into Cupscale and set the width to be 1920 pixels and a few moments later I had an HD version of my clip that looked much better. What is this sorcery!?
I now had a high quality “3D” motion background to go with my animated audio spectrum. Added some fx, camera shakes, and subtle particles to give it some “floaties” and BOOM – Done. Almost… I’m a perfectionist and needed to try a bunch of other stuff just in case. Hours, days later… This is what I finished with:
I am really happy with how it all turned out. It started with asking a simple question to ChatGPT that I didn’t really think it could answer. I had no idea how robust it would be. I ran with it on a whim and ended up making a fun song and learning a ton of new stuff about AI in the process.
When I wasn’t working on the track, I was busy reading and listening to hours of podcasts and YouTube videos on the AI art debate. Is it ethical? Is it good or bad for artists? Are we headed to a cyberpunk dystopia ruled by AI bots? Perhaps.
I’m very skeptical and suspicious about AI honestly, and who will be in control of it. Using it to engineer and steer society in ways we can’t even fathom right now. One of my favorite tech skeptics is Alison McDowell who has countless hours of content on YouTube breaking it all down. Not so much the specific tools I mentioned, but the bigger picture.
I have a lot of thoughts I could go into regarding AI and it’s philosophical implications, but I’ll save that for another time. I enjoyed playing with these new toys and getting firsthand experience with what will surely revolutionize the creative field going forward.
I’ll be keeping my eyes on it in 2023, that’s for sure. Until then, I’m just a sloth on a mystical journey. Peace out.