Saturday, December 30, 2023

The Process - Part 1

When people think of AI-powered tools, they'll think of ChatGPT and Bard and all of those other virtual assistants. All you have to do is simply ask them one thing and they will provide you an answer or an approach to a solution.

A simple Hello World app goes from this...

to a fully functional JAV cataloger!

AI can even generate art and images for you. There are numerous websites dedicated to AI-generated pornography (they all pretty much use Stable Diffusion, in case you're wondering).

This is AI-generated using Stable Diffusion
NOT using a celebrity as a personal LoRa model.
(I will be really impressed if you can figure out this celebrity.)

But now AI can even translate something from one language into another for you. And so when I mention I use Whisper AI to create subtitles for this blog, people would assume that I just throw in a JAV title and call it a day.

But like any tool, ALL AI-POWERED TOOLS STILL REQUIRE WORK. Because all AI tools are based on Large Language Models (LLMs). That means AI tools are trained based on their model. For example, AI generated porn is trained on explicit images. If you were to give a non-pornographic model an explicit prompt, it would not know what to do.

This is the same dilemma Whisper goes through. It's not trained on JAV, so a lot of the Japanese that Whisper is familiar with is Japanese content on YouTube.

Even worse, AI tools are not 100% accurate, and these mishaps are called hallucinations.

A near perfect AI generated image of a NOT celebrity.
However, I'd still consider it a hallucination because it got the lips wrong.

When Whisper hallucinates, it will repeat
the last line it remembers for eternity.

Hallucinations are absolute nonsense and can really clutter and ruin the workflow. While I generally run a first pass through Whisper, it can sometimes completely ruin watchability and comprehension.

So, it is not as simple as "drag and drop" and it will create subtitles for you. On a desktop running 12GB of video ram using CUDA, it takes Whisper twenty minutes to process a 2 hour JAV title. Before I discovered the CUDA method, it would take Whisper 12 HOURS to process ONE title. You can see why the workflow wasn't suitable for a blog. 12 hours per title is not sustainable or feasible.

Thus, STEP ONE: Process the JAV title through Whisper.

This is to give an overall impression of the video. Sometimes it's pretty accurate, but most of the time, it's all over the place. However, the goal is not to get an accurate translation of the video. The goal is to see where most of the conversations take place.

Continued in Part 2

No comments:

Post a Comment

Featured Post

Who I Am

Hi there, I'm the Admin at CompJAVSubs. This blog focuses on the odd and interesting JAV VR titles, and the occasional 2D, that are in m...