[HeyGen in-depth explanation] Features, fees, usage precautions, etc. Simultaneous multilingual interpretation by AI is coming soon...

2024/09/18 10:16 (Updated 2025/08/01 16:28)

Editors of Iolite

Written by Noriaki Yagi

[HeyGen in-depth explanation] Features, fees, usage precautions, etc. Simultaneous multilingual interpretation by AI is coming soon...

Table of Contents

—What is HeyGen?
—How to use HeyGen

—What is HeyGen?

One day, on X (formerly Twitter), a video was streamed that had been translated into a language through an application called HeyGen. When you upload what you said in Japanese to the application called HeyGen, the video is translated into the specified language within a few minutes.

I was impressed by the excellent technology that can translate different languages with a voice that sounds exactly like the natural voice, but this tool also seems to move the mouth of the output video according to the translated language.

The future of simultaneous translation of multiple languages that we have dreamed of may be just around the corner. In this issue of "Editor-in-Chief Focus," I would like to take a deeper look at the currently popular "HeyGen."

◉ "Editor-in-Chief Focus"

The editor-in-chief of "Iolite," a business magazine that covers topics on next-generation technology and finance and economics, follows the hot topics and the forefront of the latest news.

—AI-powered video generation platform "HeyGen"

"HeyGen", which was officially released from beta in July 2022, is an AI-powered video generation platform that can generate avatars and create videos using AI.

A web-based application available in a browser, HeyGen has been growing at a rate of 50% per month since its launch. It is one of the most notable services among AI-related services.

It mainly offers Talking Photo, a function that re-draws the speaker's mouth (lip sync), and a service that automatically translates voice input in text and makes the selected avatar speak. The number of users is growing day by day, and as of August 2023, 3.8 million users visited the site monthly.

It also provides video production services for corporations, and HeyGen's services are used by world-famous companies such as Accenture, Amazon, and NVIDIA, as well as educational institutions such as Columbia University.

—Development team

But why was an application released in 2022 able to explosively increase the number of users in such a short period of time? In fact, CEO Jashua Xu has a history of working on AI development at Snapchat, which was once ranked number one among SNS chosen by American teenagers.

In addition, the fact that it was selected as one of the best artificial intelligence software of 2023 by the software review platform "Tekpon" shows that it has solid technical capabilities and extensive knowledge of AI.

Regarding Tekpon's selection, Joshua Xu said, "The results reflect the team's relentless efforts, and being listed on Tekpon has further motivated us to continue innovating and always provide excellent value to our customers."

—Characteristics of HeyGen

So, let's take a look at what characteristics HeyGen, whose user base is rapidly increasing, has.

[Features]

-Over 100 avatars to choose from

-A wide variety of video templates

-Videos can be created in over 40 languages

-Efficient creative production using ChatGPT and Canva

Over 100 avatars to choose from

You can choose from a variety of nationalities, genders, and costumes, and you can even select illustrations of Shakespeare or the Mona Lisa to read the text you enter. You can also combine the face of an uploaded photo as an avatar for an additional fee, allowing you to create your own custom avatar.

A wide variety of video templates

Video templates are divided into categories such as advertising, SNS, news, and education, and you can choose your favorite design from over 100 types. Some templates are prepared for vertical videos. Another good point is that you can choose vertical videos, which are in high demand on SNS. There is no deadline for downloading the videos you create, and there is also a function to share videos with other users.

You can create videos in over 40 languages

You can choose from over 40 languages, and it also has filters that allow you to narrow down the options by gender, age, and tone of voice, so you can choose the voice that best suits your video concept. You can also record your own voice and create a voice clone, just like with a custom avatar.

You can combine your voice with an avatar of your choice, or you can create a digital clone by combining your voice with an avatar that combines your face photo with an avatar.

There are also other services that allow you to translate videos of about 5 minutes into multiple languages with just one click. At the moment, it takes several minutes to output a video, but in the future, simultaneous multilingual translation may be possible.

You can create content efficiently using ChatGPT and Canva

You can also create content efficiently using existing AI-related services. You can use ChatGPT to output text to be read aloud and have HeyGen read it, or you can use the free and easy-to-create design service "Canva" to create designs using HeyGen's AI avatar. Depending on the combination with existing AI tools, it can be used to create creative works very efficiently.

—Side business ideas: whether or not it can be used commercially

To get straight to the point, HeyGen can be used commercially. However, the HeyGen logo will be displayed on videos created and downloaded with the free plan. As I will explain later, it seems that the logo will no longer be displayed when you sign up for the Creator Plan or higher.

The following are some ways to use the features at the time of writing (October 2023) for business purposes.

[Usage ideas]

- Use in educational content

- Create content using your own or your company's IP

- Personalized video distribution

Use in educational content

According to a report by Forrester Research, a US research company, the amount of information conveyed from a one-minute video is equivalent to 1.8 million words in characters, or about 3,600 pages of a typical web page. The 1.8 million words mentioned here are calculated based on the number of English words. In other words, it can be assumed that the amount of information is equivalent to more than 3 million characters in Japanese.

This report was published in 2014, so it is likely that the amount of information aggregated on modern web pages has increased, but in any case, there is a big difference between the amount of information obtained from text and the amount of information obtained from video.

Create content using your own or your own IP

As mentioned above, on HeyGen, you can upload videos of yourself speaking, or upload images and combine them with avatars. It is also possible to have Shakespeare or Moraliza speak in multiple languages, so you could use your own IP to mass-produce videos like a VTuber, or even hold lectures remotely using your own digital clone.

Personalized video distribution

You can change the avatar to match the distribution destination and expected audience of the video you have created, and distribute personalized content. For example, you could use a lively avatar for an advertising video for an energy drink, and an avatar that suits the age group in which eyesight begins to deteriorate for an advertising video for glasses, so viewers would be able to imagine using it themselves.

Although it's a future idea, it seems that services that allow you to change the speaker of video content such as news that you watch at a fixed time every day to your liking will also appear.

—Pricing Plans

The Talking Photo function will be available from the Creator Plan, and the Business Plan seems to have API access and priority video processing.

—Competing Services (HeyGen vs D-ID)

There are several other applications that allow images and avatars to talk, and one of them that is said to be a competitor is "D-ID". There was a video that actually compared videos generated by HeyGen and D-ID, so I will post it with the reference source.

GSaab Graphics
AI avator
HeyGen vs D - ID | Talkative AI vs. DID: Talking Photos

Personally, the video generated using D-ID seemed to have a slight distortion in the center of the avatar, and the video generated by HeyGen looked more natural.

—How to use HeyGen

From here, we will explain how to create videos using HeyGen's AI avatars for free.

Click "Get started for free" in the center of the HeyGen official website.

Register with your email address, Google account, SNS account, etc. on the account registration screen.

After completing your account registration, you will be taken to a screen where you can select a video template. Select an avatar of your choice.

Click "Create with AI Studio" to generate a video.

The video production screen is as shown in the image above. In addition to being able to change the avatar reflected in the video to your choice, you can enter text and convert it to multiple languages.

To convert the language, click the text input tab below the video.

When you select your language, you can select various voice tones within the selected language.

Once you have selected your avatar, entered text, and selected your language, click "Submit" in the upper right corner and wait for the video to be generated to generate a lip-synced AI video.

—HeyGen's Challenges

Currently, issues regarding legal rights and ethical use are being raised regarding generative AI. These issues will also be seen as problems for HeyGen, an AI-equipped video generation platform.

[Remaining issues]

-Issues regarding ownership of audio data (intellectual property rights)

-Issues regarding ethical use - Hikakin also warns

Issues regarding ownership and copyright of audio data (intellectual property rights)

There is a right called copyright for audio and text. Unless the work is old and the copyright has expired, you should be careful about actions such as reading text aloud.

In addition, actions such as recording a voice actor's voice and creating a clone with HeyGen, or posting a work of reading aloud with a voice clone on SNS, may be considered copyright law violations.

Whether or not copyright infringement occurs is a specialized field that depends on whether "reliance" and "similarity" are recognized, so if there is a possibility of copyright infringement, it is a good idea to consult an expert.

If the purpose is for personal use, the act itself is often not recognized as illegal, but be careful when loading voice clones or images into HeyGen to generate custom avatars.

Ethical use issue: Hikakin also warns

TikTok, a service that allows users to create and share short videos, has a voice-to-speech function on its video creation screen. Among these, you can select the voice of Hikakin, a top YouTube creator. The voice was created by Hikakin himself, who recorded about 1,000 sentences, and the text entered in the video is read aloud in a voice that is comparable to the real thing.

However, Hikakin posted a video warning users that this function had been used commercially to advertise diet supplements without permission. This is a warning against the unauthorized commercial use of an automatic voice-to-speech function. The more AI generation becomes familiar, the more careful measures will be required to prevent the unauthorized use of personal images and voices.

—Summary

We gave a detailed explanation of HeyGen, a video generation platform that uses AI to create videos. As long as you have text to read, it is a convenient tool that allows you to create high-quality videos.

Although copyright and ethical use issues remain as issues, it is undoubtedly a convenient tool that even beginners in video creation can easily use and can even translate audio. Not only individuals but also corporations are increasingly using it in business situations, and as the number of users increases, new functions will be implemented to provide better services.

Image: Shutterstock, HeyGen

Profile

◉Noriaki Yagi

While attending university, he worked in the food and beverage industry. From that experience, he launched a restaurant consulting business and a human resources dispatch business in the amusement field, and became its representative. At the same time, he started operating on social media with the aim of establishing his own brand. After achieving a total of 10,000 followers on social media, he launched his own apparel brand due to the expansion of awareness. He joined J-CAM Co., Ltd. in September 2021. After working on YouTube and Twitter, he became editor-in-chief in April 2022. In March 2023, he launched "Iolite".

Does AI have a sense of beauty? Doubts being left behind by overheating and trends

"Deepfakes" created by misusing AI: Cleverly crafted fakes even control human thought

[HeyGen in-depth explanation] Features, fees, usage precautions, etc. Simultaneous multilingual interpretation by AI is coming soon...

—What is HeyGen?