AI

‘Why is AI booming?’ What can it make?’ Unexpectedly unknown topics in the field. --AI Engineer

2023/05/29Editors of Iolite
SHARE
  • sns-x-icon
  • sns-facebook-icon
  • sns-line-icon
「AIがなぜブームに?」「何が作れる?」意外と知らない「現場」のハナシ。——AI Engineer

Why is ‘generative AI technology’ booming now? What are its possibilities and dangers?

Generative AI technology, as represented by ChatGPT, has recently attracted a lot of attention. This is because generative AI is capable of various creative expressions, such as illustrations, programmes and 3D models, as well as text.

What changes will be made to our lives and businesses as a result of generative AI, which is said to be ‘taking many people's jobs’?

Today, I would like to ask you a number of questions about AI technologies that have been the talk of the town recently. To begin with, why has AI become such a hot topic in the last few months or so?

B: The trigger was the release of image-generating AI services such as ‘Midjo urney’ and ‘Stable Diffusion’ in July and August 2022. These services were able to automatically generate appropriate images by entering a few keywords.

This is an innovative service that allows people who cannot draw at all to create images in seconds that look as if they were drawn by a professional. However, I have the impression that these image-generating AI services attracted attention from a small group of people who were interested in IT technology.

It was probably from the service ChatGPT, released in November last year, that this became more widespread. This service became a hot topic because it was able to answer questions given by users in natural sentences, as if they were created by a human being.

ChatGPT quickly became used around the world, and in January this year it announced that it had 100 million monthly active users.

A: Since then, generative AI services have been launched one after another all over the world. Recently, search engine providers have successively released services that utilise generative AI.

Microsoft has released the ‘New Bing’ based on the large-scale language model ‘GPT-4’, which is the technology behind ChatGPT, and Google has released ‘Bard’. Both generate answers to user questions based on internet searches.

B: What is booming now is not so much ‘AI technology’ as ‘generative AI technology’ - AI technology has already been developed for decades, and many people have heard in the past that ‘AI will surpass humans’.

For example, AI technology was already developed to look at images and analyse the information. There have also been several other systems that have emerged, such as Shogi AI, that can outperform humans in certain fields.

This is booming in 2023 because AI is now able to generate things that are as good as or better than humans. AI is now entering fields that were previously regarded as ‘creative activities that only humans can do’, such as writing, images, and even novels, programming, 3D models and music.

Until now, AI technology has been good at recognising data, but not at outputting it at the same level as humans. In particular, natural language processing, or the creation of natural sentences like those produced by humans, was a difficult task for AI. However, the Transformer technology has made this possible.

This technology was originally published as a paper by Google in the course of its research into translation, but it has been applied to AI, resulting in a generative AI service that produces sentences that look like they were written by a human.

When ChatGPT became a hot topic, I asked a few questions to try it out, and sometimes I got answers that were off the mark. Seeing that, I have the impression that even though it is AI, it makes mistakes and is not very reliable.

A: Text generation AI such as ChatGPT is a technology that learns from large amounts of text data and generates ‘expected appropriate’ answers based on that. Therefore, if there is not enough training data on the content of the question, there is a possibility that the answer could be wrong.

In fact, in the case of ChatGPT, it learned data up to September 2021, so it sometimes gave completely wrong answers on current affairs that happened after 2022.

B: A common misconception about generative AI technology is that it can ‘tell you the right answers to things that humans don't know’. This is a misconception, as generative AI outputs based on the large amount of training data that already exists, so of course AI does not know about information that does not exist in the training data.

Incidentally, the problems of missing answers to questions and being out of touch with the latest current information are pretty much eliminated with ‘Bard’ and the ‘new Bing’, which answer based on the vast amount of search results on the internet.

There are downsides, such as AI taking away jobs, but it can also be used to streamline operations and generate new business.

Generative AI can be used for a wide range of purposes, but it can also be used to create new businesses.

A: There are many different types of generative AI services that have been created, and they can do a very wide range of things, so let's start by talking about AI that generates sentences. Since it is a sentence generation AI, it can of course generate sentences as if they were written by a human.

For example, it can write blog posts, emails, reports and so on, as long as you give it an appropriate topic and ask it a question. There are already an increasing number of people on the web who are using AI to mass-produce blog posts and create their own websites. Email and report writing could also be used in business.

B: Another use is to ask questions to the generative AI on matters that you don't really understand. Microsoft and Google have already released generative AI services, and the emergence of generative AI is changing internet search significantly.

Until now, you used to enter keywords into a search engine and search for the answers you wanted from among the large number of websites that appeared. Now, however, generative AI allows you to get answers that extract only the essentials from the search results.

This kind of use is not only for ‘looking up something I don't know’, but also for ‘planning a day out’. For example, with the ‘New Bing’, you can ask, ‘Which restaurant should we go to for dinner for two people with a budget of 10,000 yen? If you type in ‘Please name three restaurants in Tokyo’, it will give you three specific restaurant names.

Personally, I believe that sentence generation AI is also a revolution in internet search.

A: You can also use the generated AI to write and review programme code. For example, if you tell it to ‘write code for an application like Windows Notepad’, the generative AI will output code that can be used as is.

It is common for people to learn about programmes by looking at code that someone else has written and imitating it, but with a generative AI, you can easily have the answer presented to you in an instant.

B: Recently, Google has launched an AI-based development support facility called Duet AI for Google Cloud. With this service, you can chat with the AI and tell it your requirements, allowing you to create applications without writing any code. It can also tell you about suggested fixes and bugs by inputting the code you have written.

Writers and programmers are really starting to lose their jobs to AI. So what can image-generating AI do?

A: As the name suggests, it is a technology that generates a variety of images. You can specify a wide range of painting styles, and it can create realistic photographic-like images or anime-style moe pictures. Many image generation AI services have already been created, and it is now possible to use different services according to the style of image you want to create.

B: Midjourney and Stable Diffusion, mentioned above, are representative services. These two services do not support Japanese input, but services that do support Japanese input are being created one after another.

Services such as ‘Niji Journey’ and Microsoft's ‘Bing Image Creator’ are available in Japanese and should be easy to use.

A: With image generation AI, you can have AI create illustrations and images on your behalf. For example, you can create website banners, SNS icons and images to be inserted into presentation materials. Images that you used to buy from photo material websites can also be created by the image generation AI.

B: Image generation AI has also developed dramatically over the past few months. At the beginning of this year, it was said that ‘AI is not good at depicting human hands’, and when we actually had it draw a human, the number of fingers and the shape of the hand looked strange.

To verbalise why this happens, AI does not recognise that ‘humans have five fingers’ when drawing images, but recognises many images as a pattern and then draws the hand in a way that ‘humans have a branched hand, but depending on the image, sometimes five are visible, sometimes only three are visible, and sometimes only the upper half is visible’. Sometimes there are no hands because only the upper half of the body is shown in the image.

In other words, there is not enough learning data to recognise that ‘humans have five hands’, so the image is drawn differently from reality. However, the technology has been updated in a few months to overcome these problems, and now it is possible to draw hands that do not look different.

A: For those who work in illustration and photography, AI has become a very real threat. Recently, some people have been using generative AI to create realistic images of people and selling them as photo books. Some people are also selling illustrations created with generative AI. However, this has come under fire on the internet due to copyright and other issues.

Please tell us more about copyright and other issues later. What other things are created by generative AI?

A: Services that automatically generate music have also been created, for example, on 11 May, google released a music generation AI service called ‘MusicLM’. The music generation AI can create suitable music and play it if you give it instructions, such as ‘a calm ballad to help you concentrate on your studies’.

By having a text generation AI such as ChatGPT write lyrics and combining it with synthetic voice technology such as Vocaloid, anyone can easily create even human-like songs.

B: Services that automatically generate 3D models are also emerging, such as OpenAI's Point-E, which is developing ChatGPT, and NVIDIA's Magic3D. This is a service where you can input what you want to create in text and it will output a corresponding 3D model.

It is a service that greatly reduces the man-hours required to develop digital services that use 3D.

A: Automatic video generation AI is also increasing. For example, the Stable Animation SDK, one of the animation generation services, creates animations by inputting text, images and videos.

However, although I am impressed with the video generation AI, saying ‘It's amazing that it can make such videos automatically’, at the moment, the quality is not as good as videos made by humans, and there are many points where I feel uncomfortable. Nevertheless, this will soon become a technology that will develop and make better videos than those made by humans.

In the future, I think it is highly likely that AI-created animations, YouTube videos and music videos will be created.

B: To summarise what I have said so far, digital objects that have been created by humans up to now will almost entirely be created by AI. It is often said that AI will take away the jobs of writers, programmers and illustrators, but the impact of generative AI is not limited to specific jobs.

Everyone who deals with digital data of any kind will be affected by AI. Of course, there are downsides to AI, such as losing jobs, but it can also be used to improve operational efficiency and create new business opportunities.

▶According to a joint study conducted by Nomura Research Institute and Associate Professor Osborne of Oxford University, in 10 to 20 years' time, artificial intelligence or robots could replace 601 jobs in Japan, which are occupied by 49% of the Japanese workforce.

Generative AI outputs what would be a probabilistically correct answer, not 100% prediction and output of everything.

It seems to me that if you have AI, you can do anything. This is a bit of a mundane question, but could you make money by having AI predict stock prices?

A: I don't think so. In the first place, AI is ‘based on learning data and gives answers and outputs that are probabilistically correct’. There is no correct answer because stock price fluctuations are something that will happen in the future. It can ‘make plausible predictions of stock price fluctuations’, but it can't predict prices perfectly.

B: Generative AI outputs what looks like a probabilistically correct answer, not the correct answer.

Perhaps in the future, people will be soliciting investment funds by saying things like ‘You can definitely make money with automated trading by AI!’ However, it is important to understand that ‘AI cannot predict everything 100%’. AI cannot derive the correct answer to something for which there is no correct answer.

I see. Then, some people are concerned about the misuse of AI, but in what specific ways could it be misused?

A: Deep faking using generative AI is already a problem in many places. Generative AI can be used to create a video from a single image that makes it look like the person is speaking. Using it to spread false information by creating videos that make it look like a celebrity or politician is speaking could lead to some incidents in the future.

Images alone can also be misused to create the illusion to the viewer that there has been a fake disaster or war damage. I am sure that the creation of fake images of celebrities and the use of images of beautiful women to deceive people will also increase in the future.

Another problem is that we have not yet been able to give the data output by the generative AI a human-like sense of ethics: the AI only produces output based on the training data, so depending on the question, it may express thoughts and opinions that are ethically problematic.

B: This is an area where there is debate about whether it is a bad thing or not, especially in the illustration industry, where the debate about the use of image-generating AI is gaining momentum. People who used to draw illustrations with their own skills have expressed frustration with those who seem to be using generative AI to create and sell their illustrations.

Wouldn't that be an infringement of copyright? Some people are of the opinion that this is a violation of copyright, isn't it?

A: Yes, that's right. The debate on copyright of generated AI, not just illustrations, is widespread worldwide. There are two main points of contention: copyright on the training data of the generated AI and copyright on the output data.

For example, if an illustrator's own illustrations are learnt and an illustration similar to his or her own style is easily created, there is a possibility of losing all his or her work. So I don't want my illustrations to be used for learning data without permission, and it is not surprising that some people think that using them as learning data is an infringement of copyright.

However, at least in Japan, using copyrighted material as learning data is not legally an infringement of copyright.

B: It is very unclear whether the output data also constitutes copyright infringement in Japan. Although this point will not be clarified until further judicial precedents are issued, the general view at present is that unless the data is very similar to a specific work, it is not considered to be copyright infringement.

So even if an illustrator's work is used as training data, it would be difficult to say that it is an infringement of copyright unless the output illustration is almost like a copy of a specific illustration?

A: Yes, that is correct. However, illustration sales services can independently take measures such as ‘banning the creation and sale of AI illustrations’, and there are currently moves to restrict the use of AI illustrations in this way.

B: However, Japan is one of the most tolerant countries in the world at present when it comes to copyrights related to generative AI. In the UK, for example, the use of other people's copyrighted material as training data is ‘only allowed for non-commercial purposes’, whereas in Japan it is basically allowed with no rules.

A: Japan is very permissive when it comes to creating something using generative AI, so it is a somewhat difficult environment for creators, but on the other hand, it is also an environment that facilitates the creation of new businesses.

What are some of the businesses that have been created using generative AI? Please also tell us about the businesses that are likely to emerge in the future.

A: Text generation AI is beginning to be used in a wide range of businesses. There are already several AI-authored books on the market, as well as websites where AI is writing articles. There is also a move towards using generative AI for language learning, with a number of services being released that allow users to chat with an AI in English and have it correct any errors.

B: I think that educational services using sentence generation AI and chatbots will increase in the future. For example, I think there would be demand for a service that created a chatbot for studying for exams, which would ask questions and explain the answers according to the user's grade and deviation level.

A: A service could also be created that not only teaches you something, but also simply talks to you. Generative AI gives natural answers as if it were talking to a human being, so it could be used to complain or ask for advice.

B: Some people are already playing with ChatGPT by giving it personality and setting instructions, and having it answer in the role of a specific character. For example, if you can create a beautiful woman using image-generating AI and combine it with voice-generating AI to have a conversation, you could create something like ‘AI Girlfriend’.

A: As for image-generating AI, there are already some very ambitious people who have started selling images, selling illustrations of AI-generated women or making and selling photo books, which have become the top-selling items on sales websites.

B: If video-generating AI develops further in the future, there will be an increase in the number of AI-generated video channels on YouTube, etc. Some AI Vtubers combining 3D models and chatbots have already been released, and they are gaining popularity.

A: Thanks to the rise of generative AI services, what used to be possible only for people with specific skills is now possible for anyone. I'm sure there will continue to be discussions about copyright and other issues, but I feel that the opportunity to create a business using generative AI is right now.

So, finally, what are some of the key points we should know about using a generative AI service?

A: You should know that with generative AI, the quality of the output depends on the prompt.

What is a prompt?

A: A prompt is a command to specify the content of the output to the AI. For example, for ChatGPT, a sentence asking a question such as ‘What is blockchain?’ is a prompt, which is a sentence that asks a question such as ‘What is blockchain? In the case of image generation AI, a prompt is a sentence that conveys the image of the image you want to create, such as ‘Japanese landscape’.

B: It is a bit like typing ‘What is blockchain?’ into the ChatGPT. ' and “Tell me how blockchain works in a way that a primary school student can understand”, there is a difference in the answer. Basically, the more detailed the prompt, the easier it is for the user to get the answer they are looking for.

A: It's the same with image generation: rather than typing in ‘Japanese landscape’, it's better to enter a prompt about when and where in Japan and what kind of things are in the landscape, so that the image is exactly as you imagine it to be.

B: People tend to think that with generative AI, it's easy to create what you want to create, but that's not actually the case. You need to be able to verbalise what you want and communicate that to the AI.

Prompts are so important that the term prompt engineering has been coined to refer to creating prompts and making sure they produce good quality output. There are even ‘prompt marketplaces’ that sell prompts for good image output.

A: It is good to try various things yourself to create good prompts, but it is also recommended to learn from other people's success stories. For example, there are a lot of ‘prompts that output images of beautiful women’ on the internet, so it is a good idea to copy them and add the necessary keywords in your own way.

The experience of using a generative AI service and seeing the output alone is enjoyable, so if you haven't used it yet, please give it a try.


Profile.

◉Mr A.

AI software developer. He is involved in the development of services that utilise generative AI.

◉Mr B.

Developing small businesses using generative AI; IT writer.


Related articles

AI PROMPT Mastering the magic of the modern age.

Industry-specific efficiency AI apps that could change Japan

SHARE
  • sns-x-icon
  • sns-facebook-icon
  • sns-line-icon
Side Banner
MAGAZINE
Iolite Vol.10

Iolite Vol.10

November 2024 issueReleased on 2024/09/29

Interview Iolite FACE vol.10 David Schwartz, Hirata Roi PHOTO & INTERVIEW "Yukos" Special feature "Trends in the cryptocurrency industry in Japan", "Trump vs. Harris: What will happen to the cryptocurrency industry?", "Was the reputation economy a prophecy?" Interview: Simon Gerovich, Metaplanet Co., Ltd., Kim Dong-Gyu, CALIVERSE Series Tech and Future Sasaki Toshinao...etc.

MAGAZINE

Iolite Vol.10

November 2024 issueReleased on 2024/09/29
Interview Iolite FACE vol.10 David Schwartz, Hirata Roi PHOTO & INTERVIEW "Yukos" Special feature "Trends in the cryptocurrency industry in Japan", "Trump vs. Harris: What will happen to the cryptocurrency industry?", "Was the reputation economy a prophecy?" Interview: Simon Gerovich, Metaplanet Co., Ltd., Kim Dong-Gyu, CALIVERSE Series Tech and Future Sasaki Toshinao...etc.