From ChatGPT's current location to the future of VR and the Metaverse, spun by generative AI.
ChatGPT has taken the world by storm as an interactive sentence-generating AI; what do its experts think of the future AI will create?
--Please join us today. First of all, from an expert's point of view, what do you think is the current state of AI?
Furukawa: First of all, speaking of image recognition AI, there was an image recognition AI contest in the US in 2012, from which so-called deep learning, or deep learning, methods spread, and the rate of image recognition increased dramatically. And by 2015, it had already surpassed human recognition accuracy.
In the text domain, in June 2020, OpenAI, the creators of ChatGPT, announced GPT-3, the predecessor to ChatGPT, and from around 2020, large-scale language models, known as LLMs, began to appear, and GPT-3 had a massive 175 billion parameters. GPT-3 had 175 billion parameters.
From this stage, the quality has reached the point where it is already indistinguishable from human-created text. In fact, technically, the quality has already improved significantly since the late 2010s, and of course the accuracy is still improving, but we are not talking about a sudden increase in the performance of something in the last six months.
There are two major factors that have contributed to the current excitement of ChatGPT: one is that the general public can now easily access the conversational interface of chatting. The other is that the language barrier has been eliminated, as people can input in Japanese and get a response in Japanese.
Conventional interactive AIs can also input Japanese, but English is still a smoother way to get a response.
The use of interactive AI as a virtual assistant permeates business.
--Will interactive AI create new jobs and businesses in the future?
Furukawa: It is often said that, in the form of interactive AI, everyone can have an assistant or partner at work, an excellent subordinate, colleague or senior colleague. I think this is probably one of the fundamentals that can be used in business.
The difference between a human partner and a junior or senior colleague is that, for example, if a senior colleague gives you work materials at a new workplace and says, ‘If you don't understand something, just ask me’, but when you go to ask him or her a question, he or she is not there. But when you go to ask a question, the senior is on the phone or concentrating on his work with a difficult look on his face, making it difficult to talk to him.
At such times, you can easily ask detailed questions to the AI that is learning the work material. This means that they can communicate with the AI anytime and as many times as they want, without feeling guilty or embarrassed or any other feelings. This will also promote information sharing between members of the team, which will result in less rework and lower costs.
Interactive AI is also referred to as ‘the personification of books’, but it can be used not only for the entire business, but also separately for each AI with specific knowledge in a particular area of expertise. For example, what kind of tweets would be effective and less likely to cause a firestorm on Twitter for PR would be asked to an assistant AI that is knowledgeable about social networking operations, which also understands previous tweets.
I think the existence of virtual assistants with this kind of in-house domain knowledge will expand as a new business across industries.
--The existence of interactive AI has become widespread with the advent of -ChatGPT, but I get the impression that it is still not well-known except to those who are strong in IT and digital fields.
Furukawa: That's right. When Sam Altman, chief executive of OpenAI, visited Japan in April this year, he said at a meeting of the Liberal Democratic Party that ‘the number of users in Japan will exceed one million’, and since the working population in 2022 is 67 million, that's about 1.4% in percentage terms.
In terms of the so-called ‘chasm theory’, it's probably in the top 2.5%, so I think we're still in the early days of the dawn of the age of the digital age.
Microsoft has announced the Microsoft 365 Copilot, which will incorporate more and more generative AI and AI functions into software such as Word and Excel.
In this way, people will be exposed to AI through Microsoft's products. In that context, I think we're probably in a state where we're not aware of ChatGPT or anything like that. They will be blending in as something like, ‘If I put in a short sentence, it will make it longer’ or ‘If I put in a long sentence, it is a useful function that extracts only the important parts’.
In Japan, I think 20% to 30% of listed companies use Microsoft products, so I think it will spread like air through existing products at a surprisingly early stage.
The image of ChatGPT is a place for processing rather than collecting ‘drunks with a lot of knowledge’ information.
--We received the word ‘personification of a book’, but I think that from the general user's point of view, interactive AI has a strong impression of being an ‘easy-to-use wiki’. Is this the inherent advantage of interactive AI?
Furukawa: It is still in its infancy, so we are still in a phase where we don't even know what kind of image the general public has of it.
On top of that, as I mentioned earlier, ChatGPT is an assistant. When you type in a question in the browser, ChatGPT answers the question, which is a test.
Naturally, in order to take the test, ChatGPT is studying for the test. How they are studying for the test is that they are just searching for something on the internet, and then they are shoving all the URLs into their heads from one side to the other, from the top to the bottom.
You are just cramming it into your head, but you are not systematically learning any knowledge in a systematic way, like a dictionary, so you can make mistakes.
When you think about it in this way, it's like the image of a ‘drunk who has so much knowledge’. He's very clever, but he's drunk, so sometimes he says the right things and sometimes he says things that don't make sense. I think it's good to have that kind of image as a premise.
To be more precise, ChatGPT is not a ‘place to collect information’, but a ‘place to process information’.
When you are trying to compose some email text, you can ask ChatGPT to compose a polite email for you. But that's no different from using Google search to look up email templates and templates. So it's the processing part that's important.
What we mean by processing is, for example, changing the tone of an email template depending on the TPO or situation. In other words, rewriting and rewriting the text. It's a very good place to process the material that you have.
--Do you think the spread of AI will have a positive impact on the Japanese economy?
Furukawa: There are many perspectives, but there are two main ones: firstly, the existence of OpenAI and ChatGPT in LLM is like the Apple Store in the smartphone era. In other words, the platform is in the hands of a foreign country.
So to begin with, I think it is important to fight on the rules, but it is still early days, so we turn to the rule makers.
Recently, Softbank and LINE have announced that they will jointly create a Japanese GPT. I am very grateful for the move by companies with both data and capital to participate in the development of the LLM model itself. The first premise that needs to be adhered to is that it is not just a matter of riding on foreign platforms.
After such a big talk, the second point of view is that I think AI is good in the sense that it can be a kind of, you know, external, compulsory pressure.
Personally, I don't think AI itself will change anything drastically, and I think it is largely positioned as ‘AI as a means of DX’.
In terms of DX in Japan, when ‘digital’ and ‘transformation’ are separated in ‘digital transformation’, digitalisation means, for example, converting paper documents into data and uploading them to the internet.
Transformation is about changing the business model itself and the profit structure of the company itself.
In the DX White Paper 2023 published by the Information-technology Promotion Agency (IPA) in March, it is very clear which parts of DX Japan is currently doing well and which parts it is not. The areas where Japan is doing well, in terms of improving operational efficiency and productivity - in other words, in the area of digitalisation - are almost as successful as those in the US.
On the other hand, the areas that are not being done are precisely the areas of transformation. Specifically, this means ‘creating new products’, ‘creating services’ and ‘changing business models’. Digitalisation has been achieved, but not transformation. This is the current challenge.
In response to this situation, the introduction of AI and LLM has created an image of ‘how easy it is to create texts’. In this context, there are sections where costs can be drastically reduced by using LLM, and products where new irreplaceable values can be added.
Generative AI can compensate for the missing parts of DX, accelerating the process of transformation.
From this perspective, I think it could be a catalyst for economic growth in Japan as a whole.
Text to VR generative AI is already on its way to creating a metaverse world at low cost.
--What areas are you focusing on now that we are ‘post-ChatGPT’?
Furukawa: I mentioned earlier that ChatGPT is good at processing, but when you categorise so-called generative AI as a product, you use terms like ‘Text to Image’ or ‘what to do with what’.
In the end, the final form is to combine all of these, and ultimately go as far as using so-called brain-machine interfaces to read and generate synaptic patterns by thinking with the brain.
That's a long way off, but a bit further ahead, there is currently a lot of attention on language models and text. Next to this is images, and with GPT-4 you can already use images as input.
What this means is that, for example, camera images can be fed into LLM in real time for analysis. For example, if the expression on the person's face or the pitch of their voice changes, it can be analysed to see if they might be lying.
As in the case of multimodal AI, where text is currently very advanced, images, and then video as continuous data, can be analysed. Then after that video comes so-called 3D, and after 3D, it will lead to things like AR and VR.
I like VR myself, but I also wonder how many VR first years there will be, and at the moment, generative AI is pushing it in terms of buzz. I think this is partly due to the lack of killer content in the end. Of course, there are problems with the hardware itself, but if there is killer content, the hardware will sell well.
It would be rude to say why there is no killer content, but I think the reason there are so few is simply that production costs are high. Even for a casual smartphone game, it costs 2 or 3 billion yen to make one, and you don't know if it will be successful.
When you produce a VR game, you use software like Unity or Blender, but it's human cost and time-consuming. If that can be done by generative AI to create a production situation like ‘Text to 3D’ or ‘Text to VR’, the cost of producing VR content will drop significantly.
As the cost goes down and the quantity increases, naturally, good quality content will emerge amongst it. That will become killer content and hardware will sell well.
So what I personally have high expectations for at the moment are the fields of ‘Text to 3D’ and ‘Text to VR’. I myself am actually aiming for that business. It has a business impact as well. Creating text from text is not that different in terms of the amount of information.
On the other hand, the leverage of information is by far the greatest when you can create a 3D image by simply inserting text. That's why I think ‘Text to VR’ is the hottest at the moment.
This will definitely lead to a metaverse, where the world can be easily created through text, and the world can be created from text. And we can see the path to this, so I think this is a very hot point.
This is an easy-to-read, in-depth book with expert commentary on ChatGPT, an interactive sentence generation AI that is attracting a lot of attention for its ability to generate natural sentences while having highly flexible conversations.
Despite its compact, quick-read format, the book provides a thorough explanation of ChatGPT's capabilities, potential and technical background, as well as concrete examples of actual usage, utilisation and business applications.
Profile.
◉Woichi Furukawa.
Born in 1992. Born in Kagoshima Prefecture. Graduated from the University of Tokyo, Faculty of Engineering. Director and Chief Technology Officer of Digital Recipe Inc. He is the founder of Slideflow, which creates websites from PowerPoint presentations, and Catchy, an AI writing system using GPT-3. His book, ‘Reading Ahead! IT x Business Lecture: ChatGPT The Future Created by Interactive AI’ has sold more than 60,000 copies.
Interview Iolite FACE vol.10 David Schwartz, Hirata Michie
PHOTO & INTERVIEW Nakamura Shido
Special feature: "Unlocking the Future: The Arrival of the AI Era," "The Ishiba Cabinet is in chaos with hopes and fears intersecting. What will happen to Japan's Web 3.0 in the future?" "Learn about the tax knowledge necessary for cryptocurrency trading! Explaining the basics and techniques that can be used even now"
Interview: SHIFT AI Kiuchi Shota, Digirise's Chaen Masahiro, Bybit's Ben Zhou, Monex Group Inc.
Zero Office Head/Monex Crypto Bank Bandai Atsushi and Asami Hiroshi, Kaoria Accounting Office Representative and Active Tax Accountant Fujimoto Gohei
Series Tech and Future Sasaki Toshinao...etc.
MAGAZINE
Iolite Vol.11
January 2025 issueReleased on 2024/11/28
Interview Iolite FACE vol.10 David Schwartz, Hirata Michie
PHOTO & INTERVIEW Nakamura Shido
Special feature: "Unlocking the Future: The Arrival of the AI Era," "The Ishiba Cabinet is in chaos with hopes and fears intersecting. What will happen to Japan's Web 3.0 in the future?" "Learn about the tax knowledge necessary for cryptocurrency trading! Explaining the basics and techniques that can be used even now"
Interview: SHIFT AI Kiuchi Shota, Digirise's Chaen Masahiro, Bybit's Ben Zhou, Monex Group Inc.
Zero Office Head/Monex Crypto Bank Bandai Atsushi and Asami Hiroshi, Kaoria Accounting Office Representative and Active Tax Accountant Fujimoto Gohei
Series Tech and Future Sasaki Toshinao...etc.