top of page

AvatarAI

An interactive API project

Description:

AvatarAI is a prototype project I created during an internship, serving as a prototype for potential future integration into other software to facilitate a more interactive Q&A interaction between users and the AI agent. By integrating the APIs from ChatGPT and GooeyAI, I created an animated PNG avatar that provides a dynamic and immersive user experience. Rather than offering standard AI responses, this avatar speaks in a realistic manner, incorporating distinctive personality traits, styles, and even popular slang.

Role:

  • Internship Project

Date:

  • May ~ Jun, 2023

Skills:

  • API Integration (ChatGPT, Gooey AI)

  • Avatar Animation

  • Conversational AI Development

  • User Experience (UX) Design

  • Testing & Iteration

Note: This video contains audio.

Inspiration

I've always been fascinated by the potential of AI and the myriad ways it can be utilized to improve and enrich our lives. This project was particularly inspired by the idea of creating an AI interaction that goes beyond the conventional, one that could mimic human conversation with uncanny realism. The challenge of creating an AI with distinctive, human-like personality traits was an exciting proposition that drove this project forward.

image.png

Chatgpt ai chat

image.png

D-ID ai chat

Development process

探索Gooey API的能力

整个项目的开发过程式从探索Gooey API的功能并在Glitch上草拟初始UI布局开始。目标是创建一个引人入胜的互动平台,让用户能够轻松地与AI互动。

Testing Gooey AI's animation capabilities and initial UI placement

Exploration with ChatGPT

Next, the focus shifted towards experimenting with the GPT-3 model's output to incorporate certain personality traits and styles. The aim was to enhance the human-like nature of AI responses, with adjustments for sarcasm, usage of slang, and even mimicking personalities like Anthony Jeselnik. This step involved understanding and exploiting the GPT-3's versatility.

Testing Prompt

"Respond to the question like Anthony Jeselnik with VERY dark humor, EXTRA FUNNY, EXTRA sarcastic, stand up comedy style, using POPULAR slang words."

输入

"How are you doing today?"

Result

GPT-3回应示例 1: 哦,我很好,我的朋友。我的意思是,谁不喜欢被我们自身死亡的不断提醒所包围呢?生活不过是痛苦和失望的无尽游行。不过嘿,我想我没什么好抱怨的,对吧?

11:45 PM

GPT-3回应示例 2: 既然你这么想知道,那你今天哭了几次呢?哦,别忘了,你自早餐以来经历了多少次存在危机?让我们保持轻松愉快,好吗?

11:45 PM

Integration and Final Coding

The final stage revolved around integrating the APIs from ChatGPT and GooeyAI, synchronizing them to create an animated PNG avatar. Overcoming various tech nical challenges during this stage not only solidified my problem-solving abilities but also expanded my technical skills.

Additionally, I implemented a "CHANGE" button to allow users to switch the "personality" or "individual" they are interacting with.

Weixin Screenshot_20230720021704.png
Weixin Screenshot_20230720021715.png

Reflection

Learning Experience
This project was an enriching journey that significantly deepened my understanding of AI and machine learning principles. I had the opportunity to delve into effective API integration, honing my skills in prompt engineering. I further explored the capabilities of language models like ChatGPT and improved my full-stack development capabilities by working with JavaScript and Node.js.

Project Insights

  • Creation of AI Speech Avatar Prototype: The effective combination of various APIs led to the development of the company's first-ever AI-generated speech avatar prototype. This opens new avenues for user interaction and engagement and diversifies the company's AI product line.

  • Enhanced Conversational Personality: The adjustments I made in prompt engineering refined the AI's conversational style to exhibit more customized characteristics, such as sarcasm, deviating from the typical bot-like responses. This unique approach has the potential to elevate user satisfaction and engagement over time.

  • Scalability and Iteration: Through my hands-on work on both the front-end and the back-end, I've laid the groundwork that makes the system more scalable and makes it easier to make changes in small steps. Because of what I did, the system is now better set up for long-term adaptability and stability. This makes it easier to add new features and make changes in the future.

bottom of page