ChatGPT is a system that has gained a lot of attention lately.
Several functional requirements must be considered when designing it, such as creating, updating, viewing, and deleting conversations. Additionally, rating a response by giving thumbs up or down can help train the model.
Text-based inputs in English are assumed, and inputs go through a sanitization phase to remove profanity and detect insults.
Non-functional requirements, such as latency, security, and scalability, are also essential. The latency for server response can get long since processing time at the back end is extensive. Login flows and rate limitation procedures can prevent DDoS attempts aimed at the system, ensuring its safety.
Moreover, the rate limiter must be scalable, enabling the system to host several users simultaneously without a hitch. A scalable database that can handle this storage must be designed, with NoSQL being an ideal option.
The average message size is estimated at 100 bytes, where 200 million messages per day translate to 20 GB per day and 7.3 terabytes overall annually, equal to 76 terabytes for ten years.
The high-level design of the system should include a conversation service to manage the conversation between users and the ChatGPT model, the core part of the system. Before a message goes through, it undergoes sanitization and analysis to ensure it meets the stipulated standards.
The results from the ChatGPT model are sent back for storage in a database, which stores a single conversation. Finally, a thumbs-down rating implies that the model needs to be retrained, and a risk model is designed to detect the legitimacy of user ratings.
The conversation service is a Rest API, with operations such as creating, deleting, viewing, and sending a message. Each conversation has a unique ID for each message.
The user can rate each message with a thumbs up or thumbs down.
The data is stored in a NoSQL database, where the conversation table contains various conversation IDs, while each message contains an ID, text, author, and parent.
A Transformer model is used, where the system predicts natural sequences of words. The model is trained on internet data such as websites, books, and Wikipedia links to provide semantically meaningful and grammatically correct replies.
The model can use multiple approaches such as top K, greedy, or nucleus temperature to select the most accurate prediction.
A dataset of question-and-answer pairs is initially used to train the model.
However, since training the model on every possible question-and-answer combination is impossible, a reward model is designed to select the best responses.
The reward model is also trained using reinforcement learning, where the chatbot is rewarded for giving appropriate responses and penalized for inappropriate ones.
The chatbot design is expected to support different input and output formats, including images, audio, and video. This approach ensures that even if the model does not have a large dataset initially, it can still provide accurate responses in natural language.
The reward model considers the emotion and tone of the question and answer and continuously trains itself to improve accuracy.
Designing ChatGPT requires considering functional requirements such as creating, updating, viewing, and deleting conversations and non-functional requirements such as latency, security, and scalability.
The system uses NoSQL for database storage and a Transformer model for ChatGPT, which is trained on internet data. Additionally, chatbots are designed using supervised fine-tuning, a reward model, and reinforcement learning, ensuring that they provide accurate responses in natural language and support different input and output formats.
Exponent is the fastest-growing tech interview prep platform. Get free interview guides, insider tips, and courses.Create your free account