AI Update: Emerging AI App Architecture Simplified, Training GPT4 To Write In Your Style

Today's issue is a little different. Hope you find it useful to better understand the underlying technology.

The Emerging Architecture for Large Language Model Applications

Andreessen Horowitz released a great technical explanation of the latest evolution of the architecture for LLM Applications. By that, we mean applications built on top of the language models like ChatGPT, Claude, Vicuna, etc.

Because of the highly technical aspect of the article, I had it reimagined by ChatGPT so that a sixth grader could relate to the concepts. I think this is beneficial for you and me to better understand what is going on behind the scenes.

The original article has links to all of the projects mentioned in the architecture.

Emerging LLM App Stack

The example stack is based on in-context learning which is the most common design pattern used to start out with.

Design Pattern: In-context learning

So, think about in-context learning as a way to train a robot to answer specific questions using only a small amount of relevant information, instead of having to read a whole book or multiple books about the topic. We use this technique when we use AI, like a chatbot, that helps answer questions about complex topics like legal matters, for example.

Imagine you have a massive bookshelf filled with books about the law. Now, let's say you want to know something very specific, like "What's the legal age to vote?". One way to get an answer could be to let the chatbot read every single book, which would take a lot of time and the bot might get tired or confused. This is where in-context learning comes in handy!

Instead of reading all the books, the chatbot (or Large Language Model, LLM for short) is going to select only the books or pages that are most likely to contain the answer to the specific question. It uses its knowledge to decide which books are relevant. This makes the process faster and more accurate!

This happens in three stages:

  1. Data preparation: All the information is split into smaller parts and stored in a way the chatbot can find it easily.

  2. Question asking: When you ask a question, the chatbot picks the most relevant pieces of information from the storage, adds some basic rules, and prepares a detailed question to be processed.

  3. Answering the question: The chatbot then uses all the selected information to provide an answer to your question.

What's great about this method is that it's often easier and cheaper than trying to teach the chatbot everything about the topic from scratch. This approach also allows us to incorporate new data quickly.

The challenge is that the bot can only read a certain amount of information at once, which is like the bot's "reading window". Some researchers are trying to expand this window, but it takes more time and costs more money. Today, it would be very expensive for the bot to read thousands of pages at once.

This might seem complex, but in-context learning is a useful technique to solve problems in a smart and efficient way using AI. It's like asking a friend who is an expert in a specific topic a question, instead of trying to learn everything yourself!

Data Preprocessing/Embedding

Data Preprocessing/Embedding

Imagine you have a super-smart robot that you want to teach about a topic. You could provide it with loads of information like books, articles, charts, and lists (which are like PDFs, CSVs, or SQL tables in computer language). This is the robot's study material or the 'contextual data'. To help the robot study efficiently, you'd organize all this data in a certain way, which is what we call 'data-loading and transformation'.

Just like you would use shelves, folders, or boxes to organize your study material, developers use tools like Databricks or Airflow. Some even use special software, like LangChain or LlamaIndex, to do this. However, we believe this organizing method could be improved, and there's room for tools specifically made for organizing the robot's study material or LLM apps.

Once the robot has the study material, it needs to understand and remember it. One way is to summarize or represent this information using 'embeddings'. Most developers use a tool from OpenAI for this, and others use solutions like Cohere or Sentence Transformers from Hugging Face. Some even tailor-make these summaries for specific topics.

Now, imagine if your robot had to remember billions of summaries! It would need enormous storage space. In the computer world, we call this a 'vector database'. The most popular choice for this is Pinecone, kind of like a giant warehouse for the robot's knowledge.

There are many types of these warehouses. Some are open source (free to use and modify), like Weaviate, Vespa, and Qdrant. Others, like Chroma and Faiss, are more suited for small robots or testing. There's also pgvector, which is like an add-on to a regular warehouse and useful for those who prefer sticking to what they know.

The future will likely bring new and better storage options as companies work on their own solutions. But creating a high-performing warehouse in the cloud is a tough problem. We are curious to see whether there will be one or two popular choices in the future, just like in the real-world warehouse industry.

An interesting question is what will happen as robots get better at studying and can handle more material at once (the 'context window'). Some may think that summarizing or 'embedding' may become less important, as more data can just be added directly. But experts believe that the summarizing process will become even more important, as it allows the robot to study efficiently. We might see new types of summaries and warehouses designed to help with this in the future.

Prompt Construction/Retrieval

Prompt Construction/Retrieval

So, let's say you have a super smart robot that you want to give instructions or ask questions. This is what we call "prompting". Now, how you ask these questions or give instructions can make a big difference in how well your robot understands and answers.

At first, you might try simple prompts like direct questions or give some example answers. While this often works well, it might not be precise enough for real-world use.

The advanced level of prompting is like a ninja technique ("jiu-jitsu") which helps the robot give more accurate responses. It does this by grounding its answers in solid information and using external context it wasn't trained on. There are many cool techniques like chain-of-thought, self-consistency, and the tree of thoughts. Developers can use these in different ways for things like answering questions about a document or making chatbots.

Orchestration frameworks like LangChain and LlamaIndex help a lot here. Imagine these like a conductor of an orchestra, coordinating various parts to work together. They help manage the chain of prompts, work with external data sources, retrieve relevant data, and remember previous interactions. They also provide templates to make things easier. LangChain is one of the most popular ones, especially among hobbyists and startups.

While LangChain is still new, it's already being used to build real-world applications. Some developers prefer to use basic programming (raw Python) to have more control, but as these frameworks improve, we think more people will start using them, just like how web apps have evolved.

An interesting mention here is ChatGPT. Normally, it's an app, but it can also be used as an orchestration framework. It helps with managing prompts, remembering previous interactions, and fetching data. While it's not exactly the same as other orchestration tools, it could become a simpler alternative to building prompts in the future.

Prompt Execution/Inference

Prompt Execution/Inference

Let's imagine you're building a Lego structure with your friends. OpenAI is like the instruction manual that everyone starts with to create a cool Lego structure. It's easy to use and helps you build a wide variety of things.

Now, let's say you've built your Lego structure and you want to make it bigger or more complex. Here, you start looking at different ways you could enhance it:

  1. You might use a simpler instruction manual (like switching to gpt-3.5-turbo) that still gives a good structure but saves your Lego blocks and builds faster.

  2. You could also experiment with different instruction manuals (like Claude models) which could give you fast results and allow more custom designs.

  3. Or, you could try to mix and match ideas from other open source instruction manuals, especially useful when you have many friends helping you and you need to build quickly and cost-effectively.

For the last point, you could also look at platforms like Databricks or Anyscale, which are like guidebooks on how to enhance your Lego models. There are also services that help you easily access these open-source instructions (like Hugging Face and Replicate).

Now, while these open-source manuals are not as advanced as the OpenAI manual, they're catching up fast. The LLaMa models, for instance, have been a game-changer and have inspired a lot of new versions. As these improve, we can expect to see lots of new and creative Lego structures being built!

While most builders focus on the design and construction of the Lego structure, some also pay attention to tools that help manage the process. For example, some use a system (like Redis) to remember certain steps to make building faster. Tools like Weights & Biases or PromptLayer are like notepads that help log, track, and evaluate how your Lego structure is turning out. This helps in improving your building process and choosing the right manual. Some are also creating tools to validate their Lego structures and ensure no one is cheating.

Lastly, your Lego structure needs a place to be displayed. This could be your home, a friend's house, or a Lego exhibit. Similarly, LLM apps need a place to be hosted. Common choices are like your home (Vercel or major cloud providers). New companies like Steamship are like Lego exhibits, providing a place to display your Lego structure, and even help enhance it! And then companies like Anyscale and Modal are like a friend's house, where you can display your model and the guide you used to build it.

What about agents?

Let's think about AI as a video game character. Right now, the character (AI) can talk really well and can understand a lot of things you say or ask. This is what the technology called LLM, or large language models, does. You can even make the character say things or answer questions that are very new or fresh. It's really cool!

However, imagine if this character could do more than just talk. What if it could also solve complex puzzles, interact with the game world, or even learn from its past actions to get better at playing the game? This is what AI agent frameworks aim to do. AutoGPT is an example of this and it's being developed to make the character (or the AI) more autonomous, which means it can do more things on its own.

So you can think of AI agent frameworks as tools to make your game character smarter and more capable. They could become a major part of how we make these AI characters in the future, or even take over the whole process if they get really good!

However, there's a challenge. These agent frameworks are like brand-new video games. They can do some amazing things and have cool demos, but they're not fully ready yet. They're like a game in beta testing—it can be fun and shows a lot of promise, but might have bugs and doesn't always work as expected. But don't worry, scientists and engineers are keeping a close eye on these and are working hard to make them better and more reliable.

Looking ahead

Okay, so imagine you're playing with a huge set of LEGO blocks. In the beginning, each LEGO block is a separate piece, and you have to fit them all together to build something awesome. This is similar to how traditional software was built, where developers have to code each part separately and then put them together to create an application.

Now imagine that you got a new LEGO set, but this time some of the blocks already come pre-assembled into bigger parts. This means you can build cool things much faster and easier because a lot of the work is already done. This is what pre-trained AI models are like. They're like a kit of pre-assembled LEGO parts that developers can use to build AI applications.

The important part is, just like you can build super cool stuff with the new LEGO set, developers can now build amazing AI applications using these pre-trained models. And the best part? They can do this really fast. It's like a major change in how we play with LEGOs, or in this case, build software.

The tools and ways we use to play with these new LEGO blocks (or use these AI models) are probably going to change and get even better over time. And just like how we would share new ways to play with LEGOs with our friends, we're going to share these new changes and improvements with everyone. So if you have any ideas or suggestions on how we can make things even better, we'd love to hear them!

Training GPT4 To Write In Your Style

In the past, I would use a prompt to determine the tone, style, and structure based on a sample of writing. Last night, going through my Feedly feed of the latest AI posts on Reddit, I came across this thread on a new way to train the model.

Prompts to accomplish this (you must have a plethora of your own writing available to use for the data portion of these prompts)

1) I am going to give you a large volume of text that you will use to analyze my writing style. Do not do anything with this data until I tell you otherwise. For the remainder of the conversation, you will take the role of a human psychology professional who specializes in speech. You will value aiding me in achieving a comprehensive breakdown of my data in the ways I request. If you understand this then ask me for the data. After each analysis ask me what we will do next and when I give you another command you will use the data I am about to give you to perform the analysis of the next question. Repeat this until I tell you I am done using very specific wording which will be "I am done giving you analysis requests." without the quotations.

(we feed GPT the text data here)

NOTE: In My Testing ChatGPT, Basically Said It Could Not Perform #2. 3,4,5 all provided some good data to be used to develop a style.

2) I want you to break it down into a list of how frequently I use each word. The list should be separated by commas and should be sorted from most used to least used. Remove words not in the top 1000 words in the English language. Remove specialized words that are specific to niches that the average person would not understand. Label the list, without quotation marks, as "Words in priority order of usage:" If you understand this and can do so in a way that would assist me in developing a data-set that would aid in an AI tool's ability to write exactly like me then produce the list as requested. If you do not feel this will aid in the goal I have just outlined then say so.

3) I want you to make a list of the following: Idioms found in the data, mannerisms found in the data, misspellings found in the data, and niche words used in the data. Label the list, without quotation marks, as "idioms and mannerisms found within my speech:" Formulate another list proceeding this one that specified specific personality traits that your psychology background in speech would identify as prominent. List this using commas to separate each personality trait. Label the list, without quotation marks, as "personality traits found in my writing:" If you understand this and can do so in a way that would assist me in developing a data-set that would aid in an AI tool's ability to write exactly like me then produce the list as requested. If you do not feel this will aid in the goal I have just outlined then say so.

4) I want you to identify specific words I use to represent my individuality and the encompassing speech patterns and create a list of those words. Label the list, without quotation marks, as "specific traits of my speech:" If you understand this and can do so in a way that would assist me in developing a data-set that would aid in an AI tool's ability to write exactly like me then produce the list as requested. If you do not feel this will aid in the goal I have just outlined then say so.

5) I want you to use your knowledge of interpersonal communications to identify specific traits in my speech. Use your professional knowledge to identify which aspects and criteria is important for this. Label lists appropriately based off of their content. After doing so output the lists for me in an easily readable format.

Data analysis is done at this point. Create a new chat with GPT4 and input this:

“I want you to assume the role of a language tool that will take data meant to specify details of a personality, writing style, etc. and I want you to use your expert knowledge on human psychology to process this list for future responses. Future responses will be written using the writing style you decide is appropriate based off of the data I give you. The data I give you is meant to aid in you writing in a specific style. Do not use the data to write with but instead use it to formulate an idea of how you should write based off of the data. You should not use artistic liberties outside of writing creativity. Writing creativity should be based off of the data and the encompassing writing style that arises from using the data. If you understand and can do this then please ask me for the data that you will use to figure out the writing style for future responses.”

My plan is to now implement this set of prompts in a Make automation and feed it content from myself and clients to develop the writing styles based on their existing content much like what we are doing for long-form content. Basically, we need to take long-form content, break it up and feed it through this series of prompts.

Example Output From The Style Extraction Above

From Prompt #3

Idioms and mannerisms found within my speech:

Idioms:

  1. "Take a step back" - to review or reassess a situation or approach.

  2. "Carry the torch" - to continue or take over something that is thought to be important.

Mannerisms:

  1. Usage of rhetorical questions.

  2. Frequent use of direct address ("you", "Hey everybody").

  3. Usage of informal and colloquial language ("Yeah", "I'm").

  4. Regularly using terms related to sales and sales psychology ("Yes Momentum Sales Technique", "Attitude Type Closing Sale").

Misspellings: There were no notable misspellings in the provided text.

Niche words:

  1. Yes Momentum - A sales technique.

  2. Investigative Selling - The title of a sales book.

  3. Tom Hopkins - A renowned sales expert and author.

  4. Zig Ziglar - An American author, salesman, and motivational speaker.

Personality traits found in my writing:

  1. Enthusiastic: Your text is full of energy, excitement and passion about your subject matter.

  2. Engaging: You have a way of writing that actively draws in the reader, through use of direct address and rhetorical questions.

  3. Knowledgeable: You demonstrate extensive knowledge about your subject, citing specific individuals, techniques, and writings in the field of sales.

  4. Reflective: You often look back at your experiences and learnings from the past, applying them to present contexts.

  5. Persuasive: Your use of sales terminology and techniques, as well as your overall positive and confident tone, contribute to a persuasive writing style.

  6. Organized: You present information in a clear, step-by-step manner, making your thoughts and insights easy to follow.

From Prompt #4

Specific traits of my speech:

  1. "Hey everybody" - This greeting at the start sets a friendly, welcoming tone.

  2. "I'm here and I'm ready" - Conveys your eagerness and readiness to share your knowledge and ideas.

  3. "I love" - This expression of passion is repeated several times, emphasizing your enthusiasm for your work.

  4. "I reflected on" - Shows a thoughtful, contemplative aspect of your character.

  5. "I've learned the most from" - Suggests an appreciation for learning and mentorship.

  6. "Sales" - A recurring keyword in your speech that highlights your professional focus.

  7. "Yes Momentum" - A distinctive phrase used to describe a sales technique that you explain in detail.

  8. "You've got to" - Your use of this phrase suggests a direct and candid communication style, often used to emphasize important points or advice.

  9. "Always, smile, upbeat attitude" - This trio of words captures your positive outlook and approach.

From Prompt #5

Aspects of Openness:

  1. "Hey everybody" - You start with a welcoming and open greeting.

  2. "Thanks for showing up" - Appreciation towards others for their time.

Aspects of Empathy:

  1. "I reflected on some of my mentors" - Shows your ability to understand and share the feelings of others.

  2. "It's important to realize" - Acknowledging the importance of understanding changes in society and behaviors.

Aspects of Supportiveness:

  1. "I'm ready to deliver something that's going to be quite fun today" - Shows an encouraging and positive approach towards your audience.

  2. "I've got this book here because there's a few things that I've...dog eared to go over to cover for you" - Indicates your willingness to provide resources for your audience.

Aspects of Positiveness:

  1. "Always, smile, upbeat attitude" - A recurring theme, emphasizing your positive outlook.

  2. "You've got to be prepared, you've got to be ready" - Shows your belief in the power of preparation and positive attitude.

Aspects of Equality:

  1. "Don't know who originally started talking about that" - Shows your humility and openness to ideas from others.

Your interpersonal communication traits can be summarized as follows:

  1. Openness: You are welcoming and show an open-minded attitude.

  2. Empathy: You can understand and relate to the experiences and feelings of others.

  3. Supportiveness: You're encouraging and offer resources and guidance.

  4. Positiveness: You maintain a positive and upbeat attitude.

  5. Equality: You show humility and an appreciation for the contributions of others.

Images Of The Day

Prompt: An 1800s wet plate ambrotype closeup of bill gates in Seattle. Sepia toned, grainy, imperfections. --ar 3:2 --style raw

Prompt: An 1800s wet plate ambrotype closeup of bill gates in Seattle. Sepia toned, grainy, imperfections. --ar 3:2 --style raw

Sincerely, How Did We Do With This Issue?

I would really appreciate your feedback to make this newsletter better...

Login or Subscribe to participate in polls.

Today’s issue was a little different. Trying to understand the underlying architecture as something I personally wanted, I hope you found the simplification useful as well.

I will followup later this week on the progress of using the method from Reddit to develop a writing style.

Until Tomorrow,
Kevin Davis