Table of Contents
1. ChatGPT: 30 Year History
youtube : ChatGPT - 30 Year History - How AI Learned to Talk
Recurrent neural network:
- Serial Order: A Parallel Distributed Processing Approach 1986. First paper on sequential learning.
- Symbol sequence to sequence. Train by predicting next symbol.
- Could used to generation of sequence -> Trajectory pattern.
- Finding structure in time 1990, large network train on language.
- Observation of word boundary detection. Observation of word clustering. -> Semantic ?
- Generating Text with Recurrent Neural Networks 2011, push the experiment forwards.
- Word compression in language understanding.
- The Unreasonable Effectiveness of Recurrent Neural Networks 2015. First “large” language model Hinton/Sutskever.
- Learning to Generate Reviews and Discovering Sentiment 2017 (Ilya from OpenAI) trained on Amazon reviews (larger).
- Sentiment neuron emerge from training to predict next word.
- Size limiting the performance.
Transformer (encoder - decoder):
- Impractical to train.
- Attention Is All You Need 2017, adaptive connection.
- Leads to shallower wider network, practical to train.
- look at everything all at once, without need of internal memory.
GPT-1 : Improving Language Understanding by Generative Pre-Training 2018 use transformer in next word prediction. trained on 7000 books. show zero-shot behaviors.
GPT-2 : Language Models are Unsupervised Multitask Learners 2018 same approach. data from Reddit, much large network. But still drift to none sense after many sentences.
GPT-3 : same approach, 100 bigger network - 175 billion parameters. trained on entire network data. context learning during inference.
ChatGPT : shape the network to better follow human instruction.
2. ChatGPT - Higher Level Discussions
ChatGPT is no stochastic parrot. But it also claims that 1 is greater than 1 2023. Gamer changer but not real intelligence.
- intelligent “behavior” != “intelligence”.
- ChatGPT is much more than a stochastic parrot. It can generate novel propositional content and respond to arbitrary questions and scenarios coherently and informatively.
- Testing with more questions:
- From the example of “happy Italian”, we found that ChatGPT can describe modus ponens, but not able to apply it properly. (
ChatGPT needs correct human guide , as in LLM discussion on P!=NP) - ChatGPT cannot handle spatial layout and ordering relations.
- ChatGPT works poor with math problems. (
apparently LLM4Sciencedoes a better job. )
- From the example of “happy Italian”, we found that ChatGPT can describe modus ponens, but not able to apply it properly. (
- What is Understanding : one knows how to correct and improve it. Example of writing an essay on Descartes’ Meditations is not to summarize what has already been said, but to :
- take the electronic text of one of the Meditations and try to improve its translation into English (thus one learns to check the original);
- clarify the less clear passages with a more accessible paraphrase (thus one sees if one has really understood the text);
- try to criticise or refne the arguments, changing or strengthening them (thus one realizes that others have tried to do the same, and that is not so easy);
- and while doing all this, learn the nature, internal structure, dynamics and mechanisms of the content on which one is working.
- LLMs are fragile.
- LLMs are not stochastic parrots. LLMs synthesize texts in new ways, restructuring the contents on which they have been trained, not providing simple repetitions or juxtapositions. LLMs are like trickster: they gobble data in astronomical quantities and regurgitate (what looks to us as) information.
- Other issues (consider DALL-E, text to image generator):
- copyright and reproduction right of the data used in trainning.
- mental health problem, caused by some harmful content.
- security considerations.
- financial and environmental costs of new systems.
- AI-to-AI bridge. Socratic Models, Wolfram-Alpha & ChatGPT.
Large Language Model for Science: A Study on P vs. NP.
New Theory Suggests Chatbots Can Understand Text
- It’s important that AI scientists reach consensus on risks-similar to climate scientists, who have rough consensus on climate change-to shape good policy. conversation: Geoff Hinton with Andrew Ng
- Random Graphs, which give rise to unexpected behaviors after they meet certain thresholds, could be a way to model the behavior of LLMs.
3. Algorithm Papers
LLaVA: Large Language and Vision Assistant 2023, Open source LLM, work better for visual instruction.