The end of summer is here, and it's time for another update from the Flightdeck. Did you know you can get these updates in your inbox every two weeks? Sign up for the newsletter below.
Added support for logging
We have added support for error and debug logging to the Seaplane platform (after many of you asked!). Previously, you had to work with us directly, or in other words, reach out to me on Slack to get your latest debug log. Not the greatest developer experience, but in the words of Paul Graham - “Do things that don’t scale.”
But no longer! You can now access your log stream (STDOUT and STDERR) directly from your shell using the new plane command.
While it may seem like a small feature, it’s a very important step for full self-service usage of Seaplane. Coming in early 2024!
Docusaurus Chatbot Demo
Two weeks ago, we showed you Seaplane GPT. A chatbot with knowledge of our documentation pages. We have since open-sourced that project on GitHub. If you or someone you know has a docusaurus-based developer portal, tell them to check it out and build a GPT-style chatbot with knowledge of your product in minutes.
In this new section of our newsletter, we will highlight interesting industry articles, research or other highlights we think are worth delving into.
The transformer killer (yet again)
Microsoft's RetNet has emerged as a promising innovation, named the transformer killer by some. It aims to address some of the key challenges faced by existing models without overwhelming complexity.
RetNet focuses on achieving a balance between three crucial aspects:
- Training Parallelism: RetNet combines parallel training from Transformers with a new recurrent inference approach. This not only optimizes GPU usage but also speeds up training.
- Inference Cost and Memory Complexity: RetNet introduces a retention mechanism that substantially reduces inference costs and memory requirements compared to Transformers. It linearizes memory complexity, offering a more efficient alternative.
- Performance: Remarkably, RetNet delivers language modeling performance on par with or even better than Transformers, thanks to its innovative retention mechanism.
At the core of RetNet is a multi-scale retention mechanism, replacing multi-head attention from Transformers. This change streamlines language modeling without introducing excessive computational demands.
RetNet employs parallel representation for training, similar to Transformers but with thoughtful optimizations for efficiency. It transitions to a recurrent representation during inference, achieving similar results as parallel training but with significantly reduced memory demands.
Keep an eye on RetNets. It promises to enable the development of faster, more memory-efficient language models that can match or even exceed the performance of current state-of-the-art models.
You can read more about it in the official paper: Retentive Network: A Successor to Transformer for Large Language Models.
For a more easy to understand version, we recommend this excellent blog post by Shantanu Chandra titled Retentive Networks (RetNet) Explained: The much-awaited Transformers-killer is here