Audiobooks & Original Audio | Get your first book free!

Episodes

Does the DIFF Transformer make a Diff?

Nov 9 2024

Introducing a novel transformer architecture, Differential Transformer, designed to improve the performance of large language models. The key innovation lies in its differential attention mechanism, which calculates attention scores as the difference between two separate softmax attention maps. This subtraction effectively cancels out irrelevant context (attention noise), enabling the model to focus on crucial information. The authors demonstrate that Differential Transformer outperforms traditional transformers in various tasks, including long-context modeling, key information retrieval, and hallucination mitigation. Furthermore, Differential Transformer exhibits greater robustness to order permutations in in-context learning and reduces activation outliers, paving the way for more efficient quantization. These advantages position Differential Transformer as a promising foundation architecture for future large language model development.

Read the research here: https://arxiv.org/pdf/2410.05258

Show more Show less

8 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Automating Scientific Discovery: ScienceAgentBench

Nov 8 2024

Introducing, ScienceAgentBench, a new benchmark for evaluating language agents designed to automate scientific discovery. The benchmark comprises 102 tasks extracted from 44 peer-reviewed publications across four disciplines, encompassing essential tasks in a data-driven scientific workflow such as model development, data analysis, and visualization. To ensure scientific authenticity and real-world relevance, the tasks were validated by nine subject matter experts. The paper presents an array of evaluation metrics for assessing program execution, results, and costs, including a rubric-based approach for fine-grained evaluation. Through comprehensive experiments on five LLMs and three frameworks, the study found that the best-performing agent, Claude-3.5-Sonnet with self-debug, could only solve 34.3% of the tasks using expert-provided knowledge. These findings highlight the limitations of current language agents in fully automating scientific discovery, emphasizing the need for more rigorous assessment and future research on improving their capabilities for data processing and utilizing expert knowledge.

Read the paper: https://arxiv.org/pdf/2410.05080

Show more Show less

10 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Prune This! PyTorch and Efficient AI

Nov 7 2024

Both sources explain neural network pruning techniques in PyTorch. The first source, "How to Prune Neural Networks with PyTorch," provides a general overview of the pruning concept and its various methods, along with practical examples of how to implement different pruning techniques using PyTorch's built-in functions. The second source, "Pruning Tutorial," focuses on a more in-depth explanation of pruning functionalities within PyTorch, demonstrating how to prune individual modules, apply iterative pruning, serialize pruned models, and even extend PyTorch with custom pruning methods.

Read this: https://towardsdatascience.com/how-to-prune-neural-networks-with-pytorch-ebef60316b91

And the PyTorch tutorial: https://pytorch.org/tutorials/intermediate/pruning_tutorial.html

Show more Show less

8 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
AlexWho? Going Deeper with Deep CNNs

Nov 6 2024

The source is a chapter from the book "Dive into Deep Learning" that explores the historical development of deep convolutional neural networks (CNNs), focusing on the foundational AlexNet architecture. The authors explain the challenges faced in training CNNs before the advent of AlexNet, including limited computing power, small datasets, and lack of crucial training techniques. They discuss how AlexNet overcame these obstacles by leveraging powerful GPUs, large-scale datasets like ImageNet, and innovative training strategies. The chapter also delves into the architecture of AlexNet, highlighting its similarities to LeNet, and comparing its advantages in terms of depth, activation function, and model complexity control. Finally, the authors emphasize the importance of AlexNet as a crucial step towards the development of the deep networks used today, showcasing its impact on the field of computer vision and deep learning.

Read more: https://d2l.ai/chapter_convolutional-modern/alexnet.html

Show more Show less

12 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Predicting the Future from the Past: Sequential RNN Stuff

Nov 5 2024

This text is an excerpt from the "Dive into Deep Learning" book, specifically focusing on the processing of sequential data. The authors introduce the challenges of working with data that occurs in a specific order, like time series or text, and how these sequences cannot be treated as independent observations. They delve into autoregressive models, where future values are predicted based on past values, and highlight the common problem of error accumulation when predicting further into the future. The text discusses the concept of Markov models, where only a limited history is needed to predict future events, as well as the importance of understanding the causal structure of the data. The excerpt then provides a practical example of using linear regression for autoregressive modeling on synthetic time series data and demonstrates the limitations of simple models for long-term prediction.

Read more: https://d2l.ai/chapter_recurrent-neural-networks/sequence.html

Show more Show less

10 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Google's Secrets to Getting People to Adopt A.I.

Nov 4 2024

This excerpt from "Mental Models," a chapter in the "People + AI Guidebook," focuses on the importance of understanding and managing user mental models when designing AI-powered products. The authors discuss how to set expectations for adaptation, onboard users in stages, plan for co-learning, and account for user expectations of human-like interaction. By carefully considering these factors, product designers can ensure that users form accurate mental models and have a positive experience with AI-powered products.

Read more here: https://pair.withgoogle.com/chapter/mental-models/

Show more Show less

9 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
LLM Tokenizers, from HFs LNP Course

Nov 1 2024

This excerpt from Hugging Face's NLP course provides a comprehensive overview of tokenization techniques used in natural language processing. Tokenizers are essential tools for transforming raw text into numerical data that machine learning models can understand. The text explores various tokenization methods, including word-based, character-based, and subword tokenization, highlighting their advantages and disadvantages. It then focuses on the encoding process, where text is first split into tokens and then converted to input IDs. Finally, the text demonstrates how to decode input IDs back into human-readable text.

Read more: https://huggingface.co/learn/nlp-course/en/chapter2/4

Show more Show less

12 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
PyTorch vs Tensorflow: Who Wins in CNN?

Nov 1 2024

This research paper examines the efficiency of two popular deep learning libraries, TensorFlow and PyTorch, in developing convolutional neural networks. The authors aim to determine if the choice of library impacts the overall performance of the system during training and design. They evaluate both libraries using six criteria: user-friendliness, available documentation, ease of integration, overall training time, overall accuracy, and execution time during evaluation. The paper proposes a novel methodology for comparing these libraries by eliminating external factors that could influence the comparison and focusing solely on the six chosen criteria. The study finds that while both libraries offer similar capabilities, PyTorch is better suited for tasks that prioritize speed and ease of use, while TensorFlow excels in tasks demanding accuracy and flexibility. The authors conclude that the choice of library has a significant impact on both design and performance and that the presented criteria can assist users in selecting the most appropriate library for their specific needs.

Read more: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9699128/pdf/sensors-22-08872.pdf

Show more Show less

12 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wish list failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free

Get Started

Explore Categories

Episodes

Does the DIFF Transformer make a Diff?

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

Automating Scientific Discovery: ScienceAgentBench

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

Prune This! PyTorch and Efficient AI

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

AlexWho? Going Deeper with Deep CNNs

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

Predicting the Future from the Past: Sequential RNN Stuff

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

Google's Secrets to Getting People to Adopt A.I.

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

LLM Tokenizers, from HFs LNP Course

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed

PyTorch vs Tensorflow: Who Wins in CNN?

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wish list failed.

Follow podcast failed

Unfollow podcast failed