Research Papers » ML Digest

ModernBERT: A Leap Forward in Encoder-Only Models

ModernBERT emerges as a groundbreaking successor to the iconic BERT model, marking a significant leap forward in the domain of […]

ModernBERT: A Leap Forward in Encoder-Only Models Read More »

Qwen2.5-1M: Million-Token Context Language Model

The Qwen2.5-1M series are the first open-source Qwen models capable of processing up to 1 million tokens. This leap in

Qwen2.5-1M: Million-Token Context Language Model Read More »

DeepSeek-R1: How Reinforcement Learning is Driving LLM Innovation

DeepSeek-R1 represents a significant advancement in the field of LLMs, particularly in enhancing reasoning capabilities through reinforcement learning (RL). This

DeepSeek-R1: How Reinforcement Learning is Driving LLM Innovation Read More »

TabPFN: A Foundation Model for Tabular Data

Tabular data, the backbone of countless scientific fields and industries, has long been dominated by gradient-boosted decision trees. However, TabPFN

TabPFN: A Foundation Model for Tabular Data Read More »

T5: Exploring Google’s Text-to-Text Transformer

Developed by researchers at Google Research, T5 (Text-to-Text Transfer Transformer) [paper] employs a unified text-to-text framework to facilitate various NLP

T5: Exploring Google’s Text-to-Text Transformer Read More »

PromptWizard: LLM Prompts Made Easy

Prompt engineering plays a crucial role in LLM performance. However, manual prompt engineering is a laborious and domain-specific process, demanding

PromptWizard: LLM Prompts Made Easy Read More »

Large Concept Models (LCM): A Paradigm Shift in AI

Large Concept Models (LCMs) [paper] represent a significant evolution in NLP. Instead of focusing on individual words or subword tokens,

Large Concept Models (LCM): A Paradigm Shift in AI Read More »

Pushing the Boundaries of LLM Efficiency: Algorithmic Advancements

This article summarizes the content of the source, “The Efficiency Spectrum of Large Language Models: An Algorithmic Survey,” focusing on

Pushing the Boundaries of LLM Efficiency: Algorithmic Advancements Read More »