TikTok Researchers Introduce SWE-Perf: The First Benchmark for Repository-Level Code Performance Optimization

TikTok Researchers Introduce SWE-Perf: The First Benchmark for Repository-Level Code Performance Optimization

Introduction As large language models (LLMs) advance in software engineering tasks—ranging from code generation to bug fixing—performance optimization remains an elusive frontier, especially at the repository level. To bridge this gap, researchers from TikTok and collaborating institutions have introduced SWE-Perf—the first benchmark specifically designed to evaluate the ability of LLMs to optimize code performance in…

Read More
A new paradigm for AI: How ‘thinking as optimization’ leads to better general-purpose models

A new paradigm for AI: How ‘thinking as optimization’ leads to better general-purpose models

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Researchers at the University of Illinois Urbana-Champaign and the University of Virginia have developed a new model architecture that could lead to more robust AI systems with more powerful…

Read More
Computer and network-attached storage: Capacity optimization and backup expansion

Computer and network-attached storage: Capacity optimization and backup expansion

Motivations for bolstering backup Last March, I documented my travails striving to turn a 2018-era x86-based Apple Mac mini into a workable system in spite of its diminutive, non-upgradeable 128 GByte SSD’s internal capacity: Eight months later (last November), I discussed how, motivated by the lightning-induced last-summer demise of one of my network-attached storage (NAS)…

Read More
A Step-by-Step Coding Guide to Efficiently Fine-Tune Qwen3-14B Using Unsloth AI on Google Colab with Mixed Datasets and LoRA Optimization

A Step-by-Step Coding Guide to Efficiently Fine-Tune Qwen3-14B Using Unsloth AI on Google Colab with Mixed Datasets and LoRA Optimization

Fine-tuning LLMs often requires extensive resources, time, and memory, challenges that can hinder rapid experimentation and deployment. Unsloth AI revolutionizes this process by enabling fast, efficient fine-tuning state-of-the-art models like Qwen3-14B with minimal GPU memory, leveraging advanced techniques such as 4-bit quantization and LoRA (Low-Rank Adaptation). In this tutorial, we walk through a practical implementation…

Read More
NVIDIA AI Researchers Introduce FFN Fusion: A Novel Optimization Technique that Demonstrates How Sequential Computation in Large Language Models LLMs can be Effectively Parallelized

NVIDIA AI Researchers Introduce FFN Fusion: A Novel Optimization Technique that Demonstrates How Sequential Computation in Large Language Models LLMs can be Effectively Parallelized

Large language models (LLMs) have become vital across domains, enabling high-performance applications such as natural language generation, scientific research, and conversational agents. Underneath these advancements lies the transformer architecture, where alternating layers of attention mechanisms and feed-forward networks (FFNs) sequentially process tokenized input. However, with an increase in size and complexity, the computational burden required…

Read More