large language models

3 Questions: The pros and cons of synthetic data in AI

Synthetic data are artificially generated by algorithms to mimic the statistical properties of actual data, without containing any information from real-world sources. While concrete numbers are hard to pin down, some estimates suggest that more than 60 percent of data used for AI applications in 2024 was synthetic, and this figure is expected to grow…

Researchers glimpse the inner workings of protein language models

ellonjohns1 month ago010 mins

Within the past few years, models that can predict the structure or function of proteins have been widely used for a variety of biological applications, such as identifying drug targets and designing new therapeutic antibodies. These models, which are based on large language models (LLMs), can make very accurate predictions of a protein’s suitability for…

Unpacking the bias of large language models

ellonjohns3 months ago011 mins

Research has shown that large language models (LLMs) tend to overemphasize information at the beginning and end of a document or conversation, while neglecting the middle. This “position bias” means that, if a lawyer is using an LLM-powered virtual assistant to retrieve a certain phrase in a 30-page affidavit, the LLM is more likely to…

Making AI-generated code more accurate in any language

ellonjohns5 months ago011 mins

Programmers can now use large language models (LLMs) to generate computer code more quickly. However, this only makes programmers’ lives easier if that code follows the rules of the programming language and doesn’t cause a computer to crash. Some methods exist for ensuring LLMs conform to the rules of whatever language they are generating text…

Like human brains, large language models reason about diverse data in a general way

ellonjohns7 months ago011 mins

While early language models could only process text, contemporary large language models now perform highly diverse tasks on different types of data. For instance, LLMs can understand many languages, generate computer code, solve math problems, or answer questions about images and audio. MIT researchers probed the inner workings of LLMs to better understand how they…

Reinforcement Learning Meets Chain-of-Thought: Transforming LLMs into Autonomous Reasoning Agents

ellonjohns7 months ago012 mins

Large Language Models (LLMs) have significantly advanced natural language processing (NLP), excelling at text generation, translation, and summarization tasks. However, their ability to engage in logical reasoning remains a challenge. Traditional LLMs, designed to predict the next word, rely on statistical pattern recognition rather than structured reasoning. This limits their ability to solve complex problems…

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

ellonjohns8 months ago03 mins

The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual…

Virtual Personas for Language Models via an Anthology of Backstories

ellonjohns9 months ago02 mins

We introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience. What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors?…

Highlights

7 easy ways I fixed iOS 26’s bad battery life on my iPhone

The Rise of Micro-Influencers: Small Audiences, Big Impact – Tecuy Media

SpyCloud Report: 2/3 Orgs Extremely Concerned About Identity Attacks Yet Major Blind Spots Persist

Sandisk WD Blue SN5100 2TB SSD Review: A Rhapsody in Blue

Category Collection