
Artificial Intelligence (AI)

Making airfield assessments automatic, remote, and safe
In 2022, Randall Pietersen, a civil engineer in the U.S. Air Force, set out on a training mission to assess damage at an airfield runway, practicing “base recovery” protocol after a simulated attack. For hours, his team walked over the area in chemical protection gear, radioing in geocoordinates as they documented damage and looked for…

Meet PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities across various domains, propelling their evolution into multi-modal agents for human assistance. GUI automation agents for PCs face particularly daunting challenges compared to smartphone counterparts. PC environments present significantly more complex interactive elements with dense, diverse icons and widgets often lacking textual labels, leading to perception…

Requirement Gathering Secrets: Master the Art of Building Successful Projects!
Introduction Successful projects start with clear and well-defined requirements. Requirement gathering stands as the cornerstone of any successful project. It’s about delving deep into the core of what clients and stakeholders genuinely need, aligning expectations and project outcomes from the get-go. In this blog, we’ll explore effective strategies to master this process, ensuring your projects…

10 Best AI Avatar Generators (March 2025)
AI avatar generators have become useful tools for streaming and other forms of AI content creation, such as enhancing presentations, automating video production, or establishing a unique on-screen persona. These platforms enable creators to generate high-quality virtual presenters, complete with realistic facial expressions, synchronized voiceovers, and multilingual capabilities. Whether you are a live streamer, a…
Introducing Gemini Robotics and Gemini Robotics-ER, AI models designed for robots to understand, act and react to the physical world.
Research Published 12 March 2025 Authors Carolina Parada Introducing Gemini Robotics, our Gemini 2.0-based model designed for robotics At Google DeepMind, we’ve been making progress in how our Gemini models solve complex problems through multimodal reasoning across text, images, audio and video. So far however, those abilities have been largely confined to the digital realm….

Streamlining data collection for improved salmon population management
Sara Beery came to MIT as an assistant professor in MIT’s Department of Electrical Engineering and Computer Science (EECS) eager to focus on ecological challenges. She has fashioned her research career around the opportunity to apply her expertise in computer vision, machine learning, and data science to tackle real-world issues in conservation and sustainability. Beery…

A Step by Step Guide to Build an Interactive Health Data Monitoring Tool Using Hugging Face Transformers and Open Source Model Bio_ClinicalBERT
In this tutorial, we will learn how to build an interactive health data monitoring tool using Hugging Face’s transformer models, Google Colab, and ipywidgets. We walk you through setting up your Colab environment, loading a clinical model (like Bio_ClinicalBERT), and creating a user-friendly interface that accepts health data input and returns interpretable disease predictions. This…

AI-Driven Customer Experience: Transforming Business Models
In the rapidly evolving landscape of modern business, Artificial Intelligence (AI) is not just a buzzword—it’s a transformative force reshaping the very foundations of customer experience. As businesses strive to meet the ever-increasing expectations of their clientele, AI emerges as a game-changing ally, enabling personalization at an unprecedented scale and unlocking new frontiers in customer…

The Road to Better AI-Based Video Editing
The video/image synthesis research sector regularly outputs video-editing* architectures, and over the last nine months, outings of this nature have become even more frequent. That said, most of them represent only incremental advances on the state of the art, since the core challenges are substantial. However, a new collaboration between China and Japan this week…
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
Research Published 14 December 2023 Authors Alhussein Fawzi and Bernardino Romera Paredes By searching for “functions” written in computer code, FunSearch made the first discoveries in open problems in mathematical sciences using LLMs Update: In December 2024, we published a report on arXiv showing how our method can be used to amplify human performance in…