
Kalyan Veeramachaneni

3 Questions: The pros and cons of synthetic data in AI
Synthetic data are artificially generated by algorithms to mimic the statistical properties of actual data, without containing any information from real-world sources. While concrete numbers are hard to pin down, some estimates suggest that more than 60 percent of data used for AI applications in 2024 was synthetic, and this figure is expected to grow…

A new way to test how well AI systems classify text
Is this movie review a rave or a pan? Is this news story about business or technology? Is this online chatbot conversation veering off into giving financial advice? Is this online medical information site giving out misinformation? These kinds of automated conversations, whether they involve seeking a movie or restaurant review or getting information about…
An anomaly detection framework anyone can use
Sarah Alnegheimish’s research interests reside at the intersection of machine learning and systems engineering. Her objective: to make machine learning systems more accessible, transparent, and trustworthy. Alnegheimish is a PhD student in Principal Research Scientist Kalyan Veeramachaneni’s Data-to-AI group in MIT’s Laboratory for Information and Decision Systems (LIDS). Here, she commits most of her energy…