A stack of papers topped by many paper shreddings against a red background.
Shredded results: The Transmitter identified at least 90 publications that cite a version of the questionable dataset through a search on Google Scholar.
Photograph by Richard Drury

Exclusive: Springer Nature retracts, removes nearly 40 publications that trained neural networks on ‘bonkers’ dataset

The dataset contains images of children’s faces downloaded from websites about autism, which sparked concerns at Springer Nature about consent and reliability.

By Calli McMurray
8 December 2025 | 5 min read

Scientific publisher Springer Nature has begun to retract dozens of papers that relied on a dataset fraught with ethical and reliability concerns, The Transmitter has learned. Five papers have been retracted since 16 November, and 33 more retractions are planned, says Tim Kersjes, Springer Nature’s head of research integrity, resolutions.

The papers attempted to train neural networks to distinguish between autistic and non-autistic children in a dataset containing photos of children’s faces. Retired engineer Gerald Piosenka created the dataset in 2019 by downloading photos of children from “websites devoted to the subject of autism,” according to a description of the dataset’s methods, and uploaded it to Kaggle, a site owned by Google that hosts public datasets for machine-learning practitioners.

The dataset contains more than 2,900 photos of children’s faces, half of which are labeled as autistic and the other half as not autistic.

After learning about a paper that cites the dataset, “I went and downloaded the dataset, and I was completely horrified,” says Dorothy Bishop, emeritus professor of developmental neuropsychology at the University of Oxford. “When I saw how it was created, I just thought, ‘This is absolute bonkers.’”

Without identifying each child in the dataset, there is no way to confirm that any of them do or do not have autism, Bishop says.

The children pose at different angles under different lighting and have different expressions, Bishop says, which adds noise to the dataset. “Even if there were facial differences, it would be making them much harder to detect, because you’ve got so much variability that’s got nothing to do with the child’s condition.”

Because the images were downloaded from various websites, it is doubtful that the children or their families gave consent for them to be used in research, says Gail Alvares, principal research fellow at the Kids Research Institute Australia. “Just because you have provided an image to the internet does not mean that you necessarily provide consent for that image to be used for research purposes.”

A Kaggle user broached these same concerns in a comment, and Piosenka responded that he had not violated privacy restrictions because all images were publicly available, adding “how can one be more ethical than to try to foster early detection and treatment of Autism in children. You sir are way off base.” Piosenka did not respond to The Transmitter’s requests for comment by email and phone.

T

he dataset first came to Springer Nature’s attention last month through separate investigations into two papers, Kersjes says. The research integrity team was about to start investigating one “article of concern” when Guillaume Cabanac, professor of computer science at the University of Toulouse, alerted the team to the other one, which contained tortured phrases—strange phrases used in place of established ones, a possible sign that the text was generated by artificial intelligence.

Both papers used the photo dataset that Piosenka assembled.

The Springer Nature team concluded that the dataset is unreliable and that it collected images without ethical approval or consent, Kersjes says. “This significant methodological issue undermined the results and conclusions of the publications.”

Kaggle had removed the dataset because it violated the site’s terms of service, according to a comment from Piosenka posted 10 May 2022 on the site, but Piosenka later uploaded the files to Google Drive. The Springer Nature team also found two datasets posted by other Kaggle users that appear to be replications of the original.

After The Transmitter contacted Piosenka and Google for comment, the datasets from the other users disappeared from Kaggle, as did comments posted by Piosenka that shared the link to the dataset on Google Drive. Google did not respond to the request for comment.

Kersjes and his team audited Springer Nature publications to find any others that had used a version of the dataset, and they plan to retract a total of 38 papers, conference proceedings and book chapters, and remove all but 1 of these. Removing the publications means that the retraction note and details such as the title, authors and DOI will remain visible, but the work itself cannot be accessed. The team is also contacting other publishers to alert them about the dataset.

“This is very, amazingly proactive,” Bishop says.

The Transmitter identified at least 90 publications that cite a version of the dataset through a search on Google Scholar; 25 of those appear in journals published by the Institute of Electrical and Electronics Engineers. “IEEE is aware of this issue, and we are investigating,” an IEEE spokesperson told The Transmitter.

In 2023, the publisher Wiley retracted two papers that used the dataset. “These two papers were retracted as part of an unrelated investigation. We have recently been made aware of concerns regarding this dataset and are examining whether other papers in our portfolio rely upon it,” a Wiley spokesperson told The Transmitter last week.

F

acial features cannot be used to diagnose autism, Bishop and Alvares say. Autism is a complex, heterogeneous condition, and the gold standard for diagnosis relies on clinical assessment of behavioral traits.

However, the relationship between facial features and having an autism diagnosis is a valid line of research that peaked in popularity about five years ago, Alvares says. A fetus’s face and brain develop around the same time, which led to the hypothesis that differences in facial features could reflect differences in brain development.

Research into this question requires a large, tightly controlled sample of images—given with consent—of children with and without a clinically confirmed autism diagnosis, Alvares says. The Kaggle dataset “in no way represents any realm of an accurate database that could be used for research purposes.”

Sign up for our weekly newsletter.

Catch up on what you missed from our recent coverage, and get breaking news alerts.

privacy consent banner

Privacy Preference

We use cookies to provide you with the best online experience. By clicking “Accept All,” you help us understand how our site is used and enhance its performance. You can change your choice at any time. To learn more, please visit our Privacy Policy.