
Writing science that humans and machines can read
Large language models are now routinely used to search, summarize and synthesize the literature at scales impossible for any individual researcher—yet scientific publishing has not adapted to that reality.
A lot has been said about the pros and cons of using large language models (LLMs) for scientific writing—whether they will rot our brains or make us superhuman. But this framing misses a more immediate question: Who are we actually writing for?
Scientific papers are still written for humans, and they should remain so. Reading deeply, struggling with difficult ideas and synthesizing across studies are essential parts of scientific training and discovery. But papers are increasingly being processed by machines as well. LLMs are now routinely used to search, summarize and synthesize the literature at scales impossible for any individual researcher. We are entering a world where papers are written in one format but increasingly consumed in another, yet scientific publishing has not adapted to that reality.
Anyone who has submitted papers across journals knows how arbitrary scientific formatting can be. The same paper may be reformatted three or four times before publication, despite containing the same scientific content. Publishers designed these formats for human reading and their own workflows, not for reliable large-scale synthesis by machines.
We should design scientific communication with both humans and machines in mind, not resist this shift or rely entirely on post hoc interpretation by artificial-intelligence (AI) systems. I propose a new layer of scientific output: author-verified, machine-readable summaries that sit alongside the traditional paper.
T
his layer is especially critical in complex and interdisciplinary fields such as neuroscience. The field spans molecular biology, anatomy, neurophysiology, behavior, computer science, engineering and beyond. No one can keep up with all of it, yet modern neuroscience requires holistic understanding.We have already recognized this problem at the level of data. Major efforts such as the Allen Brain Map, Neurodata Without Borders and the DANDI archive aim to make datasets findable, accessible and reusable through standardization and metadata. These initiatives are essential, but they also highlight a gap. We have focused on making data, not scientific papers, machine readable.
Papers are written as narratives, with selective emphasis and discipline-specific language. Abstracts are concise but inclined toward headline results. Methods and results are detailed but inconsistent across studies. Discussions provide synthesis but are inevitably selective in focus and breadth. This makes large-scale synthesis difficult. Either humans extract the information manually, which is slow, or LLMs attempt to do it automatically, which is prone to error.
In my own work leading the MetaBeeAI project, we have explored using LLMs to extract structured information from scientific papers. We initially focused on papers describing the effects of neurotoxic pesticides on bees, spanning molecular toxicology, neurobiology, behavior and ecology. We found that LLMs are very good at summarizing key findings, but they struggle in ways that matter for scientific synthesis. One recurring issue is that they do not reliably distinguish between what a paper studies and what it mentions—for example, species that were compared but not tested. LLMs also often fail to accurately pair methods and findings in papers with multiple underlying experiments. These errors are subtle but important, and they propagate into downstream analyses.
Terminology is another problem. Different fields use different words for similar concepts, and even basic categories such as “control” or “treatment” are not always described consistently. LLMs can handle this to some extent, but performance drops when trying to integrate across many papers. In short, extracting facts in a standardized format that can be compared across different types of studies is a challenge.
This difficulty has consequences for which studies are visible to AI tools used to synthesize evidence and develop models. Studies that fall outside standard paradigms are harder to find and integrate. LLMs can amplify this bias because they rely on consistent patterns in language and data.
Neuroscience has long advanced by drawing insights from diverse animal systems: for example, neural activity in cephalopods, sleep in fruit flies, decision-making in honeybees, plasticity in crayfish and circuit function in worms. These studies span levels of biological organization and use different methods, vocabulary and experimental traditions. Their datasets may not be directly comparable, but together they provide a broader picture of what nervous systems can achieve and how they adapt through evolution or under stress. If we could better integrate the information locked inside decades of scientific writing, we would honor not only the work of the scientific community, but also the animal lives and public funding that made this research possible.
M
y suggestion is simple: Every paper should include an author-verified, machine-readable summary. This summary would not replace the abstract but sit alongside it and be freely available online. The goal would be to capture, in a structured way, what was studied, how it was tested, and the core numerical results, with direct links to datasets where available. The process would be straightforward. An LLM would generate a first pass, extracting key elements from the paper into a structured format. The authors would then check and correct the output, ensuring that it accurately reflects the study and its limitations. This review is important not only for reliability but because the process itself forces researchers to engage explicitly with how their work is represented and interpreted. In many cases, that reflection would be valuable in its own right. Readers and downstream users could also inspect these summaries directly rather than relying on opaque AI-generated interpretations of the literature.There are precedents for this kind of approach. Structured reporting formats have improved reproducibility by standardizing how methods are described. In genomics, shared standards and metadata frameworks have enabled large-scale data integration (such as HGVS Nomenclature and MINSEQE). More broadly, in the biosciences, STAR (structured, transparent, accessible reporting) Methods has demonstrated how standardizing methodology and protocols can enhance experimental reproducibility. A machine-readable summary would extend this idea to the level of the paper itself.
In the longer term, these summaries could evolve beyond metadata into structured representations of scientific relationships themselves: explicitly linking interventions to outcomes, circuits to behaviors, or genes to phenotypes. Scientific papers already contain these relationships implicitly in prose, but prose is ambiguous and difficult for machines to interpret consistently. Making these relationships more explicit and machine readable could fundamentally change how knowledge is integrated across neuroscience, especially across fields that use different species, methods and terminology.
It would also make science more accessible and efficient. At the moment, LLMs repeatedly process the same papers—accessing, reading and summarizing them again and again, with real computational and environmental costs. Structured, open summaries would render much of this unnecessary. AI systems could draw directly from these summaries rather than reinterpreting the full text each time. More importantly, AI systems would have access to a consistent, author-verified representation of each study, improving provenance and trust. Researchers could inspect exactly what information was extracted and how it was structured, rather than treating literature synthesis as an opaque process.
We do not need to choose between writing for humans and writing for machines. We can design for both. Scientific papers should remain rich human documents designed for interpretation, debate and learning. But alongside them, we should build structured, transparent and author-verified representations that allow machines to integrate knowledge more reliably at scale.
Right now, we are drifting toward a system in which science is increasingly interpreted by machines, with all the distortions that entails. If we want science to be more open, reusable and easier to integrate, we need to take that new reality seriously.
AI use disclosure:
Explore more from The Transmitter
Maternity induces lasting gene-expression changes in mouse brains
IQ’s link to brain structure, function in children may be a mirage