From Code to ArXiv: What It Actually Takes to Publish as a Practitioner

This is not a victory lap. The paper is written, the benchmarks are collected, the LaTeX compiles cleanly, and the methodology has been validated across nine production deployments. But as of this writing, the SPOQ paper has not been published on ArXiv because I cannot get past the endorsement requirement. I am an industry practitioner without a university affiliation, and ArXiv requires new submitters to be endorsed by an existing author in the relevant subject category. I do not have that endorser yet.

I want to write honestly about the full journey because most “how I published my paper” posts skip the parts that are actually hard. They describe the writing process, the formatting challenges, the revision cycles. They rarely mention the institutional barriers that can block publication entirely, regardless of the quality of the work. If you are an engineer considering writing a research paper, you deserve the complete picture, including the part where the system does not work the way you expect it to.

So here is the real story: why I wrote the paper, how I wrote it, what tools I used, where I am stuck, and what I would tell another practitioner who is thinking about walking this same path.

Why would an engineer write an academic paper?

SPOQ started as a practical solution to real development problems, and at some point the methodology became rigorous enough that formal documentation felt like a responsibility rather than an aspiration. Practitioners doing novel work owe it to the broader community to contribute what they have learned to the shared knowledge base.

The motivation was never academic prestige or career advancement in the traditional sense. I built SPOQ because I needed it. Leading teams at JPMorgan Chase and then as founding engineer at Notary Everyday, I kept running into the same bottleneck: AI coding agents were individually capable but collectively uncoordinated. Running them one at a time left massive throughput on the table. Running them in parallel without structure produced integration chaos. I needed a methodology that bridged the gap, and when I could not find one that worked, I built my own.

The transition from “internal methodology” to “research paper” happened gradually. As I deployed SPOQ across more projects, I started collecting structured data: wave execution times, rework rates, dependency graph shapes, validation scores. The data told a consistent story. When I shared early results with other engineers, the response was consistently some variation of “this should be a paper.” At first I dismissed the suggestion. I am not an academic. I do not have a PhD. I did not study at a research university. But the more I thought about it, the more I realized that those objections were about credentials, not about whether the work had value. The work had value. The data supported it. Writing the paper was the right thing to do.

There is also a broader principle at stake. The field of AI-assisted software engineering is evolving rapidly, and much of the most interesting work is happening in industry rather than academia. Practitioners are building novel systems, discovering new patterns, and solving problems that academic researchers have not yet formalized. If that knowledge stays locked inside company Slack channels and internal wikis, the entire field moves slower. Publishing is how practitioners contribute to the record. It is how individual discoveries become shared understanding.

What does the journey from working code to formal paper look like?

The journey started with a methodology that worked in practice, moved through formalization of concepts like wave-based dispatch and dual validation gates, included structured benchmarks from nine real deployments, and culminated in a LaTeX document that is currently at version 0.4.0. The process was messier and slower than I expected.

Writing code and writing a paper are fundamentally different activities, even when they describe the same system. Code is executable. If it compiles and the tests pass, you know it works. A paper has no compiler. There is no test suite for clarity, no linter for logical coherence. You write a section, reread it the next day, realize it assumes knowledge the reader does not have, and rewrite it from scratch. This happened to me repeatedly with the methodology description, which went through at least six complete rewrites before it reached a state where a reader without prior SPOQ exposure could follow the logic.

The formalization process forced me to confront ambiguities in my own thinking that I had been glossing over during implementation. What exactly constitutes a “wave”? I had been using the term informally for months, but when I sat down to write a formal definition, I realized there were edge cases I had never explicitly resolved. What happens when a task in Wave 1 fails validation but other Wave 1 tasks succeed? Do all of Wave 1’s dependents block, or only the dependents of the failed task? Writing the paper forced me to answer these questions precisely, and the answers fed back into the implementation, making the actual system better.

The benchmarking process required retroactive data collection from earlier deployments combined with prospective measurement on newer ones. For each of the nine deployments, I documented the dependency graph shape, the number of waves, the total tasks, the wave execution times, the rework rates, and the speedup factor compared to estimated sequential execution. Collecting this data systematically was tedious but essential. Without it, the paper would be just another opinion piece. With it, the claims are grounded in observable evidence. The difference between “I think this works” and “here is the data showing that it works” is the difference between a blog post and a research contribution.

The concept of Human-as-an-Agent, which I call HaaA, emerged during the writing process itself. In practice, I had always treated human developers as participants in the SPOQ workflow, but I had not formalized their role. Writing the paper forced me to articulate how humans fit into the agent hierarchy: not as supervisors watching from outside the system, but as first-class agents with their own capabilities, response times, and task assignments. This framing clarified the design philosophy and made the overall methodology more coherent.

How did LaTeX tooling support the paper writing process?

LaTeX provided the professional typesetting and academic formatting that the paper required, and the discipline of working within its constraints forced clearer thinking about structure, figures, and citations. Starting with LaTeX from day one was one of the best decisions I made.

I had used LaTeX peripherally during my time at the Naval Nuclear Power Training Command for technical documentation, but writing a full research paper was a different experience entirely. The learning curve was steep in the first two weeks. Setting up the document class, configuring the bibliography with BibTeX, getting figures to float correctly, managing cross-references between sections. Every one of these tasks took longer than I expected. But once the infrastructure was in place, the workflow became remarkably productive. I could focus on content while LaTeX handled the presentation.

The build tooling deserves specific mention. I set up a custom compilation pipeline using latexmk that handles the multiple passes required for cross-references and bibliography resolution. The pipeline runs on every save, producing a fresh PDF within seconds. This tight feedback loop made iterating on the paper feel similar to iterating on code, which was critical for maintaining momentum. I also version-controlled the LaTeX source in Git, giving me a complete history of every revision. Being able to diff two versions of a section and see exactly what changed proved invaluable during the revision process.

Managing figures and tables in LaTeX taught me something unexpected about academic writing: visual presentation matters far more than I initially assumed. My first draft included dense paragraphs describing the wave execution model. A colleague suggested I add a diagram. The diagram communicated in one figure what three paragraphs had failed to convey clearly. I ended up adding several figures including the dependency DAG visualization, the wave dispatch timeline, and the validation score distribution. Each one replaced prose that was accurate but difficult to parse quickly. The lesson was humbling. As an engineer, I default to text. Academic papers require a different balance between textual and visual communication.

One tooling choice I would make differently is citation management. I started with manual BibTeX entries, typing each citation by hand. By the time I had 30 references, maintaining consistency across entries became tedious and error-prone. A proper reference manager like Zotero or Mendeley would have saved hours of busywork. If you are starting a paper, set up your citation management infrastructure before you write a single paragraph. You will thank yourself when the reference list grows past twenty entries.

What is the ArXiv endorsement requirement and why does it matter?

ArXiv requires new submitters to be endorsed by an existing author who has published in the relevant subject category, and this requirement exists to maintain quality. But it creates a real barrier for industry practitioners who lack university affiliation and the natural network of ArXiv-active researchers that comes with it. The SPOQ paper is complete and submission-ready, currently blocked entirely on finding an endorser.

The endorsement system works as follows. When you create an ArXiv account and attempt to submit a paper, the system checks whether you have endorsement privileges in your chosen subject categories. For SPOQ, the relevant categories are cs.SE (Software Engineering), cs.AI (Artificial Intelligence), and cs.MA (Multi-Agent Systems). If you do not have endorsement privileges, you must find an existing ArXiv author in one of those categories who is willing to endorse you. The endorser is not reviewing the paper or vouching for its conclusions. They are simply confirming that you are a legitimate researcher submitting a genuine contribution. But finding that endorser requires a network that industry practitioners typically do not have.

I want to be clear about what this means practically. The paper is finished. It has been through multiple rounds of revision. The benchmarks are solid. The methodology is documented in sufficient detail for replication. The open-source implementation is available under an MIT license. Everything that depends on my effort and competence is complete. The only remaining barrier is a social one: I need someone with existing ArXiv standing to vouch that I am a real person doing real work. The irony is not lost on me.

I have reached out to several researchers whose work intersects with SPOQ’s domain. The responses have been uniformly polite and uniformly unhelpful. Some never replied. Others expressed interest in the work but declined to endorse because they do not know me personally. One explained that endorsement carries reputational risk and they only endorse colleagues they have collaborated with directly. I understand their position. The system incentivizes caution. But the cumulative effect is that practitioners without academic connections face a barrier that has nothing to do with the quality of their research.

This is not a complaint about ArXiv specifically. The platform provides an incredible service to the research community, and the endorsement system serves a legitimate purpose in filtering spam and low-quality submissions. My frustration is with the gap between the system’s design assumptions and the reality of where novel work originates. The system assumes that people doing publishable research are embedded in academic institutions where endorsers are readily available. That assumption was reasonable twenty years ago. It is increasingly inaccurate in 2026, when some of the most novel work in AI and software engineering is happening at startups, in open-source projects, and at companies that do not publish papers as part of their business model.

Why does this barrier deserve attention?

Industry practitioners produce novel work that never reaches the academic record because of gatekeeping mechanisms designed for a different era, and the result is a research landscape that systematically underrepresents practical engineering innovation. The gap between what practitioners know and what gets published grows wider every year.

Think about the selection bias this creates. Academic papers on AI-assisted software engineering are overwhelmingly written by researchers who study these systems from outside. They design controlled experiments, measure specific variables, and publish findings in peer-reviewed venues. This work is valuable. But it represents only one perspective on the field. The perspective of people who build and operate these systems in production is largely absent from the formal literature. Practitioners share their knowledge through blog posts, conference talks, and open-source documentation, but these formats do not carry the same weight or discoverability as published papers.

The consequences are tangible. When a researcher surveys the literature on multi-agent AI coordination, they find academic papers with controlled experiments and theoretical frameworks. They do not find the practitioner’s perspective because that perspective was never published in a format that the survey methodology would capture. The resulting review is accurate but incomplete, and the incompleteness compounds over time as subsequent researchers build on the same partial foundation.

Conference papers have similar barriers with institutional affiliation expectations. Many AI and software engineering conferences require at least one author to have a university affiliation. Some require the submitting author to have a verifiable academic email address. These requirements are not always explicit in the call for papers, but they surface during the submission process. The message, whether intentional or not, is clear: industry practitioners are welcome to attend and pay registration fees, but the podium is reserved for those with academic credentials.

I am raising this issue not because I expect the system to change overnight, but because I think it is important to name it clearly. If you are a practitioner who has done novel work and considered writing a paper, you should know about this barrier before you invest months in the writing process. It does not mean you should not write the paper. But you should go in with eyes open about what the publication pathway actually looks like for someone without a university email address.

What would I ask of readers?

If you are an ArXiv author in cs.SE, cs.AI, or cs.MA and you find the SPOQ methodology interesting, I would welcome a conversation about endorsement. The paper and methodology are available at spoqpaper.com for review. This is a genuine ask from a practitioner who believes the work has value for the research community.

I want to be transparent about what endorsement involves. According to ArXiv’s own documentation, endorsement is not a review of the paper. It is not an approval of the conclusions or methodology. It is a statement that the endorser believes the submitter is a member of the scientific community whose work is appropriate for the archive. The endorser does not assume responsibility for the paper’s content. They are simply confirming that the submission is a good-faith research contribution rather than spam or promotional material.

If you are not an ArXiv author but know someone who is, I would be grateful for an introduction. The SPOQ paper covers multi-agent AI coordination with wave-based task dispatch, dual validation gates, and the Human-as-an-Agent model. It includes benchmark data from nine production deployments. The methodology is backed by an open-source implementation with an MIT license. Anyone working in multi-agent systems, software engineering automation, or AI-assisted development would have the relevant domain expertise to evaluate whether the work merits a place on ArXiv.

I recognize that asking for help publicly feels uncomfortable. Engineers are trained to solve problems independently, and admitting that you are stuck can feel like admitting weakness. But this particular problem is not solvable through more effort or better engineering. It requires a human connection that I have not yet been able to make through cold outreach. If this post reaches someone who can help bridge that gap, the vulnerability will have been worth it.

What advice would I give other practitioners considering publishing?

Start writing before you think the work is ready because the writing process itself clarifies the methodology. Use LaTeX from the beginning rather than converting later. Build your benchmarks and data collection into your normal workflow. And do not let the endorsement barrier discourage you from completing the paper, because the writing alone makes the work better even if publication takes longer than you expect.

The most valuable piece of advice I can offer is to begin the paper while the methodology is still evolving. I waited too long. By the time I started writing, I had already made dozens of design decisions that I could no longer fully reconstruct from memory. If I had been writing concurrent with development, each decision and its rationale would have been captured in real time. Instead, I had to reverse-engineer my own reasoning from commit histories and chat logs. Starting the paper early does not mean publishing early. It means using the writing process as a tool for thinking, which is arguably its most important function.

On the tooling front, use LaTeX from the start. I have spoken with other practitioners who wrote their papers in Google Docs or Notion and then attempted to convert to LaTeX for submission. The conversion process is painful. Formatting, citations, cross-references, figure placement: all of these need to be rebuilt from scratch. LaTeX has a learning curve, but investing in that curve upfront saves you from a much more expensive conversion process later. There are excellent templates available for most target venues, and tools like Overleaf provide a browser-based LaTeX environment that reduces the setup burden.

Build your data collection into your normal workflow from the beginning. When I deployed SPOQ on the first few projects, I did not systematically track execution metrics because I was focused on making the methodology work, not on measuring it. By the time I realized I needed benchmark data for the paper, several early deployments had concluded without structured data capture. I had to rely on rough estimates and incomplete logs for those deployments, which weakened the empirical foundation of the paper. If I had set up a simple metrics pipeline from the start, even just logging wave execution times and rework counts to a CSV file, I would have nine clean data sets instead of nine data sets of varying completeness.

Finally, do not let the publication barriers stop you from finishing the paper. Even if ArXiv endorsement never materializes, the completed paper serves multiple purposes. It is a comprehensive reference document for your methodology. It forces clarity in your thinking that benefits the implementation. It demonstrates research capability to potential collaborators, employers, and clients. And it positions you to take advantage of publication opportunities as they arise, whether through ArXiv, conference submissions, or alternative venues like SSRN or institutional repositories. The paper has value independent of where it ends up being hosted. The writing is the work. Publication is distribution.

I will update this post when the publication status changes. In the meantime, the full paper and open-source implementation are available at spoqpaper.com. If the work resonates with you, I would value hearing about it. And if you can help with the endorsement question, I would value that even more.

Interested in the SPOQ methodology or able to help with ArXiv endorsement? Schedule a conversation to discuss the research, the methodology, or how multi-agent orchestration can work for your team.