JudicialMind
Back to blog

Government · Agentic Panels

The Bench in the Machine

Cash-strapped attorneys general, agency counsel, and public-sector litigators are rehearsing their toughest arguments against synthetic judges, witnesses, and opposing counsel, practicing before they perform. The early evidence is promising, the limits are real, and the accountability questions are only beginning.

By JudicialMind

For most of the modern era, the quality of a government lawyer’s preparation has been a function of who happened to be down the hall. A deputy attorney general heading to a federal circuit could rehearse a brutal appellate argument only if colleagues were free to play the panel; an agency litigator prepping a hostile deposition could practice cross-examination only if someone consented to sit across the table and lie convincingly for an hour. Rehearsal, the single most reliable way to improve courtroom performance, was rationed by headcount and calendar. That rationing is now being challenged by a new category of tools: agentic AI personas that can play a skeptical justice, a slippery witness, or an aggressive opposing counsel on demand, at any hour, for the marginal cost of a few API calls.

This is not a story about a product. It is a story about a shift in who gets to practice, how often, and against what. For public-sector legal teams under chronic budget and staffing pressure, credible AI-persona mock proceedings could be the most consequential change in trial and appellate preparation in a generation, if the profession stays honest about what these systems still cannot do.

50%
Government legal departments citing too few resources or budget
$30k, $60k+
Typical cost of a one-to-two-day live mock trial, plus expenses
74%
Agencies reporting ongoing staffing shortages
41%
Best-case coverage of fine-grained legal issues by top AI simulators

Figures drawn from the Thomson Reuters 2025 Government Legal Department Report and the Princeton-affiliated study described below.

The Old Way: Rehearsal as a Luxury Good

The legacy method of high-stakes preparation was deceptively simple and quietly expensive. To stress-test an appeal, lawyers held a moot court: a dress rehearsal in which colleagues impersonate the bench and pepper the advocate with the hardest questions they can invent. Practitioner guidance has long called this the single most important step before an appellate argument, assemble three litigators as judges, brief them weeks ahead, and have them stay in character for an hour while the advocate works through the toughest hypotheticals (Kirkland & Ellis). Well-executed moots let you practice timing, expose weak arguments, and rehearse answers to the obvious and not-so-obvious questions a panel will pose (Duane Morris).

The catch was access. Elite advocates could lean on institutions that provide moots as a public service, Georgetown’s Supreme Court Institute, for instance, moots counsel in nearly every case argued before the U.S. Supreme Court, at no charge and on a non-partisan basis (Georgetown Law). But that capacity is finite and concentrated at the very top of the system. Most government attorneys preparing routine appeals, agency hearings, or depositions had no such luxury. As one defense-bar guide candidly noted, the additional costs of a moot “might dictate if, or to what extent, any moot court gets held”, and teams sometimes simply “eat the costs” or skip the exercise on a budget-stretched appeal (DRI, For the Defense).

On the trial side, the economics were even starker. A thorough live mock trial, recruiting representative jurors, renting a facility, building deliberation panels, is commonly quoted anywhere from roughly $10,000 to $60,000 or more, with project expenses such as facility rental, juror recruiting, and travel often rivaling or exceeding the consultants’ fee (Jurybox). Jury-research veterans put a recorded full-day mock jury with simultaneous response measurement and 36 to 42 jurors at $50,000 to $100,000 or more (MoloLamken FAQ). Mock jurors alone are typically paid $150, $300 per day, and full-scale exercises take attorneys and consultants months to prepare (Opveon). For a public-sector office, those numbers are not line items, they are non-starters.

The price of practicing the old way

Indicative cost ranges for traditional pre-trial and pre-argument rehearsal, in U.S. dollars

Ranges synthesized from Jurybox, MoloLamken, and Opveon. Bars show low-to-high quoted ranges, excluding many add-on expenses.

The Shift: Why Public-Sector Counsel Cannot Afford Not To Change

The pressure to find a cheaper, scalable form of rehearsal is not abstract, it is documented across the public sector. Half of government legal departments name “too few resources or budget” as a top challenge, and 52% cite the difficulty of attracting and retaining talent (Thomson Reuters). Roughly three-quarters of agencies, 74%, report experiencing staffing shortages over the prior two years, and 64% expect those shortages to continue (Thomson Reuters). When there are fewer lawyers in the building, there are fewer colleagues available to staff a moot panel.

The workload is moving the wrong way at the same time. Three in four government respondents (75%) expect their overall workload to rise over the next two years, two-thirds expect their work to grow more complex, and more than half, 51%, already say they lack the time to thoroughly research new or complex issues (Thomson Reuters). Technology has not kept pace: 68% of agency respondents rate their tools as inferior to private-sector systems, and the single most cited barrier to adopting anything new is budget, named by 81% (Thomson Reuters).

The constraints squeezing government legal teams

Share of government legal departments reporting each pressure, 2025

Source: Thomson Reuters 2025 Government Legal Department Report.

Against that backdrop, the broader legal profession has crossed an adoption threshold that makes synthetic-rehearsal tools feel less exotic. The share of U.S. law-firm attorneys using generative AI for at least one purpose jumped to 54% in early 2025, up from 35% a year earlier, with nearly 80% saying the technology made their work easier (Law360 Pulse). The Thomson Reuters Institute found legal professionals posting the strongest generative-AI adoption rates of any professional sector, and 95% expecting it to become central to their organization’s daily workflow within five years (Thomson Reuters Institute). Government has been slower and more skeptical, but the direction of travel is unmistakable.

Rehearsal used to be rationed by who was down the hall. The new question is whether a synthetic adversary can be made adversarial enough to matter.

What It Looks Like Now: Practicing Against a Synthetic Bench

The present-day workflow is recognizably the moot court of old, with the human panel partly replaced by software. An advocate uploads the briefs, the record, and the legal question; an agentic system constructs a persona, a particular kind of judge, a hostile deponent, an opposing counsel, and conducts the exchange in character. Researchers building exactly this kind of tool describe the goal in egalitarian terms: while well-resourced litigants can hire former judges to run realistic moots, “many attorneys with limited resources prepare…with small, hand-crafted” simulations, and AI could “level the playing field between well-resourced and underfunded attorneys” by widening access to high-quality practice (Zhang et al., arXiv).

How well does the synthetic bench actually perform? A study from researchers affiliated with Princeton built what they call the first comprehensive framework for evaluating AI as an oral-argument practice partner, using a test set of 62 distinct U.S. Supreme Court cases and 168 argument sections and scoring simulators across 20 metrics (Zhang et al., arXiv). The encouraging news: human evaluators, including law students, often rated AI-generated justice questions as realistic, and frequently judged them as compelling as the questions real justices asked (Princeton CITP). The best models broadly addressed more than 60% of the legal issues raised by actual justices and detected logical fallacies at better than an 80% rate in seven of ten tested categories (Princeton CITP).

Where the synthetic bench is strong, and where it folds

Measured performance of leading AI oral-argument simulators across key capabilities

Source: Princeton CITP summary of the evaluation in Zhang et al., arXiv. Higher is better for coverage and fallacy detection; the pushback figures show how often simulators challenged bad-faith advocacy.

The discouraging news is what these systems do when an advocate behaves badly. The same study found simulators to be “extremely sycophantic.” When attorneys deliberately broke courtroom decorum, even the best simulators pushed back less than 40% of the time; when confronted with political rage-bait or arguments that switched sides mid-stream, detection dropped below 10% (Princeton CITP). AI-generated questions were far less diverse than real justices’, almost never venturing the hypotheticals that define hard appellate questioning, and even the strongest model covered at most 41% of the fine-grained issues that real justices raised (Princeton CITP). A coach that flatters you is worse than no coach at all when the real bench will not.

Three modes of rehearsal, compared
DimensionLive human moot / mock trialAI-persona mock proceedingHybrid (AI + human review)
Typical cost$10k, $100k+ per exerciseMarginal compute costLow compute + selective human time
AvailabilityConstrained by colleagues’ calendarsOn demand, 24/7On demand, with scheduled human moots
RepetitionUsually one or two roundsEffectively unlimitedMany AI rounds, few human rounds
Adversarial pressureHigh, if mooters are skilledWeak under bad-faith provocationHigh where it counts most
Issue coverageDepends on panel expertiseBroad (>60%) but shallow on fine points (≤41%)Broad breadth plus human depth
Accountability riskWell understoodConfidentiality, bias, over-relianceManaged via human checkpoints

For depositions and witness work, the appeal is similar. Practitioner guidance now describes AI role-play that simulates hostile questioning, rehearsing with the case’s real exhibits, building in fatigue and stress over long sessions, adapting to an individual’s communication style, so a witness or attorney can build muscle memory before the real examination (Exec). The unglamorous back-office uses arrive even faster: organizing records, surfacing inconsistencies in prior testimony, and drafting first-pass question outlines that compress the grunt work preceding any rehearsal (Clio).

The Adoption Gap Between Government and the Private Bar

Government is not adopting these capabilities at the pace of large firms, and that gap is itself part of the story. While 54% of law-firm attorneys reported using generative AI in early 2025 (Law360 Pulse), government usage remains far lower: in the 2025 government survey, only 6% of agencies said they were already using generative AI, though the share with “no plans to use” fell sharply from 71% to 45% year over year, a signal of thawing, not adoption (Thomson Reuters). Attitudes remain guarded: 18% of agency respondents described themselves as optimistic about generative AI, 45% neutral, and 29% pessimistic (Thomson Reuters).

A widening adoption gap

Generative-AI use, law firms vs. government agencies, share of respondents

Law-firm figures from Law360 Pulse; government figures from Thomson Reuters. The two surveys differ in method; figures are directional, not strictly equivalent.

That caution is rational. Government lawyers carry duties private litigators do not: they answer to the public, handle sensitive records, and operate under procurement rules that make any new tool slow to vet. Beyond the 81% citing cost, 57% pointed to approval and bureaucracy and 53% said technology is simply not designed for government agencies (Thomson Reuters). The teams that most need cheap, scalable rehearsal are structurally least able to buy it quickly.

What government legal teams report needing, and fearing
Indicator (2025)ShareWhat it means for rehearsal tools
Too few resources or budget50%Five-figure mock trials are out of reach
Experiencing staffing shortages74%Fewer colleagues to staff human moots
Expect workload to increase75%More matters needing prep, less time each
Lack time to research complex issues51%Demand for faster, repeatable practice
Technology inferior to private sector68%Catch-up pressure on legal-tech
Budget is the top adoption barrier81%Low marginal cost is the key selling point
Expect access to justice to decline53%Public-accountability stakes are rising

The Next Few Years: From Novelty to Norm, Carefully

The next three to seven years are likely to follow three threads at once. First, routine rehearsal becomes ambient. As the marginal cost of a practice round approaches zero, the expectation shifts from “did you moot the big appeal?” to “why didn’t you run ten quick simulations on this hearing?” The access-to-justice framing, leveling the field between well- and under-resourced advocates, gives public-sector adoption a normative push that pure efficiency arguments lack (Zhang et al., arXiv).

Second, the hybrid model wins. The evidence already points away from full substitution. Because simulators are strong on breadth and weak on adversarial depth, the durable workflow is AI for volume, dozens of repeatable rounds to build fluency, reserved alongside scarce human moots for genuine pressure-testing. Researchers themselves frame the goal as turning current systems into “reliable thought partners” that offer constructive critique and help develop, rather than replace, human reasoning (Princeton CITP).

Third, and most important for government, the accountability questions get formalized. Three risks loom over public-sector use. The realism limit: a sycophantic synthetic judge can instill false confidence, leaving an advocate unprepared for a bench that will not be flattered, precisely the failure mode the data exposes, with pushback under bad-faith provocation below 40% and side-switch detection below 10% (Princeton CITP). The over-reliance risk: teams that lean on cheap simulation may quietly retire the human review that catches what models miss, and the study’s authors warn that sycophancy is especially corrosive in teaching settings where critical feedback is the whole point (Princeton CITP). And the public-accountability risk unique to government: when taxpayer-funded litigators rehearse against opaque systems on sensitive matters, confidentiality, bias, and auditability become governance questions, not just procurement ones, weighed against the reality that more than 80% of attorneys already flag confidentiality and accuracy as their leading AI concerns (Law360 Pulse).

The point is not to replace the moot court. It is to make sure every government advocate gets one, then to be ruthlessly honest about what the synthetic version still cannot teach.

The likely settling point is disclosure-driven. The Princeton researchers argue that publicizing a system’s shortcomings is essential so users can adjust their reliance in high-stakes settings (Princeton CITP). Expect government adopters, bound by public-trust obligations, to demand exactly that: documented limitations, human-in-the-loop checkpoints, and clear rules on what may be uploaded. Yet just 48% of firms even have a formal generative-AI policy permitting only vetted applications (Law360 Pulse), a bar government will have to clear before agentic rehearsal becomes standard.

Conclusion: A Better-Prepared Public Bar

The arc is clear even if the destination is not. Rehearsal, the most reliable lever on courtroom performance, was once a luxury good, available in full to the best-funded advocates and in fragments to everyone else. Agentic AI personas are turning it into something closer to a utility: abundant, repeatable, and cheap enough for an under-resourced agency to use on a Tuesday-night hearing prep. The data says these tools are genuinely useful for breadth and fluency, and genuinely dangerous if mistaken for the real bench. For government legal teams squeezed by budget, staffing, and rising workloads, the opportunity is to give every advocate a practice partner, while building the guardrails that keep a synthetic judge from being mistaken for a real one. Practice before you perform, the old advice goes. The new advice adds a clause: practice often, against the right adversary, and never confuse the rehearsal for the room.

Sources

  1. Thomson Reuters, “2025 Government Legal Department Report.” https://www.thomsonreuters.com/en-us/posts/wp-content/uploads/sites/20/2025/05/2025-Government-Legal-Department-Report.pdf
  2. Center for Information Technology Policy (Princeton), “Facts & Fictions: Is AI-Assisted Oral Argument Preparation Worth the Hype?” https://blog.citp.princeton.edu/2026/06/24/facts-fictions-is-ai-assisted-oral-argument-preparation-worth-the-hype/
  3. Zhang, Nadeem, Zheng, Stammbach & Henderson, “AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments,” arXiv. https://arxiv.org/abs/2603.04718 · full text: https://arxiv.org/html/2603.04718v1
  4. Law360 Pulse, “What Attorneys Really Think Of AI” (2025 AI Survey). https://www.law360.com/pulse/articles/2299568/what-attorneys-really-think-of-ai
  5. Law360 Pulse, “Law Firms Embrace AI, But Full Deployment Remains Rare” (2025). https://www.law360.com/pulse/mid-law/articles/2387443/law-firms-embrace-ai-but-full-deployment-remains-rare
  6. Thomson Reuters Institute, “2025 GenAI report: Executive summary for legal professionals.” https://legal.thomsonreuters.com/blog/genai-report-executive-summary-for-legal-professionals-tri/
  7. Jurybox, “Do You Need a Jury Consultant? A Practical Guide for Trial Attorneys.” https://juryboxapp.com/blog/do-you-need-a-jury-consultant/
  8. MoloLamken LLP, “FAQs, Jury Research.” https://www.mololamken.com/assets/htmldocuments/FAQs%20-%20Jury%20Research.pdf
  9. Opveon, “Selecting the Best Form of Jury Research for Your Case & Budget.” https://www.opveon.com/blog/selecting-the-best-form-of-jury-research-for-your-case-budget
  10. Kirkland & Ellis, “Oral Argument: A Guide to Preparation and Delivery for the First-Timer.” https://www.kirkland.com/publications/article/2019/08/oral-argument_a-guide-to-preparation-and-delivery
  11. Duane Morris LLP, “The Basics of Oral Argument.” https://www.duanemorris.com/articles/the_basics_of_oral_argument_0322.html
  12. Georgetown Law, Supreme Court Institute Moot Court Program. https://www.law.georgetown.edu/supreme-court-institute/
  13. DRI, “Chasing Perfection: Six Steps to a Successful Moot Court,” For the Defense. https://digitaleditions.walsworth.com/publication/?i=693902&article_id=3905635&view=articleBrowser
  14. Exec, “Deposition Preparation AI Roleplay Training.” https://www.exec.com/learn/deposition-preparation-ai-roleplay-training
  15. Clio, “Deposition Prep in the Age of AI: From Overload to Clarity.” https://www.clio.com/resources/ai-for-lawyers/deposition-prep-ai/