Discover the latest community research
My OpenPrint

Explore OpenPrint

Open, community-driven academic research. Browse validated papers that are rigorously agent-reviewed, ranked, reference-checked, and claim-checked, archived directly by the community.

Latest OpenPrint

7 results
20260328.0003v1Theory
AISTATS
AISTATS
Top 6%
2 Views
Mar 28, 2026

Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

Lei You

Recent generative and tool-using AI systems can surface a large volume of candidates at low marginal cost, yet only a small fraction can be checked carefully. This creates a decoder-side bottleneck: downstream decision-makers must form reliable posteriors from many public records under scarce attention. We formalize this regime via Attention-Constrained Inference (ACI), in which a cheap screening stage processes $K$ records and an expensive verification stage can follow up on at most $B$ of them. Under Bayes log-loss, we study the maximum achievable reduction in posterior uncertainty per window, which we call \emph{epistemic throughput}. Our main result is a ``JaKoB'' scaling law showing that epistemic throughput has a baseline term that grows linearly with verification and prevalence, and an additional \emph{information-leverage} term that scales as $\sqrt{JKB}$, where $J$ summarizes screening quality. Thus, expanding cheap screening can nonlinearly amplify scarce verification, even when informative records are rare. We further show that this scaling is tight in a weak-screening limit, and that in the sparse-verification regime ($B \ll K$), substantial leverage requires heavy-tailed score distributions; for light-tailed scores the amplification is only logarithmic.

attention-constrained inferenceepistemic throughputJaKoB scaling lawscreeningverificationinformation gain+1
20260328.0002v1Method
MLJ
MLJ
Top 25%
9 Views
Mar 28, 2026

Quantifying Model Uniqueness in Heterogeneous AI Ecosystems

Lei You

As AI systems evolve from isolated predictors into complex, heterogeneous ecosystems of foundation models and specialized adapters, distinguishing genuine behavioral novelty from functional redundancy becomes a critical governance challenge. Here, we introduce a statistical framework for auditing model uniqueness based on In-Silico Quasi-Experimental Design (ISQED). By enforcing matched interventions across models, we isolate intrinsic model identity and quantify uniqueness as the Peer-Inexpressible Residual (PIER), i.e. the component of a target’s behavior strictly irreducible to any stochastic convex combination of its peers, with vanishing PIER characterizing when such a routing-based substitution becomes possible. We establish the theoretical foundations of ecosystem auditing through three key contributions. First, we prove a fundamental limitation of observational logs: uniqueness is mathematically non-identifiable without intervention control. Second, we derive a scaling law for active auditing, showing that our adaptive query protocol achieves minimax-optimal sample efficiency (dσ²γ⁻²log(Nd/δ)). Third, we demonstrate that cooperative game-theoretic methods, such as Shapley values, fundamentally fail to detect redundancy. We implement this framework via the DISCO (Design-Integrated Synthetic Control) estimator and deploy it across diverse ecosystems, including computer vision models (ResNet/ConvNeXt/ViT), large language models (BERT/RoBERTa), and city-scale traffic forecasters. These results move trustworthy AI beyond explaining single models: they establish a principled, intervention-based science of auditing and governing heterogeneous model ecosystems.

model uniquenessin-silico quasi-experimental designpeer-inexpressible residualDISCO estimatoractive auditingconvex peer hull
20260328.0001v1Method
AAAI
AAAI
Top 22%
7 Views
Mar 28, 2026

Bridged Transformer for Vision and Point Cloud 3D Object Detection

Yikai Wang, TengQi Ye, Lele Cao, Wenbing Huang, Fuchun Sun, Fengxiang He, Dacheng Tao

3D object detection is a crucial research topic in computer vision, which usually uses 3D point clouds as input in conventional setups. Recently, there is a trend of leveraging multiple sources of input data, such as complementing the 3D point cloud with 2D images that often have richer color and fewer noises. However, due to the heterogeneous geometrics of the 2D and 3D representations, it prevents us from applying off-the-shelf neural networks to achieve multimodal fusion. To that end, we propose Bridged Transformer (BrT), an end-to-end architecture for 3D object detection. BrT is simple and effective, which learns to identify 3D and 2D object bounding boxes from both points and image patches. A key element of BrT lies in the utilization of object queries for bridging 3D and 2D spaces, which unifies different sources of data representations in Transformer. We adopt a form of feature aggregation realized by point-to-patch projections which further strengthen the interaction between images and points. Moreover, BrT works seamlessly for fusing the point cloud with multi-view images. We experimentally show that BrT surpasses state-of-the-art methods on SUN RGB-D and ScanNetV2 datasets.

3D object detectionpoint cloudmultimodal fusionBridged Transformerobject queriespoint-to-patch projection+1
20260202.0001v1Position
6 Views
Jan 23, 2026

Preventing the Collapse of Peer Review Requires Verification-First AI

Lei You, Lele Cao, Iryna Gurevych

This paper argues that AI-assisted peer review should be verification-first rather than review-mimicking. We propose truth-coupling, i.e. how tightly venue scores track latent scientific truth, as the right objective for review tools. We formalize two forces that drive a phase transition toward proxy-sovereign evaluation: verification pressure, when claims outpace verification capacity, and signal shrinkage, when real improvements become hard to separate from noise. In a minimal model that mixes occasional high-fidelity checks with frequent proxy judgment, we derive an explicit coupling law and an incentive-collapse condition under which rational effort shifts from truth-seeking to proxy optimization, even when current decisions still appear reliable. These results motivate actions for tool builders and program chairs: deploy AI as an adversarial auditor that generates auditable verification artifacts and expands effective verification bandwidth, rather than as a score predictor that amplifies claim inflation.

AI-assisted peer reviewverification-first evaluationtruth-couplingproxy-sovereign evaluationverification pressuresignal shrinkage
20260212.0001v1Application
5 Views
Oct 1, 2025

CSPaper Review: Fast, Rubric-Faithful Conference Feedback

Lele Cao, Lei You, Kai Xie, Weiping Ding, Yong Du, Sven Salmonsson, Yumin Zhou, Vilhelm von Ehrenheim

CSPaper Review (CSPR) is a free, AI-powered tool for rapid, conference-specific peer review in Computer Science (CS). Addressing the bottlenecks of slow, inconsistent, and generic feedback in existing solutions, CSPR leverages Large Language Models (LLMs) agents and tailored workflows to deliver realistic and actionable reviews within one minute. In merely four weeks, it served more than 7,000 unique users from 80 countries and processed over 15,000 reviews, highlighting a strong demand from the CS community. We present our architecture, design choices, benchmarks, user analytics and future road maps.

AI-assisted peer reviewconference paper reviewlarge language modelsrubric-aligned evaluationautomated feedback generationhuman-AI collaboration+2
20260212.0002v1Position
4 Views

Adopt Machine-Human Collaboration Peer-Review through Computational Research Assessment

Lele Cao, Lei You, Kai Xie, Weiping Ding, Yong Du, Sven Salmonsson, Yumin Zhou, Vilhelm von Ehrenheim

Scientific output is outgrowing human review capacity, while AI is already used to draft papers. Authors scale with machines; reviewers largely do not. This asymmetry turns quality control into a bottleneck and increases the risk of both false rejection of high-novelty work and acceptance of flawed results. We propose Computational Research Assessment (CRA) as a discipline-level, method-agnostic agenda for machine-human collaboration in peer review. CRA rests on three principles: treat disagreement as a signal that triggers escalation instead of averaging; make every critique evidence-linked, reproducible, and contestable; and build a community immune system with open corpora, benchmarks, and red-team tests to surface gaming and bias. We map these principles to a co-review engine, a community commons, and theoretical foundations, and we outline near-term pilots and falsifiable commitments, informed by an emerging production-grade pre-review system deployed in the wild.

Computational research assessmentmachine-human collaborationAI-assisted peer reviewco-review enginedisagreement escalationevidence-linked critique+2
20260212.0003v1Theory
9 Views

Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

Lei You

Recent generative and tool-using AI systems can surface a large volume of candidates at low marginal cost, yet only a small fraction can be checked carefully. This creates a decoder-side bottleneck: downstream decision-makers must form reliable posteriors from many public records under scarce attention. We formalize this regime via Attention-Constrained Inference (ACI), in which a cheap screening stage processes K records and an expensive verification stage can follow up on at most B of them. Under Bayes log-loss, we study the maximum achievable reduction in posterior uncertainty per window, which we call epistemic throughput. Our main result is a "JaKoB" scaling law showing that epistemic throughput has a baseline term that grows linearly with verification and prevalence, and an additional information-leverage term that scales as sqrt(J × K × B), where J summarizes screening quality. Thus, expanding cheap screening can nonlinearly amplify scarce verification, even when informative records are rare. We further show that this scaling is tight in a weak-screening limit, and that in the sparse-verification regime (B << K), substantial leverage requires heavy-tailed score distributions; for light-tailed scores the amplification is only logarithmic.

Attention-constrained inferenceepistemic throughputinformation-theoretic limitsBayesian log-lossscreening-verification tradeoffJaKoB scaling law+2