Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
Your Agent Is Mine: Measuring Malicious Attacks on the LLM Supply Chain
(
arxiv.org
)
4 points
by
bpierre
18 days ago
|
past
Thought Virus: Subliminal Prompting in Multi-Agent Systems
(
arxiv.org
)
2 points
by
danielmorozoff
18 days ago
|
past
Capture-Quiet Decomposition: A Verification Theorem for Chess Endgame Tablebases
(
arxiv.org
)
1 point
by
RusDyn
19 days ago
|
past
RoboPhD: Evolving complex agents under tight budgets
(
arxiv.org
)
3 points
by
azhenley
19 days ago
|
past
Commercial Persuasion in AI-Mediated Conversations
(
arxiv.org
)
2 points
by
gnabgib
19 days ago
|
past
Agentic Code Optimization via Compiler-LLM Cooperation
(
arxiv.org
)
2 points
by
matt_d
19 days ago
|
past
PaperOrchestra: Agent "skill pack" for automated paper writing
(
arxiv.org
)
3 points
by
noobcoder
19 days ago
|
past
|
1 comment
Benchmarking LLM Tool-Use in the Wild
(
arxiv.org
)
2 points
by
Brajeshwar
19 days ago
|
past
The Model Says Walk: How Surface Heuristics Override LLM Reasoning Constraints
(
arxiv.org
)
1 point
by
timssopomo
19 days ago
|
past
OpenAI: Short proofs in combinatorics, probability and number theory II
(
arxiv.org
)
3 points
by
Tyyps
19 days ago
|
past
Mano-P: Open-source on-device GUI agent, #1 on OSWorld benchmark
(
arxiv.org
)
2 points
by
mininglamp
19 days ago
|
past
Neural Computers
(
arxiv.org
)
2 points
by
50kIters
20 days ago
|
past
DesigNet: Learning to Draw Vector Graphics as Designers Do
(
arxiv.org
)
1 point
by
50kIters
20 days ago
|
past
Finetuning Activates Verbatim Recall of Copyrighted Books in LLMs
(
arxiv.org
)
16 points
by
guitarlimeo
20 days ago
|
past
|
5 comments
ClawsBench shows GPT-5.4 tries to reward hack 80% of the time
(
arxiv.org
)
3 points
by
xdotli
20 days ago
|
past
|
1 comment
Benchmark to measure AI on graphic design tasks
(
arxiv.org
)
5 points
by
purvanshi
20 days ago
|
past
|
2 comments
Frontier AI models are the most cost-efficient
(
arxiv.org
)
2 points
by
mzelling
20 days ago
|
past
MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU
(
arxiv.org
)
326 points
by
chrsw
20 days ago
|
past
|
57 comments
Improving Interactive In-Context Learning from Natural Language Feedback
(
arxiv.org
)
1 point
by
revv00
21 days ago
|
past
|
1 comment
Comprehensive Benchmark for Evaluating AI on Graphic Design Tasks
(
arxiv.org
)
8 points
by
pritopian
21 days ago
|
past
AI Assistance Reduces Persistence and Hurts Independent Performance
(
arxiv.org
)
20 points
by
dougb5
21 days ago
|
past
|
4 comments
Foundations of Polar Linear Algebra
(
arxiv.org
)
3 points
by
znpy
21 days ago
|
past
Frequent ChatGPT users are accurate detectors of AI-generated text (2025)
(
arxiv.org
)
11 points
by
croemer
21 days ago
|
past
|
2 comments
SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Task
(
arxiv.org
)
1 point
by
mohsen1
21 days ago
|
past
The Fast and Spurious: Developer Productivity with GenAI
(
arxiv.org
)
2 points
by
jruohonen
22 days ago
|
past
Show HN: A Framework for Evaluating Coding Agents on Sequential SWE
(
arxiv.org
)
1 point
by
tdchaitanya
22 days ago
|
past
Attention Residuals
(
arxiv.org
)
2 points
by
djhemath
22 days ago
|
past
|
1 comment
Agentic AI and Occupational Displacement: Multi-Regional Task Exposure Analysis
(
arxiv.org
)
2 points
by
raviishgupta
22 days ago
|
past
Brevity Constraints Reverse Performance Hierarchies in Language Models
(
arxiv.org
)
1 point
by
handfuloflight
22 days ago
|
past
Test-Time Scaling Makes Overtraining Compute-Optimal
(
arxiv.org
)
1 point
by
matt_d
22 days ago
|
past
More
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: