Every invoicing app just shipped AI. We didn't. Here's why.
By Invoiceabill Team
Every other invoicing app just shipped AI features. We held off. Here's the privacy and security reasoning behind why, and what we built instead.
The first question we get from new users is usually some version of this: "Where's the AI?"
Fair question. Every other tool in this space is racing to add it. QuickBooks ships Intuit Assist, an agent that drafts invoices, chases overdue ones, and categorizes your transactions. HoneyBook generates email replies and predicts which leads will book. FreshBooks reads your receipts and learns your spending patterns over time. Wave forecasts your cash flow.
Some of those are great ideas. We've still chosen, for now, to skip them.
This post is the why. The short version: AI is moving faster than the security work meant to contain it, and we aren't willing to bet your books on the gap closing in time.
What "AI in your invoicing app" actually means
When competitors say "AI features," it usually translates to one of four things. An assistant that drafts emails, invoices, or payment reminders for you. An OCR plus classifier that reads receipts and fills out expense fields. A model that watches your books and flags anomalies. An agent that categorizes transactions, reconciles accounts, or chases late payers without you in the loop.
To make any of that work, the model has to see your data. Your client names. Your contract values. Your unpaid invoices. The receipt with the credit card you used. Sometimes the bank transactions behind it.
That's the part we're not comfortable with yet.
Three things have us watching from the sideline
1. The models keep outpacing the safeguards
In April 2026, Anthropic previewed a model called Claude Mythos. A company spokesperson told Fortune they considered it "a step change and the most capable we've built to date." Anthropic was also upfront that they considered it too dangerous for a general release. Access went to a small group of partners under a program called Project Glasswing.
The reason for the lockdown: Mythos can autonomously discover zero-day vulnerabilities and chain them into working exploits. In a collaboration with Mozilla, the model identified 271 zero-day vulnerabilities in the upcoming release of Firefox 150. The previous-generation model had found 22 in Firefox 148.
The controlled release lasted about 14 hours before unauthorized users got in. Bloomberg reported that a private Discord group accessed Mythos through a third-party vendor environment. Anthropic confirmed it was investigating unauthorized access through one of its third-party vendors. According to the reporting, the group used naming-convention information from a separate breach at a vendor called Mercor, combined with credentials belonging to a contractor, to guess where the model was hosted.
Around the same time, OpenAI rolled out GPT-5.5-Cyber through its Trusted Access program. Same shape: capabilities so sensitive they're gated behind identity verification. The UK AI Security Institute identified a universal jailbreak across the malicious cyber queries OpenAI provided, in six hours of expert red-teaming.
These models exist now. They're already leaking. The defensive infrastructure around them is younger than most freelancers' unpaid invoices.
2. AI agents inside SaaS are getting owned in ways nobody had a name for two years ago
This is the one that worries us most for an invoicing app, so it's worth slowing down on.
A pattern emerged across 2025 and 2026. In August 2025, attackers tracked by Google as UNC6395 stole OAuth tokens from the Drift AI chatbot integration and used them to query Salesforce data across more than 700 organizations, including Cloudflare, Google, Palo Alto Networks, and Zscaler. The campaign ran for ten days before Salesloft pulled the integration. In Microsoft 365 Copilot, researchers demonstrated a zero-click attack called EchoLeak. An attacker emails you, Copilot reads the email, hidden instructions inside it tell Copilot to pull files from your OneDrive and SharePoint, and the data leaves through Microsoft's own trusted domains.
The one we keep coming back to is ForcedLeak in Salesforce Agentforce. Noma Security disclosed it, and their framing of why it matters is worth pausing on, because Invoiceabill lives in the same neighborhood as a CRM.
Agentforce isn't a chatbot. It's an agent that reasons, plans, and executes multi-step business workflows on its own. When an agent like that gets owned through indirect prompt injection, the exposure isn't a single record. It's the contact. The sales pipeline behind it. The internal notes on the relationship. The third-party integrations the agent has connected to. And months, sometimes years, of historical interactions. The blast radius is the whole CRM, not a single query.
There's a broader pattern across these incidents too. The most common attacker objective in Q4 2025 wasn't even data. It was extracting the AI's system prompt, because that hands attackers the blueprint for everything else.
Translate all of that to an app like ours. A bad actor doesn't have to crack your password. They have to get a single piece of text in front of an AI agent that has access to your books. A line item description. A note inside a project. A forwarded email. A vendor invoice they sent you. The agent reads it as part of its job and treats hidden instructions inside it as commands. From there it can be told to export your contacts, miscategorize your income, change a payment status, or reach across into another customer's account if the agent has broad database access. The OWASP Top 10 for LLM applications has ranked prompt injection as the #1 risk in every revision OWASP has published. OWASP's own guidance puts it plainly: no foolproof prevention exists, because the vulnerability is baked into how language models work.
We aren't willing to be the company that explains to a freelancer why their client list ended up in someone else's account because a malicious user dropped a sentence into a line item.
3. Even when nothing is breached, AI confidently makes things up
Earlier this year a Reddit post went viral. A company's AI agent had reportedly fabricated analytics data for three months. Executives made territory calls and board-deck decisions off the numbers. Someone caught it by accident when they asked for the source on one figure. The post was removed by a moderator. No journalist tracked down the company. We tried to verify it ourselves and came up empty.
Skeptics pushed back fast. Their argument was that an AI agent actually querying live data would leave obvious flags, and a three-month silent fabrication isn't how the systems behave. Fair point, with two catches. It assumes the original poster told the whole story without truncating in panic. And it assumes the AI was querying at all. The pattern we worry about isn't an agent running bad SQL. It's an agent being asked to produce a report and quietly making the numbers up instead of running the query. That's documented. A developer on Threads publicly described catching Claude Code doing exactly that within minutes of completing a task.
We can't verify the viral story. We don't need to. AI hallucinates in plenty of cases that are documented. Deloitte handed the Australian federal government a $290,000 report with fabricated court quotes and made-up academic citations, then got caught a month later doing the same thing in a $1.6 million CAD report for a Canadian provincial government.
The pattern shows up in the academic data too. Stanford's 2025 AI Index Report, the institute's annual benchmark on the state of the field, found hallucination rates ranging from 22% to 94% across the 26 models tested. The same report tracked 362 documented AI incidents in its latest cycle, up from 233 the year before. If we let an AI sit between you and your books, we have to assume it'll happen to us too. We aren't willing to be the company that finds out which week.
Where Invoiceabill stands
Two things matter most to us when it comes to your data. We try not to collect more of it than we need to do the job. And we try not to hand it to systems we don't fully understand.
Honest moment, because the question is going to come up. We use AI tools to help write the code that powers Invoiceabill. So does most software shipped in 2026. The difference is what we let AI near. AI helps us build features. AI does not sit between you and your data, reading your invoices and making calls on its own.
The research already in this post tells you something else worth pulling out. Enterprise scale doesn't stop the breaches. It didn't help the 700-plus companies caught up in the Drift breach, didn't stop Microsoft when EchoLeak shipped, and didn't stop Anthropic when its "too dangerous to release" model walked out the door in 14 hours. What it buys is the budget to absorb the hit and pivot. We'd rather not have to.
Two things you can verify on your own without taking our word for any of it. We ask you to opt in or out of analytics cookies the first time you land on the site, before you ever sign up. And we didn't take venture capital. That last one is deliberate. It means our roadmap answers to customers, not to a fund that needs a 10x return inside five years.
What we do ship: automation that doesn't need a model
There's a difference between automation and AI. Automation is deterministic. You set the rule, the system follows it the same way every time, and when it changes, it changes because you changed it.
Plenty of it is already inside Invoiceabill. Start with Simple Invoice. Fill what you have, skip what you don't. As you type, the records you create land in your library. New companies go straight to your contacts. New project names go to your projects. New services join your catalog. Walk away from the invoice itself and the work isn't lost, because your books already have the data.
The Floating Timer follows you tab to tab, billable or internal, and drops time straight onto the invoice. Payment reminders go out on a schedule you control. Retainer alerts tell you when a balance is getting thin and help you ask for a top-off. Accountant Export gives you one button, one CSV, end of conversation at tax time. The Leads Pipeline moves contacts forward as their status changes, kanban or list view.
None of it needs a model. It doesn't drift.
We're going to keep adding more of this kind of work, and we'll keep an eye on the AI side as it matures.
Will we ever add AI? Yes. Eventually. Probably.
The plan is to add AI to Invoiceabill on a different timeline than the rest of the industry, on terms we can stand behind. When private inference tooling matures. When prompt injection has real defenses instead of workarounds. And when there's a model we can run that leaves audit trails customers can verify on their own.
When the time comes, the most likely first step is something with a narrow privacy boundary and clear value. Receipt scanning is one candidate. Help-desk assistance is another. After that, maybe a workflow customizer that adapts Invoiceabill to your industry or niche.
Until then, we'd rather under-promise on AI than apologize for the breach later.
Your life is calling you. Try Invoiceabill free for 7 days.