The Object Storage Default — Why Cloud Architects Get This Wrong

The previous articles in this series have examined what happens when agents bypass the application layer, write without transactional guarantees, and operate against data that cannot enforce its own integrity. Each of those arguments converges on the same principle: governance must live where the caller cannot bypass it.

This article applies that principle to a question that predates agents but is made considerably more urgent by them: where does binary content belong?

There is a piece of received wisdom in cloud architecture that has hardened into doctrine: binary content belongs in object storage. Images, documents, attachments, media — put them in S3, Azure Blob, or GCS, store a reference in the database, and move on. It appears in reference architectures, in certification materials, in the default recommendations of every major cloud vendor. Most architects absorb it early and never examine it again.

It produces insecure, unnecessarily complex applications. And in a world where AI agents are reading and writing against your data estate, the consequences of getting this wrong are considerably more serious than they were when humans were the only consumers.

The Security Problem Nobody Talks About

The argument against object storage as a default rarely leads with security. It should.

Databases provide fine-grained, policy-enforced access control. Row-level security, column masking, data redaction, audit logging at the record level — these are mature capabilities that allow you to define precisely what each user, role, or application identity can see and do. The policy lives inside the database boundary and is enforced at the point of access, regardless of how the caller arrives.

Object storage provides none of this. Access control is coarse-grained by design: bucket policies, IAM roles, pre-signed URLs. Within a bucket, there is no meaningful equivalent of row-level security. An identity with read access to a bucket can read everything in it. An identity with write access can write anything to it, overwrite existing content, or corrupt it — with no constraint enforcement, no trigger, no policy standing between the caller and the data.

For applications where all callers are humans operating through a controlled application layer, this is manageable. The application enforces the fine-grained rules; the object store just holds the bytes.

That model breaks when agents are the callers.

An agent with read access to an object store bucket can read every document in it — across all customers, all versions, all classifications — constrained only by whatever coarse bucket-level policy is in place. In a RAG pipeline retrieving context for agent reasoning, this is not a theoretical exposure. It is the default behaviour. The agent retrieves whatever the bucket contains that matches the query, with no awareness of whether the caller should have access to those specific documents for that specific purpose.

Write access is worse. An agent that can write to object storage can overwrite or corrupt documents with no transactional control and no audit trail beyond access logs. There is no equivalent of a database check constraint or row-level policy to prevent it. Combined with the agent’s non-deterministic behaviour — the failure modes that earlier articles in this series have examined in detail — this is precisely the ungoverned state mutation that makes agentic architectures dangerous.

The series argument has been consistent: governance must live where the caller cannot bypass it. Object storage is architecturally incapable of providing that governance. Moving binary content into the database moves it inside the policy boundary — subject to the same access controls, the same audit trail, and the same transactional guarantees as everything else.

The Complexity Nobody Counts

The standard justification for object storage leads with cost. The complexity cost rarely appears in the same conversation, and it should, because it is real and it compounds.

Object storage is not a drop-in replacement for a database column. It is a separate system, with a separate API, a separate access control model, a separate backup process, a separate restore procedure, and separate monitoring requirements. Every one of those seams is a place where things can fail, diverge, or be misconfigured.

I have seen a quieter version of this in a curated knowledge repository I helped build. The repository held the structure, metadata, categories, and links between architecture material, but the documents themselves lived in a separate WebDAV repository. At the time that felt reasonable. The knowledge system could organise the material, and the document repository could store the files.

The problem arrived later, when the WebDAV repository was retired. The links could not simply be redirected at the host level, because the new repository used a different URL structure deep inside the path. The knowledge repository was full of links that no longer resolved cleanly to the documents they represented.

The migration took weeks. We had to run multiple passes of URL matching, and even then the translation could not be fully automated. Each knowledge contributor had to check their own subject area because only they could recognise whether a migrated link still pointed to the right document.

Then it happened again. The document store was later moved into SharePoint, with another change in structure and addressing. By then, many of the people who understood the content had moved on. The first migration had been painful but possible because the contributors still had enough context to verify the links. The second time, that institutional memory had gone. The knowledge store was still valuable, but maintaining the relationship between its records and the external documents was no longer feasible. It was killed off.

That is the kind of complexity that rarely appears in the original architecture decision. The document is treated as external content, but the business meaning lives in the relationship between the record and the document. Break that relationship, and the content has not merely moved; part of the knowledge model has broken.

Backup consistency is the most underappreciated risk. A database backup and an object store backup are separate operations on separate schedules with separate restore procedures. For content that is semantically part of a record — a contract attached to a deal, a document attached to a customer profile, an image that is part of a product record — split storage means split recovery. A restore that brings the database to a point in time but not the object store to the same point leaves records pointing to content that no longer corresponds to the state of the data. For SaaS vendors this is not just an operational concern — it is a product guarantee issue. The customer’s restore is your problem.

For ISVs and multi-cloud deployments, the complexity compounds further. Object storage APIs are nominally S3-compatible but diverge in practice — multipart upload behaviour, metadata handling, consistency models, lifecycle policy syntax all differ between AWS S3, Azure Blob Storage, GCP Cloud Storage, and Oracle Object Storage. An ISV shipping across multiple clouds and on-premises cannot assume a consistent object storage API. They either write and maintain abstraction layers for each target, accept a lowest-common-denominator capability set, or discover at the worst possible moment that a specific deployment behaves differently from the others. Keeping binary content in the database eliminates this problem entirely. The database API is consistent regardless of deployment target.

Every external system an ISV integrates with also expands the compliance certification surface — SOC 2, ISO 27001, GDPR data residency. Object storage across multiple clouds multiplies that surface. Content inside the database stays within an already-certified boundary.

The Storage Cost Argument Doesn’t Hold Up

Object storage is cheaper per gigabyte than database storage. This is the primary justification, and it is true at face value. It does not survive scrutiny.

First, the comparison ignores database compression and deduplication. Modern databases apply compression transparently to stored content, including binary objects. On Oracle Autonomous Database, Advanced Compression is included in the service — compression, deduplication, and encryption for binary content stored as SecureFiles, with no additional licensing cost. On standard Oracle Enterprise Edition it requires the Advanced Compression option. The point is that the raw storage cost comparison, gigabyte for gigabyte, is not the honest comparison. Compressed database storage versus uncompressed object storage is a materially different calculation, and the gap is smaller than the headline numbers suggest.

Second, the comparison ignores operational cost. Object storage requires separate tooling, separate management, separate monitoring. At scale those operational costs are real and they do not appear in the per-gigabyte price.

Third, and most importantly, the comparison is being applied to workloads that do not justify it. The economics and performance case for object storage is strongest at high volume and large file sizes — video libraries, image-heavy consumer platforms, document repositories measured in petabytes. Most enterprise applications are nowhere near that scale. The cost advantage that exists at the upper end of the market is being used to justify a default applied to workloads where it barely registers.

The Scale Argument Is Assumed, Not Demonstrated

Related to cost is scale. Specialist object storage is designed for high-volume, high-throughput workloads. Databases were not originally optimised for streaming large binary objects. At sufficient scale the performance characteristics diverge.

This is true. It applies to a specific class of workload that most enterprise applications do not have. Logos, thumbnails, attachments, office documents, CMS assets, short videos, signed contracts — none of these generate the access patterns that motivated object storage. The argument is valid at the extremes; it is being applied as a general rule.

The question is not whether there exist workloads at sufficient scale to justify object storage. There clearly are. The question is whether that justification should be the starting assumption or the conclusion of an analysis. Currently it tends to be the starting assumption, applied without examination to applications that will never come close to the scale it was designed for.

The Content That Never Should Have Left the Database

It is worth being precise about the range of content that belongs in the database by default.

Text held in the application — customer comments, notes, support tickets, descriptions — has always belonged there. It is relational data in all but name. The case for externalising it has never existed.

JSON documents, configuration objects, structured content — modern databases handle these natively and well. There is no meaningful argument for external storage.

Small to medium binary content — logos, thumbnails, icons, signatures, small images that are incidental to the core function of the application — belongs in the database. The access patterns are record-oriented, the sizes are manageable, and splitting them out adds a seam that delivers no return. This holds for greenfield single-cloud deployments as much as for anything else.

Content management systems are a specific case worth noting. Many were originally designed for on-premises deployment, where object storage was not a viable option. Binary content in the database was the correct architectural decision at the time, and the governance and consistency properties it provides — unified backup, unified access control, no synchronisation overhead — are genuine advantages that do not disappear because cloud object storage now exists.

Office documents, PDFs, contracts — the case for keeping these in the database is strong, particularly where governance and integrity matter. Which brings us to the sharpest example.

Vital Documents and the Integrity Gap

Versioned legal contracts, regulatory submissions, audit evidence, court-admissible records — these have integrity requirements that object storage was not designed to meet.

Object storage can version files and restrict access. It cannot provide cryptographic proof that a document has not been tampered with since it was stored. It cannot enforce that a document cannot be deleted within a defined retention period. It has no native mechanism for chaining documents into a tamper-evident sequence.

Oracle Blockchain Tables, available in Oracle Autonomous Database, provide exactly this. Binary content stored as a BLOB column in a Blockchain Table is subject to cryptographic chaining — each row contains a hash of the previous row’s data, making tampering detectable. Rows cannot be deleted within a defined retention period, and optionally never. The immutability is enforced at the database engine level, not by access control policy that can be overridden by a sufficiently privileged identity.

(Note for verification before publication: confirm that SecureFiles compression and deduplication apply to BLOB columns within Blockchain Tables.)

For a legal contract where the integrity of the document could determine the outcome of litigation, or for a regulatory submission where any question of tampering could trigger a compliance investigation, the database is not just a reasonable choice. It is the architecturally superior one. Object storage offers no equivalent.

The Mental Model That Caused This

The object storage default did not emerge from careful analysis of workload requirements. It emerged from two related failures of thinking.

The first was the assumption that databases are relational stores — designed for rows and columns, and therefore inappropriate for binary content. This was never entirely accurate, and it became increasingly inaccurate as converged databases matured. Oracle has supported native binary storage via SecureFiles for well over a decade. Other major databases — PostgreSQL, SQL Server on-premises — offer partial implementations of binary storage within the transactional boundary, though none with the maturity and operational completeness of Oracle’s implementation. The category error is treating “relational database” as a synonym for “structured data only”, when the industry moved well past that limitation years ago.

The second was cloud vendor incentives. Object storage is a high-margin, high-retention service. AWS, Azure, and GCP all have strong commercial reasons to recommend it as a default, and their reference architectures reflect those incentives. That does not make the guidance dishonest — the workloads they design for often justify it — but it does mean the guidance should be read with an understanding of where it comes from. A reference architecture optimised for a hyperscale web platform is not automatically the right starting point for an enterprise line-of-business application.

The combination produced a default that was never really justified for most of the applications it was applied to, hardened by repetition into received wisdom, and is now being inherited by architectures that are adding AI agents as consumers — without examining whether the security and governance properties of that default are adequate for non-deterministic, autonomous actors operating at scale.

They are not.

The One Genuine Use Case

Object storage is the right answer for specific workloads: high-volume, large-format media where the economics and access patterns genuinely demand it. Video libraries. Image-heavy consumer platforms. Document repositories operating at petabyte scale. CDN delivery of static assets.

These are real workloads. They justify specialist storage. They are not most enterprise applications, and they do not justify a default applied universally without analysis.

What This Means in Practice

The architectural principle is straightforward: object storage requires justification. It should be the conclusion of an analysis, not the starting assumption.

For a new application — including greenfield, single-cloud, modern-stack applications — the default question is not “where does this binary content go?” but “is there a specific reason this content cannot stay in the database?” The reasons that answer that question in favour of object storage are genuine but bounded: very large files, very high volumes, CDN delivery requirements, access patterns that genuinely demand object storage semantics.

For content that does not meet those criteria — which is most of the binary content in most enterprise applications — the database is the correct default. It provides governance, transactional integrity, backup consistency, fine-grained access control, and a unified operational model. It eliminates the seam that object storage creates, and it eliminates the security exposure that seam introduces when agents are the callers.

Received wisdom delivers insecure applications. Every architectural default deserves to be examined — especially when the consumers of that architecture are no longer exclusively human.

The next article extends this argument into AI-specific storage: the vector store and the agent memory store. If the object storage default creates a security and governance gap, the disconnected vector store creates the same gap in the infrastructure built specifically to serve AI agents — which makes it, if anything, more consequential.