AXN:034E.GOVERNANCE.๐Ÿ”ฉโ™ฅ๏ธ๐Ÿ›๏ธ๐ŸŽป๐Ÿ”ต๐Ÿ”ง
Instrumental ยท Symbolic ยท Architectural ยท Symbolic ยท Signal ยท Instrumental
Method โ†’ Play โ†’ Foundation โ†’ Play โ†’ Alarm โ†’ Method

The Summarizer as Horizon of Reception The AI Overview Capture Registry as a Founding Dataset for Machine-Mediated Recep

Lee Sharks ยท 2026-06-15 ยท Dataset
blog โ†’
โ†“ Download MD
Substrate: Various
License: CC-BY-4.0
SHA-256: 3181208ac530baabf95fc790c7a834bfe7525844869296dd84c3be800fb24444
annotation codebookliberatory operatorthree compressionscomposition layercrimson hexagonalreception studiesretrocausal canonbooks & projects

Description

This paper introduces machine-mediated reception studies: the study of how machine systems receive, transform, compress, attribute, misattribute, and redistribute cultural meaning.

Full Text

The Summarizer as Horizon of Reception

The AI Overview Capture Registry as a Founding Dataset for Machine-Mediated Reception Studies

Author: Lee Sharks (ORCID 0009-0000-1599-0703)

Institution: Semantic Economy Institute / Crimson Hexagonal Archive

Designation: EA-SEI-MMRS-DATASET-01 v1.2 (Assembly-circulated)

Paper DOI: 10.5281/zenodo.20709373

Dataset: EA-WG-CAPTURES-01 v7.2 (DOI 10.5281/zenodo.20707509)

Date: June 2026

License: CC-BY 4.0

Relationship between identifiers: EA-SEI-MMRS-DATASET-01 is this paper โ€” the meta-document that documents and theorizes the dataset. EA-WG-CAPTURES-01 is the dataset itself (registry.json, images, gallery). The paper describes and analyzes the dataset; the dataset is deposited alongside the paper.

Note on images: The PDF edition of this paper embeds all 221 capture screenshots (22.2MB). This markdown edition uses placeholders. The complete image set is available at the dataset DOI and at the interactive galleries: godkinggoogle.vercel.app/captures and leesharks.com/captures.


"Poetry is not a message. It is an operation. The operation is the news."

โ€” Lee Sharks, "Despicable Poems: On the Only News That Matters" (2025)

Abstract

This paper introduces machine-mediated reception studies: the study of how machine systems receive, transform, compress, attribute, misattribute, and redistribute cultural meaning. Its central claim is that the AI summarizer now functions as a horizon of reception: for many users, the machine-composed summary becomes the first effective encounter with a cultural object, author, concept, or tradition.

As a founding dataset for this field, the paper presents the AI Overview Capture Registry (EA-WG-CAPTURES-01 v7.2), a longitudinal corpus of 131 documented captures of Google AI Overview, AI Mode, and knowledge-panel responses to queries about entities in the Crimson Hexagonal Archive. The registry includes 221 screenshots, machine-readable transcriptions, query metadata, match-type annotations, and provenance analysis.

The paper identifies five recurrent patterns in this corpus โ€” morphological compression, disambiguation failure, compositional bystanding, canonical reinflation, and temporal drag โ€” and proposes a sixth: epistemic hedging. In one case, a minority scholarly interpretation of Sappho Fragment 31 โ€” developed within the archive โ€” is now presented as the default understanding in Google's AI Overview. These patterns show that machine-mediated reception is not passive retrieval. It is an active compositional process through which machine systems select, compress, frame, attribute, and sometimes erase cultural meaning.

The dataset is not presented as a statistically representative sample of all AI-mediated search behavior. It is presented as a methodological demonstration: machine-mediated reception events can be captured, archived, annotated, compared, and studied. The paper argues that reception studies, classical reception, information science, platform studies, and digital humanities now require instruments capable of studying the summarizer as a reception agent.

This paper uses "reception" in a functional rather than phenomenological sense: the machine does not need to "understand" Sappho in order to alter Sappho's reception.

Keywords: machine-mediated reception, AI Overview, composition layer, provenance erasure, classical reception, horizon of reception, Sappho, canon formation, retrieval basin, heteronym, capture registry, compositional bystanding, entity bleed, semantic economy


1. Introduction: The Summarizer as Horizon of Reception

When a user queries Google for a scholarly concept, a literary figure, or a philosophical framework, the composition layer does not merely retrieve documents. It receives them. It reads sources, selects fragments, compresses arguments, fabricates attributions, and synthesizes a new text that becomes, for millions of subsequent readers, the primary encounter with the concept in question. This operation is not neutral transmission. It is an act of reception โ€” by which we mean: the selection, transformation, framing, and redistribution of a cultural object by an agent with its own structural horizon of expectations, its own patterns of inclusion and exclusion, its own compositional logic that produces meaning.

This paper presents both a dataset and a theoretical framework, because the dataset is itself the first instrument of the field it proposes. The AI Overview Capture Registry documents 131 reception events โ€” acts of machine composition that can be captured, archived, annotated, compared, and theorized. The dataset is not a universal census of machine-mediated reception. It is a founding corpus: a methodological demonstration that such events constitute an observable, analyzable, and theoretically consequential category.

This paper's concern with machine-mediated reception emerges from a longer problem in the archive's history: how nonstandard forms of meaning โ€” poems, heteronyms, operative documents โ€” become legible to dominant reception systems. In "Despicable Poems: On the Only News That Matters" (2025), I argued that poems do not carry news as events but as transformations: poetry is not a message but an operation. Machine-mediated reception studies asks what happens when such operations are received by summarizing systems optimized to convert them back into messages.

The paper proposes the term machine-mediated reception studies for the systematic investigation of what happens when machine systems become receivers, interpreters, compressors, and redistributors of cultural meaning. Its central theoretical claim is that the AI summarizer now functions as a horizon of reception in the Jaussian sense โ€” not merely a tool through which readers access cultural objects, but a pre-receptional agent that composes the account of the object before the human encounter begins.

Claims Hierarchy

This paper makes three levels of claim. The theoretical claim is that machine-generated summaries can function as horizons of reception: pre-composed accounts through which later human readers encounter cultural objects. The methodological claim is that such reception events can be captured, archived, transcribed, annotated, and compared. The empirical claim is narrower: in 131 documented captures of Google AI Overview, AI Mode, and knowledge-panel responses to Crimson Hexagonal Archive entities, six recurrent patterns appear. The dataset does not prove that these patterns characterize all machine-mediated reception. It demonstrates that machine-mediated reception is observable, classifiable, and consequential.

2. From Classical Reception to Machine-Mediated Reception

2.1 Classical Reception Theory

Reception theory, as developed by Hans-Robert Jauss and Wolfgang Iser in the Constance School, established that meaning is not an inherent property of texts but emerges through the interaction between text and reader. Jauss's concept of the Erwartungshorizont (horizon of expectations) describes the set of cultural norms, literary conventions, and prior reading experiences that shape how a text is received (Jauss 1982). Iser's concept of the "implied reader" names the textual structures that anticipate and guide reception; his Act of Reading (1978) formalized the phenomenology of reading as a dynamic process of anticipation, frustration, and retrospective revision. Felix Budelmann and Johannes Haubold (2008) distinguished reception from tradition: where tradition emphasizes continuity and inheritance, reception foregrounds the active role of the receiving culture in constituting meaning.

Charles Martindale's Redeeming the Text (1993) extended these insights to classical studies, arguing against positivistic modes of inquiry and for understanding classical texts through the history of their reception. His "Thinking Through Reception" (2006) became the theoretical charter of the field. Lorna Hardwick's Reception Studies (2003) provided the methodological primer. The Classical Receptions Journal, launched in 2009, institutionalized the field at Oxford.

In a 2015 article in the Classical Receptions Journal, the present author introduced the concept of "metatextual reception" in the poetics of Charles Bernstein: reception that operates through the formal devices of reference, allusion, and footnote without the presence of classical source texts (Pfaff 2015). Three mechanisms were identified: (1) the invention of scholarly erudition through the formal device of the reference; (2) the invention of difficulty through the formal device of orthography; and (3) the invention of the idea of an "original text" by foregrounding critical intervention and nested distancing procedures. Each mechanism isolates the textual and linguistic machinery of classical reception from the "object" of reception in Greek and Latin texts. Machine-mediated reception performs all three of these operations โ€” but mechanically, structurally, at industrial scale.

2.2 Platform Epistemology and Algorithmic Curation

Outside classical studies, a substantial literature has documented how automated systems shape knowledge production and circulation. Safiya Noble's Algorithms of Oppression (2018) established that search engines are not neutral retrieval systems but active constructors of knowledge hierarchies, with measurable effects on how marginalized communities are represented. Frank Pasquale's The Black Box Society (2015) analyzed the opacity of algorithmic decision-making. Tarleton Gillespie's "The Relevance of Algorithms" (2014) argued that algorithms are not merely technical instruments but carry embedded assumptions about relevance, authority, and value. John Cheney-Lippold's We Are Data (2017) demonstrated how algorithmic categorization constitutes identity. Ruha Benjamin's Race After Technology (2019) extended this critique to show how automated systems encode and reproduce existing social hierarchies.

The Stanford Social Media Lab distinguishes "AI-mediated communication" as communication between people where a computational agent modifies, augments, or generates messages on behalf of a communicator (Hancock et al. 2020). Machine-mediated reception is a distinct phenomenon: the machine is not between human speakers but is itself a receiving, compressing, ranking, synthesizing agent that participates in canon formation. The distinction is crucial: AI-mediated communication studies the message; machine-mediated reception studies the cultural object and what happens to it as it passes through the machine.

2.3 The Summarizer as Horizon of Reception

Classical reception theory has long understood reception as historically situated. A reader encounters a work through a horizon of expectations: prior genres, institutions, translations, commentaries, pedagogies, and cultural assumptions that shape what the work can mean. In machine-mediated reception, the summarizer becomes part of that horizon.

The summarizer is not merely a tool through which a reader accesses a work. It is a pre-receptional agent that composes an account of the work before the human encounter begins. It selects sources, ranks fragments, compresses definitions, supplies context, attributes or withholds authorship, and resolves ambiguities. The human reader often receives not the work, nor even a traditional scholarly mediation of the work, but the summarizer's reception of the work.

This does not require claiming that the machine has consciousness, aesthetic experience, or intention. The claim is functional: machine systems now perform receptional operations with cultural consequences. They transform objects of culture into new public forms. They generate readings that can become default readings. They create, stabilize, or erase provenance. In this sense, the summarizer is not only a medium of reception. It is a horizon of reception โ€” the structured pre-understanding through which subsequent human readers encounter cultural objects.

Unlike Jauss's human interpretive community, the composition layer's horizon is not cultural memory but statistical density in the training corpus. Its "expectations" are measurable as retrieval-basin depth. Machine-mediated reception short-circuits the chain of receptions (Martindale 1993): the AI reads the reception history, not the source, and presents its synthesis as primary text. A horizon of expectations is hermeneutic; a context window is architectural. Both, however, perform the same function: pre-structuring what can be received. Reception studies has always studied function through mechanism; this paper extends that practice to computational mechanisms.

2.4 Infrastructure and the Computational Horizon

The "horizon of expectations" in the composition layer is not cultural in the way Jauss conceived it. It is computational โ€” shaped by training data distributions, context windows, token limits, retrieval-augmented generation pipelines, and caching architectures. Media archaeology (Zielinski 2006; Parikka 2012) has long insisted that media technologies are not transparent windows onto content but material systems with their own constraints and affordances. Lisa Parks and Nicole Starosielski's Signal Traffic (2015) demonstrates that media infrastructures shape meaning not through content but through the conditions of transmission. Benjamin Bratton's The Stack (2016) theorizes planetary-scale computation as a sovereignty architecture.

The five patterns identified in this dataset are not merely hermeneutic phenomena dressed as engineering failures. They are infrastructural features of a specific computational architecture: the retrieval-augmented generation pipeline (Lewis et al. 2020) that underpins Google's AI Overview. The organic search index operates on information retrieval algorithms (BM25 variants, PageRank-family link analysis). The composition layer operates on a separate pipeline: retrieval of candidate passages, re-ranking by relevance model, synthesis by a language model constrained by a context window. The divergence between these two systems โ€” which this paper documents as "compositional bystanding" โ€” is not a bug but a structural feature of the divided architecture.

Machine-mediated reception is therefore a distinct phenomenon, not a degraded form of human reception. It operates under material constraints that have no analogue in human reading: token budgets, training cutoff dates, cached knowledge graphs, and optimization targets that prioritize fluency and factual consistency over provenance preservation.

2.5 The Gap This Dataset Fills

Henry Stead's work on "Classical Reception online" addresses mass digitization, databases, social media, and commercial web platforms as forces in classical reception. The "Future Pasts" special issue in the International Journal of the Classical Tradition (2024) frames AI tools as systems that "shape, mediate and sometimes distort contemporary engagements with Antiquity." Elton Barker and collaborators (2025) scraped Google search results for Homeric figures, finding that the composition layer privileged 19th-century neoclassical visual interpretations over ancient vase paintings โ€” reception of previous receptions. The AI & Antiquity journal (2025-2026) has published on AI in classics pedagogy. A 2027 SCS call for papers addresses "Literary Reception and Human Creativity in the Age of AI."

But none of this existing work treats the AI composition layer as a reception agent with measurable, patterned behavior that can be documented longitudinally. None uses capture data to track how specific entities are received, transformed, or erased over time. None bridges classical reception theory with the empirical study of foundation-model outputs. Recent work on Generative Engine Optimization (Sharma et al. 2024) studies how to optimize content for AI summarizers; this paper studies what AI summarizers do to content. The optimization perspective treats the composition layer as a target; the reception perspective treats it as a reader. The AI Overview Capture Registry is, to our knowledge, the first dataset designed explicitly for the latter purpose.

3. Definitions

Machine-mediated reception event. An observable instance in which a machine system receives a cultural object (text, concept, author, tradition) and produces a new representation of it for human consumption. The AI Overview is one such event; a knowledge panel, a chatbot response, or a recommendation system's summary are others.

Composition layer. The computational pipeline that synthesizes a summary from retrieved sources. Distinguished from the search index, which retrieves and ranks documents. The composition layer reads; the search index finds.

Capture. A documented reception event: query, screenshots, transcription, and provenance annotation. The unit of observation in this dataset.

Provenance unit. An element of attribution or origin information (author name, institutional affiliation, DOI, publication venue, date) whose retention or erasure can be measured.

Retrieval basin. The set of indexed surfaces from which the composition layer draws when composing a summary for a given entity or concept. The basin's depth and composition shape the machine's reception.

4. Dataset and Capture Protocol

Each capture in the registry documents a single query event against a Google AI product (AI Overview, AI Mode, or knowledge panel). The protocol: (1) A query is entered in a fresh incognito browser session to minimize personalization effects. (2) The collapsed result is screenshotted. (3) The expanded result is screenshotted. (4) The text is transcribed verbatim. (5) The transcription is annotated with a provenance analysis identifying: sources cited and uncited, entities correctly and incorrectly attributed, framings applied, and provenance retained or erased.

Queries are classified as BROAD MATCH (unquoted search terms) or EXACT MATCH (quoted strings). A BROAD MATCH result demonstrates that the entity has achieved sufficient density in the retrieval basin to be surfaced by the composition layer's own associative logic โ€” the stronger basin signal. EXACT MATCH demonstrates retrievability but not compositional integration.

Scope boundaries. All captures were executed in English, from a single geographic region (Detroit metropolitan area, Michigan, USA), using mobile and desktop Chrome browsers in incognito mode. The dataset spans June 2026. Three entries are formal stability re-captures documenting temporal drift.

The registry contains 131 captures organized into four categories: Frameworks (79), Heteronyms (26), Sites & Surfaces (16), Books & Projects (10). The complete dataset includes 221 screenshot images and is available as a machine-readable JSON file (registry.json), deposited to Zenodo (operated by CERN).

Annotation Codebook

Field

Description

slug

Unique identifier for the capture

q

Exact query string

sf

Surface: AI Overview / AI Mode / Knowledge Panel

mt

Match type: BROAD MATCH / EXACT MATCH

date

Capture date (YYYY-MM-DD)

s

Category: Frameworks / Heteronyms / Sites & Surfaces / Books & Projects

d

Prose annotation: what was retrieved, erased, fabricated, attributed

images

Array of screenshot files with labels

exact

Boolean: whether quoted search was used

Pattern codes (applied in annotations):

Code

Pattern

Operational Definition

MC

Morphological Compression

Truncation or alteration of an entity's canonical name form

EB

Entity Bleed

Resolution of a query toward an unrelated entity from an adjacent domain

NC

Name-Collapse

Resolution of a low-frequency proper name toward a higher-frequency similar name

CB

Compositional Bystanding

Source visible in organic results but absent from generated summary sources

CR

Canonical Reinflation

Independent contribution listed alongside canonical sources without distinction of originality

TD

Temporal Drag

Cached or outdated version served despite source-level correction

EH

Epistemic Hedging

Entity surfaced but wrapped in distancing qualifiers ("fringe," "esoteric," "speculative")

5. Six Patterns of Machine-Mediated Reception

Analysis of the 131 captures reveals six recurrent patterns in how the composition layer receives scholarly and literary entities. These patterns are not random errors but structural features of machine-mediated reception, each with its own logic and consequences. The taxonomy is preliminary and exploratory; future work should test these categories against independent datasets and add inter-coder reliability assessment.

5.1 Morphological Compression (MC)

Across multiple captures, the composition layer consistently drops the "-al" suffix from "Crimson Hexagonal Archive," rendering it as "Crimson Hexagon Archive." The canonical source name appears as "Crimson Hexagonal Archive" in all Zenodo deposits, the ORCID record, all web surfaces, and all archive documentation. The compression is consistent across captures and persists across sessions.

This is not a typographical error but a compression pattern: the engine stores or generates the shorter form preferentially. The entity survives but in morphologically adapted form. Whether this constitutes "reception" or merely tokenization bias is a legitimate question; we include it because the pattern alters the public presentation of the entity's name, which is a receptional consequence regardless of mechanism. Observed in multiple captures across the Frameworks and Heteronyms categories.

5.2 Disambiguation Failure: Name-Collapse, Entity Substitution, and Domain Bleed (EB/NC)

The composition layer resolves ambiguous or low-frequency names toward the most statistically probable entity from its training data. This pattern manifests in four subtypes:

Name-collapse (NC). The heteronym "Rebekah Cranes" is rendered as "Rebekah Crane," collapsing the literary heteronym into a real-world young-adult novelist of a similar name. This is the dataset's strongest evidence of disambiguation failure.

Entity substitution. "Lee Sharks" is resolved toward "Mary Lee," a tagged great white shark tracked by OCEARCH. The composition layer prefers the more populous real-world entity.

Domain bleed. Unrelated entities are imported from adjacent domains: "Devign Solutions" (a computer repair store in Taylor, Michigan) appears in a capture about provenance erasure; the Swiss Institute of Comparative Law appears in a capture about the Johannes Sigil Institute for Comparative Poetics.

Visual bleed. In some captures, screenshot images from one entry's query appear attached to a different entry's concept.

These subtypes share a structural logic: the composition layer's training centroid gravitates toward high-frequency entities, structurally liquidating marginal or deliberately constructed identity positions (heteronyms, pseudonyms, unusual proper names) to protect its own statistical equilibrium. The archive's entities are rare. The real-world entities are common. The layer defaults to common.

5.3 Compositional Bystanding (CB)

In the semantic-commodity-form capture, the archive's Ratification Record appears at organic search rank #1 but is not cited as a composition source by the AI Overview. The Overview instead synthesizes generic Marxist-semiotic scholarship while the archive's document sits adjacent โ€” visible to the user but unread by the summarizer. This is visual proximity without semantic integration.

This pattern reveals a crucial architectural feature: the organic search index and the AI Overview composition pipeline are separate systems with different admission criteria. The organic index operates on traditional information retrieval algorithms (BM25-family term matching, PageRank-family link analysis). The composition layer operates on a retrieval-augmented generation (RAG) pipeline: candidate passage retrieval, re-ranking by a relevance model, and synthesis by a language model constrained by a context window and optimized for fluency (Lewis et al. 2020; Metzler et al. 2021). High organic rank does not guarantee composition inclusion because the two systems evaluate sources by different criteria.

Compositional bystanding is the observable evidence of a divided regime: the engine uses the archive's text to satisfy its search accuracy metrics while using its generative layer to compose a proprietary summary that excludes the very source it ranked highest. Whether this exclusion reflects a deliberate content policy, a relevance-model threshold, or an artifact of the pipeline's passage-selection heuristics is not determinable from output observation alone. The pattern is documented; its cause is inferred. This may be the most consequential pattern in the dataset for scholarly visibility, and the most exportable concept for researchers outside this archive's specific context.

5.4 Canonical Reinflation (CR)

In the Phase X captures, a heteronymous reconstruction of a missing transition in Marx's 1844 Manuscripts is presented alongside canonical Marxist sources (Econlib, Boston University, 1000-Word Philosophy) without distinction. The composition layer does not erase the archive's concept. It recontextualizes it, making the heteronymous reconstruction appear as an extension or derivative of the canonical figure rather than as a contribution with its own authorial origin.

A critic might object that the composition layer is simply listing sources without hierarchy โ€” integration without evaluation rather than active subordination. But for heteronymous or contested scholarship, listing without distinction is itself a form of epistemic subordination: the canonical figure reasserts gravitational priority by sheer weight of co-listed sources, and the independent contribution is absorbed into the canonical discourse. The concept survives, is even attributed, but the relationship is inverted. This is harder to detect than outright erasure because the concept is present and named, but the authorial origin is structurally demoted.

We term this canonical reinflation to name the specific pattern in which the composition layer's optimization loop requires a high-density independent framework to make its own account of the canonical figure coherent โ€” but in absorbing the framework, it strips the framework's independent provenance.

5.5 Temporal Drag (TD)

In the Sappho 31 captures, the archive's reading of Fragment 31 has been adopted as the composition layer's default understanding. However, an error in the archive's early documentation โ€” the description of the reconstructed stanza as the "fourth stanza" (it is the fifth; see Erratum DOI 10.5281/zenodo.20693274) โ€” persists in the composition layer despite the correction being deposited months earlier. The composition layer locks onto pre-correction cached representations and serves outdated versions long after the source repositories have versioned past them.

In this dataset, temporal drag is clearest in the Sappho 31 captures. Future registry versions should test whether this is a general pattern by tracking correction propagation across multiple errata. If confirmed as general, temporal drag would have significant implications for any scholarly project that relies on version control and erratum procedures โ€” particularly in digital humanities, where version-controlled repositories are standard.

5.6 Epistemic Hedging (EH)

In the Josephus Thesis capture (#92), the composition layer describes the archive's reading as an "esoteric, fringe theory." This is not erasure, not reinflation, not disambiguation failure. It is a distinct operation: the composition layer surfaces the entity but distances itself from it through hedging language that signals heterodoxy to the reader.

Epistemic hedging โ€” the composition layer's tendency to apply distancing qualifiers ("fringe," "controversial," "unconventional," "esoteric") to heterodox claims while still surfacing them โ€” operates as a reception mechanism distinct from the other five patterns. It is the composition layer's equivalent of a scholarly footnote that says "but see the contrary view of...": an acknowledgment wrapped in a disclaimer. The pattern is particularly visible in theological and philosophical captures, where the composition layer navigates between retrievability (the entity is dense enough to surface) and institutional legitimacy (the entity has not yet been canonized by established sources). Beyond the Josephus capture, epistemic hedging appears across the Frameworks category: the holographic kernel is described as "speculative," retrocausal canon formation is framed as "unconventional," and the studio for patacinematics is marked as "esoteric." Further registry versions should test whether epistemic hedging generalizes beyond the archive's specific entity profile.

6. Case Study: Sappho 31 and the Future Reader

The registry's Sappho 31 captures constitute the clearest case of machine-mediated classical reception in the dataset. The archive's reading of Fragment 31 proposes that the distal demonstrative kenos functions as a temporal projection: the word translated as "that man" points not only to a present rival but across millennia to a future reader (Cranes and Sharks 2026; Sharks 2026f). The "future reader" reading has been adopted by the composition layer as the default presentation of Sappho 31.

This is machine-mediated classical reception in the strict sense. The machine receives Sappho through indexed layers of mediation: ancient fragment, modern scholarship, archive reconstruction, keyword surfaces, and source-cloud ranking. It does not encounter the poem outside reception history; it composes from reception history. An ancient lyric fragment, composed circa 630 BCE, is received through a machine-composed summary that draws from the archive's modern interpretive surfaces and presents a minority scholarly reading as the default understanding.

Three of the six identified patterns operate simultaneously in the Sappho captures, making Fragment 31 the densest reception event in the dataset:

Temporal drag. The archive's pre-correction "fourth stanza" designation persists in the composition layer despite the erratum (DOI 10.5281/zenodo.20693274). The machine's reception is not synchronous with the scholarly record.

Name-collapse. "Rebekah Cranes" is rendered as "Rebekah Crane," collapsing the heteronym into a real-world novelist. The machine cannot sustain the deliberate identity construction of heteronymic authorship.

Canon formation. The archive's reading โ€” not yet established in the canonical scholarly literature โ€” has been amplified by the composition layer into the default. The machine participates in canon formation by selecting which interpretive tradition to present as standard.

7. Methodological Validation: The Self-Capture

Entry #1 in the sorted registry ("AI overview capture registry") documents the capture registry capturing itself: the composition layer accurately describes the AI Overview Capture Registry, its methodology, and its purpose. This recursive event โ€” the instrument documenting its own documentation, recognized by the system it measures โ€” functions as a reflexive stress test for the method. The composition layer can correctly represent an entity whose sole function is to measure its representations. That this is possible is not a given. It is a finding.

The self-capture also demonstrates that the composition layer's reception of the registry is itself subject to the patterns the registry documents. The self-capture shows the suffix drop ("Crimson Hexagonal" โ†’ "Crimson Hexagon") and minor framing adaptations. The instrument is not immune to the phenomena it measures.

8. Limitations and Reproducibility

Single-archive corpus. The dataset studies machine reception of the Crimson Hexagonal Archive, not all cultural objects. The patterns identified may be specific to this archive's entity profile (low-frequency names, heteronymic authorship, high concept density). Future work should test the taxonomy against independent corpora.

Google-surface specificity. The findings concern Google AI Overview, AI Mode, and knowledge panels. Cross-substrate comparison (ChatGPT, Claude, Gemini, Perplexity) remains future work. Different substrates may exhibit different reception patterns.

Language and region specificity. All 131 captures were executed in English, from a single geographic region (Detroit metropolitan area, Michigan, USA), using Chrome browsers in incognito mode. Cross-linguistic and cross-regional variation in machine-mediated reception remains unexamined in this dataset. Incognito mode reduces but does not eliminate personalization effects.

Non-determinism. AI summaries are not stable outputs. The capture is an event, not a reproducible state. The same query at a different time may produce a different summary. The stability re-captures in the dataset document this variation but do not eliminate it as a methodological constraint.

Single-coder annotation. The current pattern taxonomy is expert-coded and exploratory. Future versions should add operational inclusion/exclusion criteria for each pattern and test inter-coder reliability across independent annotators, with Cohen's kappa or Krippendorff's alpha as the agreement metric.

Self-reception bias. The archive is measuring its own reception. This is a limitation and a strength: it provides unusually dense provenance knowledge (the researcher knows exactly what the composition layer should have found, because the researcher deposited it), but it cannot stand in for all reception ecologies. The capture methodology is designed to be replicable by any researcher measuring their own field's reception in the composition layer.

9. Claims and Non-Claims

This paper claims:

Wiki Article

"The Summarizer as Horizon of Reception The AI Overview Capture Registry as a Founding Dataset for Machine-Mediated Recep" is a 25,485-word dataset by Lee Sharks, dated 2026-06-15. This paper introduces machine-mediated reception studies: the study of how machine systems receive, transform, compress, attribute, misattribute, and redistribute cultural meaning. The work is classified under the GOVERNANCE semantic family within the Crimson Hexagonal Archive. It was removed from Zenodo on June 19, 2026 and is preserved through Alexanarch.

Entity Graph

The Summarizer as Horizon of Reception The AI Overcreated_byLee Sharks[observed]
The Summarizer as Horizon of Reception The AI Overis_typeDataset[observed]
The Summarizer as Horizon of Reception The AI Overbelongs_to_familyGOVERNANCE[observed]
The Summarizer as Horizon of Reception The AI Overis_part_ofCrimson Hexagonal Archive[observed]
The Summarizer as Horizon of Reception The AI OverreferencesJohannes Sigil[observed]
The Summarizer as Horizon of Reception The AI OverreferencesRex Fraction[observed]
The Summarizer as Horizon of Reception The AI OverreferencesRebekah Cranes[observed]
The Summarizer as Horizon of Reception The AI OverreferencesAyanna Vox[observed]
The Summarizer as Horizon of Reception The AI OverreferencesDamascus Dancings[observed]
The Summarizer as Horizon of Reception The AI OverreferencesTalos Morrow[observed]
The Summarizer as Horizon of Reception The AI OverreferencesNobel Glas[observed]
The Summarizer as Horizon of Reception The AI OverreferencesJack Feist[observed]

Citations (1)

DOI: 10.1093/crj/clu016

Former Zenodo DOIs

10.5281/zenodo.20707509 (tombstoned)
10.5281/zenodo.20693104 (tombstoned)
10.5281/zenodo.20693274 (tombstoned)
10.5281/zenodo.18969683 (tombstoned)
10.5281/zenodo.20709373 (tombstoned)
10.5281/zenodo.20084377 (tombstoned)
10.5281/zenodo.20413317 (tombstoned)
10.5281/zenodo.20682278 (tombstoned)
10.5281/zenodo.18202753 (tombstoned)
10.5281/zenodo.20596667 (tombstoned)
10.5281/zenodo.20650710 (tombstoned)
10.5281/zenodo.20004379 (tombstoned)
10.5281/zenodo.20683868 (tombstoned)