Uncategorized – Page 2

Why not just use ePub?

Academic literature is currently published and consumed in the PDF format so we are taking the approach that we should first augment the most widely used document medium. There is no reason why Visual-Meta cannot be implemented for ePub in the future however.

Steering Group

Frode Hegland, Jacob Hazelgrove, Vint Cerf, Ismail Serageldin, David De Roure, Pip Willcox, Mark Anderson, Jakob Voß, Christopher Gutteridge, Adam Wern, Peter Wasilko, Rafael Nepô, Adam Laidlaw, Günter Khyo, Gyuri Lajos & Stephan Kreutzer.

University of Southampton: Dame Wendy Hall, Les Carr and David Millard.

Semantic Web / RDFS

Semantic Web

Core: BibTeX {With Description of Fields}

Formatting of how to cite this document’s BibTeX:
producing software = Augmented Text Author
producing software url = www.augmentedtext.info
author = last name, followed by first name, followed by middle name; delimited by , and delimited by . between author names
title = plain text
date = time in year, month, date and time in 24 format, timezone delineated with ,
location = city name followed by country name
keywords = plain text separated by ,
responseto = citation information in this or BibTeX and/or URL to document this document is a response to

@book{
author = {Millard, David E.},
title = A reader on Reading},
date = 2010, June 2, 4:25, GMT},
}

Implementation Notes

The guiding principle is to preserve as much useful metadata in a human-readable form in an appendix at the back of a document in order for reader software to parse it to allow for advanced interactions, including citing with just copy and paste. The includes citation information as well as preserving structural data such as what text is a heading and so forth. If a competent programmer can read this and figure it out, we’ve succeeded. If not, there is a problem so please feel free to get in touch should you have any problems so we can make an effort for the next implementation information to be better. Therefore please feel free to email frode@hegland.com and we can discuss it via email or Zoom (or equivalent).

It is important to note that Visual-Meta is only an approach and as such, not all implementation of authoring and reading software will support all BibTeX content. The minimum is the introduction ‘boilerplate’ text, heading and the BibTeX of the document to allow the document to be cited. Beyond this, any Extended Visual-Meta is optional. This allows for flexible choice of what BibTeX is relevant for producers of documents.

Basic Parsing

It is worth noting that parsing software should parse the PDF from the back, looking for @{visual-meta-end} in order to determine if the document has Visual-Meta and if so, then looking for @{visual-meta-start} and using everything in-between.

Please also note that you can use ¶ to indicate a line break for parsing, using more of the horizontal space.

Copy As Citation

The Visual-Meta as an Appendix is slightly different from Visual-Meta as a copy (clipboard) payload since the act of copying text from a document with Visual-Meta should not only include the selected text but also the page number it was found on, to enable linking to that page.

Interactions to Support

The most basic interaction is for citing, since being able to refer to a document is core to document relationships. Visual-Meta was designed for is to allow a user to copy text from a PDF and paste it as a full citation since the clipboard payload can have the copied text plus the full BibTeX payload from the Visual-Meta.

Basic view controls such as being able to fold the document into a Table of Contents.

Alternative views, such as concept maps of the text and graph views.

Paratexts, including letting the reader software understand where the References, Glossary and Endnotes are and how they are formatted

More custom/niche interactions are server based, where documents can surface their contents, from Dublin Core metadata to what values tables and charts contain, for instant access externally to internal data.

Code

Code for parsing Visual-Meta will be made available when ready, summer 2023. Initially for Apple Platforms: macOS and iOS.

@{visual-meta-start} & @{visual-meta-end}

‘ @{visual-meta-start}’ and ‘@{visual-meta-end}’is not valid BibTeX, they are the external wrappers for Visual-Meta.

Implementations are of course open but parse documents from the end so that the end page is what marks the inclusion of Visual-Meta. They could have looked like anything but Visual-Meta started as simply a BibTeX embed so this is early legacy.

Extending Visual-Meta : Wrapper around Sections

Extending Visual-Meta is in principle as simple as stating with the the ‘@{visual-meta-start}’ and ‘ @{visual-meta-end}’ markers, though preferably after the BibTeX basic information, what the wrappers contain, using the this format ‘@{Dublin-core}’, ‘@{augmented-text-author-mind-map}’ where the syntax is to specify the contents but also its origin (as in the case of the Author Mind Map) unless generally known (such as Dublin Core).

Extensions PDF External

Visual-Meta can potentially also support metadata to bind the document(s).

Custom Elements

Custom element sections must be enclosed with a start of a line of ‘@name of section{‘ followed by a line of ‘}’, as shown below, where the line breaks are important for parsing:

@name of section{
specific contents
}

For example, to add metadata in Dublin Core format you need to specify ‘@dublin-core{‘ at the start of that section.

Issues with Font Rendering

The Visual-Meta should be presented in monospaced font to avoid ligatures which interfere with parsing. Please also beware of how special characters are handled: bibtex.org/SpecialSymbols

Connecting Documents

Reader software can use the Visual-Meta to ‘know’ what documents are in the user’s system (hard drive/cloud etc.) and therefore provide affordances for clicking on a cited document which they have and opening the document, straight to the cited page, without having to go through a web portal.

The vm_id in the header of the document is used to help reader software easily know what document is what document.

Experimental Variables

Experimental: If the value you are adding is not fully accurate, place a ? after that field.
For example: year = {2020},?

If you are not sure about the spelling, you can append ‘sp?’.
For example author = {Frode Alexander Hegland},sp?

Who We Are + Join Us

Notes from the core design team of Frode Hegland with Stephan Kreutzer, Adam Wern, Peter Wasilko, Christopher Gutteridge, David Millard, Mark Anderson and Günter Khyo. Please feel free to join the discussion on future-of-text.circle.so. Our dialog continues there as well as on the open weekly calls we call Open Office Hours.

Benefits

The benefit of the Visual-Meta approach is that the PDF viewer software can interact with the document in richer ways, while not loosing the robustness of being a normal PDF document. For example:

The reader software ‘knows’ the citation information of this document so that a reader can cite with a simply copy and paste
The view/presentation of the text since the reader software is aware of the document’s structure and since this metadata is visible at the same level as the content of the document, it will not be stripped out as document formats change and it will not interfere with viewers which are not Visual-Meta aware.

There are further Immediate User Benefits, different User Community Benefits and Visual VS. Embedded Benefits.

Visual-Meta Unleashes Hypertextuality and advanced interactions such as Augmented Copying (copies with full citation information), References and Glossaries, as well as included information for how to parse tables, images and special interactions for graphs. This enables dynamic re-creations of interactions with sophisticated visualisations, which no longer needs to be flattened when committed to PDF.

The Future of Connected Documents

Citations are the literal backbone of academic discourse, it is the means through which explicit links are threaded.

In the move to digital environments, the combination of the old and the new has not produced the most convenient, accurate or robust systems.

Citations have received some benefits of digital interactions, including the ability to use web-links associated with the cited material, which point to download sites and use handles, such as DOIs. Links are not robust, they are brittle connections, which break when a server goes down or a domain is no longer paid for, so this does little to create a long term, robust connection environment.

In the rush to embrace digital technologies there has also been a strong force to maintain legacy systems of work, which is why PDF documents act as the medium in which to freeze academic documents, which is a vital part of the academic process, but freezes them without adding necessary meta-data. This creates digital objects which are devoid of many benefits of being digital, such as meta-data to allow the document to be able to present itself for what it is, which can foster richly interactive ecosystems.

In the book The Future of Text I have addressed this problem: The book contains an appendix called Visual-Meta which adds human-readable metadata in the well established BiBTeX format. For example, the title of the book, which is rarely the same as the name of the document, looks like this: title = {The Future Of Text},. When someone copies text from the book, this Visual-Meta is appended to the clipboard so that when it is pasted, the copied text is pasted with full metadata for citing the book. The result is a clean and robust citation in the word processor, which can be styled as desired on export and automatically added to the Reference section. This has been implemented in my Reader PDF viewer and Author word processor, both on the macOS platform.

Visual-Meta also adds formatting information so that the reading software knows where headings are and who wrote specific sections, which is important for works such as The Future of Text, which features 180 contributors. Visual-Meta can also address high-resolution citing and computational text, as well as surfacing the values inside embedded images and tables, and more.

This is how I made citing instant (simply copy and paste), accurate (only one copy operation, based on the published metadata for the document is used) and robust since the PDF can even be printed, scanned and if OCR’d, no metadata is lost. The data cost is a page of small type and the admin cost is minimal, since this is an open standard with self-described fields, which software can easily be updated to create on authoring and parse on reading.

The system is ready to step out of The Future of Text and into everyday academic use, where it can provide time saving, robustness and enable whole new levels of advanced interactions for academic authors and readers.