Ingestible Package Guide
Introduction
This guide will walk you through creating, ingesting, and publishing an article package in Ambra. For instructions for setting up the Ambra stack, please see the Getting Started Guide.
Table of Contents:
The JATS standard
JATS is a standardized markup for journal articles. When ingesting your article into Ambra, you will have to provide an XML version of your manuscript that complies with JATS.
Rhino supports JATS 1.1d2
and 1.1d3
.
The JATS standard will tell you which tags to use for an abstract, author list, references, etc. Find the standard here.
You can find example articles for the 1.1d3 version of JATS here.
The Article Package
The article package is a zip file that contains all of the files that make up the article content. Each zip entry should be at the root level; the zip archive should not contain any subdirectories. There are a few required files, which are detailed below.
manifest.xml
The manifest is an XML file that tells Rhino what is in the article package. It must be named manifest.xml
.
All files present in the article package zip must be represented in the manifest XML, and the names must match.
Verify the layout of your manifest.xml
with a DTD. It is kept in Rhino, at /src/main/resources/manifest.dtd
. The DTD also contains an example manifest, as it would look for a PLOS article.
It is optional to include manifest.dtd
in your article package. If you choose include it, it must be mentioned in the ancillary
section of manifest.xml
.
Manifest XML Example
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE manifest SYSTEM "manifest.dtd">
<manifest>
<articleBundle>
<article uri="info:doi/my-article-id">
<representation entry="my-article-id_xml" key="my-article-id_xml" mimetype="application/xml" type="manuscript"/>
<representation entry="my-article-id_pdf" key="my-article-id_pdf" mimetype="application/pdf" type="printable"/>
</article>
<object type="figure" uri="info:doi/my-article-id_g001">
<representation entry="my-article-id_g001.tif" key="my-article-id_g001.tif" mimetype="image/tiff" type="original"/>
<representation entry="my-article-id_g001_medium.png" key="my-article-id_g001_medium.png" mimetype="image/png" type="medium"/>
<representation entry="my-article-id_g001_large.png" key="my-article-id_g001_large.png" mimetype="image/png" type="large"/>
<representation entry="my-article-id_g001_inline.png" key="my-article-id_g001_inline.png" mimetype="image/png" type="inline"/>
<representation entry="my-article-id_g001_small.png" key="my-article-id_g001_small.png" mimetype="image/png" type="small"/>
</object>
<object type="graphic" uri="info:doi/my-article-id_e001">
<representation entry="my-article-id_e001.tif" key="my-article-id_e001.tif" mimetype="image/tiff" type="original"/>
<representation entry="my-article-id_e001_inline.png" key="my-article-id_e001_inline.png" mimetype="image/png" type="inline"/>
</object>
<object type="supplementaryMaterial" uri="info:doi/my-article-id_s001">
<representation entry="my-article-id_s001.docx" key="my-article-id_s001.docx" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" type="supplementary"/>
</object>
<object type="table" uri="info:doi/my-article-id_t001">
<representation entry="my-article-id_t001.tif" key="my-article-id_t001.tif" mimetype="image/tiff" type="original"/>
<representation entry="my-article-id_t001_medium.png" key="my-article-id_t001_medium.png" mimetype="image/png" type="medium"/>
<representation entry="my-article-id_t001_large.png" key="my-article-id_t001_large.png" mimetype="image/png" type="large"/>
<representation entry="my-article-id_t001_inline.png" key="my-article-id_t001_inline.png" mimetype="image/png" type="inline"/>
<representation entry="my-article-id_t001_small.png" key="my-article-id_t001_small.png" mimetype="image/png" type="small"/>
</object>
</articleBundle>
<ancillary>
<file entry="manifest.xml" key="my-article-id_manifest.xml" mimetype="application/xml"/>
<file entry="extra.xml" key="extra.xml" mimetype="application/xml"/>
</ancillary>
</manifest>
Required tags
The example above, as well as the manifest DTD, define all required tags.
- The
xml
andDOCTYPE
tags are required and can be copied verbatim from the example. manifest
must be used as the top-level container tag.articleBundle
contains everything used to display the article.article
defines the article URI (DOI).- Must contain a
representation
tag for the XML and printable versions of your article.
- Must contain a
object
is a general container tag for graphics and supplementary material. It also requiresrepresentation
tags, but the requirements differ based on object type. Graphics will be covered in more detail later in the guide.representation
requirements:entry
- value is identical to the filenamekey
- a unique identifier for the asset which will be posted to the Content Repomimetype
- MIME type of the assettype
- varies based on object type, covered in “Graphics” below.
ancillary
contains any extra files, represented infile
nodes. Bothmanifest.xml
andmanifest.dtd
must be listed here (if present).file
nodes requireentry
,key
, andmimetype
attributes.
Manuscript.xml
The manuscript XML contains the text for your article.
There are standard tags to use for the abstract, author list, citations, etc. Please consult the JATS standard or refer to the example article packages here.
Each object
asset from the manifest should be referenced by DOI somewhere in the manuscript.
eISSN
The eISSN defined in your manuscript XML must match a journal eISSN defined in your Ambra database.
Example:
<issn pub-type="epub">1932-6203</issn>
JATS deviations
There are two known issues when rendering a JATS article in Wombat:
- The
<!DOCTYPE
tag should not be included.- If included, the article will not render, and Wombat will throw a DTD-not-found exception.
- The
<copyright-statement>
tag is not rendered. Use the<license-p>
tag instead.
PLOS is actively working to resolve these issues.
Printable
The printable
representation is a print-ready version of your article, generally a PDF.
Article Assets (figures, tables, and supplementary material)
An article can have any number of article assets included, as long as they are defined in the manuscript AND the manifest.
Each included article asset requires at least one resized copy. The copy, or copies, will be one of the following types:
Article Asset Types
figure
- A general image. Requiresoriginal
,large
,medium
,small
, andinline
representations.table
- Used to show tabular data in a graphical format. Requiresoriginal
,large
,medium
,small
, andinline
representations.graphic
- Used for images shown in-line with text, such as mathematical formulae, icons, and logos. Requiresoriginal
andinline
representations.supplementaryMaterial
- used for supplementary material such as videos or other media. Requires only thesupplementary
type.
Figure and Table Size Types
original
- The original image at its maximum resolution.large
- Used in the stand-alone image view and the image viewer. Should be able to fit on a standard computer screen.medium
- Used on the homepage. Should be less than half the size of the large version.inline
- Used within the article body. Should be slightly smaller than the medium version.small
- Used in issues and the current issue on the homepage. Should be less than half the size of the medium version.
The sizes for these images are not strict and tweaking may be necessary.
These requirements are also defined in the example XML above.
Ingesting an article into Rhino
Creating a Content Repo bucket
Rhino must be linked to a Content Repo bucket to store article files. The bucket’s name is configured in the rhino.yaml
file.
- Visit the Content Repo root page where you’ll see a browser interface.
- Click on “Create a bucket” within the
buckets
section. - You’ll see a bucket creation form. Enter the configured name (e.g.,
corpus
). - Click the “Try it out!” button.
Uploading an article
- Visit the Rhino root page where you’ll see a swagger interface.
- Click on
ingestible-zip-controller
. - Click on
zipUpload
or anywhere on the green bar. - Click the “Browse…” button and upload your article package zipfile.
- Click the “Try it out!” button.
This will ingest the article into Rhino and save the data to the database and Content Repo.
Ambra is designed with versioning in mind. This means when you ingest an article, it is staged into the ArticleIngestion
table but not yet published. In order to actually view an article, you will need to add a new revision indicating that the ingested version is one you want to publish.
Adding an article revision
- Visit the Rhino root page where you’ll see a swagger interface.
- Click on
article-crud-controller
. - Click on
writeRevision
. - Enter the DOI. Any slash characters (
/
) in the DOI be escaped as++
(for example,10.0000/my-article
becomes10.0000++my-article
). - Enter the ingestion number.
- Click the “Try it out!” button.
By default, each new revision you create will receive incrementing revision numbers, starting at 1. Input an old revision number to overwrite it instead. If you want to change an article without displaying a revision history, just overwrite revision number 1.
Viewing the article
Navigate to the article in Wombat: http://localhost:<$WOMBAT_PORT>/wombat/<$SITE_NAME>/article?id=<$DOI>
For example: http://localhost:8123/wombat/Desktop/article?id=my-article-id