About Contacts Guides API ReFindit Bibliography of Life
A copy of these instructions is included in the download RefBank.zip in the file README.txt.

SYSTEM REQUIREMENTS

  • Java Runtime Environment 1.5 or higher, Sun/Oracle JRE recommended
  • Apache Tomcat 5.5 or higher (other servlet containers should work as well, but have not been tested yet)
    If you are running Tomcat with a Server JRE 1.7 or higher, you have to enable Java 1.6 compatibility mode, as otherwise some required classes are excluded from the class path.
    This works as follows:
    • Linux/Unix: in /etc/init.d/tomcat , add the parameter -Djava.specification.version=1.6 to the JAVA_OPTS="..."; line
    • Windows: include the parameter -Djava.specification.version=1.6 wherever you set other parameters like the maximum memory as well
  • A database server, e.g. PostgreSQL (drivers included for version 8.2) or Microsoft SQL Server (drivers included)
    Instead, you can also use Apache Derby embedded database (included)
    (using Apache Derby is the default configuration, so you can test RefBank without setting up a database)

SETTING UP A RefBank NODE

  • Download RefBank.zip into Tomcat's webapps folder (an exploded archive directory, zipped up for your convenience; WAR deployment is impractical, as updates would overwrite the configurations you make)
    Instead, you can also check out the project from GIT, build the ZIP file using Ant, and then deploy RefBank.zip to your Tomcat
  • Create a RefBank sub folder in Tomcat's webapps folder.
  • Un-zip the exploded archive directory into the RefBank folder.
    If you have WebAppUpdater (builds with idaho-core) installed, you can also simply type bash update RefBank in the console.
  • Put the RefBank.zip archive you downloaded into the webapps/RefBank/ folder for others to download.
  • Now, it's time for some configuration:
    • To enable the RefBank node to store parsed references in the file system and connect to other RefBank nodes, give the web application the permission to create and manipulate files and folders within its deployment folder and to establish outgoing network connections (there are two ways to achieve this):
      • The simple, but very coarse way is to disable Tomcat's security manager altogether (not recommended)
      • More finegrained way is to add the permission statement below to Tomcat's security configuration (recommended); the security configuration resides inside Tomcat's conf folder, which is located on the same level as the webapps folder; the actual configuration file to add the permission to is either catalina.policy directly in the conf folder, or 04webapps.policy in the conf/policy.d folder, whichever is present; if both files are present, either will do:
        grant codeBase "file:${catalina.base}/webapps/RefBank/WEB-INF/lib/-" {
        	permission java.net.SocketPermission "*.*", "connect";
        	permission java.io.FilePermission "WEB-INF/-", "read,write,delete,execute";
        }
    • Adjust the config.cnfg files in the WEB-INF/<xyz>Data folders:
      • In the config file in WEB-INF/rbkData/ , enter a (presumably) globally unique RefBank domain name, which should identify the institution the RefBank node runs in and, if the institution runs multiple RefBank nodes, also distinguish the node being set up from the other ones already running (if you set up the node for testing or experimentation purposes, please choose a domain name ending in "-Test" , ".test" , "-Dev" , "-Development" , ".dev" , or something similar)
      • In the same file, enter the preferred access URL for the node, i.e., the (preferred) URL for accessing the node from the WWW
      • In the same file, enter the administration passcode for the RefBank node being set up
      • To secure form based reference upload with ReCAPTCHA to avoid spamming, obtain a ReCAPTCHA API key pair and put it in the config file in the WEB-INF/uploadData/ folder
      • To secure script based reference upload with an access key to avoid spamming, specify a n access key in the config file in the WEB-INF/uploadData/ folder
    • If not using an embedded database, create a database for RefBank in your database server, e.g. RefBankDB
    • Adjust the web.cnfg file in the WEB-INF folder:
      • Adjust the JDBC settings to access the database created for RefBank
        (by default configured to use Apache Derby in embedded mode)
      • Set the stringPoolNodeName setting to the name assigned to the RefBank servlet in the web.xml (which is RefBank if you did not change it) so dependent local servlets can connect to it directly (Java method invocations) instead of the local network loopback adapter for better performance
        (if you do not change the web.xml file, you need not change this setting, either)
      • Set the stringPoolNodeUrl setting to the access URL you configured above, or to a localhost URL; in any case, the URL used should point to the RefBank servlet directly for better performance, even if the preferred external access URL is one proxied through a local Apache web server or the like
        (the default setting assumes Tomcat running on port 8080, you need to change this only if your Tomcat is running on a different port)
  • To make your RefBank node credit your institution, do the following:
    • Put your own institution logo in the images folder
    • Customize the files footer.html and popupFooter.html in the WEB-INF folder to include your institution name and logo by replacing
      yourLogo.gif with the name of your logo image file,
      yourUrl.org with the link to your institution,
      Your Institution Name with the name of your institution,
      YourInstitutionAcronym with the acronym of your institution

LINKING THE RefBank NODE TO THE NETWORK

  • Access the web application through a browser (the search form should show up)
  • Follow the Administer This Node link at the bottom of the page
  • Enter the passcode configured above to access the administration page
  • Enter the access URL of another RefBank node (maybe the one the zip file was downloaded from, simply by replacing the RefBank.zip file name with rbk , resulting in http://<refBankHostDownloadedFromIncludingPort>/RefBank/rbk , for instance) into the Connect to other Nodes form and click the Add Node button
    ==> A list of other nodes shows up, labeled Connected Nodes
  • Configure replication of data in the Connected Nodes table and click the Update Nodes button to submit it
    ==> afterwards, the web application might be busy for a while importing the references from the other nodes via the replication mechanism

CUSTOMIZING THE LAYOUT

The servlets generate the search and upload forms as well as the search results and reference detail views dynamically from multiple files residing in the WEB-INF folder or one of its sub folders:

  • refBank.html is the template for the main pages
  • The bodies of header.html , navigation.html , and footer.html are inserted in the template where the <includeFile file="filename.html"> tags are in the template
  • The CSS styles for all these files are in refBank.css , refBank.2.css , and refBank.3.css , each representing a different layout variant
    (the refBank.3.css layout is active in the default configuration)
  • refBankPopup.html is the template for the reference detail views
  • The bodies of popupHeader.html , popupNavigation.html , and popupFooter.html are inserted in the template where the <includeFile file="filename.html"> tags are in the template
  • The CSS styles for all these files are also in refBank.css , refBank.2.css , and refBank.3.css
  • The search or upload form is inserted where the <includeForm/> tag is in the template
  • The search or upload result is inserted where the <includeResult/> tag is in the template
    • The search form content comes from WEB-INF/searchData/searchFields.html ; the actual form tag is created by the servlet
    • The CSS styles for the search form and results are in WEB-INF/searchData/refBankSearch.css . WEB-INF/searchData/refBankSearch.css , and WEB-INF/searchData/refBankSearch.css , corresponding to the respective variants of refBank.css
    • The upload form comes from WEB-INF/searchData/uploadFields.html ; the actual form tag is created by the servlet
    • The reCAPTCHA widget is inserted where the <includeReCAPTCHA/> tag is in uploadFields.html
    • The CSS styles for the upload form and results are in WEB-INF/uploadData/refBankUpload.css , WEB-INF/uploadData/refBankUpload.2.css , and WEB-INF/uploadData/refBankUpload.3.css , corresponding to the respective variants of refBank.css
  • onnNodeAdminPage.html is the template for the administration page
  • The respective CSS styles are in onnNodeAdminPage.css , onnNodeAdminPage.2.css , and onnNodeAdminPage.3.css , corresponding to the respective variants of refBank.css
  • To customize general page layout, change the refBank.html and refBankPopup.html files and the respective stylesheets
    • This can be as simple customizing respective CSS styles (can be tested on statically saved post-generation HTML pages)
    • This can include changing the file names or the placement of the <includeFile .../> tags; when changing the file names or adding <includeFile .../> tags, make sure that the references files exist (requires the web application to run for testing)
    • This can include changing the placement of the <includeForm/> and <includeResult/> tags; make sure, however, that these tags remain in the template page, as otherwise the functional parts of the pages cannot be inserted (requires the web application to run for testing)
  • To customize page header, navigation, or footer, customize the respective HTML files and the respective stylesheets
    • This can be as simple customizing respective CSS styles (can be tested on statically saved post-generation HTML pages)
    • This can include adding new <includeFile .../> tags; when doing this, make sure that the references files exist (requires the web application to run for testing)
  • Do not add header, navigation, or footer content to the refBank.html or refBankPopup.html files directly, but use the respective inserted files instead (requires the web application to run for testing)

THE RefBank NODE API

RefBank data servlet ( /RefBank/rbk ):

  • GET (response content depends on action parameter):
    • action=admin (also as /refBank/rbk/admin ): retrieve login form for the RefBank node administration HTML page (used in browser, not part of API)
    • action=nodes (also as /refBank/rbk/nodes ): retrieve list of other RefBank nodes known to this one
      • additional parameters: none
      • response (MIME type text/xml , encoding UTF-8 ):
        <nodes>
          <node name="name of RefBank node" accessUrl="preferred access URL of node" />
          <node ... />
        </nodes>
    • action=ping (also as /refBank/rbk/ping ): ping node
      • additional parameters: none
      • response (MIME type text/xml, encoding UTF-8):
        <nodes />
    • action=name (also as /refBank/rbk/name ): retrieve data of this RefBank node
      • additional parameters: none
      • response (MIME type text/xml , encoding UTF-8 ):
        <nodes>
          <node name="<name of RefBank node>" accessUrl="preferred access URL of node" />
        </nodes>
    • action=feed : retrieve the reference update feed, ordered by increasing update time
      • additional parameters:
        • updatedSince : minimum update time for references to include in the feed, formatted as UTC HTTP timestamp
      • response: compact feed of references updated since the specified timestamp (MIME type text/xml , encoding UTF-8 )
        <refSet>
          <ref id="reference ID" canonicalId="ID of canonical reference to set" deleted="deleted flag, true or false" createTime="UTC timestamp reference was first added to RefBank" updateTime="UTC timestamp reference was last updated" localUpdateTime="UTC timestamp reference was last updated on this RefBank node" parseChecksum="MD5 hash of parsed version, if available" />
          <ref ... />
        </refSet>
    • action=rss : retrieve an RSS feed announcing recently added references, ordered by decreasing upload time
      • additional parameters:
        • top : number of references to include in the feed (defaults to 100 if not specified)
      • response: an RSS feed announcing the latest additions (MIME type application/rss+xml , encoding UTF-8 )
    • action=count : retrieve the number of references stored in the node
      • additional parameters:
        • since : the UTC timestamp since which to count the references (optional, defaults to 0)
        • format : the format to represent the response (optional, defaults to the native XML representation if omitted)
      • response: the number of references stored in the node (MIME type text/xml , encoding UTF-8 )
        <refSet count="number of strings" since="argument since" />
    • action=get : resolve RefBank internal identifiers
      • additional parameters:
        • id : the identifier(s) to resolve, can be multi-valued
        • format : the format to represent the parsed versions of references in (optional, defaults to the native MODS XML representation if omitted)
      • response: the reference(s) with the specified identifier(s) (MIME type text/xml , encoding UTF-8 )
        <refSet>
          <ref id="reference ID" canonicalId="ID of canonical reference to set" deleted="deleted flag, true or false" createTime="UTC timestamp reference was first added to RefBank" createUser="name of the user to first add reference to RefBank" createDomain="name of RefBank node reference was first added to" updateTime="UTC timestamp reference was last updated" updateUser="name of the user to last update reference" updateDomain="name of RefBank node reference was last updated at">
            <refString><plain reference string></refString>
            <refParsed><parsed version of reference (if available), as MODS XML or in format specified by format parameter></refParsed>
          </ref>
          <ref>...</ref>
        </refSet>
    • action=find : search references
      • additional parameters:
        • query : full text query against reference strings, can be multi-valued
        • combine : or or and , controls if multiple full text queries are combined conjunctively (the default) or disjunctively
        • type : type of reference, only finds references with parsed version available
        • user : contributing user
        • author : query against author attribute of references, only finds references with parsed version available
        • title : query against title attribute of references, only finds references with parsed version available
        • date : query against year of publication attribute of references, only finds references with parsed version available
        • origin : query against origin of references (journal + volume number, publisher or location, as well as volume title), only finds references with parsed version available
        • format=concise : exclude parsed verion of references from response
        • format : the name of the format for representing the parsed verion of the references (defaults to the native MODS XML if not specified)
        • limit : the maximum number of references to include in the search result (0, the default, means no limit)
        • sco : set to sco to restrict search results to references not marked as duplicates of others
      • response: the reference matching the specified search criteria (MIME type text/xml , encoding UTF-8 )
        <refSet>
          <ref id="reference ID" canonicalId="ID of canonical reference to set" deleted="deleted flag, true or false" createTime="UTC timestamp reference was first added to RefBank" createUser="name of the user to first add reference to RefBank" createDomain="name of RefBank node reference was first added to" updateTime="UTC timestamp reference was last updated" updateUser="name of the user to last update reference" updateDomain="name of RefBank node reference was last updated at" parseChecksum="MD5 hash of parsed version, if available and format set to concise">
            <refString><plain reference string></refString>
            <refParsed><parsed version of reference (if available), as MODS XML or in format specified by format parameter></refParsed>
          </ref>
          <ref ...>...</ref>
        </refSet>
    • action=apiStats : retrieve statistics on the usage of the node, in particular for the data handling actions
      • additional parameters:
        • format : the name of the XSLT stylesheet to use for transforming the result (defaults to the native XML if not specified)
      • response: the API call statistics (MIME type text/xml , encoding UTF-8 )
        <apiStats total="total number of API calls" feed="number of calls to feed action" rss="number of calls to RSS feed action" find="number of calls to find action" get="number of calls to get action" update="number of calls to update action" count="number of calls to count action" stats="number of calls to API statistics"/>
  • POST : requests from RefBank node adminstration HTML page, infrastructure replication, or meta data updates for existing references:
    • /RefBank/rbk/update : for deleting or un-deleting existing references or updating canonical reference ID
      • request headers to set:
        • user : the user to credit for the update
      • request body:
        <refSet>
          <ref id="reference ID" canonicalId="ID of canonical reference to set" deleted="deleted flag to set, true or false"/>
        </refSet>
      • response (MIME type text/xml , encoding UTF-8 ):
        <refSet>
          <ref id="reference ID" canonicalId="ID of canonical reference" deleted="deleted flag, true or false" createTime="UTC timestamp reference was first added to RefBank" createUser="name of the user to first add reference to RefBank" createDomain="name of RefBank node reference was first added to" updateTime="UTC timestamp reference was last updated" updateUser="name of the user to last update reference" updateDomain="name of RefBank node reference was last updated at" parseChecksum="MD5 hash of parsed version, if available">
            <refString><plain reference string></refString>
          </ref>
        </refSet>
    • /RefBank/rbk/admin : process input from the RefBank node administration HTML page (used in browser, not part of API)
    • /refBank/rbk/nodes : retrieve list of other RefBank nodes known to this one
      • request body: none
      • response (MIME type text/xml , encoding UTF-8 ):
        <nodes>
          <node name="name of RefBank node" accessUrl="preferred access URL of node" />
          <node ... />
        </nodes>
    • /refBank/rbk/ping : ping node
      • request body: none
      • response (MIME type text/xml , encoding UTF-8 ):
        <nodes />
    • /refBank/rbk/name : retrieve data of this RefBank node
      • request body: none
      • response (MIME type text/xml , encoding UTF-8 ):
        <nodes>
          <node name="name of RefBank node" accessUrl="preferred access URL of node" />
        </nodes>
    • /refBank/rbk/introduce : introduce a new RefBank node to this one, retrieve list of other known RefBank nodes
      • request body (parameters):
        • name : the name of the RefBank node introducing itself
        • accessUrl : the preferred access URL of the RefBank node introducing itself
      • response (MIME type text/xml , encoding UTF-8 ):
        <nodes>
          <node name="name of RefBank node" accessUrl="preferred access URL of node" />
          <node ... />
        </nodes>
  • PUT : upload new or update existing references:
    • request headers to set:
      • Data-Format : the upload data format, xml or txt (tried to auto-detect if not specified)
      • User-Name : the user to credit for uploaded references (defaults to 'Anonymous' if not specified)
    • request body: the references as plain text or wrapped in XML, corresponding to the format specified in the header; to be encoded in UTF-8
      • Data-Format=txt : one plain reference string per line, adds new references, un-deletes ones that are re-added and were flagged as deleted
      • Data-Format=xml :
        <refSet>
          <ref>
            <refString><plain reference string></refString>
            <refParsed><parsed reference in MODS XML, if available></refParsed>
          </ref>
          <ref>...</ref>
        </refSet>
    • response: update statistice (MIME type text/xml , encoding UTF-8 ), in particular the uploaded or otherwise updated references, with attributes indicating whether they were updated or alltogether newly added to RefBank:
      <refSet created="number of references newly added to RefBank" updated="number of references updated, less newly created ones">
        <ref id="reference ID" canonicalId="ID of canonical reference" deleted="deleted flag, true or false" createTime="UTC timestamp reference was first added to RefBank" updateTime="UTC timestamp reference was last updated" parseChecksum="MD5 hash of parsed version, if available" parseError="explanation why parsed version was rejected, if any" created="true or false, indicating whether reference was newly added to RefBank" updated="true or false, indicating whether reference existed and was updated">
          <refString><plain reference string, as stored in RefBank></refString>
        </ref>
        <ref ...>...</ref>
      </refSet>

RefBank search servlet ( /RefBank/search ):

  • GET : retrieve search form, perform search, or retrieve styled or formatted reference, depending on parameters:
    • id : identifier of reference, yields reference specific response if set, depending on several other parameters:
      • format : reference format, for use with other software
      • style : reference string style, for use in bibliography of a publication
      • isFramePage : send reference specific pupup page instead of reference proper?
      • combinations of the id and style and format parameters return different results:
        • id + format=PaRsEtHeReF + isFramePage=true : reference specific popup page, with reference opened for manual parsing in embedded IFrame (used in browser, not part of API)
        • id + format=EdItReFsTrInG + isFramePage=true : reference specific popup page, with reference string opened for manual editing in embedded IFrame (used in browser, not part of API)
        • id + style or format + isFramePage=true : reference specific popup page, with reference in specified style or format showing in embedded IFrame (used in browser, not part of API)
        • id + style : return the reference with specified ID in the specified style (MIME type text/html , encoding UTF-8 )
        • id + format : return the reference with specified ID in the specified data format (MIME type text/plain , encoding UTF-8 )
        • id=MiNoRuPdATe , no style or format: return HTML form for POST callbacks from search result page (used in browser, not part of API)
    • canonicalStrinId : identifier of canonical representation, if set returns HTML page listing duplicate references (used in browser, not part of API)
    • query : full text query against reference strings
    • type : type of reference, only finds references with parsed version available
    • user : contributing user
    • author : query against author attribute of references, only finds references with parsed version available
    • title : query against title attribute of references, only finds references with parsed version available
    • date / year : query against year of publication attribute of references, only finds references with parsed version available (used synonymously)
    • origin : query against origin of references (journal + volume number, publisher or location, as well as volume title), only finds references with parsed version available
    • idType + idValue : query against some external identifier attribute of references (e.g. DOI or ISBN), only finds references with parsed version available
    • any of query, author, title, date/year, origin, or idType + idValue set: response is HTML page listing matching references (used in browser, not part of API)
    • no parameters at all: response is HTML page with search form (used in browser, not part of API)
  • POST (used in browser, not part of API): receive update callbacks from search result page

RefBank upload servlet ( /RefBank/upload ):

  • GET (used in browser, not part of API):
    • /RefBank/upload : retrieve upload form
    • /RefBank/upload/<upload-ID>/action : status info for running uploads
  • POST (used in browser, not part of API):
    • /RefBank/upload : receive text area or file upload from browser
  • PUT : receive upload via script:
    • request headers to set:
      • Data-Format : name of the reference data format used in request body (one of the formats selectable in the upload form)
      • User-Name : the name of the user to credit for the contributed references
      • Access-Key : the upload access key; needs to match configured key for servlet to accept the upload, ignored if no key is configured
    • request body: references in format inidcated in header
    • response: upload result statistics (MIME type text/plain , encoding UTF-8 )
      RECEIVED: <number of references received>
      ERRORS: <number of references that contain errors>
      CREATED: <number of references newly added to RefBank>
      UPDATED: <number of references that were not newly added, but had their parsed version updated>

RefBank data index servlet ( /RefBank/data ):

  • GET : retrieve a list of reference attribute values present from parsed references in RefBank, for use as a gazetteer
    • parameters:
      • type : the type of data to retrieve
    • response: list of data element values of requested type (MIME type text/plain , encoding UTF-8 )
      • type=persons : names of persons, i.e., authors and editors
      • type=journals : names of journals
      • type=publishers : names of publishers
      • type=origins : names of journals and publishers