Future-proofing MELISR

Toward a robust and standards-compliant herbarium management system for the National Herbarium of Victoria

Executive summary

MELISR is the collections database of the National Herbarium of Victoria. MELISR has not undergone any major development in the last 8+ years, in order not to interrupt data entry for Australia's Virtual Herbarium (AVH). Consequently, there are now many outstanding issues that need a major redevelopment effort to resolve. These include unreliability of the back-up system, issues with data capture and data integrity, and issues with data delivery to AVH.

In order for MELISR to optimally fulfill the business needs of the Plant Sciences and Biodiversity Division (PS&B), MELISR also will need to incorporate the loans and exchange database and the MEL herbarium census, as well as efficiently interface with nomenclatural or taxonomic databases such as VicList, APNI and the Australian Plant Census (APC), and be able to communicate with mapping software such as ArcGIS. As not all these objectives can be met under the current database management system, KE Texpress, other database options have been investigated.

It is concluded that the Royal Botanic Gardens (RBG) does not currently have the expertise to develop its own herbarium management system. In any case, this option would take the greatest development effort and lead to the largest loss of productivity during development, while there are pre-built collections management systems available, some of them at little or no cost.

KE EMu is the successor of Texpress and still uses Texpress as its back-end database, and therefore would represent a logical upgrade path. However, it is very expensive and has a history of implementation problems at other herbaria. As EMu is designed for all kinds of museum collections, customisation cost also would be maximal.

BRAHMS is especially developed for herbarium management, is free, and would probably require the least customisation to accommodate the MELISR data structure. However, BRAHMS uses a soon to be phased out database management system, and it is unclear how BRAHMS will develop after that. Because of the rather weak back-end database there are issues with robustness, scalability and extensibility.

Specify is developed for natural history collections, including herbaria and natural history musea. Specify has a highly structured, standards-compliant data model and was found to be usable and robust. It is also the only system that is completely open source and hence can be adapted to our future needs. Implementation of Specify requires only minimal changes to existing Information Services (IS) infrastructure. It is recommended that the RBG adopts Specify as its new herbarium management system.

Converting to a new collections management system will involve significant commitment of time and resources to retrain existing staff who use MELISR. In order to minimise loss of productivity during the implementation phase it is recommended the new herbarium management system be implemented prior to initiating any further large-scale data capture projects, such as databasing the foreign collection or undertaking a specimen imaging program. On a broader scale, it is a good time to upgrade our database system as TDWG standards are now at a mature stage and are being adopted globally.

Background

The MELISR database supports the primary business of PS&B. MELISR has been running on the Texpress DBMS since its inception in the early 1990s. The Texpress DBMS (produced by KE software) is simple and efficient and at the time was adopted by most Australian herbaria. However, with increasing requirements for data sharing between herbaria and the emergence of global biodiversity data standards, the limitations of the Texpress DBMS are becoming apparent.

The MELISR database design was largely borrowed from other database systems in existence (namely CANB and NSW) leading to some parts being less than optimal for MEL's requirements. Despite these shortcomings, MELISR has not had any major development work done for the last 8+ years in order to minimise disruption to the Australia's Virtual Herbarium (AVH) databasing project. There are many outstanding issues with the database that will need a considerable development investment to resolve. It is therefore prudent to consider whether this effort would not be better directed toward upgrading MELISR to another database platform.

The MELISR database is a specimen database only; its simplistic structure cannot incorporate all the specimen-related or name-related information and curatorial tools used in the herbarium. In addition to MELISR, the herbarium maintains the Loans and Exchange database (an MS Access database), the Census database (MS Access), the VicList database (MS Access), a table of scheduled taxa (Texpress and MySQL) and an Authors of taxonomic names database (MS Access). The limited querying capability of Texpress, and the inability of Texpress to directly interface with the software required to present MEL's specimen data over the internet, have made it necessary to establish a duplicate of MELISR using the open source MySQL database. This is an inefficient solution that requires extra room on the server, resources to keep the two databases in synch and duplication of effort when implementing changes to MELISR.

An important application of MELISR data is the production of distribution maps. The inability of Texpress to interoperate with GIS software makes mapping MELISR data very laborious.

The label printing program within Texpress is primitive, and cannot be tailored to meet our needs without engaging programming assistance from KE Software. As there is often a conflict between the requirement to capture data and the requirement to print data, the inability to customise label printing in MELISR results in certain data (e.g. quarantine messages or acknowledgements of funding support) needing to be edited in to or out of records either before or after printing. A more sophisticated, and easily-customisable, printing system would allow the requirements for curatorial and specimen-related label data to be managed more consistently and more efficiently.

The reporting program in Texpress also leaves much to be desired. New reports within MELISR need to be individually programmed, and several curatorial reporting requirements are handled by other programs (such as the MySQL copy of MELISR, the Census database and the Loans and Exchange database) due to the deficiencies of the Texpress system. Loan information is only recorded in MELISR records for the duration of the loan, making it impossible to track the loan history of a specimen. The absence of loan history data for individual specimens means that the MEL collections cannot be accurately audited to meet reporting requirements (such as those of the bi-annual AVH Board report).

Our past experience with the Texpress system has demonstrated that it lacks robustness, making it susceptible to data loss. In March 2006, a low-level hardware error resulted in the loss of several records. The cause of the problem was difficult to pinpoint and remedy, and resulted in a three week disruption to data entry and retrieval at a time when ten staff were employed specifically to undertake data entry. As well as resulting in significant loss of productivity, this problem highlighted the inadequacy of the Texpress backup system; not all the lost records could be recovered and restored.

In addition to the technical deficiencies of the system, the current data structure is too simplistic to allow for accurate capture of taxonomic information, determination history and geographical information, which reduces the utility of our specimen database and compromises the integrity of our data. Texpress imposes constraints on the way that data can be structured, with all information for each object (or specimen) recorded in the one database record. This creates single records containing large numbers of fields, with the same information entered numerous times across the different records, resulting in inefficient data entry. As well as improving the way that specimen data is recorded, the database needs to be able to track data validation efforts and allow for better documentation of changes to records. Appendix A lists some of the improvements that need to be made to existing fields in MELISR, as well as suggested new fields that would improve the searchability and standards-compliance of the database. Many of these suggestions have been requested from staff and external clients. The key areas requiring improvement are expanded upon below.

One critical area that requires improvement is the way that taxonomic information is entered and stored in MELISR. The flat data structure of the Texpress system means that taxonomic names -- which should ideally stand alone -- cannot be separated from the specimen-related data (such as determination annotations and hybrid information) associated with individual records. Individual components of names are currently entered by the database operator from look-up tables. While this partially restricts the content of the taxonomic name fields, the tables are easily edited by any user with data entry privileges, so there is much scope for errors in data entry and inconsistencies in the application of names. The inadequacy of MELISR in dealing with uncertain determinations and cultivar, hybrid and informal names, combined with the inability to adequately restrict the content of the taxonomy fields, reduces the quality and reliability of the name data associated with our database records.

The current approach to recording taxonomic names makes it necessary to maintain a separate list of names that reflects the content of MEL's collection (the Census database). This represents a considerable duplication of effort, which could be eliminated by using a comprehensive herbarium management system based on an authoritative list of names, rather than a collection of separate databases each with their own name lists. A major benefit of having MELISR linked to such lists is that it would allow searching by synonyms, which is a powerful tool both for specimen curation and for data interrogation.

As well as improving the way that current names are recorded in MELISR, it would be valuable from both a taxonomic and curatorial point of view to record the determination history of a specimen. Currently, there is no facility in MELISR for recording original determinations and subsequent re-determinations. Although additional fields could be added to the new system, they would suffer the same shortcomings as the existing taxonomic name fields, thus determination histories would be better handled by a relational database with an underlying taxonomy table.

Another major weakness of the current MELISR database is that specimen records from the core collection (the specimen component of the State Botanical Collection) cannot be distinguished from records from non-core collections such as the Victorian Reference Set, the Horticultural Reference Set and the Victorian Conservation Seedbank. It is important that these collections are kept separate on the database so that data-retrieval for loan enquiries, electronic data requests and specimen retrieval reflects the location and accessibility of the specimens. Database records from the Victorian Conservation Seedbank may not be associated with a vouchered specimen, which undermines the integrity of MELISR as a collections database based on verifiable specimens. While it makes sense to use the same database to record data for these distinct collections, the structure of the Texpress system is too simple to reflect the different purpose, location and accessibility issues associated with these collections.

The shortcomings with the current version of MELISR outlined above could be overcome by employing a comprehensive, relational herbarium management system that encompasses a specimen database, loans and exchange database, scheduled taxa database and herbarium census. The taxonomic tables that would be the basis of the herbarium management system could also encompass the VicList database, thus reducing duplication of effort in taxonomic name management at the RBG and streamlining curatorial practices associated with keeping VicList data up to date.

While many of the desired improvements to MELISR require a more sophisticated database management system, some improvements to the handling of primary collecting data could be implemented in the existing Texpress system. However, given the considerable shortcomings of the existing system and the effort required to replicate any changes in the duplicate version of MELISR, this development effort would be more judiciously applied to incorporating MELISR into a new herbarium management system, rather than adding further workarounds to the existing database.

Options

Keep existing Texpress system

The Texpress application currently used for MELISR is no longer widely used and is likely to become obsolete. The only technical support available is from KE software and this is unlikely to be available in the future.

The standard data exchange protocols used to link in to global biodiversity initiatives (such as BioCASE and TAPIR) cannot interface directly with Texpress. Keeping the existing Texpress system for our Collections database puts the RBG at risk of not being able to participate in emerging biodiversity information initiatives such as GBIF, ALA and EOL and poses a risk to the organisation's reputation as a quality data custodian.

The loss of corporate knowledge associated with using a near-obsolete system is a great risk to the organisation. The administration and maintenance of the Texpress system requires a large amount of specialised skills and experience that cannot be easily replaced. The archaic nature of Texpress means that skills developed in any other modern database system will not be transferable.

There may also be financial ramifications for the RBG if we persist with Texpress, given the potential for licensing and support costs to increase due to the small number of users. Texpress development costs are also very high, as has been experienced in the past, and these costs are likely to rise in the future. The RBG currently pays $6300 per annum for 25 user licenses. Although this cost could be reduced to $4800 for 15 licenses, the cost of purchasing new licenses when needed is much higher than the amount saved this way. Any redevelopment in Texpress will require purchasing the TexAPI package at a cost of $18,000 plus $2315 per annum for licenses.

Change to a custom-built collections management system

One option is to develop a custom collections management system. The greatest benefit of this option is that it would be specifically tailored to our needs. Along with this benefit come risks associated with using a system that is not widely used, and thus doesn't have a global community of users and developers. The development of such a system would be expensive in terms of staff time, and would require that staff time is taken away from other duties.

At the implementation stage, there is the risk of exceeding the anticipated development cost and the cost of data migration from Texpress to the new system. Also, the temporary loss of productivity during the changeover period would be greatest with this option. All future support and development would most likely have to come from within the organisation. The RBG do not have the expertise to develop a custom-built front end in-house.

Change to existing collections management system

The third option is to transfer MELISR to an existing collections management system. This is potentially an expensive option, but the cost of different collections management systems varies enormously. The major benefit of this option is that it will allow us to select a proven system that has a global community of users and developers. It will also be less expensive in terms of staff time than developing a custom-built system.

Risks associated with this option include high development/customisation, migration, and training costs. These costs can be minimised by choosing a system that most closely meets our needs, choosing a database management system we are already familiar with, and choosing a system with a user-friendly interface. There is also the risk of data loss during migration, which may be reduced by keeping good back-ups and by rigorous testing of the data model of the new application. We also need to minimise the risk that the new software will not be supported in the future and make sure the database can be modified and extended to meet future business needs of the RBG.

3.1. KE EMu

KE EMu is a Windows-based collections management application, designed for all kinds of musea, that uses Texpress as the back-end database. While this is a powerful system and represents a logical upgrade path, it has serious limitations. It is very expensive and has a history of implementation problems with regard to herbaria and botanic gardens (e.g. NSW and BM). It is also closed source, meaning that any customisations will have to be performed by KE Software and will be costly.

Table 1. Features of KE EMu


Operating system
- Front-end	MS Windows
- Back-end	Linux
Open source	KE EMu is completely closed source; any customisation will have to be performed by KE Software.
Back-end database	KE EMu uses KE Texpress as its back-end database. This is the same database MELISR currently uses, but with some extensions.
User-friendly interface	Yes
Customisability	KE EMu is fully customisable, but this customisability comes at a cost as it has to be carried out by KE Software. As EMu is designed for use by all kinds of musea customisation needed will be more than in the other applications considered.
Scalability	EMu scales well.
Extensibility	EMu is fairly self-contained; extension is possible, but will have to be custom-designed by KE.
Interoperability	Native interoperability is poor
Support	Support is probably good, but expensive. One would expect some support will come with the licenses. However, a large part of the problems that other herbaria have had with EMu is likely to have been caused by communication problems between herbarium people and application developers.
Startup costs	$130,000 (25 licenses plus initial customisation)
Ongoing costs	$120,000 pa (25 licenses)
url	http://www.kesoftware.com/content/view/512/356/lang,en/

3.2. BRAHMS

BRAHMS is a specialised herbarium collections management system developed by the Oxford University Herbaria and used by many herbaria all over the world, for instance the National Herbarium of The Netherlands (L, WAG). BRAHMS was found to be rich in features, but lacking in usability, robustness and extensibility. BRAHMS has just undergone a major redevelopment, but is still built around the same DBMS, Microsoft Visual FoxPro, which is not an enterprise database. BRAHMS will only run on Microsoft Windows platforms.

Table 2. Features of BRAHMS


Operating system
- Front-end	MS Windows
- Back-end	MS Windows
Open source	BRAHMS is closed source
Back-end database	BRAHMS uses Microsoft Visual Foxpro. FoxPro is a legacy DBMS no longer actively developed by Microsoft and will not be supported after 2014.
User-friendly interface	No
Customisability	BRAHMS has limited customisability. However, the application is specifically designed for herbaria, so only little customisation will be necessary to accommodate the MELISR data model.
Scalability	BRAHMS scales poorly, mostly due to its rather weak back-end database.
Extensibility	BRAHMS is fairly self-contained. However, it is very feature-rich and is specifically designed for herbarium management, so the data model should be sufficient to accommodate all necessary fields at least for the near future.
Interoperability	BRAHMS has built-in operability with some other applications, such as ArcView and DIVA-GIS. The file format in which it saves its data (.DBF) can be read by some Windows applications. An extension for online publishing is available. The National Herbarium of the Netherlands is a member of EDIT and BioCASE and delivers data to GBIF, so dynamic data delivery through a BioCASE provider must be possible.
Support	Support for BRAHMS is provided by the BRAHMS Project at the Oxford University Herbaria. As herbarium taxonomists are involved in the project, there should be no communication problems. A support contract costs $US600 pa.
Startup costs	--
Ongoing costs	$US600 pa for support (optional)
url	http://dps.plants.ox.ac.uk/bol/home/default.aspx

3.3. Specify

Specify (University of Kansas) is designed for herbaria and zoological musea and was found to be usable and robust. The Specify software is currently being completely rewritten and will be released as open source. The new version, Specify6, is built in Java using Hibernate to abstract the database layer and will therefore run on many database management systems, including MySQL, PostgreSQL, and Microsoft SQL Server, which all already run on RBG servers (MySQL is used for the web interface of all botanical databases and for the AVH interface, as well as advanced querying, of MELISR; MySource Matrix, the Content Management System for the RBG website uses PostgreSQL; Hummingbird, the records keeping software uses MS SQL Server). Specify6 will be released 27 February 2009. Specify has a world-wide user community and is currently used by 112 institutes, including 34 herbaria. Development of Specify has been supported by the US National Science Foundation for the last twenty years. Judging from the proceedings of the 2008 TDWG annual conference, Specify is very much at the forefront of collections management systems.

Specify has a strongly structured data model that is DarwinCore compliant (GBIF uses DarwinCore) and therefore most likely also ABCD compliant. Specify6 contains 138 tables and 1658 fields. While some fields in MELISR that are specific to MEL will not be already in the Specify data model, there are several blank fields which can be used for these. Given that the database layer is abstracted from the front end and the database management systems that can serve as back end are very powerful, we expect excellent scalability and extensibility, as well as interoperability with other applications (e.g. GIS, electronic flora, image storage).

Specify optionally comes with a fully customisable (using CSS only) web interface. The Specify front end, which includes all the forms and reports (including labels), is fully customisable, without requiring programming. If in future we want to make changes to the data model, the associated changes in the front end would require programming in Java. However, given the growing user community we expect that changes in the data model necessitated by outside factors would be taken care of in new minor versions of Specify.

Table 3. Features of Specify


Operating system
- Front-end	MS Windows, Mac OS 10 or Linux
- Back-end	MS Windows, Mac OS 10 or Linux
Open source	Specify is open source. Source code for everything except the label printing application can be obtained from the developers. Specify6 is entirely written in Java, which means that in future we will actually be able to make changes to the source code if necessary.
Back-end database	Because the front-end user interface is separate from the back-end database, Specify6 can use most of the major database management systems, such as MySQL, PostgreSQL, MS SQL Server or Oracle. Specify6 by default uses MySQL.
User-friendly interface	Yes
Customisability	Specify 6's Graphic User Interface is entirely customisable, with the possibility to choose fields, change the format or type of fields and even change field names (similar to forms in MS Access).
Scalability	Because of the very powerful back-end database systems Specify6 will scale very well.
Extensibility	Specify has good extensibility. While extension of the data model at the back end is easy and only requires knowledge of SQL, the associated changes in the front end require more knowledge of Java than is currently available at the RBG. However, the Specify data model is very rich, with many customisable fields, so should easily be able to accommodate all MELISR fields at least in the near future.
Interoperability	Specify comes with a web-interface (which we may not use) and a DiGIR provider (which we definitely will not use). MySQL interoperates very well with PHP for dynamic web applications and, through the MySQL ODBC, with MS Access and ArcGIS.
Support	Specify is free and offers free support to registered users. While priority support is given to US institutes, Specify is happy to provide support to non-US institutes as resources allow. Part of the support is migration of data into Specify. With Specify5.2 there was a waiting time of 2--3 months between registration and migration. We expect waiting times to be longer once Specify6 is released, as existing Specify users will need to have their data migrated as well.
Startup costs	--
Ongoing costs	--
url	http://www.specifysoftware.org/Specify; http://specify6.specifysoftware.org/ (temporary Specify6 website)

Recommendation

Given the Texpress system is nearing obsolescence and changing to a custom-built system is an inefficient and expensive option, we recommend changing to an existing collections management system. Based on our comparison of existing collections management systems we recommend upgrading to Specify. We finally recommend a new collections database management system be implemented prior to initiating any further large-scale data capture projects, such as databasing the foreign collection or imaging of herbarium specimens.

Of the three collections management systems considered KE EMu was discarded as an option early, because of its high implementation, customisation and licensing cost and because of the problems other herbaria have experienced with it. BRAHMS was found to be feature-rich and, because it was designed especially for herbaria, BRAHMS' data model is currently probably the most compatible with the structure of MELISR. However, we are concerned about the back-end database management system BRAHMS uses, and that because of its limited scalability and extensibility, we will not be able to adapt BRAHMS to meet the RBG's future needs.

Specify has the ability to employ a very powerful database management system and can therefore make use of the DBMS' back-up and security facilities. It has a highly structured data model that can include most MELISR fields as is, and all fields after some modification. Specify comes with a very user-friendly, fully customisable front end and with extensions such as a web interface and DiGIR provider. The worldwide diverse user and development community guarantees that Specify will adapt to future needs better than the other systems. Also from an infrastructure perspective Specify fits best, as all required infrastructure is already in place at the RBG (nevertheless a more detailed analysis of infrastructure requirements will be part of the project planning).

We would like to emphasise that while upgrading to a new collection management system is urgent in order to safeguard the quality and integrity of our collections data and the RBG's reputation as a quality data custodian, the process is not going to be painless. The implementation of a new collection management system will require a large time commitment, especially from the Programmer, Information Services and Collections Information Officer, and will affect all MELISR users. In order to ensure data integrity it will not be possible to run the old and new systems concurrently and therefore MELISR will not be accessible for a period of time during the implementation phase. Also the loans and exchange administration system will not be accessible during this period as it will be included in the new collections management system.

The temporary loss of productivity during the implementation period will be more than made up for by improved efficiency and increased productivity once the new herbarium management system has been implemented. On a broader scale, this is a good time to upgrade our collections database system, as international data standards have come of age and only minor changes are expected in the near future.

Appendix B describes an implementation roadmap that aims to ensure the implementation period is as short as possible and to minimise loss of productivity during this period.

Table 5. Summary of features of the different herbarium management systems considered. The current KE Texpress collections management system is included for comparison.

	KE Texpress	KE EMu	BRAHMS	Specify
Operating system
- Front-end	Linux shell emulator	Windows	Windows	Windows, Linux or Mac OS X
- Back-end	Linux	Linux	Windows	Windows, Linux or Unix
Open source	No	No	No	Yes
Back-end database	Texpress	Texpress	Microsoft Visual Foxpro	MySQL, PostgreSQL, MS SQL Server, Oracle or any other server-side DBMS
User-friendly interface	No	Yes	No	Yes
Customisability	Poor	Good, but very expensive	Good	Good
Scalability	Poor	Good	Poor	Excellent
Extensibility	Poor	Good, but very expensive	Poor	Good
Interoperability	Poor	Poor, but with ample inbuilt functionality	Good	Good
Support	Very limited^1^	Good	Good	Good
Startup costs	N/A + $18,000 (TexAPI)	$130,000	--	--
Ongoing costs	$6,300 pa + $2,315 (TexAPI)	$102,000 pa	$US600 pa^2^	--

^1^ Support for Texpress is very limited as Texpress as a stand-alone application is being phased out and replaced by EMu.

^2^ $US600 pa is for support, there are no licensing costs.

Appendix A --- Summary of suggested improvements to MELISR

Field/s	Suggested improvements
Taxonomy fields	Link specimen names to an authoritative table of taxonomic names. Information related to individual specimens (determination annotations, qualifiers and hybrid information) to be stored with individual records, rather than with names.
	Improve handling of higher level taxonomies, particularly for fungi and algae
Determinations	Include original determination and subsequent determination history.
Collector/Add. coll.	Create separate fields for recording verbatim label data and standardised data, which would link to an authoritative list of collectors.
Collecting date	Add a memo date field to record non-standard collecting dates, e.g. 13-17 June; late March; Spring; Christmas etc.
Additional collector	Add new fields to record collecting numbers of additional collectors.
Geocode	Allow for recording of geocode as originally provided (DMS or decimal).
	Add new fields for recording AMG references, and enable autoconversion of AMG to geocode.
	Add a new field for recording error measure when provided by collector.
	Improve handling of geocode source data.
Cultivated data	Improve handling of locality data for cultivated records. Currently, provenance data is entered in the Notes field, and cultivating locality details are entered in the locality fields (minus geocode). Need to record both cultivated and provenance locality data in a way that allows them to be queried and mapped (or excluded from queries or mapping) on request.
	Add new fields to cater for Plant Occurence and Status Scheme values.
Unit relationship field	Add new fields to record the range of relationships between herbarium sheets. Currently, the only way of recording a relationship between one or more herbarium sheets is to multisheet them. Need to convert the multisheet field to a unit relationship field that allows for other types of relationships to be recorded (e.g. cultivated seedling and wild-collected parent plant; uncertain links between foreign specimens).
Duplicates/Specimen Received from/Original herbarium (for images)	Apply a restricted vocabulary to these fields to prevent the inclusion of non-standard entries.Protologue
	Separate the publication title and the page and date citation into two distinct fields so that publication title can be linked to an authoritative list of names (e.g. BPH/TL-2) to avoid incorrect and inconsistent entries.
Precision	Check that our precision code values are sensible and add a built-in guide to help data entry personnel to use them correctly.
Depth	Stop decimal places from being automatically appended to values in this field as it gives an inaccurate impression of the precision of the measurement.
Notes	Divide this field into two (or more) categories to allow the collector's notes to be recorded separately from annotations on the specimen and curatorial or explanatory notes added by the database operator.
Managed habitats	Improve plant occurrence fields to better document specimens collected from managed habitats, e.g. those that have been self-established in a Botanic gardens context, and should not be treated either as wild-collected or as cultivated.
Validation level	Add new fields to represent the level of validation of a database record (includes identification, geocode, distribution, predictive distribution).
Images	Differentiate between images of the sheet and images of plant in its habitat etc. Also need to be able to record when we have produced a digital image of a specimen to send to another institution.
	Add a field to record file paths or URLs for digital images.
Vic. Ref. Set	Improve flagging of Vic. Ref. Set specimens. Vic. Ref. Set specimens are currently listed as duplicates, despite the Vic. Ref. Set not being an official, accessible collection. It would be better to flag these records in a different way than duplicate specimens are flagged.
Type status determination	Move type status determination data. This information is currently stored in Extra Info., but would be better stored as part of the determination history, with type status of the determination recorded.
Verbatim label field	Add a new field to record verbatim label data for foreign-language labels.
	Allow for unicode characters to be captured so that foreign-language data can be recorded more accurately (not possible in Texpress).
Original language field	Add a new field to record the original language that a label is written in (if non-English). This will allow ease of searching in the event that we want to query for language (e.g. for batch translation of labels).
Global gazetteer	Link to a global gazetteer for ease of geocoding foreign collections. e.g. GEONet Names Server files
Library catalogue no.	Add fields to enter call numbers for additional information stored in the library, e.g. letters, photos, colour transparencies etc.
Quarantine notes	Add a new field to enter quarantine notes that are not printed on any labels or exported, and are only used by curation staff. Currently, these messages must be deleted prior to labels being printed, then re-entered into the record.
Destructive sampling	Add a new field to record when material has been removed for destructive sampling.
Ethnobotanical information	Add a new field to flag the presence of ethnobotanical data associated with a record.
Indigenous name	Add a new field to record indigenous plant names when provided by the collector.

Appendix B --- Implementation roadmap

The major stages required to implement this project are outlined below:

Project planning

Registration with Specify;
Development of a detailed implementation plan, by September 2009.

Comprehensive needs analysis

Consultation with MELISR users regarding the current MELISR database structure to determine what fields need changing, and what new elements are required;
Mapping of fields in MELISR against the HISPID5 (ABCD) standard to ensure compliance;
Preparation of draft MELISR data entry manual.

Data preparation

Mapping of MELISR fields against Specify data model;
Performance of major quality assurance work on non-compliant fields in MELISR to make data migration to Specify as smooth as possible.

Implementation and testing

Installation of Specify on MEL server and work stations;
Data migration;
Comprehensive testing of the new system and revision of the MELISR data entry manual.

Training

Training of MELISR users in the use of the new system. This will need to be undertaken incrementally, starting with those staff whose work is most reliant on the database.

Configuration of provider software

Configuration of TAPIR and BioCASE providers for data delivery to AVH, the ALA and the GBIF.

Appendix C --- Glossary

Term	Description
ABCD	Access to Biological Collections Data -- a comprehensive TDWG standard for access to and exchange of primary biodiversity data
ALA	Atlas of Living Australia -- a project funded under the Australian Government's National Collaborative Research Infrastructure Strategy (NCRIS) to develop a biodiversity data management system for Australia's biological knowledge
ArcGIS	a group of geographic information system (GIS) software product lines produced by ESRI
AVH	Australia's Virtual Herbarium -- an online botanical information resource that provides access to data associated with scientific plant specimens in Australia's major herbaria
BioCASE	Biological Collections Access Service for Europe -- a transnational network of European biological data providers
BioCASE provider	A data exchange protocol developed by BioCASE. The BioCASE provider abstracts data from a database and turns it into standard format, such as ABCD or DarwinCore. MEL and AD (Adelaide) use the BioCASE provider to deliver data to AVH. Other similar protocols are TAPIR and DiGIR.
BM	Acronym of The Natural History Museum, London (British Museum)
BRAHMS	Botanical Research And Herbarium Management System
CSS	Cascading Style Sheets -- a stylesheet language used to apply formatting to web pages
customisable	able to be customised to our particular needs, without the need for extensive programming
DarwinCore	a standard designed to facilitate the exchange of information about the geographic occurrence of species and the existence of specimens in collections
DBMS	database management system -- examples of database management systems are: MS Access, MySQL, MS SQL Server
DiGIR	Distributed Generic Information Retrieval -- a client/server protocol for retrieving information from distributed resources
EMu	Electronic Museum -- collections management software developed by KE Software
EoL	Encylopedia of Life -- a project to create an online reference source and database for the 1.8 million named and known species on earth
extensible	able to be extended and adapted: an extensible database allows for expansion of the data structure, i.e. additional tables and fields
future proofing	the selection of physical media and data formats that best ensure the continued accessibility of data into the future. This process involves anticipating future developments and ensuring that only well-documented formats, standards and specifications are used to store and describe data.
GBIF	Global Biodiversity Information Facility -- an international organisation that focuses on making scientific data on biodiversity available via the Internet using web services. The data are provided by many institutions from around the world; GBIF's information architecture makes these data accessible and searchable through a single portal. Data available through the GBIF portal are primarily distribution data on plants, animals, fungi, and microbes for the world, and scientific names data.
GIS	geographic information system -- captures, stores, analyses, manages, and presents data that refers to or is linked to location. In a more generic sense, GIS applications are tools that allow users to create interactive queries (user created searches), analyse spatial information, edit data and maps, and present the results of all these operations.
HISPID	Herbarium Information Standards and Protocols for the Interchange of Data -- a standard format for the interchange of electronic herbarium specimen information, initially developed by the Australian herbaria and later adopted as a TDWG standard. The current version -- HISPID5 -- is ABCD compliant.
Java	an object-oriented programming language
MEL	acronym of The National Herbarium of Victoria
MELISR	MEL Information System Register -- the National Herbarium of Victoria's specimen database
Microsoft SQL Server	a relational database management system, used as the back-end database of Specify5
MySource Matrix	an open source content management system (CMS) written in PHP, used for the new RBG website
MySQL	a powerful, open source, relational database management system
NSW	The National Herbarium of New South Wales
PostgreSQL	an object-relational database management system
PS&B	Plant Sciences and Biodiversity Division, Royal Botanic Gardens (Melbourne)
RBG	Royal Botanic Gardens
roadmap	a plan for change execution
robust	able to withstand pressures or changes in procedure or circumstance
scalable	able to handle growth without having to replace the existing platform or architecture
Specify	research software application, database and network interface for biological collections information
TAPIR	TDWG Access Protocol for Information Retrieval -- a computer protocol designed for the discovery, search and retrieval of distributed data over the internet
TDWG	Taxonomic Database Working Group (also referred to as Biodiversity Information Standards)
Texpress	an object-oriented multi-user database management system developed by KE Software
usability	the efficiency with which a user can perform tasks in a given application
VicList	Census of the Vascular Plants of Victoria, an up-to-date list of the species and infraspecific taxa of vascular plants occurring in Victoria

← Herbarium interactions Specify to Darwin Core mapping →

# Future-proofing MELISR

# Executive summary

# Background

# Options

# Keep existing Texpress system

# Change to a custom-built collections management system

# Change to existing collections management system

# 3.1. KE EMu

# 3.2. BRAHMS

# 3.3. Specify

# Recommendation

# Appendix A --- Summary of suggested improvements to MELISR

# Appendix B --- Implementation roadmap

# Appendix C --- Glossary