Actions

Report of Working Group

From Online Dictionary of Crystallography

Revision as of 08:55, 17 July 2006 by BrianMcMahon (talk | contribs)

The Dictionary Working Group of the Commission on Crystallographic Nomenclature (CCN) was formed during the 20th IUCr Congress in Florence to provide guidance on the establishment and conduct of a project undertaken under the aegis of the Commission, with the approval of the IUCr Executive Committee and the involvement of other Commissions and appropriate bodies of the IUCr, to provide online definitions of terms used in the practice of crystallography.

The first stage of the action of the Working Group was two-fold: on the one hand, to define the nature and scope of the Dictionary, and, on the other hand, to develop an appropriate tool for its implementation.

The purpose of this report is to present the state of the project after nearly one year's experience and to give the Working Group's proposals on these two points and on the financial implications.

Nature and scope

Motivation

Many definitions of crystallographic terms are scattered in the International Tables but there is, at present, no place where they are systematically compiled, as is the case, for instance, for the chemical terms defined in the various compendia published by IUPAC (the 'gold', 'red', 'blue', 'purple', 'silver' books). The many questions received by the Commission on Crystallographic Nomenclature related to matters of definitions and nomenclature show that there is a real need for such a compendium for crystallography. The idea was received enthusiastically by the Executive Committee in Florence and the Working Group was set up to implement a pilot project for a dictionary of crystallographic terms.

Medium

It is proposed that the project should be executed initially as solely an online project because of the flexibility of the online medium, the fact that there is no limit on the number of entries, the possibility of hyperlinks to IUCr and other web resources. The present form of the project follows the Wikipedia pattern and makes use of the mediawiki software (see Technical Considerations). It was implemented by the Research and Development Officer, Brian McMahon.

It will always be possible at a later stage to consider a physical book with a CD containing all the hyperlinks, if it appears that there is a need for such a product.

Scientific scope

Broadly speaking, the project should be confined to the subject of crystallography, the area of science over which the IUCr has authority. Terms selected for inclusion should have a clear crystallographic implication and terms from connected disciplines (mathematics, physics, chemistry, mineralogy, biology, computational data processing, etc.) should be included insofar as they relate to crystallography, e.g. crystallographic group. Names of chemical or biological substances or minerals should not be included at the present stage, but terms such as albite twin law should. Reference to computer programs per se should not be included, but there might be instances when it becomes essential, e.g. SHELX. Names of people should only be included if they relate to crystallographic concepts, e.g. Bragg's law, Ewald sphere. Double-word items such as “X-ray interferometry” should be entered as such. A search on “interferometry “ will automatically retrieve them. Equations, tables and figures are included where necessary (see, for instance, the entries Bragg's law and arithmetic crystal classes).

The Working Group considers that translations of terms in other languages than English should be given, but the definitions should not be translated into other languages. The pilot demonstrates many translations into French, Spanish, Italian, German and Russian. Because it is impossible to collect a comprehensive set of translations at any one time, an advantage of the WiKi approach is the ability to extend the list of translations at any time.

The granularity of definitions

The Working Group recommends a reference product that is a blend between “dictionary” and “encyclopaedia”: a list of terms with short definitions and cross-links to other entries in the work, with at times longer developments. These longer developments are presented on a separate page that one accesses via a hyperlink (see for instance the page arithmetic crystal classes). Hyperlinks are also provided to other web resources of the IUCr (Teaching Pamphlets, CIF dictionaries, International Tables, Journals). For instance, in the entry reciprocal lattice, hyperlinks are given to the corresponding pamphlet on the IUCr web site (open access) and to the appropriate chapters of IT Volumes A, B, C and D; for these it is for the Executive Committee to decide (after recommendation from the Finance Committee) whether such links will be free access or not. As other examples, the entry CIF has links to Journal articles (subscribers only or by buying the articles) and the entry Bragg's law to 50 Years of X-ray Diffraction (free access). Hyperlinks to other web sites such as the IUPAC web sites or educational web sites can also be provided, if appropriate (see, for instance the entry absolute structure).

The general pattern of a typical page is:

  • translation of the term in other languages,
  • main definition
  • examples or applications or special cases
  • history
  • list of links to other entries or to IUCr or other web pages

Structure of the work

The work will be structured in several ways to assist navigation. The terms are entered alphabetically and can be retrieved alphabetically, but the WiKi software allows an ordering by categories and subcategories. Each entry can be attached to one or more categories (and subcategories). At the time of writing, categories are being assigned to entries on an ad hoc basis in an attempt to determine suitable structuring mechanisms. A click on a category provides links to all the entries related to that category. The present list of categories is given on the Main Page. As an example the subcategory Twinning has been introduced in the category Fundamental crystallography.

There are several advantages to having categories and subcategories. One is to allow searches on areas of interest, for instance if you are looking for a particular type of twinning, but don’t remember its exact name. Another one is to make the work of preparing the dictionary easier by assigning editors and subeditors to categories and subcategories. Their duty would be to oversee the definitions and to check that there are no obvious omissions.

Note that the Wiki software allows searches on headwords, but also full-text searching of the entire corpus, so that the user has available a large number of query-based informational retrieval strategies.

Level of definitions and audience

The primary goal of the dictionary is to be a reference for authors and referees of IUCr Journals and for research professionals in general: it will give the “official” IUCr acceptance of terms. As such it will also be useful to students and to the general public.

Organization of contributors

The Editorial Board should consist of the members of the CCN, with representatives from the other Commissions as consultants for the various fields of crystallography. It is clear that, as Editors of the various IUCr publications, the members of the CCN are the people whose duty is to say how crystallographic terms should be used.

Efficiency, however, requires that the work should be done under the supervision of a Main Editor or Editor-in-Chief and and a small number of appointed Editors (and subeditors) for the various categories (and subcategories), chosen in priority among the CCN members and consultants.

The initial experience of the Working Group has been, however, that even the greatest enthusiasts for the project are so busy that they find it difficult to spend the time necessary to make substantial contributions. The authoring privilege has been extended recently to the rest of the CCN. Early indications are that, again, the rate of accretion of new definitions is slower than we would like to see. It is likely that individuals will need to be recruited and charged with populating specific topic areas with content if one wants the project to proceed at a reasonable pace. This may involve some financial incentive.

Presentation

It is expected that the resource would appear as a single web site. However, it should also act as a companion to International Tables and to the Journals, as well as to educational resources such as the Teaching Pamphlets and any new educational initiatives arising from the Teaching Commission. As the Online Dictionary of Crystallography would be an important and useful service to researchers, students and authors, it is desirable that it should be open access, bearing in mind that most definitions have links to IT Volumes, which are not open access. This last point may encourage people to subscribe to International Tables Online.

Financial implications

The project as initially envisaged will rely heavily on volunteer labour and existing hardware resources. The current pilot implementation shares the same hardware as the main IUCr web site (although it is managed as a separate virtual server, and so can easily be moved to its own server machine if required). Some additional software development will be required (e.g. implementation of a reliable backup strategy, modifications to the style to conform with other IUCr web components); but so long as these are not time-critical, they can be absorbed within the existing workload of the R&D department. Significant software developments (such as creation of a hard-copy edition) would need to be assessed and costed separately. Note that hardware costs in the event of a migration to a separate server would be modest (e.g. of the order of GBP 1000 would suffice for a powerful dedicated machine).

Technical editing costs are ruled out at this stage (it is assumed that the invited contributors will have a high degree of literacy, and that there will be a measure of self-regulation as contributors edit each other's entries to correct minor spelling and typographic errors). Since each entry will be presented as a separate web page, minor inconsistencies of style and presentation will not be so important as they would be in a hard-copy publication. Conversely, however, the decision to produce a hard-copy publication would be likely to involve more rigorous technical editing, with subsequent added costs.

The Finance Committee should monitor the possible need for payment of editorial honoraria. It is expected that the project will require an Editor-in-Chief responsible for its overall shape and direction (at present this role is filled by the project initiator, Professor Authier). The roles of such an Editor-in-Chief will also cover the possible appointment of subsidiary editors to supervise the collection of definitions in topic areas where they have particular expertise, and the commissioning of definitions or sets of definitions to address topics not currently covered. The number and roles of secondary editors will depend in part on the readiness of the volunteer pool of contributors to identify deficiencies and provide needed definitions without prompting. The experience of the Wikipedia project suggests that this is possible in principle, but the early experience of the pilot suggests that significant effort will be needed in the early stages to build a critical mass of content that will inspire more active involvement by volunteer contributors.

Technical Considerations

A major goal of the pilot project was to identify a software platform capable of supporting collaborative work on an online dictionary by the distributed authorship that the project required. Ideally the software chosen would also act as a dissemination mechanism, i.e. the contributors would be working directly on the pages that readers would view.

WiKi software

The approach put forward at the Florence Congress and enthusiastically received by the Nomenclature Commission was the use of a 'WiKi'. A WiKi (from the Hawai'ian for 'quick' or 'fast') is a web-centric content management system designed to be lightweight and encourage rapid development of web sites by a collaboration of authors and editors. The public Wikipedia project is an example of a very large work of this sort (at the moment over one and a quarter million encyclopaedic entries in the English-language edition, written and edited by tens of thousands of users). Two software WiKi implementations were investigated, MoinMoin, which is used in-house for technical documentation by the IUCr editorial staff, and mediawiki, which is used in the Wikipedia project. Although MoinMoin had certain advantages in ease of set-up, maintenance and use, it proved to be too limited in its ability to handle images, mathematics, and complex page layouts. After a few months development on the MoinMoin platform, the content was transferred successfully to a mediawiki implementation, which will form the basis for future developments.

mediawiki

The first mediawiki implementation (http://www.mediawiki.org) was set up in December 2005, and a reimplementation with updated software and appropriate access control mechanisms was launched in late January 2006.

The main advantages of this implementation are:

  • native support for uploading of images and other non-text files
  • native support for TeX-based processing of suitable marked-up mathematics content
  • support for raw HTML markup, allowing the construction of complex tables and relatively complex page layout
  • support for a simple markup that is easy for a new author to learn, and is suitable for simple text-only entries
  • layered and extensible access rights, allowing different classes of user: 'reader', 'author', 'editor' and 'systems administrator'
  • support for categories
  • automated section numbering
  • numerous admin functions (collection of statistics, autoindexing of categories and of the entire site, identification of broken internal links etc.)
  • support for automated rights metadata (the current pilot is advertising Creative Commons rights to copy, distribute, display, and perform the work, and to make derivative works)

Its main disadvantages are:

  • sugnificantly greater administrative overhead than MoinMoin (although much of this is one-off setup or introduction of new features)
  • poor support for page templates (although templated data fields and transclusion will be useful features in the longer term)
  • poor local documentation

mediawiki offers many features that are suitable for the Online Dictionary project - ability to create and edit entries, store version histories, exercise editorial control to freeze definitions if necessary, internal hyperlinking, indexing and search engines, the ability to annotate and discuss articles. It is also suitable as a dissemination platform. It offers good support for maths and images, both of which are considered essential for an effective crystallography dictionary. It is therefore proposed to base the public Online Dictionary service on this software platform.

APPENDIX: Membership of the Working Group

The initial membership of the Working Group established in Florence consisted of:

  • Andre Authier (Chair)
  • John Helliwell
  • Bill Clegg
  • Paola Spadon
  • I. David Brown
  • Brian McMahon

Giovanni Ferraris, as Chair of the Commission on Inorganic and Mineral Structures, Massimo Nespolo, as Chair of the Commission on Mathematical and Theoretical Crystallography and Peter Strickland, Managing Editor of IUCr publications, as observer, subsequently joined the group. Howard Flack also provided sample entries and useful feedback.