AnIML Workshop "Tools for Applying the Analytical Information Markup Language (AnIML)"
at PittCon 2008, March 4 2008 in New Orleans, LA
Abstract:ASTM Subcommittee E13.15 on Analytical Data in collaboration with IUPAC Subcommittee on Electronic Data Standards is developing AnIML to provide mechanisms for representing, organizing, and storing analytical result data and metadata from any analytical technique. Raw, processed, annotated, simulated, and, in some cases, legacy data can be interchanged and archived in AnIML format. Applying software tools to AnIML datasets demonstrates additional power and utility of the AnIML approach. Because it is based on XML (Extensible Markup Language), AnIML data can be readily manipulated, converted to other formats, and examined using standard XML tools. Since metadata are included with result data, generic applications can be created that can visualize the data in any AnIML file. The flexible design of AnIML precludes use of the simple validation procedures commonly built into browsers and parsers; however, a custom scheme is being developed that that provides for much more sophisticated validation of AnIML files. Digital signature, verification, and data tracking tools can ensure the data and document integrity necessary for the use of AnIML in regulated industries. Format conversion tools can transform legacy data, such as, Andi and JCAMP-DX datasets, into AnIML format. This Workshop will examine how relatively simple software tools can enhance the utility of AnIML-formatted data beyond data interchange and archiving while improving the usefulness and availability of analytical result data. Additional information may be found on the AnIML website: animl.sourceforge.net.
- 1:00 pm Introductory Remarks - Gary W Kramer
- 1:35 pm Fundamentals of the Analytical Information Markup Language (AnIML) BURKHARD A SCHAEFER, BSSN Software (Paper 1390-1)
- 2.00 pm How to Export Legacy Chromatography Data to AnIML DALE O'NEILL, Agilent Technologies (Paper 1390-2)
- 2.25 pm Validating AnIML Data Files GARY W KRAMER, NIST (Paper 1390-3)
- 3.05 pm Using AnIML as a Data Archiving Format in Academia STUART J CHALK, University of North Florida (Paper 1390-4)
- 3.30 pm Using AnIML in Regulated Environments MAREN FIEGE, Waters GmbH (Paper 1390-5)
- 3.55 pm Open Source Software for AnIML JAMIE MCQUAY, Scimatic Software (Paper 1390-6)
- 4.20 pm Native AnIML LCMS Processing and Viewing MARK F BEAN, GSK (Paper 1390-7)
Paper 1390-1: FUNDAMENTALS OF THE ANALYTICAL INFORMATION MARKUP LANGUAGE (ANIML)
Burkhard A. Schaefer, BSSN Software, Postfach 411145, Mainz 55068, Germany
The Analytical Information Markup Language (AnIML) is a standardization effort of the E13.15 Sub-Committee of the American Society for Testing and Materials (ASTM). AnIML defines an XML-based format for documentation of laboratory experiments and their results. It is suitable for many different analytical measurement techniques.
To achieve this, AnIML provides a generic data container that permits the storage of arbitrary analytical data. The concept of Technique Definitions permits the formal specification of constraints for using this data container. This way, a definition can prescribe how the data for specific measurement techniques should be captured in the data file.
In the first part, this paper will provide an introduction to the AnIML architecture and highlight some of its design fundamentals.
The second part describes the AnIML features that facilitate the design of generic tools to manipulate AnIML documents. Even though AnIML was designed with human-readability in mind, it makes more sense to create, modify and view AnIML documents using appropriate software tools. That way, the data can be presented in a fashion that is more convenient to the user (who is typically a domain expert, rather than a computer programmer). We demonstrate what such tools could look like and how they can be integrated into existing laboratory software environments.
(Presentation not currently available)
Paper 1390-2: HOW TO EXPORT LEGACY CHROMATOGRAPHY DATA TO ANIML
Dale O'Neill, Agilent Technologies, 6612 Owens Dr., Pleasanton, CA 94588
According to a report by UC Berkley's School of Information Management and Systems, over the next several years more data will be created than in the previous 300 years combined. This data is in various proprietary formats of which some is well structured and others not structured. Some data is stored on disks and others in databases. Because of this mass amount of data and the many proprietary formats, regulations are now surfacing requiring vendors to make their data retrievable for the next 10 to 30 years. Some SOP's require the data to be accessible for up to the next 100 years. This has created many headaches in the industry. With many vendors continuing to create new applications and new data formats while at the same time not given adequate support for old applications and old data formats, the problem will only continue to grow.
To help solve this problem Analytical Information Markup Language called AnIML has been developed. AnIML is an XML standard for analytical chemistry data. It is the collaborative effort between many groups and individuals and is sanctioned by the ASTM. The main idea behind AnIML is to create a standard format for interchanging and archiving data, independent of the measurement technique that was used.
In this presentation I will show how chromatography data is applied to AnIML. Where the data fits into the AnIML core and how viewers can be confident to find and display the most pertinent information. I will also discuss archiving and Exporting legacy data to AnIML.
Paper 1390-3: VALIDATING ANIML DATA FILES
Gary W. Kramer, National Institute of Standards and Technology, 100 Bureau Drive, Bldg. 227; Room A-163, Gaithersburg, MD 20899-8312
The Analytical Information Markup Language (AnIML) is being created to facilitate the interchange and archiving of analytical chemistry data by creating structured data containers using the eXtensible Markup Language (XML). Content document structure is specified by the general AnIML schema, the AnIML technique schema, and methodology-specific technique definition files. The use of the technique definition files affords AnIML great flexibility and extensibility. Validation is the process of insuring that an AnIML data document conforms to sets of requirements promulgated by an authority. At the lowest level are the XML syntax rules detailed by the World-Wide Web Consortium (W3C) for XML document well-formedness. XML validators or validating parsers check an XML document against the structure and rules called out in the associated content model document—the Document Type Definition (DTD) or XML schema.
Because AnIML incorporates technique-definition documents to stylize data to the customary representations expected in a given analytical genre and further allows for extensions to tailor data representations to the needs of organizations and individuals, AnIML data documents will require custom validation tools. The drawback of having to create such tools is offset by the opportunity to increase the scope of the checking tools to include semantic validation of the data-element contents themselves. The base level for AnIML data files will be validation to the ASTM AnIML standards requirements. However, customized verification tools offer the possibility of incorporating imperatives from higher authorities, such as manufacturing quality assurance provisions or regulatory requirements, into the validation scheme.
Paper 1390-4: USING ANIML AS A DATA ARCHIVING FORMAT IN ACADEMIA
Stuart J. Chalk, University of North Florida, Department Of Chemistry And Physics, 1 Unf Drive, Jacksonville, FL 32224
A discussion of how the AnIML specification can be used in the context of performing academic research will be presented. Focus areas will include mechanisms for getting data into AnIML format, how to add metadata to AnIML files, and tools to visualize, process and disseminate data stored in AnIML files. Finally, the issues of data integrity, security and intellectual property will be addressed.
Paper 1390-5: USING ANIML IN REGULATED ENVIRONMENTS
Maren Fiege, Waters GmbH, Europaallee 27-29, Frechen, NRW 50226, Germany
Scientists worldwide are moving from paper-based processes to electronic data. While the advantages of electronic data are clear, there are also some new challenges to ensure data longevity and validity. These include the need to comply with regulations such as 21 CFR Part 11 and GMP as well as protection of intellectual property.
This talk will outline how the ASTM AnIML data standards will help scientists meet these needs to provide a secure, compliant, and long-term stable format.
Paper 1390-6: OPEN SOURCE SOFTWARE FOR ANIML
Jamie McQuay, Scimatic Software, 436-550 Front St. West, Toronto, Ontario M5V3N5, Canada
The ASTM Subcommittee E13.15 is currently developing the Analytical Information Markup Language (AnIML). This standard will provide a defined data format making the contents of a file accessible to any AnIML aware software application.
Software tools must be provided to assist in the acceptance of a new data standard by the scientific community. These tools must demonstrate what is possible with the new file format as well as provide functionality the user will need for the adoption of the AnIML standard.
The AnIML Tools project is an open source effort which aims to provide the community with the tools that are required for AnIML formatted data. These tools, written in Microsoft C# by Scimatic Software, are currently being developed along side the AnIML standardization process. The first release of these tools will be an application that allows the user to open, graph and validate AnIML formatted files. The core functionality of this viewer is developed as a separate component to allow easy movement of this functionality into other AnIML projects.
This talk will give an overview and demonstration of the current open source tools that are available from the AnIML Tools project. Future development for this project will also be discussed along with a request for more community involvement.
Paper 1390-7: NATIVE ANIML LCMS PROCESSING AND VIEWING
Mark F. Bean, GSK, Up12-210, 1250 South Collegeville Rd, Collegeville, PA 19426
Adventures in representing LCMS data in XML will be recounted by an experienced C#.Net developer and LCMS practitioner, active in the ASTM working party since its inception. Last year we showed a sophisticated, vendor-independent LCMS data viewer using AnIML converted from Microsoft typed datasets. This year, drawing on progress in the realization of LC, UV, and MS technique definitions for AnIML, we will explore using AnIML as the native file format. This will include, if possible, conversion of entire LCMS data files, creation of LCMS reports, and representation of graphical results in AnIML. The talk will include discussion of various AnIML XML files, as well as a dynamic vendor-independent LCMS data viewer capable of re-extracting spectra or extracted ion chromatograms on demand from the original files. We will discuss challenges in the creation and parsing of these XML files.
(Presentation not currently available)