Faster and Better Metadata Authoring using CEDAR's Value Recommendations

Presenting Author Information 

Name

Marcos Martinez-Romero

Institution

Stanford University

BD2K Grant Number

U54 AI117925

PI

Mark A. Musen

Email

marcosmr@stanford.edu

Phone Number

6504228878

Additional Author Information 

Names and affiliations of additional authors

(one per line)

Martin J. O’ Connor, Stanford University
Maryam Panahiazar, Stanford University
Debra Willrett, Stanford University
Attila L. Egyedi, , Stanford University
John Graybeal, Stanford University
Mark A. Musen, Stanford University

Is there an additional contact person?

Yes

Name of additional contact

Martin O’ Connor

Email address of additional contact

martin.oconnor@stanford.edu

Additional information 

Please choose the topic that best fits your abstract (posters will be grouped according to your selection). Detailed session descriptions can be found in the Abstract Guidelines.

Software, Analysis, & Methods Development

Please consider my abstract for a (See Presentation Guidelines)

Demo only (includes poster, power, table)

Abstract Information

Poster presentations may be submitted electronically in order to reach a wider audience and be available after the All hands meeting. Do you plan to submit your poster as a digital submission in addition to bringing a physical copy?

Yes

Abstract Title

Faster and Better Metadata Authoring using CEDAR's Value Recommendations

Abstract Description

In biomedicine, good metadata is crucial to finding experimental datasets, to understand how experiments were performed, and to reuse data to conduct new analyses. Despite the growing number of efforts to define guidelines and standards to describe biomedical experiments, the impediments to creating accurate, complete, and consistent metadata are still considerable. Authoring good metadata is a tedious and time-consuming task that biomedical scientists tend to avoid. 

The Center for Expanded Data Annotation and Retrieval (CEDAR) is developing novel methods and tools to simplify the process by which investigators annotate their experimental data with metadata. The CEDAR Workbench (cedar.metadatacenter.net) is a set of Web-based tools for the acquisition, storage, search, and reuse of metadata templates. As a step towards decreasing authoring time while increasing metadata quality, we have enhanced the CEDAR Workbench with value recommendation capabilities. 

Our system identifies common patterns in the CEDAR metadata repository, and generates real-time suggestions for filling out metadata acquisition forms. These suggestions are context-sensitive, meaning that the values predicted for a particular field are generated and ranked based on previously entered values. Our value recommendation approach supports both free-text values and terms from ontologies and controlled terminologies. We discuss some of the challenges that have arisen while implementing our approach, and our strategies for making this capability useful to the end users of CEDAR. We demonstrate CEDAR's intelligent authoring capabilities using metadata from the Gene Expression Omnibus (GEO), and show how the technology that we are developing leverages existing metadata to make the authoring of high-quality metadata a manageable task.

 

Release Date: 
November 29, 2016
Blurb: 
CEDAR's Value Recommendations improve metadata entry (BD2K AHM poster)
Author List: 
Martin J. O’ Connor, Maryam Panahiazar, Debra Willrett, Attila L. Egyedi, John Graybeal, Dr. Mark A. Musen
Artifact Type: 
Last Updated: 
Nov 23 2016 - 11:52am