The CEDAR project is in close alignment with several biomedical data interoperability projects, and will interact with these projects throughout CEDAR's performance period.
The partner projects on the CEDAR team of projects are run by our partners at Oxford University, Northrop Grumman, Yale University, and Stanford University. In addition, we are closely connected to our collaborators within the Center for Biomedical Informatics Research (BMIR) at Stanford University's School of Medicine, and will be integrating many CEDAR elements with existing BMIR tools like NCBO's BioPortal, and Protégé/WebProtégé.
Partner Projects on the CEDAR Team
BioSharing, ISA Tools, and Nature Publishing Group’s Scientific Data
The BioSharing initiative works with groups of biomedical scientists, service providers, standards communities, and journals to define, curate, and register community standards. Biosharing.org provides a repository of dozens of community-endorsed metadata templates and data-sharing guidelines. We will enhance this content by defining relationships between relevant standards and representing them in machine-readable form.
We rely on the Oxford group’s experience with ISA Tools, a metadata framework that is built around the Investigation–Study–Assay representation. Recently the Nature Publication Group announced plans to publish Scientific Data submissions online along with the corresponding metadata in ISA-Tab format. These metadata will provide an additional source of information for CEDAR. We will use the ISA-Tab format to enable export of experimental metadata to be used by the journal publishers and other repository developers. We will also use the ISA-Tab models as a key input to the general CEDAR study model, to be used as a basis for metadata descriptions of studies.
ImmPort is a major data repository managed by a large group at Northrop Grumman Corporation (NGC) in collaboration with investigators at Stanford University. The ImmPort team is working constantly to improve the metadata for its datasets. Currently, they rely on metadata templates curated by the Human Immunology Project Consortium (HIPC) Data Standards committee, headquartered at Yale, and the BioSharing initiative.
We will rely on ImmPort as a provide of knowledge about metadata workflow and organization, given their extensive experience developing these pipelines. The ImmPort metadata models will be used, in combination with the ISA models above, to establish a new CEDAR general study model. The ImmPort metadata will then be tested against the new model, and CEDAR will evaluate its metadata creation processes and results in the context of the ImmPort users and workflows.
Human Immunology Project Consotium (HIPC)
HIPC, established in 2010 by the NIAID Division of Allergy, Immunology, and Transplantation as part of the overall NIAID focus on human immunology. Through this program, well-characterized human cohorts are studied using a variety of modern analytic tools. The information gained from the HIPC program will provide a comprehensive understanding of the human immune system and its regulation, with many novel applications in human populations.
The HIPC program is creating centralized research resources and a comprehensive, centralized database for use by the science community. This knowledge base will also serve as a foundation for the future study of immune-mediated diseases in the human, such as allergy, asthma, transplant rejection, and autoimmune diseases, and a variety of inflammatory diseases.
Stanford Digital Repository
The Stanford Digital Repository (SDR) is a rapidly growing archive that includes experimental data contributed by Stanford faculty members from all departments of the university. The repository stores not only experimental data generated through faculty research, but also thousands of scanned books, dissertations, musical recordings, images, and videos.
We plan to work with the SDR to make our technology available for all faculty members at Stanford who wish to annotate and archive their data, and to evaluate the value of the CEDAR approaches. We will train the SDR team in our technology and work with them as they apply CEDAR components. monitoring the use of CEDAR technology within SDR by Stanford faculty and staff to assess the usability of our approach within this new framework.
Partner Biomedical Informatics Research Division Projects
CEDAR relies on the close collaboration of these partners in the Biomedical Informatics Research Division of Stanford's School of Medicine (the same division that leads CEDAR).
The National Center for Biomedical Ontology (NCBO)
The NCBO manages a repository of all the world’s publicly available biomedical ontologies and terminologies—now more than 390 in number. The NCBO BioPortal resource makes these ontologies and terminologies available via a Web browser and Web services through a common interface. The NCBO Annotator service takes as input natural-language text and returns as output ontology terms to which the text refers.
Our project relies on the BioPortal ontology repository and the NCBO Annotator—both of which are hardened and well-tested software systems.
Protégé is the most widely used ontology-development system in the world. The most recent generation of Protégé software runs on the Web, and has been widely adapted in the biomedical community. The WHO, for example, is using Protégé to build and maintain ICD-11 as well as other international terminologies. Much of our work to ease the filling out of metadata templates goes well beyond the current capabilities of the Protégé system, but CEDAR will gain a major advantage by building on this solid foundation.
An important feature of Protégé is the ability to take as input the classes in ontologies and to generate as output a graphical user interface for acquiring instances of those classes. We will model our project on these capabilities directly. Users will be able to select metadata templates (such as MIAME) from an online library (an extension to the NCBO BioPortal). The CEDAR technology will automatically create the Web-based interface for filling out the template with experimental metadata.