CRUX Project

A CRowdsourced data infrastructure, to curate, discover, and recommend Unpublished XRD data and analytical results.

Project Summary

This project CRUX, a crowdsourced data infrastructure, to curate, discover, share, and recommend unpublished XRD data and analytical results. CRUX promotes underutilized high-quality material science data by allowing the sharing and exploration of unpublished datasets with state-of-the-art knowledge harvesting, and machine learning (ML) techniques. CRUX enables a materials knowledge graph (KG) model, automatic data integration, and an exploratory query engine that support "Why" and "What-if" analysis for XRD analysis. CRUX will enable an open, collaborative, and sustainable platform that can facilitate exchanging of unpublished XRD data and unlock new research problems (e.g., prediction of materials compositions with multi-phase data), and inspire the novel design of ML pipelines for data-driven materials science. CRUX will make materials data resources available for a broad community including materials scientists, data analysts, developers, and the general public.



CRUX-KB (Fraction)

The processed model-data bipartite graphs for Peak Finding (PKZoo), Image Classification (KIZoo), and Text Classification (HFZoo) are available in Hugging Face. PKZoo(below) includes 195,089 nodes in 9 types, 493,561 edges in 13 types.



CRUX-Onto

The Factual Knowledge layer is merged with Materials Design Ontology. Visualized by WebVOWL.



CRUX-Q

Query Example:

"Find the datasets for sample 'CaCO3-TiO2' provided by 'NC-State', a task with the name 'peak_finding', and the models implemented with the library 'peakutils'."

Visualized Result (fraction):

Visualized by Neo4J.