Materials Platform for Data Science Evgeny Blokhin
1,2*
and Pierre Villars
3
www.mpds.io
1
Materials Platform for Data Science, Sepapaja 6, 15551, Tallinn, Estonia 2 Tilde Materials Informatics, Straßmannstraße 25, 10249 Berlin, Germany 3 Materials Phases Data System, Unterschwanden 6, 6354, Vitznau, Switzerland *Email:
[email protected]
Data types and formats
Rationale Nowadays data-intensive applications are very demanding to the high-
AaBb
quality starting data (“garbage in — garbage out”). Further, weak algorithms with more data usually beat better algorithms with less data. Since 1998 within PAULING FILE project we manually extract, cross-check, annotate, and categorize materials data from publications, books, conference proceedings etc. Now we present all these data online at the
270 000 processed publications
65 000 phase diagrams
DOI BIBTEX
JSON planned CALPHAD
Materials Platform for Data Science (www.mpds.io) available via the REST programming interface.
JSON planned RDF
In: { "elements": "Sr-Ti-O",
Poisson ratio, figure of merit, phonons, etc.
}
Individual researchers
Software integrations
Simulation platforms
"sgs": "Pm-3m",
API
"props": "structural properties" } Out: { "error": null,
Other databases
"out": [ { "properties":"values" } ], "npages": 1, "page": 0,
2100
}
7 (20 spellings)
JSON
JSON CIF
"classes": "perovskite, superconductor",
Physical properties (MPDS hierarchy)
Crystalline classes cubic, cub, hex, etc.
}
Al2O3, O3Al2, O16Al7, etc.
any valid formula
}
any of 100 elements
}
Al, Al-O-Fe, Ru-Am, etc.
800 000 physical property sets
Chemical formula
Chemical elements
4000
binary, perovskites, ab initio, etc.
410 000 crystalline structures
Programming interface
Graphical interface Materials classes (umbrella term)
150 000 distinct phases
Other applications
"count": 1 }
Analytic platforms
Software architecture
Entity relationship model
Web-browser
GUI application
API layer
Provider
API layer
Web-server
Twitter robot @mpdsio
Platform application
Database engine API layer
has part is fully determined by
is represented by is related as XYZ
External infrastructures
Database storage