Autonomous Material Discovery Agent
Aetheria is an agentic AI system at the forefront of scientific discovery, designed to autonomously propose, simulate, and refine novel material compositions with desired characteristics.
Project's Core Objective
Our primary mission is to develop an agentic AI system that can autonomously propose novel material compositions/structures, predict their properties using computational simulations, and iteratively refine its proposals to discover materials with desired characteristics.
Target Material Focus
Novel Stable Binary Inorganic Compound
This project will initially focus on the discovery of compounds composed of two distinct inorganic elements, ensuring stability for practical application.
Desired Property
Bandgap of ~1.5 eV
A critical semiconductor property, this specific bandgap is highly relevant for applications in photovoltaics and optoelectronics.
Project Context & Landscape
Understanding the current state and challenges in Agentic AI for scientific discovery is crucial for shaping our approach. This section provides insights into the evolving landscape.
Core Agentic Components
Aetheria's autonomous capabilities are built upon a foundation of interconnected agentic components, each with a specific role in the discovery process.
Perception (Information Gathering)
This component focuses on enabling the agent to "read" and understand diverse sources of scientific information. It's about how Aetheria gathers the necessary data to inform its hypotheses and decisions.
Scientific Literature: Agents need to parse and comprehend scientific papers (abstracts, methods, results).
Approach: Utilize open-source LLMs (fine-tuned Llama 3, Mistral, Gemma) with RAG over publicly available scientific articles (arXiv, PubMed Central) to enable "reading."
Material Databases: Accessing existing material properties and structures.
Approach: Leverage open-access databases like Materials Project, OQMD, or Crystallography Open Database (COD) via their APIs or downloadable datasets.
Simulation Results: Parsing outputs from computational chemistry/materials simulations.
Approach: Define a standardized format (e.g., JSON, YAML) for simulation outputs that the agent can reliably parse.
Simulated Discovery Progress
This conceptual chart illustrates how Aetheria's discovery process might iteratively approach the target bandgap. Each point represents a material hypothesis and its simulated outcome.
Project Roadmap: Next Steps
Building a robust Autonomous Materials Discovery Agent is an evolving journey. Here are the key next steps to build out this prototype.
Define Narrow Scope for Prototype
Specify precise material types (e.g., simple binary oxides), exact properties to discover (e.g., only bandgap & stability), and initially simulate complex computations rather than running full DFT.
Choose Core Tools/Frameworks
Solidify choices for agent orchestration (AutoGen), LLM integration (Ollama), data handling (Pandas, NumPy, Pymatgen), ML (scikit-learn), and basic database (SQLite).
Start with Perception Module
Begin by enabling the agent to read from material databases (e.g., Materials Project API) and parse simple scientific paper abstracts for relevant keywords and data points.
Iterative Development
Adopt an agile approach, building out each agentic component iteratively, testing, and refining the agent's behavior and performance at each stage.