Platform Architecture

Architecture and overall data workflow in Alhazen

The general architecture of the Alhazen toolset is based on the following high-level workflow:

Fig1: Overall system architecture of the Alhazen platform.

The system’s functionality is based on applying LLM-based technology to scientific knowledge gathered from available resources on the web. This uses a local database to store information downloaded from the web as Collections (named corpora), Expressions (references to scientific information entities); Items (copies of the information entities themselves); and Fragments(excerpts of the information entities indexed back into the original items) - see documentation on the CEIFiNS Database for additional detail.

This initial system provides function calls for running either a single logical query or a collection of queries on the European PMC system and building a local database on top of that.

This data can then be queried and analyzed with LLM-based tools and methods which in turn records the results of those analyses in the database as Notes which in turn can be analysed to generate Summaries.