WP2 Architectural design

Design and build a database to hold neutron monitor data and make this data available for users.

The prototype database (db01.nmdb.eu) is currently running on a single core AMD64 machine with 1GB RAM and a 300GB software RAID partition for storage. All stations send their data to this machine, all data is downloaded from this machine, no NMDB mirrors have been setup yet. During the prototype phase, the layout of the tables shall be checked for completeness, if any changes are required, this will be discussed at the midterm meeting and implemented for the final database early 2009. A single server setup for storing and retrieving data may be a bottleneck in case large amounts of data will be retrieved regularily. There are several options this might be improved, we are trying to implement and test them by using additional or virtual servers.

Separate servers for storing and retrieving data

By setting up a server, that is only used to send data to, and then mirror the data to a sever that is used for queries, a query that takes a long time to complete does not block the server for write access.

Separate servers for current and archival data

Queries that download long timeseries with high resolution take a long time to complete and can block the server for write access, or block real-time access for downloads. By having one server for archival data, and a separate server for current data (the last 3-12 months), real-time queries are not blocked, since they use the server which has only short timeseries.

Loadbalancing / Cluster of servers

By setting up a cluster of servers, downloads could be redirected to the server with the smallest load, so that more queries could be performed without interefering with each other. This setup requires an extra server for the loadbalancing.

The best approach for NMDB would probably a combination of these solutions: One server which is only used to store data (maybe with regional servers that combine data from several stations and send it to the central server). From this server the data is replicated to the read-only mirrors which contain the full data set. Additionally there would be mirrors which hold only the data from the last few months for real-time applications.

Further ideas