Health systems are among the most prolific generators of massive databases. Health systems managers, physicians and users are also avid consumers of the knowledge derived from this experience. Surprisingly, modern data mining techniques cannot be directly applied to this huge amount of data. Two are the main reasons for this: data is distributed, and there are legal and professional reasons for not sharing clinical databases.

MIND intends to design the software tools to perform data mining avoiding these limitations. The project will develop distributed genetic algorithms that can extract global knowledge by performing local search and migration of the results. MIND keeps the privacy of the results and overcomes the problems derived from security policies in server communications, and coherence between different database structures. Figure 1 shows the physical implementation of this approach.



Figure 1. Click to view larger size.