High fidelity MT training data is always important, even more so when it comes to medical subjects. This is a must-have corpus for anyone seeking for pharma-related data.
It covers product features, dosage/usage recommendations, laboratory analysis and clinical trials for several types of medicines. Parts of the corpora describe the symptoms, diagnostics and treatment of various maladies, patient profiles and common side-effects of medicines used as part of the treatment. Furthermore, the data contains references to various medical regulatory bodies and established regulations.
We created this corpus together with the Universitat Autonoma de Barcelona, based on a bilingual query corpus carefully aligned and validated by the university. This very good query corpus resulted in highly relevant and clean data: "Without a doubt, TAUS Matching Data contributed greatly to increasing both the size and quality of the university corpus", as worded by the Universitat Autnnoma de Barcelona.