High fidelity MT training data is always important, even more so when it comes to medical subjects. This is a must-have corpus for anyone seeking for pharma-related data.
It covers product features, dosage/usage recommendations, laboratory analysis and clinical trials for several types of medicines. Parts of the corpora describe the symptoms, diagnostics and treatment of various maladies, patient profiles and common side-effects of medicines used as part of the treatment. Furthermore, the data contains references to various medical regulatory bodies and established regulations.
We created this corpus together with the Universitat Autonoma de Barcelona, based on a bilingual query corpus carefully aligned and validated by the university. This very good query corpus resulted in highly relevant and clean data: "Without a doubt, TAUS Matching Data contributed greatly to increasing both the size and quality of the university corpus", as worded by the Universitat Autnnoma de Barcelona.
Click on the testimonial tab to read their complete testimonial.
Testimonial from Universitat Autonoma de Barcelona
The Universitat Autonoma de Barcelona has worked with TAUS on a project to gather
data from TAUS's Data Cloud platform. The process consisted in the University supplying TAUS with a corpus of
approximately 40K English to Spanish strings, which consisted of data from the European Medicines Agency
(UE) and from the FDA database aligned by a student. The texts compiled in this corpus were
summaries of product characteris and summaries for the public of several medicines.
TAUS used this corpus to explore its Data Cloud for similarity in the pair of languages considered and
reverted back with data output which appropriate score ranges on similarity and proximity. The
university then performed an assessment of the quality of the output based on the field and degree
of specialisation of the data.
After this review, the university concluded that the data were of high quality and relevance to the
pharmaceutical and technical field of the original corpus. Without a doubt, TAUS Matching Data
contributed greatly to increasing both the size and quality of the university corpus.
The Universitat Autonoma de Barcelona is very grateful to TAUS for supporMng the academic
research being carried out by one of its students in the field of machine translation, more precisely
in the training of a machine translaMon engine specialised in pharmaceuticals. We hope to be able
to collaborate with them again soon, given the good treatment received and the quality and
commitment of TAUS with the quality of its services.
English - Spanish
Price in Euro /
Data Cloud Credits