Matching Data as a Service
Matching Data is a clustered search technology applied to the TAUS Data Cloud repository
and to web-crawled data. Matching Data uses an example data set and returns matches
according to relevance on a segment level across files and domains.
With this methodology
developers of MT engines can create high fidelity data sets tuned to their own domains.
This new approach is based on DatAptor,
a joint research project between the University of Amsterdam, TAUS, Intel and EC DGT.
How it works
-
1
Search sample submission
Data buyer provides a search sample and criteria (domain, languages)
-
2
Data matching
TAUS runs an algorithm to identify the best matching data in TAUS Data repository
-
3
Selection creation
Data selections with different matching rates are created (Compact, Medium, Large).
-
4
Selection review
Data buyer reviews the selections and chooses the best fitting one(s)
-
5
Payment and Download
The data are ready for download after the payment