Matching Data as a Service

Matching Data is a clustered search technology applied to the TAUS Data Cloud repository and to web-crawled data. Matching Data uses an example data set and returns matches according to relevance on a segment level across files and domains.
With this methodology developers of MT engines can create high fidelity data sets tuned to their own domains.
This new approach is based on DatAptor, a joint research project between the University of Amsterdam, TAUS, Intel and EC DGT.

Domain-specific or cross-domain matching
Tailored to your search requirements
Clustered search within TAUS Data Cloud or on TAUS crawled data

How it works

  • 1

    Search sample submission

    Data buyer provides a search sample and criteria (domain, languages)

  • 2

    Data matching

    TAUS runs an algorithm to identify the best matching data in TAUS Data repository

  • 3

    Selection creation

    Data selections with different matching rates are created (Compact, Medium, Large).

  • 4

    Selection review

    Data buyer reviews the selections and chooses the best fitting one(s)

  • 5

    Payment and Download

    The data are ready for download after the payment

How to get started?

Do you have a query corpus to submit?
Request Matching Data
Contact us to get more information
Contact us