E-commerce Corpus

* Special promotional price, 25% discount on all titles until May 31, 2019.
The discount is only valid for Matching Data purchases in EUR. It is not applicable to purchases with Data Cloud or Partner credits.
 Initiator: eBay
 Domain: E-commerce
 Language(s):
French - Dutch German - Polish German - Italian English - Italian

Reliable product descriptions and information are a crucial asset in any e-commerce environment. In these corpora you'll find carefully filtered and cleaned data on a great variety of product types, that will make it even easier for your global customers to click on the 'Add to shopping cart' button!

From technology items to office equipment, home furniture and pets accessories, to clothing and beauty products. Not to forget all the necessary gear for hobbies like photography, (motor)biking, fishing, skiing and gaming, and even some very special collector's items like stamps, coins, paintings, books, miniature cars, vinyl records and Panini stickers albums. These are just some examples of the products covered in our e-commerce corpora.

These corpora were created in collaboration with eBay Inc, who provided bilingual query data sets including a representative selection of key product descriptions. Based on that, we've applied TAUS proprietary Matching Data technology to extract the data from the TAUS Data Cloud, a large industry-shared repository of parallel corpora. According to the eBay Inc's linguistic assessment, our corpora were found to be "of good quality and appropriate to consume as aligned corpora to that provided in the eBay sample

To view samples please login.
French - Dutch Tokens
Corpus Size Segments Source Target
Compact 60.5K 436K 371K
Medium 106K 809K 688K
Large 183K 1.5M 1.3M
Sample Login to view
Generic placeholder image
German - Polish Tokens
Corpus Size Segments Source Target
Compact 65.1K 383K 427K
Medium 110K 680K 755K
Large 187K 1.2M 1.3M
Sample Login to view
Generic placeholder image
German - Italian Tokens
Corpus Size Segments Source Target
Compact 109K 1.5M 1.6M
Medium 193K 2.8M 3.0M
Large 336K 5.1M 5.6M
Sample Login to view
Generic placeholder image
English - Italian Tokens
Corpus Size Segments Source Target
Compact 142K 1.5M 1.6M
Medium 259K 2.9M 3.2M
Large 459K 5.5M 5.9M
Sample Login to view
Generic placeholder image
Testimonial from eBay

eBay Inc. has worked with TAUS on a joint pilot project to enable data discovery within TAUS's Data Cloud corpora. The process consisted of eBay supplying TAUS with a sample of approximately 58,000 French/Dutch and 80,000 German/Polish parallel text segments, representing content that is aligned to eBay projects.

TAUS used the sample to query Data Cloud, across multiple domains, and retrieve the appropriate samples in three sample sizes based on different thresholds for similarity and proximity. eBay Inc. then performed a linguistic assessment of 500 given examples of this output. Our linguistic review rendered positive results and the content supplied by TAUS was of good quality and appropriate to consume as aligned corpora to that provided in the eBay sample.

We look forward to having data search and discovery features on Data Cloud, whereby a user is capable of discovering their own project aligned content as a consumable self-service. We believe this will allow TAUS and its members to drive increased value from the TAUS data assets and in turn, will likely continue to fuel growth in the pool of data and value-add services.

Language Pair
Compact
Medium
Large
French - Dutch
Member Price
Price in Euro / Partner Credits
Price in Data Cloud Credits
€ 3600
€ 2700
8 million
€ 4000
€ 3000
9 million
€ 4400
€ 3300
10 million
Non-Member Price
Price in Euro
€ 4320
€ 3240
€ 4800
€ 3600
€ 5280
€ 3960
German - Polish
Member Price
Price in Euro / Partner Credits
Price in Data Cloud Credits
€ 3700
€ 2775
8 million
€ 4000
€ 3000
9 million
€ 4500
€ 3375
10 million
Non-Member Price
Price in Euro
€ 4440
€ 3330
€ 4800
€ 3600
€ 5400
€ 4050
German - Italian
Member Price
Price in Euro / Partner Credits
Price in Data Cloud Credits
€ 5400
€ 4050
12 million
€ 7100
€ 5325
16 million
€ 9000
€ 6750
20 million
Non-Member Price
Price in Euro
€ 6480
€ 4860
€ 8520
€ 6390
€ 10800
€ 8100
English - Italian
Member Price
Price in Euro / Partner Credits
Price in Data Cloud Credits
€ 5500
€ 4125
13 million
€ 7300
€ 5475
17 million
€ 9300
€ 6975
21 million
Non-Member Price
Price in Euro
€ 6600
€ 4950
€ 8760
€ 6570
€ 11160
€ 8370

Couldn't find what you were looking for?

Do you have a query corpus to submit?
Request Matching Data
Contact us to get more information
Contact us
500x500