vAIsual Inc, the company behind the largest visual dataset collection in the world, today launched the first of it’s non-biometric datasets, consisting of over 130,000 images of elephants, dogs and birds.
The datasets are specially prepared to meet the needs of ML teams, such as detailed and consistent metatags, high resolution images and, most importantly, legal clearances.
Self-service access to the datasets is via the Dataset Shop, established in 2022 by clean data specialists vAIsual Inc, and specifically catering for research and engineering teams training AI for a range of applications.
According to vAIsual CEO, Michael Osterrieder, these datasets are the first of thousands going live on the site in the next few weeks. “After concluding licensing deals for over 400 million images, we now have the largest collection of licensable images for AI training available anywhere.”
“We are excited to launch three new datasets that focus on elephants, dogs and birds respectively. Using our proprietary dataset building technology, we can now assemble datasets consisting of tens of thousands of images of a particular theme or subject.
Being able to collate and package these datasets saves hundreds of hours for engineers to prepare material for AI training.” says Osterrieder.
While reducing time is a core benefit, Osterrieder also emphasizes the importance of having full legal clearance.
“We are starting to see dataset disclosure requirements emerging in some jurisdictions, which will mean any AI model trained on scraped data will risk being blocked,” says Osterrieder.
The availability of legally clean datasets, that also remunerate the original content creators, is an important step to ensure companies building AI technology are doing it ethically and responsibly.
“Offering custom-prepared datasets containing premium visual content, with the consent of the original copyright owners (or their legal representatives). is essential for the AI industry to mature into a truly commercial and viable industry,” says Osterrieder.
In the coming weeks, additional datasets will be added to the datasetshop.com. The datasets are specially prepared for engineers to add to their workflow for AI training and are commercially available in a variety of resolutions.
About Dataset Shop
First launched in 2022 by the “clean data guys”, vAIsual Inc’s Dataset Shop is a marketplace for visual media designed specifically for AI training purposes.
The online store initially sold the largest biometrically released human dataset, consisting of over 600,000 high quality images, custom shot for AI training.
The Dataset Shop is rapidly growing the collection of datasets through partnerships with stock agencies seeking to address the issue of widespread scraping of datasets, obtained without the consent of copyright owners.
About vAIsual (pro-nounced v-eye-sual)
vAIsual was first formed in 2020 by Michael Osterrieder and Nicolas Menijes, soon to be joined by industry veterans Mark Milstein and Istvan Novak. All founders are well connected to the IP licensing industry.
vAIsual covers the whole AI workflow, from dataset generation and delivery, to optimizing training sessions. They offer generated content to the commercial advertising industry as well as for the machine learning industry.
vAIsual exclusively relies on ethically sourced and legally clean datasets.
vAIsual launches AI training datasets of elephants, dogs and birds on the Dataset Shop