• AutoTL;DRB
    link
    fedilink
    English
    27 months ago

    This is the best summary I could come up with:


    MENLO PARK, California, Sept 28 (Reuters) - Meta Platforms (META.O) used public Facebook and Instagram posts to train parts of its new Meta AI virtual assistant, but excluded private posts shared only with family and friends in an effort to respect consumers’ privacy, the company’s top policy executive told Reuters in an interview.

    “We’ve tried to exclude datasets that have a heavy preponderance of personal information,” Clegg said, adding that the “vast majority” of the data used by Meta for training was publicly available.

    The product will be able to generate text, audio and imagery and will have access to real-time information via a partnership with Microsoft’s (MSFT.O) Bing search engine.

    Those posts were used to train Emu for the image generation elements of the product, while the chat functions were based on Llama 2 with some publicly available and annotated datasets added, a Meta spokesperson told Reuters.

    Some companies with image-generation tools facilitate the reproduction of iconic characters like Mickey Mouse, while others have paid for the materials or deliberately avoided including them in training data.

    OpenAI, for instance, signed a six-year deal with content provider Shutterstock this summer to use the company’s image, video and music libraries for training.


    The original article contains 603 words, the summary contains 201 words. Saved 67%. I’m a bot and I’m open source!